Introduction to AI and cybersecurity

Artificial intelligence (AI) is one of the perennial buzzwords of computer science. For decades, we’ve been promised robots that will take care of our every need, but science hasn’t quite come through.

AI has certainly made some significant strides in some areas, one of which is cybersecurity. Many cybersecurity providers now offer products that leverage artificial intelligence and machine learning (ML) to help with detection and response to cyber threats. Even the U.S. Pentagon has released a strategy to the media, detailing how they plan to leverage AI for prediction and defense against digital and physical threats.

Why is AI entering cybersecurity?

A large portion of the role of a cyberdefender is boring, thankless tasks. Detection of most cyberthreats requires trawling through massive amounts of data, looking for anomalies or indicators of a possible attack. Once a potential threat has been detected, further data analysis is required to identify the details of the attack, the impact of the breach and the effects on the computers. This requires a lot of number-crunching and looking through data for anomalies.

To be honest, humans are awful at this sort of work. Analysts are known to get security alert fatigue, where too many false alarms mean that they miss real ones. We become bored and miss things, and the result is a breach.

Computers, on the other hand, never get bored, and large-scale data analysis and anomaly detection are some of the things that they’re best suited for. The main reason that AI is entering the cyberdomain is that it is a scalable way to ensure the security of the organization. It acts as a magnifier for the efforts of human efforts. By filtering out most of the noise and only bringing the data points most likely to be significant to the attention of analysts, AI and ML systems ensure that analysts’ attention is focused where it’s most needed.

Applications of AI in cybersecurity

The field of AI is still evolving, and all of the possible applications of AI and ML in the cybersecurity space haven’t been developed or explored yet. However, there are already many AI-based systems currently in use to help in defending computers and networks.

Automated network analysis

Network analysis is a perfect fit for machine learning systems, due to the sheer volume of available data that requires analysis. Most malware and cyberattackers operate over the network, so monitoring network communications is a good way to detect attempted installations of malware and the command-and-control (C2) communications of successful intrusions.

Most malware authors now misuse common protocols for C2 in order to blend in with the rest of the traffic on the network. Placing data in HTTP header values or embedding it in DNS requests and responses allows the data to get past the firewall and increases the probability that it will be overlooked. ML-based detection algorithms use keyword matching, statistics monitoring, anomaly detection and other mechanisms to determine if a given packet is sufficiently “different” from the norm. If so, it’s brought to the attention of a human for future analysis.

Email scanning

One of the biggest threats to organizational cybersecurity is phishing. Adversaries have discovered that it’s much easier to get a human to click on a link than to discover and exploit a zero-day or unpatched software in your system. Detecting and blocking these malicious emails is an extremely active area of research in the cyberfield.

Machine learning and AI-based algorithms are active in detecting phishing emails at all levels. Some anti-phishing programs perform deep link inspection, simulating clicks on all links in the email and examining the resulting pages for signs of phishing. Computer vision is also in use to see emails how the recipient would and look for suspicious features. Natural language processing is employed to determine if the word choice, grammar and so on. of the email matches expectations. Finally, anomaly detection is applied at all levels to determine if any feature of the email’s sender, recipient, body, attachments or other items. are cause for suspicion.

While a human could perform this analysis with the aid of a checklist or similar, the sheer volume of emails that people receive means that most people don’t. Using machine learning and AI to weed out or warn the user about the majority of suspicious emails provides protection at scale and decreases vulnerability to phishing attacks.

Machine learning for antivirus

Traditionally, antivirus programs are signature-based. As malware is discovered, indicators of compromise (IoCs) for it are collected and distributed to antivirus engines. As each file enters a network or computer, it’s scanned against the signature list and quarantined or deleted in the event of a match.

The issues with traditional approaches to antivirus detection are delays, scalability and applicability of signatures. By definition, there is always a delay between an attack starting and signatures being available, since someone needs to detect the malware, generate a signature and deploy it. The use of large list of signatures is also problematic since the list grows continuously, making storage and scanning less efficient. Finally, malware authors may use different malware for each infection to ensure that no signature exists for detection of a given threat.

Antivirus systems using AI focus on detecting unusual behavior by programs rather than matching signatures. Since most malware is designed to do things that are different from the standard operation of the computer, they can be detected based on these actions. This allows these AI-based AVs to detect zero-day exploits and other previously unknown malware.

User behavioral modeling

Beyond modeling the behavior of programs on the computer, some AI-based cyberdefenses model the behavior of users on the system. This is designed to detect and remediate account takeover attacks where an attacker has stolen a user’s credentials and used them to gain access to the account through legitimate means. Even attackers that limit themselves to using legitimate programs in their attacks (“living off the land”) often use them in a way that is different from legitimate users.

By observing these deviations, AI-based algorithms can detect these account takeovers and initiate an account lockout and further investigation.

The future of AI in cybersecurity

Artificial intelligence still has a lot of growing to do, and this further development will doubtless create new applications for cybersecurity. As algorithms improve, the need for humans to weed out false positives from true will decrease, enabling fully-automated cyberdefense systems.

One challenge with the increasing usage of AI for cybersecurity is that the effectiveness of AI and ML is based solely on the quality of the data used for training them. Poorly-trained algorithms or algorithms based on data deliberately corrupted by the adversary are likely to miss detections or provide large numbers of false alarms.

A common belief about the future of AI in cybersecurity is that it won’t be fully on the side of the network defender. A survey at Blackhat found that 62% of surveyed participants believed that an AI-powered cyberattack was possible within the next year, and Darktrace CEO Nicole Eagen believes that the future of cybersecurity is AI vs. AI.

As artificial intelligence and machine learning improve, they will be leveraged by malware authors and cyberattackers for a variety of different applications including network scanning, automated phishing attacks and AI-enabled botnets. AI-based defenses will have to learn to adapt and grow to meet the threat of these rapidly-evolving systems as well.

 

Sources

  1. Summary of the 2019 Department of Defense Artificial Intelligence Strategy: Harnessing AI to Advance Our Security and Prosperity, U.S. Department of Defense
  2. Artificial Intelligence and Cybersecurity: Attacking and Defending, The State of Security
  3. Darktrace CEO: The Future of Cybersecurity is A.I. vs. A.I., Fortune