Using Adversarial Machine Learning to Combat Attacks on ML Systems

Using Adversarial Machine Learning to Combat Attacks on ML Systems

As AI and machine learning (ML) become more normative in all fields, the risk of cyberattacks targeted toward this kind of technology becomes more real. We are not at the point at which machines are threatening to take over humanity, as depicted in sci-fi films like The Matrix. But there are real ML security issues that can affect critical fields like healthcare, transportation, finance, and others. Security failures in these fields can prove disastrous, which is why adversarial machine learning is more important than ever.

Adversarial machine learning is the study of how machine learning algorithms can be compromised. It is an important tool for combatting potential adversarial examples, one of the most potent threats to machine learning today. Adversarial examples are manipulations of machine learning algorithms that cause them to make false interpretations. These examples are not visible to the naked eye, only to ML algorithms. Attackers can use adversarial examples to exploit the way ML algorithms work and essentially wreak havoc.

How adversarial examples exploit machine learning algorithms

Machine Learning models go through training phases, during which they “learn” how to identify what they “see” and accomplish various functions. During training, they are shown images and corresponding labels. The system then examines the pixels in the images and creates settings so that it can link each image/pixel value with the correct label. Adversarial examples take an image that has a certain number of pixels and change it. Changing the number of pixels makes the ML algorithm read the image in an incorrect way.

One well-known example of this is when Google researchers showed that a popular object detection algorithm looked at two nearly-identical pictures of pandas, but classified one image as a panda and the other as a gibbon. To the human eye, both images looked like identical pandas, but someone had manipulated one image so that the ML algorithm thought it was a gibbon. 

To top it off, the algorithm showed that it was 57.7% confident that one image was a panda, and 99.3% confident that the other was a gibbon. In other words, it was nearly 100% confident that one of its classifications was right when it was actually 100% wrong.

The above example is fairly benign. It is not a life and death situation if an image is identified as a panda or gibbon. Rather, it is an illustration of the potential vulnerabilities of ML systems. There are many other situations in which the mis-identification of a certain object can literally be life-threatening. For example, ML systems that run autonomous cars must be able to identify the things they see. Stop signs, speed limit signs, pedestrian crosswalks, etc. If an adversarial attack changes the pixel values of a stop sign so that the ML algorithm sees something else, or nothing at all, the driver would be in grave danger. Unfortunately, these are the types of adversarial attacks that cybersecurity experts are worried about today.

Detecting adversarial attacks

Microsoft, IBM, MITRE, Nvidia, and nine other organizations have teamed up to confront the threat of adversarial attacks. They developed a framework called the Adversarial ML Threat Matrix, a guide to help security teams detect and respond to these types of attacks. Still in its early stages, the Adversarial ML Threat Matrix is in the style of ATT&CK, a popular guide developed by MITRE for cybersecurity experts that outlines cybersecurity threats.

The Matrix includes documented techniques that hackers use to attack digital infrastructures of ML systems. The vulnerabilities of ML systems differ greatly from those of other technologies, and traditional cybersecurity tools are not necessarily designed to detect them. The ML Threat Matrix also includes a set of case studies to help companies understand what they are up against. The case studies include examples of how hackers used adversarial attacks to poison data, bypass spam detectors, and force AI systems to reveal confidential data.

While there is no easy solution at this point, the creators of the ML Threat Matrix hope that it will generate awareness of adversarial attacks and encourage security teams to beef up this area of security.