You created a machine learning application. Now make sure it’s secure.

The software industry has demonstrated, all too clearly, what happens when you don’t pay attention to security.

By Ben Lorica and Mike Loukides
February 28, 2019
Security Security (source: Pixabay)

In a recent post, we described what it would take to build a sustainable machine learning practice. By “sustainable,” we mean projects that aren’t just proofs of concepts or experiments. A sustainable practice means projects that are integral to an organization’s mission: projects by which an organization lives or dies. These projects are built and supported by a stable team of engineers, and supported by a management team that understands what machine learning is, why it’s important, and what it’s capable of accomplishing. Finally, sustainable machine learning means that as many aspects of product development as possible are automated: not just building models, but cleaning data, building and managing data pipelines, testing, and much more. Machine learning will penetrate our organizations so deeply that it won’t be possible for humans to manage them unassisted.

Organizations throughout the world are waking up to the fact that security is essential to their software projects. Nobody wants to be the next Sony, the next Anthem, or the next Equifax. But while we know how to make traditional software more secure (even though we frequently don’t), machine learning presents a new set of problems. Any sustainable machine learning practice must address machine learning’s unique security issues. We didn’t do that for traditional software, and we’re paying the price now. Nobody wants to pay the price again. If we learn one thing from traditional software’s approach to security, it’s that we need to be ahead of the curve, not behind it. As Joanna Bryson writes, “Cyber security and AI are inseparable.”

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

The presence of machine learning in any organization won’t be a single application, a single model; it will be many applications, using many models—perhaps thousands of models, or tens of thousands, automatically generated and updated. Machine learning on low-power edge devices, ranging from phones to tiny sensors embedded in assembly lines, tools, appliances, and even furniture and building structures, increases the number of models that need to be monitored. And the advent of 5G mobile services, which significantly increases the network bandwidth to mobile devices, will make it much more attractive to put machine learning at the edge of the network. We anticipate billions of machines, each of which may be running dozens of models. At this scale, we can’t assume that we can deal with security issues manually. We need tools to assist the humans responsible for security. We need to automate as much of the process as possible, but not too much, giving humans the final say.

In “Lessons learned turning machine learning models into real products and services,” David Talby writes that “the biggest mistake people make with regard to machine learning is thinking that the models are just like any other type of software.” Model development isn’t software development. Models are unique—the same model can’t be deployed twice; the accuracy of any model degrades as soon as it is put into production; and the gap between training data and live data, representing real users and their actions, is huge. In many respects, the task of modeling doesn’t get started until the model hits production, and starts to encounter real-world data.

Unfortunately, one characteristic that software development has in common with machine learning is a lack of attention to security. Security tends to be a low priority. It gets some lip service, but falls out of the picture when deadlines get tight. In software, that’s been institutionalized in the “move fast and break things” mindset. If you’re building fast, you’re not going to take the time to write sanitary code, let alone think about attack vectors. You might not “break things,” but you’re willing to build broken things; the benefits of delivering insecure products on time outweigh the downsides, as Daniel Miessler has written. You might be lucky; the vulnerabilities you create may never be discovered. But if security experts aren’t part of the development team from the beginning, if security is something to be added on at the last minute, you’re relying on luck, and that’s not a good position to be in. Machine learning is no different, except that the pressure of delivering a product on time is even greater, the issues aren’t as well understood, the attack surface is larger, the targets are more valuable, and companies building machine learning products haven’t yet engaged with the problems.

What kinds of attacks will machine learning systems see, and what will they have to defend against? All of the attacks we have been struggling with for years, but there are a number of vulnerabilities that are specific to machine learning. Here’s a brief taxonomy of attacks against machine learning:

Poisoning, or injecting bad (“adversarial”) data into the training data. We’ve seen this many times in the wild. Microsoft’s Tay was an experimental chatbot that was quickly taught to spout racist and anti-semitic messages by the people who were chatting with it. By inserting racist content into the data stream, they effectively gained control over Tay’s behavior. The appearance of “fake news” in channels like YouTube, Facebook, Twitter, and even Google searches, was similar: once fake news was posted, users were attracted to it like flies, and the algorithms that made recommendations “learned” to recommend that content. danah boyd has argued that these incidents need to be treated as security issues, intentional and malicious corruption of the data feeding the application, not as isolated pranks or algorithmic errors.

Any machine learning system that constantly trains itself is vulnerable to poisoning. Such applications could range from customer service chat bots (can you imagine a call center bot behaving like Tay?) to recommendation engines (real estate redlining might be a consequence) or even to medical diagnosis (modifying recommended drug dosages). To defend against poisoning, you need strong control over the training data. Such control is difficult (if not impossible) to achieve. “Black hat SEO” to improve search engine rankings is nothing if not an early (and still very present) example of poisoning. Google can’t control the incoming data, which is everything that is on the web. Their only recourse is to tweak their search algorithms constantly and penalize abusers for their behavior. In the same vein, bots and troll armies have manipulated social media feeds to spread views ranging from opposition to vaccination to neo-naziism.

Evasion, or crafting input that causes a machine learning system to misclassify it. Again, we’ve seen this both in the wild and in the lab. CV Dazzle uses makeup and hair styles as “camouflage against face recognition technology.” Other research projects have shown that it’s possible to defeat image classification by changing a single pixel in an image: a ship becomes a car, a horse becomes a frog. Or, just as with humans, image classifiers can miss an unexpected object that’s out of context: an elephant in the room, for example. It’s a mistake to think that computer vision systems “understand” what they see in ways that are similar to humans. They’re not aware of context, they don’t have expectations about what’s normal; they’re simply doing high-stakes pattern matching. Researchers have reported similar vulnerabilities in natural language processing, where changing a word, or even a letter, in a way that wouldn’t confuse human researchers causes machine learning to misunderstand a phrase.

Although these examples are often amusing, it’s worth thinking about real-world consequences: could someone use these tricks to manipulate the behavior of autonomous vehicles? Here’s how that could work: I put a mark on a stop sign—perhaps by sticking a fragment of a green sticky note at the top. Does that make an autonomous vehicle think the stop sign is a flying tomato, and if so, would the car stop? The alteration doesn’t have to make the sign “look like” a tomato to a human observer; it just has to push the image closer to the boundary where the model says “tomato.” Machine learning has neither the context nor the common sense to understand that tomatoes don’t appear in mid-air. Could a delivery drone be subverted to become a weapon by causing it to misunderstand its surroundings? Almost certainly. Don’t dismiss these examples as academic. A stop sign with a few pixels changed in the lab may not be different from a stop sign that has been used for target practice during hunting season.

Impersonation attacks attempt to fool a model into misidentifying someone or something. The goal is frequently to gain unauthorized access to a system. For example, an attacker might want to trick a bank into misreading the amount written on a check. Fingerprints obtained from drinking glasses, or even high resolution photographs, can be used to fool fingerprint authentication. South Park trolled Alexa and Google Home users by using the words “Alexa” and “OK Google” repeatedly in an episode, triggering viewers’ devices; the devices weren’t able to distinguish between the show voices and real ones. The next generation of impersonation attacks will be “deep fake” videos that place words in the mouths of real people.

Inversion means using an API to gather information about a model, and using that information to attack it. Inversion can also mean using an API to obtain private information from a model, perhaps by retrieving data and de-anonymizing it. In “The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets,” the authors show that machine learning models tend to memorize all their training data, and that it’s possible to extract protected information from a model. Common approaches to protecting information don’t work; the model still incorporates secret information in ways that can be extracted. Differential privacy—the practice of carefully inserting extraneous data into a data set in ways that don’t change its statistical properties—has some promise, but with significant cost: the authors point out that training is much slower. Furthermore, the number of developers who understand and can implement differential privacy is small.

While this may sound like an academic concern, it’s not; writing a script to probe machine learning applications isn’t difficult. Furthermore, Michael Veale and others write that inversion attacks raise legal problems. Under the GDPR, if protected data is memorized by models, are those models subject to the same regulations as personal data? In that case, developers would have to remove personal data from models—not just the training data sets—on request; it would be very difficult to sell products that incorporated models, and even techniques like automated model generation could become problematic. Again, the authors point to differential privacy, but with the caution that few companies have the expertise to deploy models with differential privacy correctly.

Other vulnerabilities, other attacks

This brief taxonomy of vulnerabilities doesn’t come close to listing all the problems that machine learning will face in the field. Many of these vulnerabilities are easily exploited. You can probe Amazon to find out what products are recommended along with your products, possibly finding out who your real competitors are, and discovering who to attack. You might even be able to reverse-engineer how Amazon makes recommendations and use that knowledge to influence the recommendations they make.

More complex attacks have been seen in the field. One involves placing fake reviews on an Amazon seller’s site, so that when the seller removes the reviews, Amazon bans the seller for review manipulation. Is this an attack against machine learning? The attacker tricks the human victim into violating Amazon’s rules. Ultimately, though, it’s the machine learning system that’s tricked into taking an incorrect action (banning the victim) that it could have prevented.

Google bowling” means creating large numbers of links to a competitor’s website in hopes that Google’s ranking algorithm will penalize the competitor for purchasing bulk links. It’s similar to the fake review attack, except that it doesn’t require a human intermediary; it’s a direct attack against the algorithm that analyzes inbound links.

Advertising was one of the earliest adopters of machine learning, and one of the earliest victims. Click fraud is out of control, and the machine learning community is reluctant to talk about (or is unaware of) the issue—even though, as online advertising becomes ever more dependent on machine learning, fraudsters will learn how to attack models directly in their attempts to appear legitimate. If click data is unreliable, then models built from that data are unreliable, along with any results or recommendations generated by those models. And click fraud is similar to many attacks against recommendation systems and trend analysis. Once a “fake news” item has been planted, it’s simple to make it trend with some automated clicks. At that point, the recommendation takes over, generating recommendations which in turn generate further clicks. Anything automated is prone to attack, and automation allows those attacks to take place at scale.

The advent of autonomous vehicles, ranging from cars to drones, presents yet another set of threats. If the machine learning systems on an autonomous vehicle are vulnerable to attack, a car or truck could conceivably be used as a murder weapon. So could a drone—either a weaponized military drone or a consumer drone. The military already knows that drones are vulnerable; in 2011, Iran captured a U.S. drone, possibly by spoofing GPS signals. We expect to see attacks on “smart” consumer health devices and professional medical devices, many of which we know are already vulnerable.

Taking action

Merely scolding and thinking about possible attacks won’t help. What can be done to defend machine learning models? First, we can start with traditional software. The biggest problem with insecure software isn’t that we don’t understand security; it’s that software vendors, and software users, never take the basic steps they would need to defend themselves. It’s easy to feel defenseless before hyper-intelligent hackers, but the reality is that sites like Equifax become victims because they didn’t take basic precautions, such as installing software updates. So, what do machine learning developers need to do?

Security audits are a good starting point. What are the assets that you need to protect? Where are they, and how vulnerable are they? Who has access to those resources, and who actually needs that access? How can you minimize access to critical data? For example, a shipping system needs customer addresses, but it doesn’t need credit card information; a payment system needs credit card information, but not complete purchase histories. Can this data be stored and managed in separate, isolated databases? Beyond that, are basic safeguards in place, such as two-factor authentication? It’s easy to fault Equifax for not updating their software, but almost any software system depends on hundreds, if not thousands, of external libraries. What strategy do you have in place to ensure they’re updated, and that updates don’t break working systems?

Like conventional software, machine learning systems should use monitoring systems that generate alerts to notify staff when something abnormal or suspicious occurs. Some of these monitoring systems are already using machine learning for anomaly detection—which means the monitoring software itself can be attacked.

Penetration testing is a common practice in the online world: your security staff (or, better, consultants) attack your site to discover its vulnerabilities. Attack simulation is an extension of penetration testing that shows you “how attackers actually achieve goals against your organization.” What are they looking for? How do they get to it? Can you gain control over a system by poisoning its inputs?

Tools for testing computer vision systems by generating “adversarial images” are already appearing, such as cleverhans and IBM’s ART. We are starting to see papers describing adversarial attacks against speech recognition systems. Adversarial input is a special case of a more general problem. Most machine learning developers assume their training data is similar to the data their systems will face in the real world. That’s an idealized best case. It’s easy to build a face identification system if all your faces are well-lit, well-focused, and have light-skinned subjects. A working system needs to handle all kinds of images, including images that are blurry, badly focused, poorly lighted—and have dark-skinned subjects.

Safety verification is a new area for AI research, still in its infancy. Safety verification asks questions like whether models can deliver consistent results, or whether small changes in the input lead to large changes in the output. If machine learning is at all like conventional software, we expect an escalating struggle between attackers and defenders; better defenses will lead to more sophisticated attacks, which will lead to a new generation of defenses. It will never be possible to say that a model has been “verifiably safe.” But it is important to know that a model has been tested, and that it is reasonably well-behaved against all known attacks.

Model explainability has become an important area of research in machine learning. Understanding why a model makes specific decisions is important for several reasons, not the least of which is that it makes people more comfortable with using machine learning. That “comfort” can be deceptive, of course. But being able to ask models why they made particular decisions will conceivably make it easier to see when they’ve been compromised. During development, explainability will make it possible to test how easy it is for an adversary to manipulate a model, in applications from image classification to credit scoring. In addition to knowing what a model does, explainability will tell us why, and help us build models that are more robust, less subject to manipulation; understanding why a model makes decisions should help us understand its limitations and weaknesses. At the same time, it’s conceivable that explainability will make it easier to discover weaknesses and attack vectors. If you want to poison the data flowing into a model, it can only help to know how the model responds to data.

In “Deep Automation in Machine Learning,” we talked about the importance of data lineage and provenance, and tools for tracking them. Lineage and provenance are important whether or not you’re developing the model yourself. While there are many cloud platforms to automate model building and even deployment, ultimately your organization is responsible for the model’s behavior. The downside of that responsibility includes everything from degraded profits to legal liability. If you don’t know where your data is coming from and how it has been modified, you have no basis for knowing whether your data has been corrupted, either through accident or malice.

Datasheets for Datasets” proposes a standard set of questions about a data set’s sources, how the data was collected, its biases, and other basic information. Given a specification that records a data set’s properties, it should be easy to test and detect sudden and unexpected changes. If an attacker corrupts your data, you should be able to detect that and correct it up front; if not up front, then later in an audit.

Datasheets are a good start, but they are only a beginning. Whatever tools we have for tracking data lineage and provenance need to be automated. There will be too many models and data sets to rely on manual tracking and audits.

Balancing openness against tipping off adversaries

In certain domains, users and regulators will increasingly prefer machine learning services and products that can provide simple explanations for how automated decisions and recommendations are being made. But we’ve already seen that too much information can lead to certain parties gaming models (as in SEO). How much to disclose depends on the specific application, domain, and jurisdiction.

This balancing act is starting to come up in machine learning and related areas that involve the work of researchers (who tend to work in the open) who are up against adversaries who prize unpublished vulnerabilities. The question of whether or not to “temporarily hold back” research results is a discussion that the digital media forensics community has been having. In a 2018 essay, Hany Farid noted: “Without necessarily advocating this as a solution for everyone, when students are not involved on a specific project, I have held back publication of new techniques for a year or so. This approach allows me to always have a few analyses that our adversaries are not aware of.”

Privacy and security are converging

Developers will also need to understand and use techniques for privacy-preserving machine learning, such as differential privacy, homomorphic encryption, secure multi-party computation, and federated learning. Differential privacy is one of the few techniques that protects user data from “inverting” a model and extracting private data from it. Homomorphic encryption allows systems to do computations directly on encrypted data, without the need for decryption. And federated learning allows individual nodes to compute parts of a model, and then send their portion back to be combined to build a complete model; individual users’ data doesn’t have to be transferred. Federated learning is already being used by Google to improve suggested completions for Android users. However, some of these techniques are slow (in some cases, extremely slow), and require specialized expertise that most companies don’t have. And you often will need a combination of these techniques to achieve privacy. It’s conceivable that future tools for automated model building will incorporate these techniques, minimizing the need for local expertise.

Live data

Machine learning applications increasingly interact with live data, complicating the task of building safe, reliable, and secure systems. An application as simple as a preference engine has to update itself constantly as its users make new choices. Some companies are introducing personalization and recommendation models that incorporate real-time user behavior. Disinformation campaigns occur in real time, so detecting disinformation requires knowledge bases that can be updated dynamically, along with detection and mitigation models that can also be updated in real time. Bad actors who create and propagate disinformation are constantly getting more sophisticated, making it harder to detect, particularly with text-based content. And recent developments in automatic text generation means that the creation of “fake news” can be automated. Machine learning can detect potential misinformation, but at present, humans are needed to verify and reject misinformation. Machine learning can aid and support human action, but humans must remain in the loop.

Applications of reinforcement learning frequently interact with live data, and researchers are well aware of the need to build reinforcement learning applications that are safe and robust. For applications like autonomous driving, failures are catastrophic; but at the same time, the scarcity of failure makes it harder to train systems effectively.

Organization and culture

In traditional software development, we are finally learning that security experts have to be part of development teams from the beginning. Security needs to be part of the organization’s culture. The same is true for machine learning: from the beginning, it’s important to incorporate security experts and domain experts who understand how a system is likely to be abused. As Facebook’s former chief security officer Alex Stamos has said, “the actual responsibility [for security] has to be there when you’re making the big design decisions.” Every stage of a machine learning project must think about security: the initial design, building the data pipelines, collecting the data, creating the models, and deploying the system. Unfortunately, as Stamos notes, few teams are actually formed this way.


Whatever they might believe, most organizations are in the very early stages of adopting machine learning. The companies with capabilities equivalent to Google, Facebook, Amazon, or Microsoft are few and far between; at this point, most are still doing some early experiments and proofs of concepts. Thought and effort haven’t gone into security. And maybe that’s fair; does a demo need to be secure?

Perhaps not, but it’s worth thinking carefully about history. Security is a problem in part because the inventors of modern computer networking didn’t think it was necessary. They were building the ARPAnet: an academic research net that would never go beyond a few hundred sites. Nobody anticipated the public internet. And yet, even on the proto-internet, we had the Morris worm in the 80s, and email spam in the ’70s. One of the things we do with any technology is abuse it. By ignoring the reality of abuse, we entered a never-ending race; it’s impossible to win, impossible to quit, and easy to lose.

But even if we can give the internet’s early naivete a pass, there’s no question that we live in a world where security is a paramount concern. There is no question that applications of machine learning will touch (indeed, invade) people’s lives, frequently without their knowledge or consent. It is time to put a high priority on security for machine learning.

We believe that attacks against machine learning systems will become more frequent and sophisticated. That’s the nature of the security game: an attack is countered by a defense, which is countered in turn by a more sophisticated attack, in a game of endlessly increasing complexity. We’ve listed a few kinds of attacks, but keep in mind we’re in the early days. Our examples aren’t exhaustive, and there are certainly many vulnerabilities that nobody has yet thought of. These vulnerabilities will inevitably be discovered; cybercrime is a substantial international business, and the bad actors even include government organizations.

Meanwhile, the stakes are getting higher. We’ve only begun to pay the penalty for highly vulnerable networked devices—the Internet of Things (IoT)—and while the security community is aware of the problems, there are few signs that manufacturers are addressing the issues. IoT devices are only becoming more powerful, and 5G networking promises to extend high-bandwidth, low-latency connectivity to the edges of the network. We are already using machine learning in our phones; will machine learning extend to near-microscopic chips embedded in our walls? There are already voice activity detectors that can run on a microwatt; as someone on Twitter suggests, a future generation could possibly run on energy generated from sound waves. And there are already microphones where we least suspect them. Deploying insecure “smart devices” on this scale isn’t a disaster waiting to happen; it’s a disaster that’s already happening.

We have derived a lot of value from machine learning, and we will continue to derive value from pushing it to the limits; but if security issues aren’t addressed, we will have to pay the consequences. The software industry has demonstrated, all too clearly, what happens when you don’t pay attention to security. As machine learning penetrates our lives, those stakes will inevitably become higher.

Related resources:

Post topics: Artificial Intelligence

Get the O’Reilly Radar Trends to Watch newsletter