What machine learning means for software development
“Human in the loop” software development will be a big part of the future.
“Human in the loop” software development will be a big part of the future.
Machine learning is poised to change the nature of software development in fundamental ways, perhaps for the first time since the invention of FORTRAN and LISP. It presents the first real challenge to our decades-old paradigms for programming. What will these changes mean for the millions of people who are now practicing software development? Will we see job losses and layoffs, or will see programming evolve into something different—perhaps even something more focused on satisfying users?
We’ve built software more or less the same way since the 1970s. We’ve had high-level languages, low-level languages, scripting languages, and tools for building and testing software, but what those tools let us do hasn’t changed much. Our languages and tools are much better than they were 50 years ago, but they’re essentially the same. We still have editors. They’re fancier: they have color highlighting, name completion, and they can sometimes help with tasks like refactoring, but they’re still the descendants of emacs and vi. Object orientation represents a different programming style, rather than anything fundamentally new—and, of course, functional programming goes all the way back to the 50s (except we didn’t know it was called that). Can we do better?
We will focus on machine learning rather than artificial intelligence. Machine learning has been called “the part of AI that works,” but more important, the label “machine learning” steers clear of notions like general intelligence. We’re not discussing systems that can find a problem to be solved, design a solution, and implement that solution on their own. Such systems don’t exist, and may never exist. Humans are needed for that. Machine learning may be little more than pattern recognition, but we’ve already seen that pattern recognition can accomplish a lot. Indeed, hand-coded pattern recognition is at the heart of our current toolset: that’s really all a modern optimizing compiler is doing.
We also need to set expectations. McKinsey estimates that “fewer than 5% of occupations can be entirely automated using current technology. However, about 60% of occupations could have 30% or more of their constituent activities automated.” Software development and data science aren’t going to be among the occupations that are completely automated. But good software developers have always sought to automate tedious, repetitive tasks; that’s what computers are for. It should be no surprise that software development itself will increasingly be automated.
This isn’t a radical new vision. It isn’t as if we haven’t been working on automated tools for the past half-century. Compilers automated the process of writing machine code. Scripting languages automate many mundane tasks by gluing together larger, more complex programs. Software testing tools, automated deployment tools, containers, and container orchestration systems are all tools for automating the process of developing, deploying, and managing software systems. None of these take advantage of machine learning, but that is certainly the next step.
Will machine learning eat software, as Pete Warden and Andrej Karpathy have argued? After all, “software eating the world” has been a process of ever-increasing abstraction and generalization. A laptop, phone, or smart watch can replace radios, televisions, newspapers, pinball machines, locks and keys, light switches, and many more items. All these technologies are possible because we came to see computers as general-purpose machines, not just number crunchers.
From this standpoint, it’s easy to imagine machine learning as the next level of abstraction, the most general problem solver that we’ve found yet. Certainly, neural networks have proven they can perform many specific tasks: almost any task for which it’s possible to build a set of training data. Karpathy is optimistic when he says that, for many tasks, it’s easier to collect the data than to explicitly write the program. He’s no doubt correct about some very interesting, and very difficult, programs: it’s easy to collect training data for Go or Chess (players of every level have been recording games for well over 150 years), but very hard to write an explicit program to play those games successfully. So, machine learning is an option when you don’t know how to write the software, but you can collect the data. On the other hand, data collection isn’t always easy. We couldn’t even conceive of programs that automatically tagged pictures until sites like Flickr, Facebook, and Google assembled billions of images, many of which had already been tagged by humans. For tasks like face recognition, we don’t know how to write the software, and it has been difficult to collect the data. For other tasks, like billing, it’s easy to write a program based on a few simple business rules. It’s hard to imagine collecting the data you’d need to train a machine learning algorithm—but if you are able to collect data, the program you produce will be better at adapting to different situations and detecting anomalies, particularly if there’s a human in the loop.
Machine learning is already making code more efficient: Google’s Jeff Dean has reported that 500 lines of TensorFlow code has replaced 500,000 lines of code in Google Translate. Although lines of code is a questionable metric, a thousand-fold reduction is huge: both in programming effort and in the volume of code that has to be maintained. But what’s more significant is how this code works: rather than half a million lines of statistical code, it’s a neural network that has been trained to translate. As language changes and usage shifts, as biases and prejudices are discovered, the neural network can be revisited and retrained on new data. It doesn’t need to be rewritten. We shouldn’t understate the difficulty of training a neural network of any complexity, but neither should we underestimate the problem of managing and debugging a gigantic codebase.
We’ve seen research suggesting that neural networks can create new programs by combining existing modules. The system is trained using execution traces from other programs. While the programs constructed this way are simple, it’s significant that a single neural network can learn to perform several different tasks, each of which would normally require a separate program.
Pete Warden characterizes the future of programming as becoming a teacher: “the developer has to become a teacher, a curator of training data, and an analyst of results.” We find this characterization very suggestive. Software development doesn’t disappear; developers have to think of themselves in much different terms. How do you build a system that solves a general problem, then teach that system to solve a specific task? On one hand, this sounds like a risky, troublesome prospect, like pushing a rope. But on the other hand, it presumes that our systems will become more flexible, pliable, and adaptable. Warden envisions a future that is more about outcomes than writing about lines of code: training a generic system, and testing whether it meets your requirements, including issues like fairness.
Thinking more systematically, Peter Norvig has argued that machine learning can be used to generate short programs (but not long ones) from training data; to optimize small parts of larger programs, but not the entire program; and possibly to (with the help of humans) be better tutors to beginning programmers.
There are early indications that machine learning can outperform traditional database indexes: it can learn to predict where data is stored, or if that data exists. Machine learning appears to be significantly faster and require much less memory, but it is fairly limited: current tools based on machine learning do not cover multidimensional indexes, and assume that the database isn’t updated frequently. Retraining takes longer than rebuilding traditional database indexes. However, researchers are working on multidimensional learned indexes, query optimization, re-training performance, and other issues.
Machine learning is already making its way into other areas of data infrastructure. Data engineers are using machine learning to manage Hadoop, where it enables quicker response to problems such as running out of memory in a Hadoop cluster. Kafka engineers also report using machine learning to diagnose problems. And researchers have had success using machine learning to tune databases for performance, where it simplifies the problem of managing the many configuration settings that affect behavior. Data engineers and database administrators won’t become obsolete, but they may have to develop machine learning skills. And in turn, machine learning will help them to make difficult problems more manageable. Managing data infrastructure will be less about setting hundreds of different configuration parameters correctly than about training the system to perform well on your workload.
Making difficult problems manageable remains one of the most important issues for data science. Data engineers are responsible for maintaining the data pipeline: ingesting data, cleaning data, feature engineering, and model discovery. They are responsible for deploying software in very complex environments. Once all this infrastructure has been deployed, it needs to be monitored constantly to detect (or prevent) outages, and also to ensure that the model is still performing adequately. These are all tasks for which machine learning is well-suited, and we’re increasingly seeing software like MLFlow used to manage data pipelines.
Among the early manifestations of automated programming were tools designed to enable data analysts to perform more advanced analytic tasks. The Automatic Statistician is a more recent tool that automates exploratory data analysis and provides statistical models for time series data, accompanied by detailed explanations.
With the rise of deep learning, data scientists find themselves needing to search for the right neural network architectures and parameters. It’s also possible to automate the process of learning itself. After all, neural networks are nothing if not tools for automated learning: while building a neural network still requires a lot of human work, it would be impossible to hand-tune all the parameters that go into a model. One application is using machine learning to explore possible neural network architecture; as this post points out, a 10-layer network can easily have 1010 possibilities. Other researchers have used reinforcement learning to make it easier to develop neural network architectures.
Taking this further: companies like DataRobot automate the entire process, including using multiple models and comparing results. This process is being called “automated machine learning”; Amazon’s Sagemaker and Google’s AutoML provide cloud-based tools to automate the creation of machine learning models.
Model creation isn’t a one-time thing: data models need to be tested and re-tuned constantly. We are beginning to see tools for constant monitoring and tuning. These tools aren’t particularly new: bandit algorithms for A/B testing have been around for some time, and for many companies, bandit algorithms will be the first step toward reinforcement learning. Chatbase is a Google startup that monitors chat applications so developers can understand their performance. Do the applications understand the questions that users are asking? Are they able to resolve problems, or are users frequently asking for unsupported features? These are problems that could be solved by going through chat logs manually and flagging problems, but that’s difficult even with a single bot, and Chatbase envisions a future where many organizations have dozens or even hundreds of sophisticated bots for customer service, help desk support, and many other applications.
It is also possible to use machine learning to look for vulnerabilities in software. There are systems that will go over the code and look for known flaws. These systems don’t necessarily fix the code, nor do they promise to find every potential problem. But they can easily highlight dangerous code, and they can allow developers working on a large codebase to ask questions like “are there other problems like this?”
Game developers are looking to machine learning in several ways. Can machine learning be used to make backgrounds and scenes that look more realistic? Drawing and modeling realistic scenes and images is very expensive and time consuming. Currently, everything a non-player character (NPC) does has to be programmed explicitly. Can machine learning be used to model the behavior of NPCs? If NPCs can learn behavior, we can expect game play that is more creative.
What does the future look like for software developers? Will software development take the same path that McKinsey forecasts for other industries? Will 30% of the activities involved in software development and data science be automated?
Perhaps, though that’s a simplistic reading of the situation. Machine learning will no doubt change software development in significant ways. And it wouldn’t be surprising if a large part of what we now consider “programming” is automated. That’s nothing new, though: compilers don’t do machine learning, but they transformed the software industry by automating the generation of machine code.
The important question is how software development and data science will change. One possibility—a certainty, really—is that software developers will put much more effort into data collection and preparation. Machine learning is nothing without training data. Developers will have to do more than just collect data; they’ll have to build data pipelines and the infrastructure to manage those pipelines. We’ve called this “data engineering.” In many cases, those pipelines themselves will use machine learning to monitor and optimize themselves.
We may see training machine learning algorithms become a distinct subspecialty; we may soon be talking about “training engineers” the way we currently talk about “data engineers.” In describing his book Machine Learning Yearning, Andrew Ng says, “This book is focused not on teaching you ML algorithms, but on how to make ML algorithms work.” There’s no coding, and no sophisticated math. The book focuses almost entirely on the training process, which, more than coding, is the essence of making machine learning work.
The ideas we’ve presented have all involved augmenting human capabilities: they enable humans to produce working products that are faster, more reliable, better. Developers will be able to spend more time on interesting, important problems rather than getting the basics right. What are those problems likely to be?
Discussing intelligence augmentation in “How to Become a Centaur,” Nicky Case argues that computers are good at finding the best answer to a question. They are fundamentally computational tools. But they’re not very good at finding interesting questions to answer. That’s what humans do. So, what are some of the important questions we’ll need to ask?
We’re only starting to understand the importance of ethics in computing. Basic issues like fairness aren’t simple and need to be addressed. We’re only starting to think about better user interfaces, including conversational interfaces: how will they work? Even with the help of AI, our security problems are not going to go away. Regardless of the security issues, all of our devices are about to become “smart.” What does that mean? What do we want them to do? Humans won’t be writing as much low-level code. But because they won’t be writing that code, they’ll be free to think more about what that code should do, and how it should interact with people. There will be no shortage of problems to solve.
It’s very difficult to imagine a future in which humans no longer need to create software. But it’s very easy to imagine that “human in the loop” software development will be a big part of the future.