Evolutionary computation: Stepping stones and unexpected solutions
An interview with Risto Miikkulainen.
Download our free report “Future of Machine Intelligence: Perspectives from Leading Practitioners,” now available. The following interview is one of many included in the report.
As part of our ongoing series of interviews surveying practitioners at the frontiers of machine intelligence, I recently interviewed Risto Miikkulainen, professor of computer science and neuroscience at the University of Texas at Austin, and a fellow at Sentient Technologies, Inc. Miikkulainen’s work focuses on biologically inspired computation, such as neural networks and genetic algorithms.
Key takeaways:
- Evolutionary computation is a form of reinforcement learning applied to optimizing a fitness function.
- Its applications include robotics, software agents, design, and Web commerce.
- It enables the development of truly novel solutions.
David Beyer: Why don’t we start with your background and how you got to your current role.
Risto Miikkulainen: I completed my Ph.D. in 1990 at the UCLA computer science department. Following that, I became a professor in the computer science department at the University of Texas, Austin. My dissertation and early work focused on building neural network models of cognitive science — language processing and memory, in particular. That work has continued throughout my career. I recently dusted off those models to drive toward understanding cognitive dysfunction like schizophrenia and aphasia in bilinguals.
Neural networks, as they relate to cognitive science and engineering, have been a main focus throughout my career. In addition to cognitive science, I spent a lot of time working in computational neuroscience.
More recently, my team and I have been focused on neuroevolution; that is, optimizing neural networks using evolutionary computation. We have discovered that neuroevolution research involves a lot of the same challenges as cognitive science — for example, memory, learning, communication, and so on. Indeed, these fields are really starting to come together.
DB: Can you give some background on how evolutionary computation works, and how it intersects with deep learning?
RM: Deep learning is a supervised learning method on neural networks. Most of the work involves supervised applications where you already know what you want — e.g., weather predictions, stock market prediction, the consequence of a certain action when driving a car. You are, in these cases, learning a nonlinear statistical model of that data, which you can then re-use in future situations. The flipside of that approach concerns unsupervised learning, where you learn the structure of the data, what kind of clusters there are, what things are similar to other things. These efforts can provide a useful internal representation for a neural network.
A third approach is called “reinforcement learning.” Suppose you are driving a car or playing a game: it’s harder to define the optimal actions, and you don’t receive much feedback. In other words, you can play the whole game of chess, and by the end, you’ve either won or lost. You know that if you lost, you probably made some poor choices. But which? Or, if you won, which were the well-chosen actions? This is, in a nutshell, a reinforcement learning problem.
Put another way, in this paradigm, you receive feedback periodically. This feedback, furthermore, will only inform you about how well you did without in turn listing the optimal set of steps or actions you took. Instead, you have to discover those actions through exploration — testing diverse approaches and measuring their performance.
Enter evolutionary computation, which can be posed as a way of solving reinforcement learning problems. That is, there exists some fitness function, and you focus on evolving a solution that optimizes that function.
In many cases, however, in the real world, you do not have a full state description — a full accounting of the facts on the ground at any given moment. You don’t, in other words, know the full context of your surroundings. To illustrate this problem, suppose you are in a maze. Many corridors look the same to you. If you are trying to learn to associate a value for each action/state pair, and you don’t know what state you are in, you cannot learn. This is the main challenge for reinforcement learning approaches that learn such utility values for each action in each respective state.
Evolutionary computation, on the other hand, can be very effective in addressing these problems. In this approach, we use evolution to construct a neural network, which then ingests the state representation, however noisy or incomplete, and suggests an action that is most likely to be beneficial, correct, or effective. It doesn’t need to learn values for each action in each state. It always has a complete policy of what to do — evolution simply refines that policy. For instance, it might first, say, always turn left at corners and avoid walls, and gradually then evolve toward other actions as well. Furthermore, the network can be recurrent, and consequently remember how it “got” to that corridor, which disambiguates the state from other states that look the same. Neuroevolution can perform better on problems where part of the state is hidden, as is the case in many real-world problems.
DB: How formally does evolutionary computation borrow from biology, and how you are driving toward potentially deepening that metaphor?
RM: Some machine learning comprises pure statistics or is otherwise mathematics-based, but some of the inspiration in evolutionary computation, and in neural networks and reinforcement learning in general, does in fact derive from biology. To your question, it is indeed best understood as a metaphor; we aren’t systematically replicating what we observe in the biological domain. That is, while some of these algorithms are inspired by genetic evolution, they don’t yet incorporate the overwhelming complexity of genetic expression, epigenetic influence, and the nuanced interplay of an organism with its environment.
Instead, we take the aspects of biological processes that make computational sense and translate them into a program. The driving design of this work, and indeed the governing principle of biological evolution, can be understood as selection on variation.
At a high level, it’s quite similar to the biological story. We begin with a population from which we select the members that reproduce the most, and through selective pressure, yield a new population that is more likely to be better than the previous one. In the meantime, researchers are working on incorporating increasing degrees of biological complexity into these models. Much work remains to be done in this regard.
DB: What are some of the applications of this work?
RM: Evolutionary algorithms have existed for quite a while, indeed since the ’70s. The lion’s share of work centered around engineering applications — e.g., trying to build better power grids, antennas, and robotic controllers through various optimization methods. What got us really excited about this field are the numerous instances where evolution not only optimizes something that you know well, but goes one step further and generates novel and indeed surprising solutions.
We encountered such a breakthrough when evolving a controller for a robot arm. The arm had six degrees of freedom, although you really only needed three to control it. The goal was to get its fingers to a particular location in 3D space. This was a rather straightforward exercise, so we complicated things by inserting obstacles along its path, all the while evolving a controller that would get to the goal while avoiding said obstacles. One day while working on this problem, we accidentally disabled the main motor — i.e., the one that turns the robot around its main axis. Without that particular motor, it could not reach its goal location.
We ran the evolution program, and although it took five times longer than usual, it ultimately found a solution that would guide the fingers into the intended location. We only understood what was going on when we looked at a graphical visualization of its behavior. When the target was, say, all the way to the left, and the robot needed to turn around the main axis to get its arm into close proximity, it was, by definition, unable to turn without its main motor. Instead, it turned the arm from the elbow and the shoulder, away from the goal, then swung it back with quite some force. Thanks to momentum, the robot would turn around its main axis, and get to the goal location, even without the motor. This was surprising to say the least.
This is exactly what you want in a machine learning system. It fundamentally innovates. If a robot on Mars loses its wheel or gets stuck on a rock, you still want it to creatively complete its mission.
Let me further underscore this sort of emergent creativity with another example (of which there are many!). In one of my classes, we assigned students to build a game-playing agent to win a game similar to tic-tac-toe, only played on a very large grid where the goal is to get five in a row. The class developed a variety of approaches, including neural networks and some rule-based systems, but the winner was an evolution system that evolved to make the first move to a location really far away, millions of spaces away from where the game play began. Opposing players would then expand memory to capture that move, until they ran out of memory and crashed. It was a very creative way of winning, something that you might not have considered a priori.
Evolution thrives on diversity. If you supply it with representations and allow it to explore a wide space, it can discover solutions that are truly novel and interesting. In deep learning, most of the time you are learning a task you already know — weather prediction, stock market prediction, etc. — but, here, we are being creative. We are not just predicting what will happen, but we are creating objects that didn’t previously exist.
DB: What is the practical application of this kind of learning in industry? You mentioned the Mars rover, for example, responding to some obstacle with evolution-driven ingenuity. Do you see robots and other physical or software agents being programmed with this sort of on-the-fly, ad hoc, exploratory creativity?
RM: Sure. We have shown that evolution works. We’re now focused on taking it out into the world and matching it to relevant applications. Robots, for example, are a good use case: they have to be safe, they have to be robust, and they have to work under conditions that no one can fully anticipate or model. An entire branch of AI called evolutionary robotics centers around evolving behaviors for these kinds of real, physical robots.
At the same time, evolutionary approaches can be useful for software agents, from virtual reality to games and education. Many systems and use cases can benefit from the optimization and creativity of evolution, including Web design, information security, optimizing traffic flow on freeways or surface roads, optimizing the design of buildings, computer systems, and various mechanical devices, as well as processes such as bioreactors and 3-D printing. We’re beginning to see these applications emerge.
DB: What would you say is the most exciting direction of this research?
RM: I think it is the idea that, in order to build really complex systems, we need to be able to use “stepping stones” in evolutionary search. It is still an open question: using novelty, diversity, and multiple objectives, how do we best discover components that can be used to construct complex solutions? That is crucial in solving practical engineering problems, such as making a robot run fast or making a rocket fly with stability, but also in constructing intelligent agents that can learn during their lifetimes, utilize memory effectively, and communicate with other agents.
But equally exciting is the emerging opportunity to take these techniques to the real world. We now have plenty of computational power, and evolutionary algorithms are uniquely poised to take advantage of it. They run in parallel and can, as a result, operate at very large scale. The upshot of all of this work is that these approaches can be successful on large-scale problems that cannot currently be solved in any other way.