In 1852, mathematics professor Augustus De Morgan described a problem posed by a student:

If a figure be anyhow divided and the compartments differently coloured so that figures with any portion of common boundary line are differently coloured—four colours may be wanted, but not more—the following is the case in which four colours are wanted. Query cannot a necessity for five or more be invented.

That is, you’ll never need more than four colors on an ordinary two-dimensional map in order to color every country differently from the countries adjoining it. A proof for the four-color conjecture evaded mathematicians until 1976, when Kenneth Appel and Wolfgang Haken announced a solution. They had reduced the set of all possible map configurations to 1,936 fundamental configurations, then used 1,200 hours of computer time to verify that each could be colored with only four colors. In all, their program performed billions of individual calculations.

Appel and Haken’s computer-based casewise approach was immediately controversial. Theirs was the first major proof that couldn’t be entirely comprehended by an individual human—what a critic labeled a “non-surveyable proof.” Since then, mathematicians have continued to be skeptical of computer proofs. In 1998, Thomas Hales offered a casewise computer-checked proof to the Kepler conjecture; a reviewer likened the years-long process of checking it to proof-reading a phone book.

At first glance, these sorts of arguments look like academic quirks—the preoccupation of a rarified group of intellectuals who think computer proofs aren’t beautiful enough. But the four-color proof was the first landmark application of a problem-solving approach that has become commonplace in the digital realm and will soon be commonplace in the physical world. Our environment is about to become computer optimized.

We’re surrounded by what you might call non-surveyable objects: a modern airliner results from millions of person-hours of design and engineering work, and no individual engineer could possibly verify every aspect of its design or search the entire space of design alternatives to identify improvements. The Web takes this even further. Not only is the codebase for Google Search large and complex beyond the comprehension of any individual, but it also routinely generates web pages on demand that no human has ever seen before, or will ever see again.

What’s new is the use of machine learning techniques to create and optimize physical designs. Researchers have been investigating the idea for decades, but cheap cloud computing and advances in machine learning will make these techniques accessible in the design of nearly anything. The result is a design whose rationales are known only to an artificial-intelligence algorithm, in which there are design decisions that no human can articulate. These designs are likely to be much more complex than human designs, and they make accessible a vast design space that ordinary human iteration wouldn’t be able to explore.

Machine-learning techniques are now widely used to optimize complex information systems. A genetic algorithm starts with a fundamental description of a desired outcome—say, an airline timetable that’s optimized for fuel savings and passenger convenience. It adds in constraints—the number of planes an airline owns, the airports it operates in, and the number of seats on each plane. It loads what you might think of as independent variables—details on thousands of flights from an existing timetable, or perhaps randomly generated dummy information.

The algorithm tests that timetable against the optimization goals, discerns the effect that each change to the timetable might have on its performance against the goals, adjusts the timetable incrementally, and tests again. It might also ingest data on past performance—data that might not have any evident pattern in itself, but that enriches the model.

Over thousands, millions, or billions of iterations, the timetable gradually improves to become more efficient and more convenient. And the algorithm gains an understanding of how each element of the timetable—the takeoff time of flight 37 from O’Hare, for instance—affects the dependent variables of fuel efficiency and passenger convenience.

Researchers have looked for ways to apply this kind of iterative optimization to physical design for decades, initially applying it to specialized objects with well-defined optimization goals, like antennas. In 2006, NASA launched three spacecraft with so-called evolved antennas that “resemble bent paper clips,” making those the first generatively designed objects to fly in space. Today, lattices and other complex, structurally critical elements are often designed generatively.

Generalizing this kind of process to design any kind of object is difficult. The more you generalize, the more complex the definition process becomes. You need a lot of data to understand the interactions between independent variables and dependent variables, and the models become exponentially more complex as you add parameters.

For relief, we look to cloud computing. As Autodesk CEO Carl Bass has pointed out, we’re approaching the point where one second of computing time on 10,000 computers working in parallel is as inexpensive as 10,000 seconds of computing time on a single computer. (To save you the arithmetic, that’s nearly three hours’ worth of computing delivered in one second.) If machine learning can be made highly parallel, the cloud can deliver results almost instantaneously.

Autodesk is testing that insight in Project Dreamcatcher, software that can design optimized physical objects with genetic algorithms. A designer using Dreamcatcher begins by specifying constraints and fundamental requirements: a chair, for instance, needs a platform raised a certain height off the floor that can support a human load. Then she specifies optimization goals: minimize weight, maximize rigidity, minimize 3D-printing time, and so on. And then she hits “start.”

From there, Dreamcatcher takes a solid mass that fits the constraints and starts whittling away. Each time it shears off a bit of material, it tests the new design against the designer’s goals. Is the new design lighter, more rigid, and easier to manufacture? The algorithm uses the test to gradually build up an understanding of how each bit of material affects the design’s performance. It steers the design process accordingly, making larger and smaller modifications in different directions as it seeks improvement, like a ball rolling around on a lumpy surface, searching for a low point to settle into.

Along the way, the designer can give feedback to the algorithm. Maybe the optimal design isn’t an aesthetically pleasing one, or it reveals an essential constraint that the designer hadn’t considered. By adding new constraints as she observes the output of the algorithm, the designer can direct the process at a high level and use the impartial algorithm to build the details of a design that’s pleasing to humans. And just like the ball on the lumpy surface, the algorithm might encounter a local minimum—a small depression that appears to be an optimized design. Here, the operator might give the algorithm a gentle nudge and encourage it to search elsewhere.

In the process, the designer’s role becomes one of high-level vision and curation, and the role of software in relation to the designer changes. As Erin Bradner, a researcher at Autodesk, puts it, computer-assisted design “has, until now, been about transcription: a human designs something in his head, then uses software to transcribe the design into the computer. Now, CAD is about augmentation.”

What happens when you generalize that idea to any kind of process? You’d have a human describe an optimal outcome and the process that leads up to it, and ask a computer to discover the relationships between each input and each step in the process and its outcome. The whole thing could be driven by the enormous quantities of data that are available through connected sensors and the Internet of Things.

That’s essentially what Riffyn, a startup based in Oakland, is headed toward *[disclosure: Riffyn is a portfolio company of O’Reilly AlphaTech Ventures, a venture capital firm affiliated with O’Reilly Media]*. It’s developing software for designing and analyzing scientific experiments, but, really, it’s a generalized platform for optimizing any physical process.

Alastair Dant, Riffyn’s lead interactive developer, showed me a tongue-in-cheek experiment he cooked up that searches for the best coffee, as measured in lines of code written by his engineers. Riffyn presents its user with a flow chart that reads left to right; coffee beans and engineers go in, and lines of code come out. Each step in the process has its own node—grind coffee, brew coffee, serve coffee—and its own inputs and outputs that make up a “genealogy” linking the steps together. The experimenter switches out the variety of beans and sees whether engineer output changes.

That’s all normal science: isolate individual variables that might have an effect on some measurement, and test variations one by one. But, midway through a run of experiments, you can decide to start adding and varying parameters that you hadn’t considered at the outset. Add water as an input and vary its temperature at the same time that you’re varying the brand of coffee beans. Add a sub-process to the brewing node that represents resting and measure the effect of a post-brew rest on outcome.

It’s the kind of seeking that computers are good at: start with a simple model, then make it more sophisticated as you go. The contours of causality emerge as the computer iterates and its human operator adds variables and relationships between them.

“The process itself, written in a lab manual, is almost a Platonic ideal,” says Dant. The scientist’s challenge is to link that process to reality.

The experiment begins to look less like an ordinary cause-and-effect test and more like a genetic algorithm that’s searching for an optimal combination of many variables. With repeated variation and testing, the effects of each variable, singly and jointly with others, become clear.

An important difference between Riffyn and Dreamcatcher is the amount of time that a test might take: Dreamcatcher can generate and test a model in a matter of seconds. Even in a lab that uses fully automated, high-volume equipment, a scientific experiment designed in Riffyn’s software might take years to carry out, and years more to refine iteratively. But that makes the computer modeling all the more important: it’s essential that lab time is used to vary parameters in the right direction.

In either case, though, indefinitely complex optimization becomes practical, and our experimental envelope grows dramatically. Computers can search through immense solution spaces for the ideal design; we might someday talk about “discovering a design” through the joint efforts of human and computer neurons, rather than “creating a design.”