A new infrastructure for biology
The revolution in automation is fueling biology at scale.
BioFabricate, Synbiobeta, and a few conversations over the last few weeks have helped me to refine my take on the future of biology. We’re seeing a revolution in biology and related sciences — one prerequisite for that revolution is an ongoing revolution in tooling; part of that tooling revolution is a revolution in automation.
I’ve written about robotic lab equipment. Lab automation is central to reproducibility: it forces more complete descriptions of experimental procedure, and minimizes errors and dependencies on “technique.” Robots don’t understand lab culture and folklore: everything that goes into the experiment has to be part of the program. And they don’t have bad days, or better bench technique than the robot in the lab down the hall. They don’t get tired, and they don’t make mistakes.
But Christina Agapakis’ talk at BioFabricate, followed up by Emily LeProust’s talk at Synbiobeta, showed me another dimension of laboratory robotics. Agapakis talked about Ginkgo Bioworks‘ work on synthesizing scents for a perfume company, but what really caught my eye was the automated lab that they have built. Why an automated lab? Of course, Ginkgo is interested in reliability and repeatability: you can’t make a commercial product if you can’t produce it reliably. But that’s not the only reason. Ginkgo is doing biology at a scale where automation is the only way forward.
The process from an experimental result to a product is long and arduous. A researcher does an experiment on a lab bench that shows a microbe can be engineered to create a desirable compound. Great. The researcher can write a paper — but we’re still far away from a commercial process. A lot of questions still have to be answered.
- Can the compound be produced reliably? A process that only works sometimes may be interesting, and may have important implications for research, but it’s hard to build a business around something that only works half of the time.
- Can the compound be produced in quantities that are commercially useful? Microscopic quantities are fine for academic papers, but commercial processes need to be produce results in volumes that can be shipped to customers.
- Can the compound be produced cheaply? Microbes have to be fed. And not all foodstocks are inexpensive.
- Will the process work in real-world conditions? What are the tolerances? If the temperature is one degree high or low, does the whole thing stop? If you grow a yeast in a large fermentation vat rather than a test tube, will waste products choke it out?
That’s a start. To answer these questions, companies like Ginkgo run thousands of variations of the experiment: different microbes, different ways of inserting DNA, different kinds of food, different environments. This process is all about making a reaction commercially viable: making the transition from a successful experiment to an industrial process. It’s part of what we mean when we talk about synthetic biology as an engineering discipline. Since we don’t understand biological processes well enough to design circuits that work the first time, we need to proceed by trial and error. At a massive scale.
But how do you run thousands of experiments? You don’t do that by hiring a lot of bench biologists. Even with the cheapest labor, that would be prohibitively expensive. And you can’t do the experiments one at a time, or ten at a time, either. That would be prohibitively slow. Results 10 years from now aren’t useful.
Hence, automated labs: not just for reliability, but for scale. Emily LeProust of Twist gave me a sense for what scale means. We’re used to thinking about test tubes, but biologists frequently talk about “plates,” which are essentially an array of miniature test tubes built together. There are typically 96 wells (test tubes) on a plate, which is roughly the size of a 3×5 card. That’s small, but still something that biologists can work with by hand. It’s also well within the capability of a simple lab robot. So are plates with 384 wells.
LeProust showed me how Twist delivers DNA samples: on plates with 9,600 wells. The wells are all but microscopic. At this scale, there’s no way humans can do the experiment by hand. You’re not going to stick a pipette into a well that you can barely see to extract a few nanoliters of fluid. You need a laboratory that’s robotic from start to finish. This lab needs to be able to work with fluids at a microscopic scale. It need to use techniques like acoustic liquid handling, microfluidics, and even nanofluidics.
Everything — the start of the experiment to data collection — has to be automated. There’s no sitting around with a lab notebook. The role of the biologist isn’t to “do the experiment,” but to design the experiments (at scale), create the programs needed to run them on the automated lab, and analyze the data that comes back.
This changes the game completely. At this scale, biology is no longer a laboratory science, if a laboratory means a large room with countertops and bunsen burners. You design an experiment on your laptop, in your office, a coffeeshop, or on a beach. You send that experiment to the lab, which might be a facility owned by your employer or research institution; or it might be a “lab as a service,” such as Transcriptic or Emerald. Vendors like Twist deliver supplies directly to the automated lab.
How, specifically, does the game change? The speed at which we can synthesize DNA is increasing at a rate that exceeds Amdahl’s law, and the cost of synthesized DNA is plumetting. It’s down to $0.10 per base, and expected to drop further as the market grows. Many of the reagents used in biology are extremely expensive, but cost is less meaningful when you’re using chemicals in nanoliter quantities. And we can run thousands of experiments at a time: 24/7, using automated equipment that never makes mistakes because it’s tired or bored.
And, rather than dealing with data from a single experiment, we’re dealing with data from thousands. Even when the number of experiments is relatively small, the data involved can be huge: DNA sequences, high-resolution microscopic images, calibration information from data, and large databases.
This is the new infrastructure for biology: large robotic foundries, running experiments in parallel; the tools to program and manage these foundries; and the tools for managing, analyzing, and sharing the data that results. It’s biology at scale.