This past June, a driverless truck passed a 200-mile test drive from Yuma, Arizona, to San Diego, California—a milestone for autonomous trucking in the U.S. This feat was achieved by the company TuSimple, which trained its driving system using an AI technique known as deep learning to simulate tens of millions of miles of road driving.
Deep learning can approach tasks that are easy for a person but hard for computers, such as: identifying people and objects in photos; detecting mood and intention in an image, text, or voice interaction; or recognizing handwriting. Rather than hand-coding software routines with specific instructions, the system is trained using large amounts of data and algorithms that give it the ability to learn how to perform the task.
With deep learning, we are now able to build software that can detect fraud; identify patterns in trading; recommend new products to customers; and, in the very near future, provide autonomous trucks to an industry plagued with a chronic shortage of drivers, fatal accidents, and high overhead cost due to fuel inefficiencies. TuSimple’s driverless trucks are poised to disrupt the $700 billion U.S. trucking industry—just one thin slice of how advances in deep learning and AI will transform our lives. “For me, it was the challenge of working on unsolved problems that drew me to this project,” says Xiadoi Hou, co-founder and CTO of TuSimple.
So how is TuSimple using deep learning to make self-driving trucks possible? “We use cameras as our primary sensors,” Hou explains. “Each camera ingests 20-30 frames per second, about 100 megabytes of data, which passes through layers and layers of deep learning network stacks and gets some result. The results are further combined in algorithms to make real-time decisions based on the truck’s self-information, velocity, and angle, where the other cars or obstacles are, detecting lanes, etc.”
TuSimple’s deep learning requirements for training versus implementation are very different. The models are created and trained in a multiple-GPU-based Amazon Web Services cloud environment. Once the models are trained, they are transferred to truckborne computers where the models interpret sensor input into results that enable real-time understanding and reaction to road conditions and other road users.
In 2015, when Hou started thinking seriously about adopting a deep learning framework to speed development, Apache MXNet was still a fledgling project known as CXXNET. Hou had jotted down his requirements for a framework, which included specific constraints on time, cost, capacity, and scalability. MXNet’s breakthrough, and what attracted Hou and his team, is that it successfully combined the ability to scale to multiple GPUs across multiple hosts with high performance and cross-platform portability.
MXNet was also efficient in training—offsetting the cost of computational power needed to train models. “The choice of a deep learning framework is really important,” Hou emphasizes “You can waste a lot of computational resources.” Hou also cites a recent article that benchmarked MXNet against TensorFlow and found that for eight GPUs and a well-known data set called CIFAR-10, MXNet was significantly faster, much more memory-efficient and accurate than TensorFlow.
Given the fast pace of development in deep learning, Hou and his team are avid supporters of the open source community. They have been contributing to MXNet since its early days, and their contributions span from classic algorithm implementations to tutorials and talks. TuSimple has released widely-adopted pipelines on MXNet, such as Faster RCNN, and, in turn, they benefit from a thriving community of developers collaborating on a framework that is focused on solving fundamental problems.
The value of community behind MXNet and other open source projects cannot be understated. The Apache Way, which anoints open source projects that use the Apache Software Foundation voting procedures and the rest of their governance processes, has given rise to many popular tools, including Hadoop, Spark, Kafka, and Mahout. As these tools attract more developers, the better the code gets.
In TuSimple’s case, MXNet only solves about 10% of the problems they are working on, but which could have easily taken up 90% of their time. TuSimple’s contributions to MXNet have been enhanced by other developers. And while he could have created a branch of MXNet with his own codebase, he chose not to because TuSimple would be running on a framework that is isolated from the main branch and would not benefit from its continued development and enhancement.
Deep learning is only one aspect of creating a comprehensive autonomous driving solution for commercial trucks. The technique is best at solving interpretation problems such as image recognition, object detection, estimating the relative speed of vehicles to the ground, bump detection, lane detection, etc., but has limited capacity for meaningful implementation of the rules that govern driving. The “black box” approach inherent in most of today’s deep learning algorithms gives little insight into the internal logic of the algorithms.
What might happen in a court of law when a system cannot explain its own logic (if a truck makes a sudden stop to avoid a hazard that human drivers didn’t see, it would have to "explain" its behavior)? Or what would happen if the truck begins learning behavior that is against the law (if the most efficient path involves sweeping across four lanes of traffic, for example)? For a system with immediate real-world consequences, this would add too much uncertainty, and that necessitates other models to govern the system and provide a holistic view of the external world. “Imagine the whole solution looking like a tree,” explained Hou. “The leaf nodes would be deep learning approaches and the branching nodes would be based on Bayesian inference that incorporates priors and rules.”
As for the future, Hou is excited about TVM, a newly-released submodule of MXNet, and TuSimple's contributions to it. Driverless trucks are fast becoming a reality—as deep learning matures, it will reduce barriers to entry and spark even more innovations.