Built by Jess Dixon of Andalusia, Alabama. Can fly forward, backward, straight up, or hover in the air. Circa 1940.
Built by Jess Dixon of Andalusia, Alabama. Can fly forward, backward, straight up, or hover in the air. Circa 1940. (source: Kobel Feature Photos on Wikimedia Commons).

Ask a random person for an example of an AI system and chances are he or she will name self-driving vehicles. In this episode of the O’Reilly Data Show, I sat down with Shaoshan Liu, co-founder of PerceptIn and previously the senior architect (autonomous driving) at Baidu USA. We talked about the technology behind self-driving vehicles, their reliance on rule-based decision engines, and deploying large-scale deep learning systems.

Here are some highlights from our conversation:

Advanced sensors for mapping, localization, and obstacle avoidance

The first part is sensing. How do you gather data about the environment? You have different types of sensors. The main type of sensor used in today's autonomous driving is LIDAR, a laser-based radar. A main problem with LIDAR is cost. However, there are startups that are working on low-cost LIDAR systems. Then, of course, there is GPS, and in addition there is a sensor called the inertial measurement unit (IMU). People today usually combine the data from GPS, IMU, and LIDAR to localize the vehicle to centimeter accuracy.

There's one more sensor—a radar— used for obstacle avoidance. It's a reactive mechanism. If all of the above sensors fail to recognize that there's an object in front of you, then this sensor can detect objects five to 10 meters away from you. This radar is hooked up directly to the control system, such that when it detects there's an object in front of you it can drive the car away from the object autonomously.

Sophisticated machine learning pipelines for perception

To me, perception has three major components. The first component is how you localize your vehicle, and then based on localization information, you can make decisions about where to navigate. The second component is object recognition. Here, deep learning technology is commonly used to take camera data and recognize the objects around your vehicle. The third component is object tracking. You might be in a car on a highway, for example. You want to know what the car next to you is doing. … A deep learning-based object-tracking mechanism is what you would normally use to track the car or the objects next to you.

Largely rule-based decision engines

The decision pipeline normally includes a few major components. The first one is path planning. How do you want to go from point A to point B and plan your path? How do you issue instructions to the vehicle to go from point A to point B? There are many research papers and algorithms on route planning; the famous A* algorithm is often impractical.

The second part is prediction. We discussed that as part of the perception pipeline—there's object tracking to track nearby objects. Then, we have a prediction algorithm based on the tracking results. The algorithm measures the likelihood of crashing into or avoiding nearby objects. Based on these predictions, we derive the object- or obstacle-avoidance decisions. How do we drive away from these obstacles or moving objects such that we don't get into an accident?

Today, you’ll find largely rule-based engines, but there are many research projects on the of use reinforcement learning and deep learning networks to make autonomous decisions about prediction, obstacle avoidance, path bending, and so on.

Related resources: