From determining the most convenient rider pickup points to predicting the fastest routes, Uber uses data-driven analytics to create seamless trip experiences.
Uber’s analysts and engineers wanted to run real-time analytics with deep learning models. But copying data from one source to another is pretty expensive.
Zhenxiao Luo explains how Uber supports real-time analytics with deep learning on the fly, without any data copying. He starts with the company’s big data infrastructure, specifically Hadoop, Spark, and Presto, and discusses how Uber uses Presto as an interactive SQL engine and deployed Hadoop Distributed File System, Pinot, MySQL, and Elasticsearch as storage solutions. He then details how Uber built a Presto Elasticsearch connector from scratch to support real-time analytics on heterogeneous data. He concludes by sharing the company’s production experience and roadmap.
This session was recorded at the 2019 O'Reilly Strata Data Conference in San Francisco.
- Title: Real-time analytics at Uber: Bring SQL into everything
- Release date: October 2019
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 0636920339793
You might also like
Mastering Big Data Analytics with PySpark
PySpark helps you perform data analysis at-scale; it enables you to build more scalable analyses and …
Advanced Applied SQL for Business Intelligence and Analytics
This example-driven course provides thoughtful and interactive commentary throughout. We understand the common mistakes and misconceptions …
Building Real-Time Analytics Systems
Gain deep insight into real-time analytics, including the features of these systems and the problems they …
SQL Commands, Joins, and Views
If you need to understand how to query relational databases, this Learning Path will get you …