Spark is one of today’s most popular distributed computation engines for processing and analyzing big data. This course provides data engineers, data scientist and data analysts interested in exploring the technology of data streaming with practical experience in using Spark. You’ll learn about the Spark Structured Streaming API, the powerful Catalyst query optimizer, the Tungsten execution engine, and more in this hands-on course where you’ll build small several applications that leverage all the aspects of Spark 2.0. While not a requirement, the course works best for those with some Scala experience.
Michael Li is the founder of The Data Incubator, which provides big data corporate training and a selective eight-week fellowship for PhDs transitioning into industry. Previously, he worked as a data scientist, software engineer, and researcher at Foursquare, Google, Andreessen Horowitz, J.P. Morgan, and NASA. He is a regular contributor to VentureBeat, The Next Web, and Harvard Business Review. Michael earned his Ph.D. at Princeton and was a Marshall Scholar in Cambridge.