Strata + Hadoop World San Jose 2017 gathered 325 of the globe's leading minds in technology and business to demonstrate how big data, machine learning, and analytics are changing not only business, but society itself. This video compilation provides a complete recording of each of the conference's 167technical sessions, 23 long-form tutorials, and 17 keynotes. Some of the featured speakers you'll hear from include: Confluent CEO Jay Kreps on stream processing and its impact on how businesses deal with real-time data; Amazon Ad platform leader Alice Zheng on the best feature engineering methods for machine learning pipelines; Eric Colson of Stitch Fix on how to build a great data science team; DataVisor CEO Yinglian Xie on how Spark's in-memory big data security analytics can identify nefarious sleeper cells; and Pinterest Chief Scientist Jure Leskovec on Pixie, the graph-based system that makes personalized recommendations to 100+ million users in real time.
Get this compilation and you'll enjoy unfettered access to the Strata Business Summit and a set of 29 carefully curated sessions specifically tailored for the C-level business executive. Taught by top data strategists and thinkers at Silicon Valley Data Science, MapR Technologies, LinkedIn, Unisys, UC Berkeley, Deloitte Touche Consulting, and from VCs at Kleiner, Perkins, Caufield & Byers, the Summit is like an MBA in data-driven business. You'll receive a hand-picked lineup of executive briefings on key issues, such as predictive analytics and machine learning, Cloud strategy, governance security and privacy, IoT, and artificial intelligence.
The 23 tutorials included in the compilation cover big data topics such as a review of Apache Spark 2.0 core concepts; an exploration of stream processing from the basics through Apache Beam; a practical look at how to do scalable, end-to-end data science in R on single machines and on Spark clusters; overviews of how to get started in Tensor Flow, architect a data platform, Scala and Spark, build data applications in AWS, build a data pipeline with Kafka, secure your Hadoop clusters; and how to visualize large, complex datasets with R, Hadoop, and Spark. Each of the conference's 17 keynote sessions are included, as well as all of the 167 specialized sessions, covering topics such as PyTorch, a flexible and intuitive framework for deep learning; Docker on Yarn; Spark structured streaming; the Netflix data platform; RubiX, a caching framework for big data engines in the cloud; Stanford University's Weld, an optimizing runtime for high-performance data analytics; and much, much more.