Learn how Spark 2.3.0+ integrates with K8s clusters on Google Cloud and Azure.
Holden Karau is a transgender Canadian open source developer advocate at Google focusing on Apache Spark, Beam, and related big data tools. Previously, she worked at IBM, Alpine, Databricks, Google (yes, this is her second time), Foursquare, and Amazon. Holden is the coauthor of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out of date. She is a committer & PMC member for Apache Spark and committer on Apache SystemML and Apache Mahout projects. When not in San Francisco, Holden speaks internationally about different big data technologies (mostly Spark). She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Outside of work she enjoys playing with fire, riding scooters, and dancing. You can follow her on Twitter @holdenkarau, Twitch (holdenkarau), and on YouTube.
How to use the wordcount example as a starting point (and you thought you’d escape the wordcount example).
Early methods to integrate machine learning using Naive Bayes and custom sinks.