Spark ML provides a rich set of tools and models for training, scoring, evaluating, and exporting machine learning models. This video walks you through each step in the process. You’ll explore the basics of Spark’s DataFrames, Transformer, Estimator, Pipeline, and Parameter, and how to utilize the Spark API to create model uniformity and comparability. You'll learn how to create meaningful models and labels from a raw dataset; train and score a variety of models; target price predictions; compare results using MAE, MSE, and other scores; and employ the SparkML evaluator to automate the parameter-tuning process using cross validation. To complete the lesson, you'll learn to export and serialize a Spark trained model as PMML (an industry standard for model serialization), so you can deploy in applications outside the Spark cluster environment.
- Gain hands-on experience in training, scoring, evaluating, and exporting machine learning models
- Understand how to utilize the Spark API to create model uniformity and comparability
- Explore feature extraction, training, scoring, and hyper-parameter tuning using Spark ML
- Understand how to use a model trained in Spark and deploy it in other applications
Hollin Wilkins is the cofounder of Combust, Inc., an ML/AI start-up in the SF Bay Area. A data scientist and software engineer formerly with True Car, Hollin has worked with machine learning, high-performance microservices, and software development since 2011.
Jason Slepicka is a senior data engineer with DataScience, where he builds pipelines and data science platform infrastructure. Jason is working on his PhD in Computer Science at the University of Southern California Information Sciences Institute.
- Title: Training and Exporting Machine Learning Models in Spark
- Release date: December 2017
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491988824
You might also like
ODSC Europe 2018 (Open Data Science Conference)
ODSC Europe 2018 Royalties for this video set help fund the ODSC Grant Award for open …
Deploying Spark ML Pipelines in Production on AWS
Translating a Spark application from running in a local environment to running on a production cluster …
Deep Learning with TensorFlow: Applications of Deep Neural Networks to Machine Learning Tasks
6+ Hours of Video Instruction is an introduction to Deep Learning that bring the revolutionary machine-learning …
Deploying Machine Learning Models as Microservices Using Docker
Modern applications running in the cloud often rely on REST-based microservices architectures by using Docker containers. …