Video description
Quick, no nonsense. What more can you wish?Jonathan Rioux, Senior Analyst
Spark in Motion teaches you how to use Spark for batch and streaming data analytics. In nearly 3 hours of hands-on video lessons, you'll get up and running with Spark, starting with the basic architecture of a Spark application. You'll explore data partitioning and accessing common application state, and then you'll deep-dive into using Spark SQL and dataframes for structured analytics. Finally, you'll use Spark Streaming to handle and process real-time data flowing into your application.
When you're doing analytics on big data systems, it can be a challenge to efficiently query, stream, filter, and consolidate data sharded across a cluster. Built especially for efficiently operating over large distributed datasets, the Spark data processing engine takes some of the weight off your shoulders. Spark features an easy-to-use interface, near-limitless upgrade potential, and performance that will knock your socks off. Spark simplifies your data infrastructure so you can focus on creating top-notch analytics.
Inside:
- Exploring the Spark Ecosystem
- Deploying Spark on a cluster
- Analytics with SparkSQL
- Real-time applications with Spark Streaming
Jason Kolter is an instructor for the University of Washington certificate program in Big Data Technologies. Additionally he has worked in a wide range of technology companies, gaining extensive experience leading teams building production large-scale distributed analytics systems.
Best course I have seen so far.
Peter J. Hampton, AI Researcher
Spark is a very valuable library, but it's very hard to use (the learning step is very steep). This video course makes the learning smoother, and takes the users to a place where they can experiment by themselves.
Alberto Boschetti, Data Scientist
Table of contents
-
AN INTRODUCTION TO APACHE SPARK
- What is Spark? 00:04:45
- Exploring the Spark ecosystem 1 00:06:26
- Functional programming using the Spark shell 00:08:48
- Rich programming using notebooks 00:06:24
- Using RDDs part 1: Features and creating loading 00:08:06
- Using RDDs part 2: Transformations and actions 00:08:19
- Spark application architecture 00:06:22
- Summary 00:01:49
-
BUILDING REALISTIC SPARK APPLICATIONS
- Deploying Spark on a cluster 00:07:11
- Scaling Spark applications 00:08:58
- Making iterative applications fly 00:06:43
- Accessing common application state 00:04:42
- Configuring the Spark runtime 00:06:05
- Monitoring and metrics with the Spark Web UI 00:04:52
- Summary 00:01:12
-
ADVANCED ANALYTICS WITH SPARK SQL AND DATASETS
- Creating and using datasets 00:05:30
- Structured processing using Spark SQL 00:05:27
- Bringing SQL to Spark with the DataFrame API 00:05:26
- Working with Spark SQL data sources 00:04:32
- Interactive queries with the Spark SQL server 00:03:44
- Summary 00:01:01
-
LOW LATENCY APPLICATIONS WITH SPARK STREAMING
- What is a streaming application? 00:03:32
- Understanding Spark Streaming 00:04:48
- Programming Spark Streaming 00:05:24
- Spark Streaming data sources 00:05:35
- What is Structured Streaming? 00:07:22
- Building continuous applications using Structured Streaming 00:07:20
- Summary and course wrap-up 00:01:54
-
APPENDICES
- Installing Spark 00:03:19
- Installing Jupyter Notebook 00:05:04
Product information
- Title: Spark in Motion
- Author(s):
- Release date: March 2019
- Publisher(s): Manning Publications
- ISBN: None
You might also like
book
Designing Data-Intensive Applications
Data is at the center of many challenges in system design today. Difficult issues need to …
book
Software Engineering at Google
Today, software engineers need to know not only how to program effectively but also how to …
book
Fundamentals of Software Architecture
Salary surveys worldwide regularly place software architect in the top 10 best jobs, yet no real …
book
Head First Design Patterns, 2nd Edition
You know you don’t want to reinvent the wheel, so you look to design patterns—the lessons …