Book description
- Understand Spark unified data processing platform
- How to run Spark in Spark Shell or Databricks
- Use and manipulate RDDs
- Deal with structured data using Spark SQL through its operations and advanced functions
- Build real-time applications using Spark Structured Streaming
- Develop intelligent applications with the Spark Machine Learning library
Product information
- Title: Beginning Apache Spark 2: With Resilient Distributed Datasets, Spark SQL, Structured Streaming and Spark Machine Learning library
- Author(s):
- Release date: August 2018
- Publisher(s): Apress
- ISBN: 9781484235799
You might also like
book
Python for Data Analysis, 3rd Edition
Get the definitive handbook for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python …
book
Generative Deep Learning, 2nd Edition
Generative AI is the hottest topic in tech. This practical book teaches machine learning engineers and …
book
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition
Through a recent series of breakthroughs, deep learning has boosted the entire field of machine learning. …
book
Hands-On Recommendation Systems with Python
With Hands-On Recommendation Systems with Python, learn the tools and techniques required in building various kinds …