November 2024
Intermediate to advanced
306 pages
7h 57m
English
Data processing is an essential step in data analysis and machine learning, as it involves transforming, cleaning, and integrating raw data into a suitable format for further processing. In this chapter, we will introduce the basic concepts and principles of data processing, providing some practical examples and use cases of data preprocessing with Apache Spark.
In this chapter, we will cover the following topics:
By the end of this chapter, you will know the different data processing techniques using Spark.
You can find the code files for this chapter on GitHub ...
Read now
Unlock full access