February 2022
Intermediate to advanced
574 pages
10h 50m
English
Welcome to the next major section of the book. In this section, we will focus on designing and developing data processing systems.
In the last chapter, we learned about implementing the serving layer and saw how to share data between services such as Synapse SQL and Spark using metastores. In this chapter, we will focus on data transformation—the process of transforming your data from its raw format to a more useful format that can be used by downstream tools and projects. Once you complete this chapter, you will be able to read data using different file formats and encodings, perform data cleansing, and run transformations using services such as Spark, SQL, and Azure Data Factory (ADF).
We will cover ...
Read now
Unlock full access