Chapter 2: Creating Robust Data Pipelines and Data Transformation

In this chapter, we will cover how to load and enrich data using the power of Apache Spark in Azure Synapse Analytics. We will learn about and understand various concepts and recipes for writing Spark data frames to read data from Azure Data Lake Storage (ADLS) and writing to a SQL pool using PySpark.

This chapter comprises the following recipes:

Reading and writing data from ADLS Gen2 using PySpark
Visualizing data in a Synapse notebook

Reading and writing data from ADLS Gen2 using PySpark

Azure Synapse can take advantage of reading and writing data from the files that are placed in the ADLS2 using Apache Spark. You can read different file formats from Azure Storage with ...

Get Azure Synapse Analytics Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Azure Synapse Analytics Cookbook by Gaurav Agarwal, Meenakshi Muralidharan

Chapter 2: Creating Robust Data Pipelines and Data Transformation

Reading and writing data from ADLS Gen2 using PySpark

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly