Chapter 5. Construct the Bronze Layer
Having established the foundation of your data platform, whether it is Microsoft Fabric or Azure Databricks, it’s time to build the Bronze layer. This is the layer where all the raw data first lands, and the data is maintained in its original form. It serves both as a historical archive and a reliable single source.
As part of the exercise of setting up the first layer, you’ll tackle tasks such as setting up connections, building your first data pipeline, and exploring how to handle data ingestion and schema management. You’ll come across various code snippets along the way. These snippets are here to help clarify the process—some are just for learning, and some you can actually use in your coding exercises. Keep in mind, though, these examples are streamlined for educational purposes, so you might need to tweak them a bit when you apply them to real-world scenarios.
By the end of this chapter, you will thoroughly understand how to build and implement the Bronze layer of your Medallion architecture, including the nuances that come with ingestion and managing data in the Bronze layer. This solid base will prepare you for the subsequent Silver and Gold stages. Let’s start by building the data pipeline.
Building the Data Pipeline
In this section, we will construct a data pipeline using Data Factory,1 while integrating Spark and Delta Lake into the process. This hands-on journey will equip you with the skills to understand how these tools interconnect ...