Chapter 2: Introduction to Important AWS Glue Features

In the previous chapter, we talked about the evolution of different data management strategies, such as data warehousing, data lakes, the data lakehouse, and data meshes, and the key differences between each. We introduced the Apache Spark framework, briefly discussed the Spark workload execution mechanism, learned how Spark workloads can be fulfilled on the AWS cloud, and introduced AWS Glue and its components.

In this chapter, we will discuss the different components of AWS Glue so that we know how AWS Glue can be used to perform different data integration tasks.

Upon completing this chapter, you will be able to define data integration and explain how AWS Glue can be used for this. You ...

Get Serverless ETL and Analytics with AWS Glue now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.