Chapter 13: Data Analysis

In the previous chapter, we looked at the various buckets of Glue job expectation messages, why they occur, and how to handle them.

We learned about the impact of data skewness, how that can adversely impact job execution, and the techniques you can use to fix it. Additionally, we looked at some of the common reasons for Out-of-Memory (OOM) errors and the out-of-the-box mechanisms that are available in AWS Glue to handle them. Some of these tools and techniques can be used to be more effective in resource utilization in a pay-as-you-go cloud-native world. These techniques can not only be used for efficient processing but also help you reduce the processing time in a world that increasingly needs answers as quickly as ...

Get Serverless ETL and Analytics with AWS Glue now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.