4 Getting data into the platform
This chapter covers
- Understanding databases, files, APIs, and streams
- Ingesting data from RDBMSs using SQL versus change data capture
- Parsing and ingesting data from various file formats
- Developing strategies to deal with source schema changes
- Designing an ingestion pipeline to handle the challenges of data streams
- Building an ingestion pipeline for SaaS data
- Implementing quality control and monitoring in your ingestion pipeline
- Discussing network and security considerations for cloud data ingestion
If you’ve read the chapters up to this point, you’re able to architect a good, layered data lake. Now it’s time to start diving into a few of these layers in much greater detail.
In this chapter, we’ll focus on ...
Get Designing Cloud Data Platforms now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.