Table of Contents
Preface
Part 1: Upstream Data Ingestion and Cleaning
1
Data Ingestion Techniques
Technical requirements
Ingesting data in batch mode
Advantages and disadvantages
Common use cases for batch ingestion
Batch ingestion use cases
Batch ingestion with an example
Ingesting data in streaming mode
Advantages and disadvantages
Common use cases for streaming ingestion
Streaming ingestion in an e-commerce platform
Streaming ingestion with an example
Real-time versus semi-real-time ingestion
Common use cases for near-real-time ingestion
Semi-real-time mode with an example
Data source solutions
Event data processing solution
Ingesting event data with Apache Kafka
Ingesting data from databases
Performing data ingestion from cloud-based file ...
Get Python Data Cleaning and Preparation Best Practices now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.