8. Ingestion from databases
This chapter covers
- Ingesting data from relational databases
- Understanding the role of dialects in communication between Spark and databases
- Building advanced queries in Spark to address the database prior to ingestion
- Understanding advanced communication with databases
- Ingesting from Elasticsearch
In the big data and enterprise context, relational databases are often the source of the data on which you will perform analytics. It makes sense to understand how to extract data from those databases, both through the whole table or through SQL SELECT
statements.
In this chapter, you’ll learn several ways to ingest data from those relational databases, ingesting either the full table at once or asking the database to ...
Get Spark in Action, Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.