June 2020
Intermediate to advanced
576 pages
15h 41m
English
This chapter covers
In a lot of use cases, I had to get data from nontraditional data sources to use in Apache Spark. Imagine that your data is in an enterprise resource planning (ERP) package, and you want to ingest it via the ERP’s REST API. Of course, you could create a standalone application, dumping all the data in a CSV or JSON file and ingesting the file or files, but you don’t want to deal with the life cycle of each file. When will you be able to delete it? Who has access to it? Can the disk be full ...
Read now
Unlock full access