May 2017
Beginner to intermediate
596 pages
15h 2m
English
A Spool file, in simple words, is a file containing data to be processed. Most of the times, such files contain delimited information (information separated by a character), and is read line by line for processing, wherein each line represents a record. Optionally, each of these line may also contain XML/JSON data structure.
One of the sources of data can be spool files emitted by other systems, which may contain user data and these spool files may work as integration points into the Data Lake.
Flume framework supports a number of variations of spool formats, here we are considering the most common spool format which contains data as JSON messages.
A spool source may be configured with the following steps: ...