December 2018
Intermediate to advanced
318 pages
8h 28m
English
The raw or aggregated data from data collectors is stored in data stores, like SQL databases, NoSQL databases, data warehouses, and distributed systems, like HDFS. This data may require some cleaning and preparation if it is unstructured. The file format in which the data is received varies from database dumps, JSON files, parquet files, avro files, and even flat files. For distributed data storage systems, the data upon ingestion gets distributed to different file formats.
Some of the popular data stores available for use as per industry standards are:
Read now
Unlock full access