Skip to Content
Architecting the Industrial Internet
book

Architecting the Industrial Internet

by Robert Stackowiak, Carla Romano, Shyam Varan Nath
September 2017
Intermediate to advanced
360 pages
9h 43m
English
Packt Publishing
Content preview from Architecting the Industrial Internet

Data lakes

A data lake is a repository of data in its natural format and can consist of data of all types, schema, structured, and semi-structured. Its purpose is to serve as a repository for all analyzable data, including raw and transformed structured data from applications and relational systems, semi-structured data such as document collections (for example, email), logs, clickstreams, devices, geolocation trails, social media, and weather using HDFS. Unstructured data such as images, video, and audio can also be included in a data lake. Data can simply be dumped in the data lake with no consideration for integration and transformation.

Data stored in its native format can later be parsed for analysis. It can serve as a staging area for ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Industry 4.0: The Industrial Internet of Things

Industry 4.0: The Industrial Internet of Things

Alasdair Gilchrist
Industrial Internet Application Development

Industrial Internet Application Development

Alena Traukina, Jayant Thomas, Prashant Tyagi, Veera Kishore Reddipalli
What Employees Want Most in Uncertain Times

What Employees Want Most in Uncertain Times

Kristine W. Powers, Jessica B.B. Diaz

Publisher Resources

ISBN: 9781787282759