October 2018
Beginner to intermediate
496 pages
16h 24m
English
This chapter begins to examine the complexities of networking data. Understanding and preparing all the data coming from the IT infrastructure is part of the data engineering process within analytics solution building. Data engineering involves the setup of data pipelines from the data source to the centralized data environment, in a format that is ready for use by analytics tools. From there, data may be stored, shared, or streamed into dedicated environments where you perform data science analysis. In most cases, there is also a process of cleaning up or normalizing data at this layer. ETL (Extract, Transform, Load) is a carryover acronym from database systems that were commonly used at the data ...