December 2017
Intermediate to advanced
468 pages
13h 12m
English
Big data engineering gets the most value out of the vast amount of disparate data, data staging, profiling, and data cleansing in any big data platform. Also, it represents optimal ways of migrating the data from back office systems to the front office to help data analysts and data scientists:

The preceding diagram accounts for a sample ecosystem of a big data engineering landscape. One can find numerous tools in each stage of the big data landscape. The following are some examples of those tools: Hadoop, Oozie, Flume, Hive, HBase, Apache Pig, Apache Spark, MapReduce, YARN, Sqoop, ZooKeeper, text analytics, and so on. ...