10 Top 10 Mistakes Integrating Hadoop Data

1. Integrating Data Without a Business Purpose

It’s not enough to just do something; whatever you’re doing must truly matter. Integrating data should happen because we want to capture data in a structure suitable for both immediate and long-term uses. Decisions must be made to prevent redundancy of data in the integration process.

For example, it would be foolish from a cost and maintenance standpoint to store data “just in case.” Things would soon get out of hand with that reasoning, even if an army were deployed to maintain it. Integrating data into Hadoop usually happens because the data volume to be stored, and the pace at which that data can be generated, are high. Also, this does not mean the ...

Get Integrating Hadoop now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.