O'Reilly logo

Integrating Hadoop by Jake Dolezal, William McKnight

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

10 Top 10 Mistakes Integrating Hadoop Data

1. Integrating Data Without a Business Purpose

It’s not enough to just do something; whatever you’re doing must truly matter. Integrating data should happen because we want to capture data in a structure suitable for both immediate and long-term uses. Decisions must be made to prevent redundancy of data in the integration process.

For example, it would be foolish from a cost and maintenance standpoint to store data “just in case.” Things would soon get out of hand with that reasoning, even if an army were deployed to maintain it. Integrating data into Hadoop usually happens because the data volume to be stored, and the pace at which that data can be generated, are high. Also, this does not mean the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required