The data integration pattern

The data integration pattern deals with methods to integrate data from multiple sources and techniques to address data inconsistencies that arise out of this activity.

Background

This pattern discusses ways of integrating data from multiple sources. Data integration can sometimes lead to inconsistencies in the data, for example, different data sources may use different units of measurement. The data integration pattern deals with techniques to address data inconsistency.

Motivation

For a multitude of Big Data solutions, it is common for data to exist in various places, such as SQL tables, logfiles, and HDFS. In order to discover exciting relationships between the data that is lying at different places, they have to be ...

Get Pig Design Patterns now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.