Patterns: Information Aggregation and Data Integration with DB2 Information Integrator

Chapter 3. Data Integration and Information Aggregation patterns 69

A Data Server/Services node is a generic data storage node that provides

managed, persistent storage of any type of data and a means to directly access

and manipulate that data. The data may be stored in files and accessed through

file I/O routines or may be stored in a database with more structured and

managed access methods.

The flow is as follows.

1. A requesting application makes a query of data from the "federated" data

source, for example, a simple SQL Select request.

2. The Data Integration node processes the request, and utilizing its metadata

(which defines the data sources) passes on the requests to the appropriate

data sources.

In many cases, the data integration/federation logic within the Data

Integration node may be logically separate from the data connector logic. This

data connector logic spreads out the overhead of making the query to multiple

data sources, allowing the queries to run in parallel against each database.

When performance is of major concern, multiple logical data connectors may

exist to process queries against a single data source—the idea here being to

prevent any single node in the process from becoming a bottleneck if too

many requests run against one data source.

3. In all cases, the results that are returned from each individual data source

must then be aggregated and normalized by the data integration layer so that

these results appear to be from one "virtual" data source.

4. The results are then sent back to the requesting application, which has no

idea that multiple data sources were involved.

3.3.3 Federation: Cache variation pattern

Figure 3-5 on page 70 represents the Federation: Cache variation pattern.

Note: Although omitted for simplicity of representation, an Application

Server/Services node can be substituted for the Data Server/Services node

where access to the data is provided through an application API rather than

directly to the database management system.

70 Patterns: Information Aggregation and Data Integration with DB2 Information Integrator

Figure 3-5 Federation: Cache variation application pattern

Local temporary storage can be used to cache data returned from read-only

queries to remote data sources. Under defined circumstances, this cache can be

used to speed up query response time or to compensate for a data source that is

temporarily off line. Such function must be used carefully, however, as the

cached data and its underlying source may no longer be in sync (there may be a

latency involved).

LEGEND:

Data sources are represented by disks in three different colors / shades:

Blue / plain: Read/write

Yellow / diagonal hatching: Read-only

Green / vertical hatching: Temporary

Read/write and read-only refer only to the interaction between the overall pattern and that data source,

as also indicated in most cases by annotation on the linkages. In general we may assume that the

application associated with a particular data source has read/write access.

A dotted box around an application and source data indicates that the source data may need to be

accessed through the owning application via its API, or may be accessed directly via a database API.

In general, a dotted box around a number of components indicates that we are not specifying which

of those components we are interacting with.

A beveled box represents an additional Application pattern.

A dashed line, arrow or component indicates an optional component.

Population

Federation

Metadata

Application

Source /

Target

Application

read only

read/write

Application

Source

Temporary

store

PopulationPopulation

Federation

Metadata

Application

Source /

Target

Application

Source /

Target

Application

read only

read/write

Application

Source

Temporary

store

Temporary

store

Get Patterns: Information Aggregation and Data Integration with DB2 Information Integrator now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Patterns: Information Aggregation and Data Integration with DB2 Information Integrator by Nagraj Alur, YunJung Chang, Barry Devlin, Bill Mathews, John Matthews, Sreeram Potukuchi, Uday Sai Kumar, Raj Datta

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly