Chapter 2. Financial services business scenario 427
2.4.2 Step 2: Identify differences between the sources and targets
Differences between the source and target can be determined as follows:
򐂰 From active relational database catalogs, dictionaries, repositories, and other
documentation that provide details about the metadata of the various data
sources and targets.
More often than not, metadata information stored in sources (other than the
active relational database catalogs) tends to be out of date because it is not
maintained regularly as systems evolve.
򐂰 From an analysis of the data itself using tools such as IBM WebSphere
Information Analyzer. Metadata is deduced from the data and presented to
the data analyst for review and affirmation or denial of the deduction.
These are complementary approaches, essential to achieving a fuller
understanding of how synchronized the definition of the metadata is with the data
content. It also enables you to keep the metadata about data sources current,
which is critical for building new systems that require data integration from
existing systems.
Data elements in the target not found in the source,
and target data element not nullable or has no
default values
򐂰 Define default values in the target; some
impact on target applications
Code mismatch between the source and the target
data elements; for example, a salutation can be Mr,
Mrs, Dr, Miss, and Ms, and source and target do not
have corresponding codes
򐂰 Coarse to fine
򐂰 Fine to coarse
򐂰 Coarse to fine
Use transformation to perform the mapping
򐂰 Fine to coarse
Transform with loss of granularity; loss of
function
Modify target definition to match fine
granularity of the source; significant impact
on target applications likely
Multiple data elements in the source maps to a
single data element in the target
򐂰 For example, an address
򐂰 Use transformation to perform the mapping
Single data element in the source maps to multiple
data elements in the target
򐂰 For example, an address
򐂰 Use transformation to perform the mapping
Might require standardization software
Different character maps such as Unicode and
ASCII
Transformation to perform the mapping
Commonly encountered differences Potential actions

Get IBM WebSphere Information Analyzer and Data Quality Assessment now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.