144 Enabling Smarter Government with Analytics to Streamline Social Services
Information Integration Services
Information Integration Services provide Information Integrity Services, ETL
services, and EII services for federated query access to structured and
unstructured data distributed over disparate data sources. Information Integrity
Services include data profiling, analysis, cleansing, data standardization, and
probabilistic matching services. Data profiling and analysis services are critical
for understanding the quality of master data across enterprise systems, and for
defining data validation, data cleansing, matching, and standardization logic
required to improve master data quality and consistency.
MDM services can request data cleansing, standardization, and probabilistic
matching services to cleanse, standardize, and match master data updates
received by business systems. Data cleansing, standardization, and probabilistic
matching services should be available as information services for real-time
business transactions and to support batch processing for the loading and
matching of records from multiple data sources into a target system. Address
Standardization and Validation services are examples of information integrity
services that standardize address information and validate that address against
a published list of valid addresses for that region. Data cleansing services
provide functionality to
scrub data such as:
򐂰 The ability to validate fields based upon simple validation rules such as
known dimensions for a product or valid reference table codes
򐂰 Comparing data values against a range of values
򐂰 Populating missing required data fields with default values
򐂰 Accessing an external data source for information to look up for validation of a
data field or to populate data fields
Matching services provide probabilistic matching capability to match and merge
data records based upon survivorship rules and are used to eliminate the
duplicate entry of master data entities such as clients or providers into the MDM
database. Matching services are based on configurable matching logic to match
duplicate records and implement survivorship capability that determines how to
remove duplicate entries and merge content from those duplicate records into a
single consolidated record.
ETL services support the initial and incremental ETL of data from one or more
source systems to meet the needs of one or more targets, such as a data
warehouse or the MDM database. Asynchronous and synchronous
communication techniques to support the transporting of low volumes of
changed data could occur within the connectivity and interoperability layer.

Get Enabling Smarter Government with Analytics to Streamline Social Services now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.