7.2

Analyzing Repetitive Data

Abstract

Unstructured repetitive data must be passed before it can be analyzed. Parsing reveals where records of data and attributes of data reside. One type of unstructured repetitive data that is analyzed is log tape data. Typically, links of data are found in the log tape. One important technique for analyzing unstructured repetitive data is the creation of both active indexes and passive indexes.

Keywords

metadata
log tape
links
parsing
Much of the data found in Big Data is repetitive. Analyzing repetitive data in the Big Data environment is quite different than analyzing data in the nonrepetitive environment. As a point of departure, we need to look at what the repetitive Big Data environment looks like. ...

Get Data Architecture: A Primer for the Data Scientist now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.