Chapter 19 DW 2.0 and unstructured data

It is estimated that more than 80% of the data that exists in corporations is unstructured text. Unfortunately the technology that runs on computers today is dedicated to handling structured, repeatable data. The result is that there is valuable information that is not being used for decision making in the corporation. The useful information found in text is not a big part of the decision-making process.

DW 2.0 AND UNSTRUCTURED DATA

The DW 2.0 architecture for the next generation of data warehousing recognizes that there is valuable information in unstructured textual information. DW 2.0 recognizes that quite a bit of work must be done to the text to make it fit for analytical processing.

The starting ...

Get DW 2.0: The Architecture for the Next Generation of Data Warehousing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.