The ability to take unstructured text and transform it into a database format, and then derive context from it, is a large achievement unto itself. But there is a problem with the output of Textual ETL. Figure 12.1 illustrates a simplified architectural rendition of Textual ETL.
The problem with the output from Textual ETL is that visualization is difficult. Unless a person is looking for a single specific word, which is almost never the case, the output directly from Textual ETL is difficult to use.
Patterns
What most people are looking for from the output of Textual ETL is patterns. And patterns are formed by multiple occurrences of data, where the instances of data are gathered, organized, and accumulated. Figure ...
Get Preventing Litigation now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.