CHAPTER 3 Natural Language Processing in Support of a Cognitive System

One of the aspects that distinguish a cognitive system from other data-driven techniques is the capability to manage, understand, and analyze unstructured data in context with the questions being asked. In many organizations as much as 80 percent of the data that is collected and stored is unstructured. To make good decisions, these documents, reports, e-mail messages, speech recordings or images, and videos must be understood and analyzed to make good decisions. For example, in medical journals there are millions of articles published in a single year that can offer new treatment options. In the retail market, there are billions of social media conversations that are leading indicators of future trends. There is important information that is buried inside voice and video recordings that can have an impact on a variety of fields. Unlike structured database data, which relies on schemas to add context and meaning to data, unstructured information must be parsed and tagged to find the elements of meaning. Tools for this process of identifying the meaning of the individual words include categorization, thesauri, ontologies, tagging, catalogs, dictionaries, and language models.

In a cognitive system, the developer needs to generate and test hypotheses and provide alternative answers or insights with associated confidence levels. Often the body of knowledge used within the cognitive system is text-based. In this ...

Get Cognitive Computing and Big Data Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.