O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Text Analytics

Video Description

Hear the legendary Bill Inmon discuss text analytics. Not only will Bill teach you about textual analytics, he will give you exercises and explain the answers as if you are in the classroom with him! Over a dozen exercises are included in this video.

Text is everywhere in the corporation. Yet corporations use only a paltry amount of it. In today’s world, 95% or more of corporate decisions are made based on classical structured data. Yet there is a wealth of information locked up in text.

So what is the problem with text? There are many challenges. But the primary challenge is that text does not fit comfortably or well with standard database structure. Standard database structures require data to be nice and uniform. But text is anything but uniform. And there are other important challenges with text. Language is inherently complex. And processing text requires more than just the handling of text. Managing text requires the identification and management of the context of text as well.

This overview course examines the challenges facing the organization that wishes to incorporate text into the decision making process. This overview course contains both lecture and interactive activities.

Upon completing this course the viewer will be able to identify the activities that must be done in order to start using text in management decisions.

Clips covered include:

  • Quantifying Business Value. Most business decisions are made based upon structured data, yet most of the data in the corporation is unstructured/textual. The benefits for each type of unstructured data is discussed, including the voice of the customer. The challenges of analyzing text are covered, as well as some statistics around the quantity of business decisions made based upon structured versus unstructured data.
  • Visualizing Text. We will explore transforming raw text into a visual that management can use to make important corporate decisions.
  • Identifying the Correct Audience. Learn who receives the benefits of textual analytics (HINT: it is not the IT department!). Marketing, Sales, Finance, and Management use cases will be discussed.
  • More on Visualizations. The benefits of visualization are discussed with many examples, including those involving demographics and geography.
  • Iterative Processing. We discuss the process for working with text, starting with capturing raw data, then creating taxonomies, preparing textual disambiguation technology, building databases, and finally visualization.
  • Acquiring Text. Text comes from many different places, including voice communication, paper documents, telephone transcriptions, and email conversions. We discuss these different sources along with the challenges they raise.
  • Formatting Raw Text. Several examples are provided, showing how to go from raw text into something more useful.
  • Deciphering a Taxonomy. A taxonomy is defined and several examples are provided.
  • Categorizing Taxonomies. We explore the two main categories of taxonomies (language and industry-specific), and give examples of each.
  • Leveraging Taxonomies. Taxonomies help us organize text. They help us identify the important words and form the foundation of sentiment analysis.
  • Generalizing Taxonomies. The same or very similar taxonomies apply to different companies in the same industry. We go through a number of examples.
  • Editing Text Basics. We cover the basic ways to do textual disambiguation, including stop word processing, alternate spelling, acronym resolution, and stemming.
  • Resolving Taxonomies. The process of matching your taxonomies against your documents is called “taxonomy resolution”. We discuss taxonomy resolution and provide several examples.
  • Exploring the Database. We discuss the standard relational record and how text is transformed to fit into the relational database.
  • Post Processing. We discuss post processing, which are the steps in going from a traditional relational database to an analytical database. Some of the functions discussed include inference processing, conjunctions, and negations.
  • Processing Sentiment Analysis. We discuss going from the analytic database to visualzation using several examples of sentiment analysis.
  • Leveraging Textual Analytics. Several examples including a call center example illustrate the value of textual analytics.
  • Improving on Textual Analytics. We discuss the evolution of textual analytics and how it has improved over the years, such as through soundex and stemming.