O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Text Mining Fundamentals

Video Description

This video explains text mining along with how it is performed. There are eight modules in this video:

  • Text Mining Overview. In this first clip, we define text mining and provide examples, such as spam filtering, review analysis, and sentiment analysis.
  • Related Disciplines. In this second clip, we distinguish text mining from Information Retrieval, Information Extraction, and Natural Language Processing (NLP).
  • Text Mining Sources. In this third clip, we define the sources of text (textual data) and explore, structured documents, semi structures documents, and unstructured documents.
  • Text Mining Process. In this fourth clip, we provide an overview to the steps in performing text mining, including Given Data (Text), Text Preprocessing, Feature Generation/Extraction, Feature Selection, Text Mining Methods, and Results Evaluation.
  • Text Preprocessing. In this fifth clip, we cover Text Preprocessing including the two methods of Lexical Analysis and Syntactic Analysis.
  • Feature Extraction. In this sixth clip, we explore Feature Extraction including Stop Word Elimination, Stemming, and Lemmatization.
  • Weighting Models. In this seventh clip, we describe how to transform bag of words to a vectorial representation so that we can use it in text mining algorithms for further processing like document classification. We cover Boolean Model, Term Frequency (TF), and Term Frequency Inverse Document Frequency (TFIDF).
  • Dimension Reduction. In this eighth clip, we explore dimension reduction, which is reducing the size of the vocabulary to avoid the curse of dimensionality. We cover Latent Semantic Analysis techniques which are widely used in text mining for dimension reduction. We provide an example of using the K nearest neighbor algorithm.

Table of Contents

  1. Text Mining Overview 00:02:46
  2. Related Disciplines 00:05:53
  3. Text Mining Sources 00:02:27
  4. Text Mining Process 00:03:00
  5. Text Preprocessing 00:05:41
  6. Feature Extraction 00:14:11
  7. Weighting Models 00:10:39
  8. Dimension Reduction 00:16:58