O'Reilly logo

Text Mining and Analysis by Satish Garla, Murali Pagolu, Goutam Chakraborty

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4 Parsing and Extracting Features

Introduction

Tokens and Words

Lemmatization

POS Tags

Parsing Tree

Text Parsing Node in SAS Text Miner

Stemming and Synonyms

Identifying Parts of Speech

Using Start and Stop Lists

Spell Checking

Entities

Building Custom Entities Using SAS Contextual Extraction Studio

Summary

References

Introduction

In this chapter, we discuss the next step and perhaps the most important step in the text mining process flow—text parsing. In Chapters 2 and 3, we have seen how various methods collect and process textual documents. The next task is to convert the collected text documents (in unstructured form) to a vector representation (a structured form). Fundamentally, parsing is the first step in converting unstructured ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required