Chapter 4 Parsing and Extracting Features


Tokens and Words


POS Tags

Parsing Tree

Text Parsing Node in SAS Text Miner

Stemming and Synonyms

Identifying Parts of Speech

Using Start and Stop Lists

Spell Checking


Building Custom Entities Using SAS Contextual Extraction Studio




In this chapter, we discuss the next step and perhaps the most important step in the text mining process flow—text parsing. In Chapters 2 and 3, we have seen how various methods collect and process textual documents. The next task is to convert the collected text documents (in unstructured form) to a vector representation (a structured form). Fundamentally, parsing is the first step in converting unstructured ...

Get Text Mining and Analysis now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.