Chapter 4

Processing Text and Other Sequences

IN THIS CHAPTER

check Understanding natural language processing

check Considering raw text processing and use of sparse matrices

check Performing scoring and classification

Natural Language Processing (NLP) is all about taking text that humans can understand as words, even if those words don’t form a sentence, and putting it in a form that computers can process in some manner to look for patterns. For example, the computer doesn’t understand “Turn on the radio,” but it can process that command into a specific pattern. Sometimes, the computer also reacts to the processed pattern to perform a task, such as turning on the radio. The processing and the action of performing the task are separate. In this chapter, you start with the basics needed to understand NLP and see how it can serve you in building better applications for language problems. For example, you discover some of the issues in processing even raw text and in storing some types of data using sparse matrices so that the data doesn’t take so much space. You also discover how to score and classify text.

You don’t have to type the source code for this chapter manually. In fact, using the downloadable ...

Get Data Science Programming All-in-One For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.