12

Natural Language Toolkit

One of the first things taught in introductory statistics textbooks is that correlation is not causation. It is also one of the first things forgotten.

Thomas Sowell

In This Chapter

Using a computer to derive insights into text is incredibly useful. The subset of data science that addresses deriving insights into text is called natural language processing. The Natural Language Toolkit (NLTK) is a Python package for all things language processing. This chapter takes a quick look at this powerful package.

NLTK Sample Texts

The NLTK package offers sample texts from many sources that you can ...

Get Foundational Python for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.