Chapter 2. Natural Language Processing

Our intelligence is what makes us human, and AI is an extension of that quality.

Yann LeCun, professor, New York University

Humans have been creating the written word for thousands of years, and we’ve become pretty good at reading and interpreting the content quickly. Intention, tone, slang, and abbreviations—most native speakers of a language can process this context in both written and spoken word quite well. But machines are another story. As early as the 1950s, computer scientists began attempts at using software to process and analyze textual components, sentiment, parts of speech, and the various entities that make up a body of text. Until relatively recently, processing and analyzing language has been quite a challenge.

Ever since IBM’s Watson won on the game show Jeopardy!, the promise of machines being able to understand language has slowly edged closer to reality. In today’s world, where people live out their lives through social media, the opportunity to gain insights from the millions of words of text being produced every day has led to an arms race. New tools allow developers to easily create models that understand words used in the context of their industry. This leads to better business decisions and has resulted in a high-stakes competition in many industries to be the first to deliver.

A 2017 study from IBM reported that 90% of the world’s data had been created in the past two years, and that 80% of that data was unstructured ...

Get Getting Started with Artificial Intelligence, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.