Chapter 2. Natural Language Basics

What Is Natural Language?

One of the most important faculties humanity has is language. Language is an essential part of how our society operates. Although it is such an integral function of humanity, it’s still a phenomenon that is not fully understood. It is primarily studied by observing usage and by observing pathologies. There has also been much philosophical work done exploring meaning and language’s relationship to cognition, truth, and reality. What makes language so difficult to understand is that it is ubiquitous in our experiences. The very act of producing and consuming statements about language contains the biases and ambiguity that permeate language itself. Fortunately, we do not need to go into such high philosophies! However, I like to keep in mind the grandeur and mystery of language as an anchor as we dive into the material in this book.

Many animals have complex communication systems, and some even have complex societies, but no animal has shown the ability to communicate such complex abstractions as humans do. This complexity is great if you are looking to survive the Neolithic period or to order a pizza, but if you are building an application that processes language, you have your work cut out for you. Human language appears to be much more complex than any other communication system. Not only do the rules of our language allow infinite unique sentences (e.g., “The first horse came before the second horse, which came before ...

Get Natural Language Processing with Spark NLP now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.