Preface
WE LIVE IN A MAGICAL TIME. While lounging on my living room sofa, using only my voice I can order a pound of gummy bears to be delivered to my door within two hours. (Whether or not it’s a good thing that I can do this is a discussion for another book.)
The technology of speech recognition—having a computer understand what you say to it—has grown in leaps and bounds in the past few years. In 1999, when I began my career in voice user interface (VUI) design at Nuance Communications, I was amazed that a computer could understand the difference between me saying “checking” versus “savings.” Today, you can pick up your mobile phone—another magical device—and say, “Show me coffee shops within two miles that have WiFi and are open on Sundays,” and get directions to all of them.
In the 1950s, when computers were beginning to spark people’s imaginations, the spoken word was considered to be a relatively easy problem. “After all,” it was thought, “even a two-year-old can understand language!”
As it turns out, comprehending language is quite complex. It’s filled with subtleties and idiosyncrasies that take humans takes years to master. Decades were spent trying to program computers to understand the simplest of commands. It was believed by some that only an entity that lived in the physical world could ever truly understand language, because without context it is impossible to understand the meaning behind the words.
Speech recognition was around in science fiction long before it came ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access