Chapter 4A History of Language Modeling
Natural language processing (NLP) is the science of creating artificial systems that attempt to simulate or duplicate the use of language in the same ways that humans interact with one another, in a way that is seemingly natural. Nearly all NLP systems to date could be sorted into one of two categories: rule-based NLP and statistical NLP. In a way, this is somewhat of an arbitrary distinction, since technically, statistical systems are still rule based; they are just not based on any rules that were explicitly written and/or defined by the system designer. So, it would probably be more accurate to say that, when we refer to rule-based systems, we are referring to language systems that operate using rules that have been explicitly defined in the system's code, whereas when we speak of statistical systems, we are referring to systems that operate using rules that have been algorithmically defined by the system itself, based on statistical analysis routines, which were themselves defined in the code. So, in truth, the only fundamental difference between rule-based and statistical language systems is that in the former, the creator explicitly defines how the machine will interact, and in the latter, the creator only defines how the machine will learn to interact.
For rule-based models the creator gives the system the rules of interaction, and in statistical models the creator gives the system a framework to create its own rules of interaction. ...
Get The Language of Deception now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.