
608
|
Chapter 9: Information Processing Techniques
Natural Language Processing
Attempting to derive meaning from text requires natural language processing. e most
fundamental level of natural language processing is the ability to parse the text into words.
e words themselves can be further broken down through morphological analysis. Ap-
plications for this technology include spelling and grammar checkers, along with tagging
and indexing. e following sections will explore natural language-processing techniques,
along with some of its applications.
Word Parsing and Morphological Analysis
Parsing most Western-language sentences into their constituent words is a somewhat
trivial operation due to the intervening—and necessary—spaces between words. Other
than the occasional punctuation and dealing with case issues, there is very little diculty.
Most CJKV sentences, however, oer signicant challenges in this area of information
processing. e ultimate goal of parsing sentences into their component words is for com-
mon purposes such as determining the key words of a document, which is called tagging.
Tagging is useful for categorizing or indexing documents based on their content, and it is
a key function for search engines.
Most CJKV sentences, such as those for Chinese and Japanese, include no intervening
spaces between words. Korean, on the other hand, does use spaces ...