December 2004
Intermediate to advanced
608 pages
11h 47m
English
The tools presented in the preceding few chapters have dealt with structured data, data organized into rows in a database or trees in the form of beans or XML. This chapter and Chapter 14 address tools for working with an important type of unstructured data: raw text.
Text appears within applications in a variety of forms and contexts. Text may exist within a larger structured context, such as a name, address, or description field in a database. Text may also live in large collections with minimal structure, such as a e-mail folders or hierarchies of directories of memos or reports. Applications may need to perform similar tasks in all these cases: determine whether a particular piece of text ...