Chapter 14. Working with Text 2: Searching

Chapter 13 discussed one common approach to working with text, defining a structure through a language called regular expressions and checking an input against this structure. One application of this approach is finding portions of text from a collection that meet a certain precisely defined criteria.

A very common related problem is finding documents from within a large collection that meet a less precisely defined requirement—for example, finding all Web pages that discuss JavaServer Pages or finding all e-mails from John Smith. This is called a text search or free text search to enforce the idea that the desired text is “free” to appear anywhere in the documents.

Text searches could be tackled by the ...

Get Apache Jakarta and Beyond: A Java Programmer’s Introduction now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.