15.1 INTRODUCTION
String matching is employed in several applications, such as packet classification, computational biology, spam blocking, and information retrieval. String search operates on a given alphabet set Σ of size |Σ|, a pattern P = p0p1 … pm−1 of length m, and a text string T = t0t1 … tn−1 of length n, with m ≤ n. The problem is to find all occurrences of the pattern P in the text string T. The average time complexity for implementing the string search problem on a single processor was proven to be O(n) [99]. We refer the reader to Reference 100 for a comprehensive review of the different hardware implementations of the string matching problem.
A hardware implementation for the search engine can be assumed to have the following characteristics:
- The text length n is typically big and variable.
- The pattern length m varies from a word of few characters to hundreds of characters (e.g., a URL address).
- The word length w is determined by the data storage organization and datapath bus width.
- Typically, the search engine is looking for the existence of the pattern P in the text T; that is, the search engine only locates the first occurrence of the P in T.
- The text string T is supplied to the hardware in word serial format.
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access