Chapter 22. Examining Regular Expressions

In Chapter 7 we covered PHP strings — how to create them, print them, and (to some extent) how to examine and modify them. In this chapter, we delve into more advanced string-manipulation techniques, starting off with functions to split up (or tokenize) strings into parts. We'll soon run into limitations of the basic tokenization functions, which show the need for regular expressions.

Finally, we'll cover some of the more advanced string functions that enhance the effectiveness of regular expressions and the use of strings in general.

Tokenizing and Parsing Functions

Sometimes you need to take strings apart at the seams, and you have your own notions of what should count as a seam. The process of breaking up a long string into words is called tokenizing, and among other things it is part of the internals of interpreting or compiling any computer program, including PHP. PHP offers a special function for this purpose, called strtok().

The strtok() function takes two arguments: the string to be broken up into tokens and a string containing all the delimiters (characters that count as boundaries between tokens). On the first call, both arguments are used, and the string value returned is the first token. To retrieve subsequent tokens, make the same call, but omit ...

Get PHP6 and MySQL® 6 Bible now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.