Limitations

Some of these normal characters endowed with super powers are absolutely necessary because they represent a single character (such as \n and \t) that cannot be represented in any other way in a regular expression. However, some of these characters are just shortcuts to allow you to write less code (such as \w and \s). The shortcuts are useful, as long as you recognize their limitations. The \w matches only the letters from the English alphabet. Throw in just a little bit of Spanish flair to your string (as with olé), and your regular expression will fail. Also, \s matches a lot of space characters that you might not actually want to match. In both cases, writing out the exact set of characters you want to match (within brackets) is ...

Get Learning to Program now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.