Quick Reference

Character Representations

\x{nn}

Two-digit hexadecimal code. \x{20} represents the space character

\x{nnnn}

Four-digit hexadecimal code (Unicode): \x{0020} represents the space character

\N{unicode name}

Unicode names. \N{Latin small letter a with ogonek} represents ą. The Unicode’s name is case-insensitive, but it matches case-sensitively. Thus, both \N{latin small letter a with ogonek} and \N{Latin Small letter A with ogonek} match ą.

Character Classes 1: Standard Classes

I call these classes “standard” for lack of a better term. They were part of the first implementations of GREP, and of the three types of class, to this day the standard classes are the easiest to use.

[char]

A single character or a group of characters

[^char]

Exclude single character or a group of characters

.

Any character except paragraph break

\w

Word character: letters, digits, and underscore

\W

Non-word character

\l

Lowercase letter

\L

Non-lowercase letter

\u

Uppercase letter

\U

Non-uppercase letter

\d

Digit

\D

Nondigit

\h

Horizontal space: all spaces and tabs

\H

Non-horizontal space characters

\s

Whitespace character: all spaces, tabs, and returns

\S

Non-whitespace character

\v

Vertical space: break characters—paragraph break, forced line break, page, column, frame breaks.

\V

Whatever is not \v

Character Classes 2: Posix Expressions

There is much overlap between the Posix class and the standard class. Most Posix expressions listed here can ...

Get GREP in InDesign now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.