Regex Literals

Problem

You need a regular expression that matches regular expression literals in your source code files so you can easily find them in your text editor or with a grep tool. Your programming language uses forward slashes to delimit regular expressions. Forward slashes in the regex must be escaped with a backslash.

Your regex only needs to match whatever looks like a regular expression literal. It doesn’t need to verify that the text between a pair of forward slashes is actually a valid regular expression.

Because you will be using just one regex rather than writing a full compiler, your regular expression does need to be smart enough to know the difference between a forward slash used as a division operator and one used to start a regex. In your source code, literal regular expressions appear as part of assignments (after an equals sign), in equality or inequality tests (after an equals sign), possibly with a negation operator (exclamation point) before the regex, in literal object definitions (after a colon), and as a parameter to a function (after an opening parenthesis or a comma). Whitespace between the regex and the character that precedes it needs to be ignored.

Solution

(?<=[=:(,](?:\s*!)?\s*)/[^/\\\r\n]*(?:\\.[^/\\\r\n]*)*/
Regex options: None
Regex flavors: .NET
[=:(,](?:\s*!)?\s*\K/[^/\\\r\n]*(?:\\.[^/\\\r\n]*)*/
Regex options: None
Regex flavors: PCRE 7.2, Perl 5.10
(?<=[=:(,](?:\s{0,10}+!)?\s{0,10})/[^/\\\r\n]*(?:\\.[^/\\\r\n]*)*/
Regex options: None
Regex flavors: ...

Get Regular Expressions Cookbook, 2nd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.