Regex Literals

Problem

You need a regular expression that matches regular expression literals in your source code files so you can easily find them in your text editor or with a grep tool. Your programming language uses forward slashes to delimit regular expressions. Forward slashes in the regex must be escaped with a backslash.

Your regex only needs to match whatever looks like a regular expression literal. It doesn’t need to verify that the text between a pair of forward slashes is actually a valid regular expression.

Because you will be using just one regex rather than writing a full compiler, your regular expression does need to be smart enough to know the difference between a forward slash used as a division operator and one used to start a regex. In your source code, literal regular expressions appear as part of assignments (after an equals sign), in equality or inequality tests (after an equals sign), possibly with a negation operator (exclamation point) before the regex, in literal object definitions (after a colon), and as a parameter to a function (after an opening parenthesis or a comma). Whitespace between the regex and the character that precedes it needs to be ignored.

Solution

(?<=[=:(,](?:\s*!)?\s*)/[^/\\\r\n]*(?:\\.[^/\\\r\n]*)*/
Regex options: None
Regex flavors: .NET
[=:(,](?:\s*!)?\s*\K/[^/\\\r\n]*(?:\\.[^/\\\r\n]*)*/
Regex options: None
Regex flavors: PCRE 7.2, Perl 5.10
(?<=[=:(,](?:\s{0,10}+!)?\s{0,10})/[^/\\\r\n]*(?:\\.[^/\\\r\n]*)*/
Regex options: None
Regex flavors: ...

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.