8.7. Add a cellspacing Attribute to <table> Tags That Do Not Already Include It
Problem
You want to search through an (X)HTML file and add cellspacing="0" to all tables that do not
already include a cellspacing
attribute.
This recipe serves as an example of adding an attribute to XML-style tags that do not already include it. You can swap in whatever tag and attribute names and values you prefer.
Solution
Regex 1: Simplistic solution
You can use negative lookahead to match <table> tags that do not contain the
word cellspacing, as
follows:
<table\b(?![^>]*?\scellspacing\b)([^>]*)>
| Regex options: Case insensitive |
| Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Here’s the regex again in free-spacing mode:
<table \b # Match "<table", followed by a word boundary
(?! # Assert that the regex below cannot be matched here
[^>] # Match any character except ">"...
*? # zero or more times, as few as possible (lazy)
\s cellspacing \b # Match "cellspacing" as a complete word
) #
( # Capture the regex below to backreference 1
[^>] # Match any character except ">"...
* # zero or more times, as many as possible (greedy)
) #
> # Match a literal ">" to end the tag| Regex options: Case insensitive |
| Regex flavors: .NET, Java, PCRE, Perl, Python, Ruby |
Regex 2: More reliable solution
The following regex replaces both instances of the negated
character class ‹[^>]› from the simplistic solution with
‹(?:[^>"']|"[^"]*"|'[^']*')›. This improves the regular expression’s reliability in two ways. First, it adds ...