3.13. Find a Match Within Another Match
Problem
You want to find all the matches of a particular regular expression, but only within certain sections of the subject string. Another regular expression matches each of the sections in the string.
Suppose you have an HTML file in which various passages
are marked as bold with <b>
tags. You want to find all numbers marked as bold. If some bold text
contains multiple numbers, you want to match all of them separately. For
example, when processing the string 1 <b>2</b> 3 4 <b>5 6
7</b>
, you want to find four matches: 2
, 5
, 6
, and 7
.
Solution
C#
StringCollection resultList = new StringCollection(); Regex outerRegex = new Regex("<b>(.*?)</b>", RegexOptions.Singleline); Regex innerRegex = new Regex(@"\d+"); // Find the first section Match outerMatch = outerRegex.Match(subjectString); while (outerMatch.Success) { // Get the matches within the section Match innerMatch = innerRegex.Match(outerMatch.Groups[1].Value); while (innerMatch.Success) { resultList.Add(innerMatch.Value); innerMatch = innerMatch.NextMatch(); } // Find the next section outerMatch = outerMatch.NextMatch(); }
VB.NET
Dim ResultList = New StringCollection Dim OuterRegex As New Regex("<b>(.*?)</b>", RegexOptions.Singleline) Dim InnerRegex As New Regex("\d+") 'Find the first section Dim OuterMatch = OuterRegex.Match(SubjectString) While OuterMatch.Success 'Get the matches within the section Dim InnerMatch = InnerRegex.Match(OuterMatch.Groups(1).Value) While InnerMatch.Success ResultList.Add(InnerMatch.Value) ...
Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.