3.20. Split a String, Keeping the Regex Matches
Problem
You want to split a string using a regular expression. After the split, you will have an array or list of strings with the text between the regular expression matches, as well as the regex matches themselves.
Suppose you want to split a string with HTML tags in it along
the HTML tags, and also keep the HTML tags. Splitting I●like●<b>bold</b>●and●<i>italic</i>●fonts should result in an array of
nine strings: I●like●, <b>, bold, </b>, ●and●, <i>, italic, </i>, and ●fonts.
Solution
C#
You can use the static call when you process only a small number of strings with the same regular expression:
string[] splitArray = Regex.Split(subjectString, "(<[^<>]*>)");
Construct a Regex object if you want to use the same
regular expression with a large number of strings:
Regex regexObj = new Regex("(<[^<>]*>)");
string[] splitArray = regexObj.Split(subjectString);VB.NET
You can use the static call when you process only a small number of strings with the same regular expression:
Dim SplitArray = Regex.Split(SubjectString, "(<[^<>]*>)")
Construct a Regex object if you want to use the same
regular expression with a large number of strings:
Dim RegexObj As New Regex("(<[^<>]*>)")
Dim SplitArray = RegexObj.Split(SubjectString)Java
List<String> resultList = new ArrayList<String>(); Pattern regex = Pattern.compile("<[^<>]*>"); Matcher regexMatcher = regex.matcher(subjectString); int lastIndex = 0; while (regexMatcher.find()) { resultList.add(subjectString.substring(lastIndex, ...