3.19. Split a String

Problem

You want to split a string using a regular expression. After the split, you will have an array or list of strings with the text between the regular expression matches.

For example, you want to split a string with HTML tags in it along the HTML tags. Splitting Ilike<b>bold</b>and<i>italic</i>fonts should result in an array of five strings: Ilike, bold, and, italic, and fonts.

Solution

C#

You can use the static call when you process only a small number of strings with the same regular expression:

string[] splitArray = Regex.Split(subjectString, "<[^<>]*>");

If the regex is provided by the end user, you should use the static call with full exception handling:

string[] splitArray = null;
try {
    splitArray = Regex.Split(subjectString, "<[^<>]*>");
} catch (ArgumentNullException ex) {
     // Cannot pass null as the regular expression or subject string
} catch (ArgumentException ex) {
    // Syntax error in the regular expression
}

Construct a Regex object if you want to use the same regular expression with a large number of strings:

Regex regexObj = new Regex("<[^<>]*>");
string[] splitArray = regexObj.Split(subjectString);

If the regex is provided by the end user, you should use the Regex object with full exception handling:

string[] splitArray = null; try { Regex regexObj = new Regex("<[^<>]*>"); try { splitArray = regexObj.Split(subjectString); } catch (ArgumentNullException ex) { // Cannot pass null as the subject string } } catch (ArgumentException ex) { // Syntax ...

Get Regular Expressions Cookbook, 2nd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.