4.24. Using Regular Expressions to Split a String
Problem
You want to split a string into tokens, but you require more sophisticated searching or flexibility than Recipe 4.7 provides. For example, you may want tokens that are more than one character or can take on many different forms. This often results in code, and causes confusion in consumers of your class or function.
Solution
Use Boost’s regex class template. regex enables the use of regular expressions on string and
text data. Example 4-33 shows how to use
regex to split strings.
Example 4-33. Using Boost’s regular expressions
#include <iostream>
#include <string>
#include <boost/regex.hpp>
int main() {
std::string s = "who,lives:in-a,pineapple under the sea?";
boost::regex re(",|:|-|\\s+"); // Create the reg exp
boost::sregex_token_iterator // Create an iterator using a
p(s.begin(), s.end(), re, -1); // sequence and that reg exp
boost::sregex_token_iterator end; // Create an end-of-reg-exp
// marker
while (p != end)
std::cout << *p++ << '\n';
}Discussion
Example 4-33 shows how to use
regex to iterate over matches in a regular
expression. The following line sets up the regular expression:
boost::regex re(",|:|-|\\s+");What it says, essentially, is that each match of the regular expression is either a comma, or a colon, or a dash, or one or more spaces. The pipe character is the logical operator that ORs each of the delimiters together. The next two lines set up the iterator:
boost::sregex_token_iterator p(s.begin(), s.end(), re, ...