2.6. The Poor Man’s Tokenizer
Problem
You need a quick method of breaking up a string into a series of discrete tokens or words.
Solution
Use the
Split instance method of the
string class. For example:
string equation = "1 + 2 - 4 * 5";
string[] equationTokens = equation.Split(new char[1]{' '});
foreach (string Tok in equationTokens)
Console.WriteLine(Tok);This code produces the following output:
1 + 2 - 4 * 5
The Split method may also be used to separate
people’s first, middle, and last names. For example:
string fullName1 = "John Doe";
string fullName2 = "Doe,John";
string fullName3 = "John Q. Doe";
string[] nameTokens1 = fullName1.Split(new char[3]{' ', ',', '.'});
string[] nameTokens2 = fullName2.Split(new char[3]{' ', ',', '.'});
string[] nameTokens3 = fullName3.Split(new char[3]{' ', ',', '.'});
foreach (string tok in nameTokens1)
{
Console.WriteLine(tok);
}
Console.WriteLine("");
foreach (string tok in nameTokens2)
{
Console.WriteLine(tok);
}
Console.WriteLine("");
foreach (string tok in nameTokens3)
{
Console.WriteLine(tok);
}This code produces the following output:
John Doe Doe John John Q Doe
Notice that a blank is inserted between the '.'
and the space delimiters of the fullName3 name;
this is correct behavior. If you did not want to process this space
in your code, you can choose to ignore it.
Discussion
If you have a consistent string whose parts, or
tokens, are separated by well-defined
characters, the Split function can tokenize the string. Tokenizing a string consists ...