O'Reilly logo

C# Cookbook by Jay Hilyard, Stephen Teilhet

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

8.7. A Better Tokenizer

Problem

A simple method of tokenizing—or breaking up a string into its discrete elements—was presented in Recipe 2.6. However, this is not powerful enough to handle all your string-tokenizing needs. You need a tokenizer—also referred to as a lexer—that can split up a string based on a well-defined set of characters.

Solution

Using the Split method of the Regex class, we can use a regular expression to indicate the types of tokens and separators that we are interested in gathering. This technique works especially well with equations, since the tokens of an equation are well-defined. For example, the code:

using System;
using System.Text.RegularExpressions;

public static string[] Tokenize(string equation)
{
    Regex RE = new Regex(@"([\+\-\*\(\)\^\\])");
    return (RE.Split(equation));
}

will divide up a string according to the regular expression specified in the Regex constructor. In other words, the string passed in to the Tokenize method will be divided up based on the delimiters +, -, *, (, ), ^, or \. The following method will call the Tokenize method to tokenize the equation: (y - 3)(3111*x^21 + x + 320):

public void TestTokenize( )
{
    foreach(string token in Tokenize("(y - 3)(3111*x^21 + x + 320)"))
        Console.WriteLine("String token = " + token.Trim( ));
}

which displays the following output:

String token = String token = ( String token = y String token = - String token = 3 String token = ) String token = String token = ( String token = 3111 String token = ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required