A Complete S-Expression Parser

As it turns out, there is a formal definition for S-expressions to handle many applications requiring the representation of hierarchical data. You can imagine that this goes beyond data represented as words and integers—the authentication certificate sample in the previous section requires a variety of different field types.

The Internet Draft describing S-expressions for data representation can be found at http://people.csail.mit.edu/rivest/sexp.txt. Fortunately, the draft defines the BNF for us:

<sexp> :: <string> | <list> <string> :: <display>? <simple-string> ; <simple-string> :: <raw> | <token> | <base-64> | <hexadecimal> | <quoted-string> ; <display> :: "[" <simple-string> "]" ; <raw> :: <decimal> ":" <bytes> ; <decimal> :: <decimal-digit>+ ; -- decimal numbers should have no unnecessary leading zeros <bytes> -- any string of bytes, of the indicated length <token> :: <tokenchar>+ ; <base-64> :: <decimal>? "|" ( <base-64-char> | <whitespace> )* "|" ; <hexadecimal> :: "#" ( <hex-digit> | <white-space> )* "#" ; <quoted-string> :: <decimal>? <quoted-string-body> <quoted-string-body> :: "\"" <bytes> "\"" <list> :: "(" ( <sexp> | <whitespace> )* ")" ; <whitespace> :: <whitespace-char>* ; <token-char> :: <alpha> | <decimal-digit> | <simple-punc> ; <alpha> :: <upper-case> | <lower-case> | <digit> ; <lower-case> :: "a" | ... | "z" ; <upper-case> :: "A" | ... | "Z" ; <decimal-digit> :: "0" | ... | "9" ; <hex-digit> :: <decimal-digit> | "A" | ... | "F" | ...

Get Getting Started with Pyparsing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.