EBNF Grammar for XML 1.0 (Third Edition)

Document

[1] document ::= prolog 22 element 39 Misc 27 *

Character range

[2] Char ::= #x9 | #xA | #xD | [#x21-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

Whitespace

[3] S ::= (#x20 | #x9 | #xD | #xA)+

Names and tokens

[4] NameChar ::= Letter 84 | Digit 88 | '.' | '-' | '_' | ':' | CombiningChar 87 | Extender 89

[5] Name ::= ( Letter 84 | '_' | ':') ( NameChar 4 )*

[6] Names ::= Name 5 (#x20 Name 5 )*

[7] Nmtoken ::= ( NameChar 4 )+

[8] Nmtokens ::= Nmtoken 7 (#x20 Nmtoken 7 )*

Literals

[9] EntityValue ::= '"' ([^%&"] | PEReference 69 | Reference 67 )* '"' | "'" ([^%&'] | PEReference 69 | Reference 67 )* "'"

[10] AttValue ::= '"' ([^<&"] | Reference 67 )* '"' | "'" ([^<&'] | Reference 67 )* "'"

[11] SystemLiteral ::= ('"' [^"]* '"') | ("'" [^']* "'")

[12] PubidLiteral ::= '"' PubidChar 13 * '"' | "'" ( PubidChar 13 - "'")* "'"

[13] PubidChar ::= #x20 | #xD | #xA | [a-zA-Z0-9] | [-'( )+,./:=?;!*#@$_%]

Character data

[14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)

Comments

[15] Comment ::= '<!--' (( Char 2 - '-') | ('-' ( Char 2 - '-')))* '-->'

Processing instructions

[16] PI ::= '<?' PITarget 17 ( S( Char 2 * - ( Char 2 * '?>' Char 2 *)))? '?>'

[17] PITarget ::= Name 5 - (('X' | 'x') ('M' | 'm') ('L' | 'l'))

CDATA sections

[18] CDSect ::= CDStart 19 CData 20 CDEnd 21

[19] CDStart ::= '<![CDATA['

[20] CData ::= ( Char 2 * - ( Char 2 * ']]>' Char 2 *))

[21] CDEnd ::= ']]>'

Prolog

[22] prolog ...

Get XML in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.