Block Escapes

A block escape is a simple way to refer to a range of characters that have some property in common. Each block escape has a name; to use a block name, prepend Is to it. Block escapes are used with the \p and \P operators. For example, the expression \p{IsThai} refers to the Thai characters (&#x0E00&#x0E7F). The expression \P{IsThai} refers to everything except Thai characters. The block names are listed here in the format defined in the XML Schema spec.

Table E-1 shows the complete list of block escapes. This table was generated from version 5.0.0 of the file blocks.txt. The list of block escape names is part of the Unicode Character Database; see http://www.unicode.org/ for the latest version of the Unicode standard.

Table E-1. Block escape names
Block nameStarting characterEnding character
BasicLatin&#x0000&#x007F
Latin-1Supplement&#x0080&#x00FF
LatinExtended-A&#x0100&#x017F
LatinExtended-B&#x0180&#x024F
IPAExtensions&#x0250&#x02AF
SpacingModifierLetters&#x02B0&#x02FF
CombiningDiacriticalMarks&#x0300&#x036F
GreekandCoptic&#x0370&#x03FF
Cyrillic&#x0400&#x04FF
CyrillicSupplement&#x0500&#x052F
Armenian&#x0530&#x058F
Hebrew&#x0590&#x05FF
Arabic&#x0600&#x06FF
Syriac&#x0700&#x074F
ArabicSupplement&#x0750&#x077F
Thaana&#x0780&#x07BF
NKo&#x07C0&#x07FF
Devanagari&#x0900&#x097F
Bengali&#x0980&#x09FF
Gurmukhi&#x0A00&#x0A7F
Gujarati&#x0A80&#x0AFF
Oriya&#x0B00&#x0B7F
Tamil&#x0B80&#x0BFF
Telugu&#x0C00&#x0C7F
Kannada&#x0C80&#x0CFF
Malayalam&#x0D00&#x0D7F
Sinhala&#x0D80&#x0DFF
Thai&#x0E00&#x0E7F
Lao&#x0E80 ...

Get XSLT, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.