3.3. Unicode Escapes

A compiler for the Java programming language (“Java compiler”) first recognizes Unicode escapes in its input, translating the ASCII characters \u followed by four hexadecimal digits to the UTF-16 code unit (§3.1) of the indicated hexadecimal value, and passing all other characters unchanged. Representing supplementary characters requires two consecutive Unicode escapes. This translation step results in a sequence of Unicode input characters.

UnicodeInputCharacter:    UnicodeEscape    RawInputCharacterUnicodeEscape:    \ UnicodeMarker HexDigit HexDigit HexDigit HexDigitUnicodeMarker:    u    UnicodeMarker uRawInputCharacter:    any Unicode characterHexDigit: one of    0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F

The \, u, and ...

Get The Java® Language Specification, Java SE 7 Edition, Fourth Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.