Character Sets
The character sets that C++ uses at compile time and runtime are implementation-defined. A source file is read as a sequence of characters in the physical character set. When a source file is read, the physical characters are mapped to the compile-time character set, which is called the source character set. The mapping is implementation-defined, but many implementations use the same character set.
At the very least, the source character set always includes the characters listed below. The numeric values of these characters are implementation-defined.
| Space |
| Horizontal tab |
| Vertical tab |
| Form feed |
| Newline |
a ... z |
A ... Z |
0 ... 9 |
_ { } [ ] # ( ) < > % : ; . ? *
+ - / ^ & | ~ ! = , \ " ' |
The runtime character set, called the execution character set , might be different from the source character set (though it is often the same). If the character sets are different, the compiler automatically converts all character and string literals from the source character set to the execution character set. The basic execution character set includes all the characters in the source character set, plus the characters listed below. The execution character set is a superset of the basic execution character set; additional characters are implemented-defined and might vary depending on locale.
| Alert |
| Backspace |
| Carriage return |
| Null |
Conceptually, source characters are mapped to Unicode (ISO/IEC 10646) and from ...