Strings are also data – they’re arrays of bytes. But unlike simple Data, the String class knows how to interpret that data. To interpret data correctly, we need to know encoding. The most widespread options are UTF-8 and UTF-16, also known as Unicode.
UTF-8 uses one to four bytes to encode one character. Latin characters from the ASCII range use one byte. If the text contains only Latin characters, spaces, and standard punctuation symbols, ASCII and UTF-8 text strings are identical.
UTF-16 uses two of four ...