October 2018
Beginner to intermediate
466 pages
12h 2m
English
If we have an array of bytes from somewhere, we can convert it to Unicode using the .decode method on the bytes class. This method accepts a string for the name of the character encoding. There are many such names; common ones for Western languages include ASCII, UTF-8, and latin-1.
The sequence of bytes (in hex), 63 6c 69 63 68 e9, actually represents the characters of the word cliché in latin-1 encoding. The following example will encode this sequence of bytes and convert it to a Unicode string using latin-1 encoding:
characters = b'\x63\x6c\x69\x63\x68\xe9'
print(characters)
print(characters.decode("latin-1"))
The first line creates a bytes object. Analogous to an f-string, the b character immediately before the ...