December 2018
Beginner to intermediate
796 pages
19h 54m
English
Using the encode/decode methods, we can encode Unicode strings and decode bytes objects. UTF-8 is a variable length character encoding, capable of encoding all possible Unicode code points. It is the dominant encoding for the web. Notice also that by adding a literal b in front of a string declaration, we're creating a bytes object:
>>> s = "This is üŋíc0de" # unicode string: code points>>> type(s)<class 'str'>>>> encoded_s = s.encode('utf-8') # utf-8 encoded version of s>>> encoded_sb'This is \xc3\xbc\xc5\x8b\xc3\xadc0de' # result: bytes object>>> type(encoded_s) # another way to verify it<class 'bytes'>>>> encoded_s.decode('utf-8') # let's revert to the original'This is üŋíc0de'>>> bytes_obj = b"A bytes object" ...Read now
Unlock full access