Skip to Content
Introducing Python
book

Introducing Python

by Bill Lubanovic
November 2014
Beginner content levelBeginner
481 pages
10h 11m
English
O'Reilly Media, Inc.
Content preview from Introducing Python

Chapter 7. Mangle Data Like a Pro

In this chapter, you’ll learn many techniques for taming data. Most of them concern these built-in Python data types:

strings

Sequences of Unicode characters, used for text data.

bytes and bytearrays

Sequences of eight-bit integers, used for binary data.

Text Strings

Text is the most familiar type of data to most readers, so we’ll begin with some of the powerful features of text strings in Python.

Unicode

All of the text examples in this book thus far have been plain old ASCII. ASCII was defined in the 1960s, when computers were the size of refrigerators and only slightly better at performing computations. The basic unit of computer storage is the byte, which can store 256 unique values in its eight bits. For various reasons, ASCII only used 7 bits (128 unique values): 26 uppercase letters, 26 lowercase letters, 10 digits, some punctuation symbols, some spacing characters, and some nonprinting control codes.

Unfortunately, the world has more letters than ASCII provides. You could have a hot dog at a diner, but never a Gewürztraminer1 at a café. Many attempts have been made to add more letters and symbols, and you’ll see them at times. Just a couple of those include:

  • Latin-1, or ISO 8859-1

  • Windows code page 1252

Each of these uses all eight bits, but even that’s not enough, especially when you need non-European languages. Unicode is an ongoing international standard to define the characters of all the world’s languages, plus symbols ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Introducing Python, 2nd Edition

Introducing Python, 2nd Edition

Bill Lubanovic
Introduction to Python

Introduction to Python

Jessica McKellar
Fluent Python

Fluent Python

Luciano Ramalho
Python for Programmers

Python for Programmers

Paul Deitel, Harvey Deitel

Publisher Resources

ISBN: 9781449361167Errata Page