Chapter 8

Shaping Data


Bullet Manipulating HTML data

Bullet Manipulating raw text

Bullet Discovering the bag of words model and other techniques

Bullet Manipulating graph data

Chapter 7 demonstrates techniques for working with data as an entity — as something you work with in Python. However, data doesn’t exist in a vacuum. It doesn’t just suddenly appear within Python for absolutely no reason at all. As demonstrated in Chapter 6, you load the data. However, loading may not be enough — you may have to shape the data as part of loading it. That’s the purpose of this chapter. You discover how to work with a variety of container types in a way that makes it possible to load data from a number of complex container types, such as HTML pages. In fact, you even work with graphics, images, and sounds.

Remember As you progress through the book, you discover that data takes all kinds of forms and shapes. As far as the computer is concerned, data consists of 0s and 1s. Humans give the data meaning by formatting, ...

Get Python for Data Science For Dummies, 2nd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.