O'Reilly logo

Data Visualization with Python and JavaScript by Kyran Dale

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Part III. Cleaning and Exploring Data with Pandas

In this part of the book, in the second phase of our toolchain (see Figure III-1) we take the Nobel Prize dataset we just scraped with Scrapy in Chapter 6 and first clean it up, then explore it for interesting nuggets. The principal tools we’ll be using are the large Python libraries Matplotlib and Pandas.

Pandas will be introduced in the next couple of chapters, along with its building block, NumPy. In Chapter 9 we’ll use Pandas to clean the Nobel dataset. Then in Chapter 11, in conjunction with Python’s plotting library Matplotlib, we’ll use it to explore it.

In Part IV we’ll see how to deliver the freshly cleaned Nobel Prize dataset to the browser, using Python’s Flask web server.

dvpj 09
Figure III-1. Our dataviz toolchain: cleaning and exploring the data

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required