Skip to Content
Visualizing Data
book

Visualizing Data

by Ben Fry
December 2007
Beginner to intermediate
382 pages
10h 29m
English
O'Reilly Media, Inc.
Content preview from Visualizing Data

Chapter 10. Parsing Data

Parsing converts a raw stream of data into a structure that can be manipulated in software. Lots of parsing is detective work, requiring you to spend time looking at files or data streams to figure out what’s inside. The data might be available in an easily parsed format (such as an RSS feed in XML format) or in a proprietary binary format. This chapter covers some of the methods used to store data, methods for reading common data formats, and some detective procedures for dissecting data. Even if your particular data format is not covered in this chapter, the methods discussed are applicable to any data source.

Parsing may also seem to be quite disconnected from the actual process of data visualization. However, it’s part of the process for a reason: chances are, you’ll have to obtain data from a source that’s not under your control and will spend a lot of time figuring out how to use the data that you’re given. This chapter aims to give you a sense of how files are typically structured because more likely than not, the data you acquire will be poorly documented (if it’s documented at all). Being able to recognize the basic file format, or even whether the data is compressed, are valuable clues to unpacking unknown information.

Generally, data boils down to lists (one-dimensional sets), matrices (two-dimensional tables, such as a spreadsheet), or trees and graphs (individual “nodes” of data and sets of “edges” that describe connections between them). Strictly ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Visualizing Graph Data

Visualizing Graph Data

Corey Lanum
Hands-On Data Visualization

Hands-On Data Visualization

Jack Dougherty, Ilya Ilyankou
Designing Data Visualizations

Designing Data Visualizations

Noah Iliinsky, Julie Steele

Publisher Resources

ISBN: 9780596514556Errata Page