Chapter 6. Data Encoding and Processing
The main focus of this chapter is using Python to process data presented in different kinds of common encodings, such as CSV files, JSON, XML, and binary packed records. Unlike the chapter on data structures, this chapter is not focused on specific algorithms, but instead on the problem of getting data in and out of a program.
6.1. Reading and Writing CSV Data
Problem
You want to read or write data encoded as a CSV file.
Solution
For most kinds of CSV data, use the csv library. For
example, suppose you have some stock market data
in a file named stocks.csv like this:
Symbol,Price,Date,Time,Change,Volume
"AA",39.48,"6/11/2007","9:36am",-0.18,181800
"AIG",71.38,"6/11/2007","9:36am",-0.15,195500
"AXP",62.58,"6/11/2007","9:36am",-0.46,935000
"BA",98.31,"6/11/2007","9:36am",+0.12,104800
"C",53.08,"6/11/2007","9:36am",-0.25,360900
"CAT",78.29,"6/11/2007","9:36am",-0.23,225400Here’s how you would read the data as a sequence of tuples:
importcsvwithopen('stocks.csv')asf:f_csv=csv.reader(f)headers=next(f_csv)forrowinf_csv:# Process row...
In the preceding code, row will be a tuple. Thus, to access certain
fields, you will need to use indexing, such as row[0] (Symbol) and
row[4] (Change).
Since such indexing can often be confusing, this is one place where you might want to consider the use of named tuples. For example:
fromcollectionsimportnamedtuplewithopen('stock.csv')asf:f_csv=csv.reader(f)headings=next(f_csv)Row=namedtuple ...