Chapter 6. Data Encoding and Processing
The main focus of this chapter is using Python to process data presented in different kinds of common encodings, such as CSV files, JSON, XML, and binary packed records. Unlike the chapter on data structures, this chapter is not focused on specific algorithms, but instead on the problem of getting data in and out of a program.
6.1. Reading and Writing CSV Data
Problem
You want to read or write data encoded as a CSV file.
Solution
For most kinds of CSV data, use the csv
library. For
example, suppose you have some stock market data
in a file named stocks.csv like this:
Symbol,Price,Date,Time,Change,Volume "AA",39.48,"6/11/2007","9:36am",-0.18,181800 "AIG",71.38,"6/11/2007","9:36am",-0.15,195500 "AXP",62.58,"6/11/2007","9:36am",-0.46,935000 "BA",98.31,"6/11/2007","9:36am",+0.12,104800 "C",53.08,"6/11/2007","9:36am",-0.25,360900 "CAT",78.29,"6/11/2007","9:36am",-0.23,225400
Hereâs how you would read the data as a sequence of tuples:
import
csv
with
open
(
'stocks.csv'
)
as
f
:
f_csv
=
csv
.
reader
(
f
)
headers
=
next
(
f_csv
)
for
row
in
f_csv
:
# Process row
...
In the preceding code, row
will be a tuple. Thus, to access certain
fields, you will need to use indexing, such as row[0]
(Symbol) and
row[4]
(Change).
Since such indexing can often be confusing, this is one place where you might want to consider the use of named tuples. For example:
from
collections
import
namedtuple
with
open
(
'stock.csv'
)
as
f
:
f_csv
=
csv
.
reader
(
f
)
headings
=
next
(
f_csv
)
Row
=
namedtuple ...
Get Python Cookbook, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.