Chapter 16. Data in a Box: Persistent Storage

It is a capital mistake to theorize before one has data.

Arthur Conan Doyle

An active program accesses data stored in Random Access Memory, or RAM. RAM is very fast, but it is expensive and requires a constant supply of power; if the power goes out, all the data in memory is lost. Disk drives are slower than RAM but have more capacity, cost less, and retain data even after someone trips over the power cord. Thus, a huge amount of effort in computer systems has been devoted to making the best trade-offs between storing data on disk and RAM. As programmers, we need persistence: storing and retrieving data using nonvolatile media such as disks.

This chapter is all about the different flavors of data storage, each optimized for different purposes: flat files, structured files, and databases. File operations other than input and output are covered in Chapter 14.

A record is a term for one chunk of related data, consisting of individual fields.

Flat Text Files

The simplest persistence is a plain old flat file. This works well if your data has a very simple structure and you exchange all of it between disk and memory. Plain text data might be suitable for this treatment.

Padded Text Files

In this format, each field in a record has a fixed width, and is padded (usually with space characters) to that width in the file, giving each line (record) the same width. A programmer can use seek() to jump around the file and only read and write ...

Get Introducing Python, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.