Chapter 13. Files and Databases

Most of the programs we have seen so far are ephemeral in the sense that they run for a short time and produce output, but when they end, their data disappears. Each time you run an ephemeral program, it starts with a clean slate.

Other programs are persistent: they run for a long time (or all the time); they keep at least some of their data in long-term storage; and if they shut down and restart, they pick up where they left off.

A simple way for programs to maintain their data is by reading and writing text files. A more versatile alternative is to store data in a database. Databases are specialized files that can be read and written more efficiently than text files, and they provide additional capabilities.

In this chapter, we’ll write programs that read and write text files and databases, and as an exercise you’ll write a program that searches a collection of photos for duplicates. But before you can work with a file, you have to find it, so we’ll start with filenames, paths, and directories.

Filenames and Paths

Files are organized into directories, also called “folders.” Every running program has a current working directory, which is the default directory for most operations. For example, when you open a file, Python looks for it in the current working directory.

The os module provides functions for working with files and directories (“os” stands for “operating system”). It provides a function called getcwd that gets the name of the current ...

Get Think Python, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.