Chapter 3. Excel Files

Microsoft Excel is ubiquitous. We use Excel to store data on customers, inventory, and employees. We use it to track operations, sales, and financials. The list of ways people use Excel in business is long and diverse. Because Excel is such an integral tool in business, knowing how to process Excel files in Python will enable you to add Python into your data processing workflows, receiving data from other people and sharing results with them in ways they’re comfortable with.

Unlike Python’s csv module, there is not a standard module in Python for processing Excel files (i.e., files with the .xls or .xlsx extension). To complete the examples in this section, you need to have the xlrd and xlwt packages. The xlrd and xlwt packages enable Python to process Excel files on any operating system, and they have strong support for Excel dates. If you installed Anaconda Python, then you already have the packages because they’re bundled into the installation. If you installed Python from the Python.org website, then you need to follow the instructions in Appendix A to download and install the two packages.

A few words on terminology: when I refer to an “Excel file” that’s the same thing as an “Excel workbook.” An Excel workbook contains one or more Excel worksheets. In this chapter, I’ll be using the words “file” and “workbook” interchangeably, and I’ll refer to the individual worksheets within a workbook as worksheets.

As we did when working with CSV files in Chapter 2 ...

Get Foundations for Analytics with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.