Skip to Content
Automate the Boring Stuff with Python, 2nd Edition
book

Automate the Boring Stuff with Python, 2nd Edition

by Al Sweigart
November 2019
Beginner to intermediate
592 pages
14h 43m
English
No Starch Press
Content preview from Automate the Boring Stuff with Python, 2nd Edition

15WORKING WITH PDF AND WORD DOCUMENTS

Image

PDF and Word documents are binary files, which makes them much more complex than plaintext files. In addition to text, they store lots of font, color, and layout information. If you want your programs to read or write to PDFs or Word documents, you’ll need to do more than simply pass their filenames to open().

Fortunately, there are Python modules that make it easy for you to interact with PDFs and Word documents. This chapter will cover two such modules: PyPDF2 and Python-Docx.

PDF Documents

PDF stands for Portable Document Format and uses the .pdf file extension. Although PDFs support many features, this ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Automate the Boring Stuff with Python

Automate the Boring Stuff with Python

Al Sweigart

Publisher Resources

ISBN: 9781098122584Errata Page