O'Reilly logo

Mastering Python for Networking and Security by Jose Manuel Ortega

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Introduction to PyPDF2

One of the modules available in Python to extract data from PDF documents is PyPDF2. The module can be downloaded directly with the pip install utility since it is located in the official Python repository .

In the https://pypi.org/project/PyPDF2/ URL, we can see the last version of this module:

This module offers us the ability to extract document information, and encrypt and decrypt documents. To extract metadata, we can use the PdfFileReader class and the getDocumentInfo() method, which returns a dictionary with the data of the document:

The following function would allow us to obtain the information of all the PDF ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required