One of the modules available in Python to extract data from PDF documents is PyPDF2. The module can be downloaded directly with the pip install utility since it is located in the official Python repository .
In the https://pypi.org/project/PyPDF2/ URL, we can see the last version of this module:
This module offers us the ability to extract document information, and encrypt and decrypt documents. To extract metadata, we can use the PdfFileReader class and the getDocumentInfo() method, which returns a dictionary with the data of the document:
The following function would allow us to obtain the information of all the PDF ...