Chapter 8. Encrypted Documents

PDF documents can be encrypted using a variety of industry-standard schemes which have increased in complexity and security over the years, starting with PDF version 1.1. The PDF standard provides, in addition, a general mechanism for encapsulating third-party encryption and security policies.

Encryption applies, with a few exceptions, to streams and strings in the file, but does not encrypt numbers or other PDF data types, nor does it encrypt the file as a whole. Thus, the document’s object structure remains visible to applications without the need for decryption, but the substantive content of the document is safeguarded.

The more modern PDF encryption methods allow the file’s XMP metadata stream (XML Metadata) to be left unencrypted so it may be extracted and read by programs which don’t know how to open encrypted PDF files, or if the password is not known.

Introduction

Due to the complexity of encrypted documents, it isn’t possible to manually build an example (as we have in other chapters), but we can use pdftk to process our standard hello.pdf file into an encrypted one, encypted.pdf:

pdftk hello.pdf output encrypted.pdf encrypt_40bit owner_pw fred

This creates the output file encrypted.pdf using the 40-bit RC4 method with an owner password of fred. The owner password is the master password for the file. Someone who has it can do anything with the file, including re-encrypting it or changing the security settings. The user password allows ...

Get PDF Explained now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.