CHAPTER 12Document Metadata

Document metadata refers to information that is stored within a file and used to provide context or descriptions about that file. Metadata is often invisible and provides supporting information about the file in which it is stored.

Document metadata can include pieces of information such as the document's title, the software used to create it, the name of the author or organization where it was created, the name of the computer on which it was created, and the date and time the file was first created or modified.

In addition to basic metadata information, different file types can contain different types of metadata, which can also vary between the software used to create that file. The amount of metadata that is saved with a document ultimately depends on the software that was used to create the document.

Where things get really interesting is when you come across sensitive documents that have not been stripped of their metadata. The metadata within these documents (especially in photos) can contain incredibly sensitive information.

To give you an example, in 2016, there was a scandal involving leaked nude celebrity photos. The photos were stolen from the victims’ personal accounts and leaked online. Upon analysis, the metadata within the photos contained very specific and identifiable information such as the camera (or phone) type, lens settings, date and time the photos were taken, and even its geolocation!

This type of information can provide a ...

Get Hunting Cyber Criminals now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.