CHAPTER 12Document Metadata
Document metadata refers to information that is stored within a file and used to provide context or descriptions about that file. Metadata is often invisible and provides supporting information about the file in which it is stored.
Document metadata can include pieces of information such as the document's title, the software used to create it, the name of the author or organization where it was created, the name of the computer on which it was created, and the date and time the file was first created or modified.
In addition to basic metadata information, different file types can contain different types of metadata, which can also vary between the software used to create that file. The amount of metadata that is saved with a document ultimately depends on the software that was used to create the document.
Where things get really interesting is when you come across sensitive documents that have not been stripped of their metadata. The metadata within these documents (especially in photos) can contain incredibly sensitive information.
To give you an example, in 2016, there was a scandal involving leaked nude celebrity photos. The photos were stolen from the victims’ personal accounts and leaked online. Upon analysis, the metadata within the photos contained very specific and identifiable information such as the camera (or phone) type, lens settings, date and time the photos were taken, and even its geolocation!
This type of information can provide a ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access