Chapter 10. Archive Software Methods

Note

This chapter is written by Dan Frith, aka @penguinpunk, an industry veteran from Down Under. Dan’s a fan of my work, but not what I would call a fanboy. (He has enough chutzpah to tell me when he thinks I’m wrong. Must be an Australian thing.) He’s got great field experience, so in addition to being a tech reviewer of the book, I asked him to write this chapter.

An archive is the one data protection system you probably need and most likely don’t have. In Chapter 3, I defined an archive as a separate copy of data stored in a separate location, made to serve as a reference copy, and stored with enough metadata to find the data in question without knowing where it came from. Backup is, on the other hand, a secondary copy of data that you use to recover in the event that your primary copy of the data is affected in some way, whether from corruption, deletion, or some other misfortune.

You most likely have a backup system, but you just as likely do not have an archive system. Most people therefore have no idea what an actual archive system does or why you might want one. Let’s explore that topic a bit.

A Deeper Dive into Archive

Another way I like to define archive is the primary copy of your data that has secondary value (i.e., is no longer primary data). Typically, archive data is no longer current, infrequently accessed, or simply no longer valued (as much) by the users who created it. Not every piece of data needs to be kept in a prominent ...

Get Modern Data Protection now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.