Chapter 13. Serialized Data

Next to image data, serialized content is the second most common data format you’ll be sending around in your networked applications. And even though the lowest-hanging data compression fruit will clearly come from image data, it’s equally important to take a hard look at serialized content.

What do we mean by “serialized”? Serialization is the process of taking a high-level data object and converting it to a binary string (the inverse is deserialization). This transform can be applied to a plethora of different data types, but it’s most accurate when describing the conversion from an in-memory structure or class to a file or memory binary large object (BLOB) to send over a network.

This particular use case dominates the mountain of data transfers we see from modern mobile and web applications. Consider your favorite social media app. When you load it for the first time, a flurry of serialized data is passed between the client and the server in order to show you the right information on the screen. And this continues as you receive updates, news, and messages. When you post your own updated status, this input has to go into memory, be serialized to a format, uploaded to the server, which will deserialize it, add it to its database, and then serialize it again in order to send updates to all of your friends.

Although images take up the bulk of your data compression footprint by size, serialized content makes up for it in volume.

This means that performance ...

Get Understanding Compression now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.