Introduction

Imagine you had a time machine that could go back one minute, or an hour. Think about what you could do with it. From the perspective of other people, it would seem like there was nothing you couldn’t do, no contest you couldn’t win.

In the real world, there are three basic ways to win. One way is to have something, or to know something, that your competition does not. Nice work if you can get it. The second way to win is to simply be more intelligent. However, the number of people who think they are smarter is much larger than the number of people who actually are smarter.

The third way is to process information faster so you can make and act on decisions faster. Being able to make more decisions in less time gives you an advantage in both information and intelligence. It allows you to try many ideas, correct the bad ones, and react to changes before your competition. If your opponent cannot react as fast as you can, it does not matter what they have, what they know, or how smart they are. Taken to extremes, it’s almost like having a time machine.

An example of the third way can be found in high-frequency stock trading. Every trading desk has access to a large pool of highly intelligent people, and pays them well. All of the players have access to the same information at the same time, at least in theory. Being more or less equally smart and informed, the most active area of competition is the end-to-end speed of their decision loops. In recent years, traders have gone to the trouble of building their own wireless long-haul networks, to exploit the fact that microwaves move through the air 50% faster than light can pulse through fiber optics. This allows them to execute trades a crucial millisecond faster.

Finding ways to shorten end-to-end information latency is also a constant theme at leading tech companies. They are forever working to reduce the delay between something happening out there in the world or in their huge clusters of computers, and when it shows up on a graph. At Facebook in the early 2010s, it was normal to wait hours after pushing new code to discover whether everything was working efficiently. The full report came in the next day. After building their own distributed in-memory database and event pipeline, their information loop is now on the order of 30 seconds, and they push at least two full builds per day. Instead of slowing down as they got bigger, Facebook doubled down on making more decisions faster.

What is your system’s end-to-end latency? How long is your decision loop, compared to the competition? Imagine you had a system that was twice as fast. What could you do with it? This might be the most important question for your business.

In this book we’ll explore new models of quickly processing information end to end that are enabled by long-term hardware trends, learnings from some of the largest and most successful tech companies, and surprisingly powerful ideas that have survived the test of time.

Get Building Real-Time Data Pipelines now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.