Chapter 1. Introduction
In the last couple of years, eBPF has gone from relative obscurity to one of the hottest technology areas in modern infrastructure computing. Personally, I’ve been excited about the possibilities that eBPF enables ever since seeing Thomas Graf speak about it in a Black Belt session at DockerCon 17.1 At the Cloud Native Computing Foundation (CNCF), my colleagues on the Technical Oversight Committee put eBPF forward as one of the areas to watch in our predictions of the technologies that would take off in 2021. Over 2,500 signed up for that year’s eBPF Summit virtual conference, and several of the world’s most advanced software engineering companies came together to create the eBPF Foundation. Clearly, there is a lot of interest in this technology.
In this short report, I hope to give you some insight into why people are so excited about eBPF and the capabilities it offers for tooling in modern compute environments. You’ll get a mental model for what eBPF is and why it’s so powerful. There are some code examples to help make it more concrete (but you can skip over these if you prefer). You’ll get an understanding of what’s involved when building eBPF-enabled tools, and why eBPF has become so seemingly ubiquitous in such a short period of time.
Inevitably, in this short report there isn’t room to go into all the details, but I’ll leave you with some pointers for more information if you want to dive in more deeply.
Extended Berkeley Packet Filter
Let’s get the acronym out of the way: eBPF stands for Extended Berkeley Packet Filter. From that name, you can see that its roots lay in filtering network packets, and the original paper2 was written at the Berkeley Lab (Lawrence Berkeley National Laboratory). But (in my opinion) the name is not terribly helpful for conveying the true power of eBPF, as the “extended” versions enable so much more than packet filtering. These days, eBPF is used as a standalone name that encompasses more than its acronym suggests.
So, if it’s not just about packet filtering, what is eBPF? eBPF is a framework that allows users to load and run custom programs within the kernel of the operating system. That means it can extend or even modify the way the kernel behaves.
As an eBPF program is loaded into the kernel, a verifier ensures that it is safe to run, and rejects it if not. Once loaded, an eBPF program needs to be attached to an event, so that whenever the event happens, the program is triggered.
eBPF was originally developed for Linux, and that is the operating system I’ll focus on in this report; but it is notable that, as of this writing, Microsoft is developing an eBPF implementation for Windows.
Now that the Linux kernels in widespread use all have support for the “extended” parts, the terms eBPF and BPF are largely used interchangeably these days.
eBPF-Based Tools
As you’ll see in this report, the ability to dynamically change the behavior of the kernel is tremendously useful. Traditionally, if we want to observe how our applications are behaving, we add code into those apps to generate logs and traces. eBPF allows us to collect customized information about how an app is behaving without having to change the app in any way, by observing it from within the kernel. We can build on this observability to create eBPF security tools that detect or even prevent malicious activity from within the kernel. And we can create powerful, high-performance networking capabilities with eBPF, handling network packets within the kernel and avoiding costly transitions to and from user space.
The concept of observing applications from the kernel’s perspective isn’t entirely new—it builds on older Linux features, such as perf,3 which also collects behavior and performance information from within the kernel without having to modify the applications being measured. But these tools define a scope for the kinds of data that can be collected, and the formats in which the data is made available. With eBPF, we have far more flexibility because we can write entirely custom programs, allowing us to build a wide range of tools for different purposes.
eBPF programming is incredibly powerful, but it’s also complex. For most of us, the utility of eBPF is going to come not from writing programs ourselves but from using tools created by others. There are an increasing number of projects and vendors building on the eBPF platform to create a new generation of tooling, covering observability, security, networking, and more.
I’ll discuss some more of these higher-level tools later in this report, but if you’re comfortable on the Linux command line and can’t wait to see eBPF in action, a great place to start is the BCC project. It includes a huge collection of tracing tools; even just glancing at the list should give you some idea of the vast scope of operations we can instrument with eBPF, including file operations, memory usage, CPU stats, and even observing any bash command entered anywhere in the system.
In the next chapter, we’ll look at why changing the kernel’s behavior is useful, and why eBPF makes it vastly easier to do this than writing kernel code directly.
1 Thomas Graf, “Cilium: Network and Application Security with BPF and XDP” (DockerCon 17, April 17–20).
2 Steven McCanne and Van Jacobson, “The BSD Packet Filter: A New Architecture for User-Level Packet Capture” (working paper, Lawrence Berkeley National Laboratory, Berkeley, December 19, 1992).
3 perf is a Linux subsystem for collecting performance data.
Get What Is eBPF? now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.