Chapter 5. Data Enrichment

Falco’s architecture allows you to capture events from different data sources, as you’ve learned. This process delivers raw data, which can be very rich but isn’t very useful for runtime security unless paired with the right context. That’s why Falco first extracts and then enriches the raw data with contextual information, so that the rule author can comfortably use it. Typically, we refer to this information as the event metadata. Getting metadata can be a complex task, and getting it efficiently is even more complex.

You’ve already seen that the system-state collection capabilities in libscap and the state engine implemented by libsinsp (discussed in Chapter 3) are central to this activity, but there’s much more to discover. In this chapter, we’ll delve into the design aspects of the Falco stack to help you better understand how data enrichment works. In particular, we will show you libsinsp’s efficient layered approach to obtaining system, container, and Kubernetes metadata for system call (syscall) events. This is what enables you to access the information you need relating to different contexts (depending on your use case), such as a container’s ID or the name of a Pod where a suspicious event occurred. Finally, we’ll show you how plugins, Falco’s other main data source, can implement their own data enrichment mechanisms, opening up infinite possibilities.

Understanding Data Enrichment for Syscalls

Understanding how data ...

Get Practical Cloud Native Security with Falco now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.