Chapter 4. Open Source Telemetry Standards: Prometheus, OpenTelemetry, and Beyond

In the previous chapters we discussed how observability data has been growing in scale while delivering diminishing business outcomes. We delved into real-life use cases on how well-known companies have solved observability data issues. Finally, we introduced a new framework for reliably solving the same issue, distilling the core principles that we have gathered from our experience solving those real-world use cases.

In this chapter, we will explore implementations using open source software and how open source instrumentation has increasingly become the de facto standard for monitoring and observability in cloud native environments. We’ll trace the evolution of this trend, highlighting the pivotal roles of Prometheus and OpenTelemetry (OTel) in shaping the landscape. These tools have simplified the collection and analysis of vast amounts of telemetry data and established new benchmarks for flexibility, scalability, and community-driven development in observability.

Instrumentation Before Prometheus and OTel

Before the industry standardized on Prometheus and OTel, many companies were forced to use proprietary collection solutions, such as AppDynamics, Dynatrace, or New Relic. These vendors control the instrumentation and aggregation of telemetry using agents, which are software processes that run alongside an application to collect data and then send it to an external server.

If you are running an AppDynamics observability setup, you have no choice but to use the AppDynamics agent to send telemetry to their system, as shown in Figure 4-1. This is called agent-based application instrumentation. Applications typically need to install a software library or software development kit (SDK) to run these agents and send the data back to the aggregation server.

In addition to proprietary agents, the data formats used by the agents are also proprietary. These proprietary data formats mean that, for all intents and purposes, your data is locked into the vendor. For example, you cannot easily migrate from one vendor to another without losing all dashboards and alerts that were built by the original vendor, making migrations labor-intensive and wasteful.

Agents are also largely noninteroperable. This means that if you rely on AppDynamics, the same agents cannot easily aggregate those same metrics into New Relic’s system.

Agent-based systems use the same system resources as the application and in some cases can slow down or even crash applications. Site reliability engineering (SRE) teams can’t observe when agents cause performance issues since they are using the same agents to send the data back to the aggregation servers.

Data Collection Is Controlled by Users

In 2012, while most organizations were making the switch to microservices architecture, SoundCloud ran into a set of challenges while scaling their existing monitoring system. To solve these challenges, SoundCloud created Prometheus: a way to instrument once and output everywhere.

By August 2018, Prometheus graduated as a CNCF official project.¹ An open source ecosystem was built around Prometheus largely because of Kubernetes and its increasing ubiquity in the cloud native space.

Because of Prometheus, most organizations running in cloud native architectures today no longer have to deal with a myriad of tools and agents to instrument their applications. Effectively, this moved the data collection from being controlled by vendors to being controlled by cloud native observability practitioners.

Prometheus

Prometheus is inspired by Google’s Borgmon monitoring system (Borg). Instead of using a sink that pushes data to an aggregator system, Prometheus instrumentation exposes a metric endpoint (usually an HTTP endpoint in /metrics). The Prometheus server scrapes the metric endpoint. While most other systems are push-based, pushing data out toward an aggregator, Prometheus is pull-based. This represents a major innovation: because push-based systems must wait for servers to respond to requests, they can cause delays and performance degradation.

Interoperability Between Different Observability Tools

Pull-based systems expose data by using a broadcast system, “listening” to and then broadcasting data without affecting or even notifying the system producing the data. This eliminates the need for agents, and for most applications, its impact on performance is almost negligible. Figure 4-2 shows agentless metrics in Prometheus.

The shift to pull-based metrics collection has allowed SRE teams to better control the metrics they collect. Further, pull-based collection allowed interoperability between different observability tools.

Standardization to Prometheus

The caveat is that for a pull-based system to be effective, it needs a standard data format to eliminate the need for conversion. Similar to Borg, Prometheus created its exposition format, Prometheus exposition, then wrote clients that use it to expose metrics simply.

Since Prometheus shifted the responsibility to clients outputting a standard data format, it created a system where whoever supports that format can use the data. This created a cottage industry of every software that outputs data supporting Prometheus’s exposition format, resulting in massive adoption and standardization around Prometheus.

In essence, you write once, and you can output anywhere that supports the Prometheus exposition format!

Prometheus Reliability

With Prometheus, metrics instrumentation is part of the application rather than a separate process, as shown in Figure 4-3. This contributes to greater reliability.

Another contributor to Prometheus’s reliability is the nature of the pull model. The pull model is inherently reliable because if the collector is down, Prometheus simply waits longer to pull the metric, while the push model will fail.

Prometheus thus solved the two big problems: reliability and collection scalability. It has since been so widely adopted that most open source tools in the cloud native ecosystem support Prometheus metrics exposition.

Its pervasiveness became especially evident when cloud native ecosystems started to build tools and standards on top of Prometheus.² This means that any tool sets or vendors that are compatible with Prometheus are now forced to be interoperable with each other.

Prometheus tools and standards give SRE teams greater control over their metrics instrumentation. What impact has this had on the cloud native ecosystem?

Prometheus: The Good

For good or ill, the industry is adopting Prometheus and it has become the de facto standard for cloud native observability for metrics. After Kubernetes, it was the second project to attain “graduated” status from the CNCF, which requires meeting stringent criteria.

Prometheus itself has many advantages, the foremost of which are:

Dimensional metric data model: Prometheus uses a dimensional metric data model that allows flexibility when labeling metric data. You can use these dimensions to query metrics using the PromQL language, contrasting with StatsD, which primarily employs a simplistic model focusing on counters and timers without inherent support for dimensional data or a specialized query language.
Service discovery: Prometheus can use service discovery native to the system Prometheus is monitoring. For example, Prometheus can self-discover pod endpoints using Kubernetes’s own service discovery APIs.
Deep integration between PromQL and alerting: Prometheus has a built-in Alertmanager subsystem that can push to paging systems like PagerDuty and Slack. Alertmanager uses PromQL to build alerts and thresholds.
Mature specification: Prometheus has reached a level of maturity that makes it a stable and reliable solution for many organizations.

Prometheus: The Not-So-Good

However, as with all good systems, Prometheus has disadvantages as well:

Generic use case

The use case for Prometheus is too generic: it isn’t built for any one type of application, so you have to configure it for your specific system, including creating metadata labels for each metric type. The relabeling configuration becomes complex as you collect more metrics.

Annotation leads to complexity

The more dimensions your metrics have, the more complicated it gets to configure Prometheus scraping because you have to coordinate collection between multiple instances. You can solve this problem easily by using tools like PromLens and annotating metrics only when necessary.

Hard to operate

Prometheus is hard to operate. Prometheus runs as a single binary, which means it’s easy to stand up but harder to keep running on unexpected errors. Having Prometheus run in production means tweaking and fine-tuning to keep it running. You end up spending time on Prometheus that you could (and should!) be spending on your core business applications instead.

Horizontal scalability

The biggest disadvantage of Prometheus is that its server uses vertical scaling.

In general, there are two types of scaling: horizontal and vertical. Horizontal scaling, also sometimes called fan-out scaling, is based on multiple servers, while vertical scaling is based on the resources of one server. Most distributed systems are scaled horizontally because it is faster and more cost-effective.

Prometheus, by default, lacks horizontal scaling capabilities, leading to reliance on vertical scaling for large deployments. This approach necessitates the use of powerful servers with extensive CPU and memory resources. The approach poses another significant challenge as well: it creates a single point of failure, as an outage in the server’s region can disrupt the entire system. This is particularly problematic in cloud native environments where reliability is paramount. Additionally, managing such a setup is complex and resists automation, making it more akin to treating the server as a “pet” rather than “cattle,” as per the popular cloud analogy. Finally, the scalability of Prometheus is inherently limited; even in the cloud with its vast array of compute resources, there’s a ceiling to how much a single server can be scaled vertically.

That said, there are ways to scale Prometheus servers horizontally. Projects such as Thanos, Cortex, and Mimir aim to add horizontal scalability to Prometheus. However, once you reach the point where you need to scale Prometheus horizontally, we suggest you look into fully managed options. The complexity of running horizontally scaling Prometheus usually outweighs the benefits of maintaining these systems, with very few exceptions.

OpenTelemetry

As more organizations and practitioners standardized to Prometheus for their metrics, another question arose. What about logs and traces? This leads to the challenge of how to deal with the fragmentation of tools to generate telemetry for logs, metrics, and traces.

Many tools and solutions were crafted to solve this challenge; the most well-known ones were OpenCensus and OpenTracing. OpenTracing focused on telemetry for tracing, while OpenCensus focused on telemetry for both metrics and tracing.

By 2019, a committee was formed that aimed to combine the efforts of OpenCensus and OpenTracing into building a standardized and unified set of tools, which was dubbed OpenTelemetry.

What Is OTel?

OTel is an observability framework and toolkit designed to create and manage telemetry data such as traces, metrics, and logs. Crucially, OTel is vendor- and tool-agnostic, meaning that it can be used with a wide variety of observability backends.

OTel generates, collects, processes, and exports telemetry. However, OTel is not a backend system for logs, metrics, or traces; you still need a system to send the telemetry generated by OTel for further analysis or safekeeping.

OTel is not one system like Prometheus; it’s an umbrella project that combines the effort of building multiple subsystems to generate high-quality, ubiquitous, and portable telemetry to enable effective observability.

The OTel Specification

Unlike Prometheus, which inadvertently built a standard, OTel is deliberately building a standard, the OTel specification, that can be used for any implementation.

OTel SDK

OTel SDK, also known by engineers as client libraries, allows us to create telemetry that we can install depending on which programming languages we are writing our application in. The client libraries can either generate telemetry automatically or manually.

Libraries have built-in automatic instrumentation. For example, HTTP metrics, gRPC tracing, and even Express.js metrics are automatically generated when you install these libraries and set them up in your JavaScript application. However, there are edge cases; not every library would automatically generate metrics, and you would need to configure them.

Manual instrumentation uses primitives that the client libraries will allow you to use to generate specific signals about your application or to add contextual metadata to the metrics, spans, or logs emitted.

OpenTelemetry Collector

The OpenTelemetry Collector functions as an intermediary for telemetry data, equipped with three core components: an ingestion endpoint that receives data and also translates incoming telemetry into OTel formats (OpenTelemetry Protocol [OTLP] over HTTP, OTLP over gRPC); a processor that handles filtering, batching, and transforming the data; and an exporter that transmits the processed telemetry data to various backends.

There are also multiple vendor exporters that you can use depending on where you want your telemetry data to end up. Additionally, there is a growing list of exporter, collector, receiver, and client instrumentation libraries in the OpenTelemetry Registry.

OTel: The Promise

OTel can be used for instrumenting logs, metrics, and traces to emit telemetry via a standard format. It promises a single unified standard for observability, simplifying the telemetry process, and supports multiple vendors and open source software (OSS) with no vendor lock-in. Further, it allows extensibility. Developers can build upon the specifications to extend OTel to fit their specific needs.

The willingness of popular vendors, libraries, and languages to support OTel means it’s easier for developers to emit telemetry in OTel format.

Another promise of OTel is the ability to correlate signals from multiple sources, like logs to metrics correlation, metrics to traces, and even logs and metrics to traces, using a standard specification. Imagine jumping into one correlation ID for a failed HTTP request and finding all the downstream logs, metrics, and even traces!

OTel: The Reality

The learning curve to fully understand all the components of OTel and effectively use it in production can be steep, especially for practitioners who are used to working with proprietary observability systems.

Another important difference to note is that, unlike Prometheus, which uses a pull system, OTel uses a push system with a collector.

Limitations of maturity

Being an umbrella project, OTel has multiple levels of maturity depending on which programming language you are using and what types of signal you want to emit.

For example, as of this writing in November 2023, the Python trace and metric client libraries are stable, but logs are experimental. Golang traces are stable, but metrics are mixed, and logs have not yet been implemented. To get the full level of maturity, visit the OpenTeleme⁠try status page.

The current limitations of maturity mean that fully adopting OTel across telemetry types will be an ongoing project until all the languages and frameworks your organization supports are stable.

Backend support

OTel is vendor neutral; however, different backends offer varying levels of support for OTel. Backends may not fully use OTel’s capabilities.

Where to Start with OTel

Because of the increased complexity, we suggest those interested in adopting OTel begin by just running the collector. Simply running the collector will give you a good feel for how the rest of the OTel ecosystem works.

Having the collector will allow you to start utilizing telemetry prewritten by tools you are already familiar with. For example, if you are running NGINX Ingress Controller, you can follow the Kubernetes NGINX Ingress OpenTelemetry guide to start sending telemetry to your collector.

Once you run a collector, you will want to try your hand at configuring auto-instrumentation in the collector to see what telemetry you can get from your system out of the box. Additionally, we suggest you try to do Chronosphere’s walkthrough of OTel using JavaScript and automatic instrumentation or view a practical demonstration of OTel in action.

Implications of OTel’s Approach

For bigger organizations with a greater need for flexibility in their telemetry systems, OTel is a better way than proprietary or vendor-specific collectors to handle telemetry. OTel provides a standard, flexible, and interoperable way to generate telemetry.

Being a standard across the industry, OTel allows practitioners to better their portability of skills when moving across different organizations or divisions of a bigger organization. OTel lock-in becomes less of a concern for practitioners.

The ability to correlate data using OTel is perhaps its greatest advantage. There is no other system in the cloud native observability space that has that potential out of the box.

But as with any new standard, there is an adoption curve. The trick is to understand when OTel is stable enough for your organization and when the complexity of adoption is minimal enough for adopting OTel.

As more and more systems adopt OTel, it will become an indispensable project that allows all practitioners to better organize and standardize telemetry systems.

A key weakness in the adoption of OTel is perhaps the erratic support for logs, where Fluent Bit can come in to fill the gap.

Fluent Bit

Fluent Bit is a vendor-neutral, open source solution that enables organizations to connect any data source to any destination. Organizations leverage Fluent Bit to create observability pipelines that can collect, process, and route data. It has a fully pluggable architecture that allows users to connect telemetry sources with various other destinations and perform many different types of processing (such as filtering, parsing, etc.) on the data while in flight.

Fluent Bit began as an outgrowth of the Fluentd project, which was created by Sadayuki Furuhashi in 2011 as an open source data collector that lets users unify log data collection and consumption. Fluent Bit was created in 2014 as a more lightweight, performant version for resource-constrained environments.

With over 12 billion downloads, the Fluent Bit project is one of the most widely adopted solutions to address logging challenges in cloud native environments. It includes support for OTel and Prometheus as both an input and an output, supports connectors that allow it to integrate with hundreds of other systems, and allows extensibility via plugins written in WebAssembly and Golang. The synergy between the broad telemetry capabilities of projects like OTel and Fluent Bit’s specialized log-processing abilities allows for a solution that works for any scale of organization—from a lightweight system for transforming log entries to structured metric data to large-scale processing of logs via backends like Kafka or OpenSearch. Fluent Bit is available as a default logging option in the environments of most cloud service providers.

¹ “Prometheus Graduates Within CNCF,” Cloud Native Computing Foundation, August 9, 2018, https://oreil.ly/qt-vt.

² Anne McCrory, “Ubiquitous? Pervasive? Sorry, They Don’t Compute,” Computerworld, March 20, 2000, https://oreil.ly/juHHV.

Get Cloud Native Observability now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Cloud Native Observability by Kenichi Shibata, Rob Skillington, Martin Mao