Chapter 11. Ensuring Reliability with Linkerd

As discussed from the very beginning, back in Chapter 1, microservices applications are utterly reliant on the network for all of their communications. Networks are slower and less reliable than in-process communication, which introduces new failure modes and presents new challenges to our applications.

For service mesh users, where the mesh mediates all your application traffic, the reliability benefit is that the mesh can make intelligent choices about what to do when things go wrong. In this chapter, we’ll talk about the mechanisms that Linkerd provides to mitigate the problems of unreliability in the network, helping to address the inherent instability of microservices applications.

Load Balancing

Load balancing might seem like an odd reliability feature to lead with, since many people think that Kubernetes already handles it. As we first discussed in Chapter 5, Kubernetes Services make a distinction between the IP address of the Service and the IP addresses of the Pods associated with the Service. When traffic is sent to the ClusterIP, it ends up being redirected to one of the endpoint IPs.

However, in Kubernetes, the built-in load balancing is limited to entire connections. Linkerd improves on this by using the proxy, which understands more about the protocol involved in the connection, to choose an endpoint for each request, as shown in Figure 11-1.

Figure 11-1. Service discovery in Linkerd

As you can see from Figure 11-1 ...

Get Linkerd: Up and Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.