Chapter 15. Debugging Linkerd

If you’ve come this far, you understand how valuable a tool Linkerd can be for a platform to provide. It secures your apps, provides powerful insights into those applications, and can account for underlying application and network issues by making your connections more reliable. In this chapter, we’re going to look at what to do when you need to troubleshoot Linkerd itself.

In all cases, the first step when diagnosing an issue with Linkerd is to use the Linkerd CLI’s built-in health checking tool by running linkerd check, as we discussed in more detail in Chapter 6. linkerd check is a very quick way to pinpoint many of the most common issues with Linkerd installations; for example, it can immediately diagnose expired certificates, which is the most common issue that causes Linkerd outages in practice.

Diagnosing Data Plane Issues

Linkerd is somewhat famous for not requiring a ton of hands-on work with the data plane; however, it’s still useful to be able to do some basic troubleshooting. Many proxy issues end up involving fairly similar sets of solutions.

“Common” Linkerd Data Plane Failures

While Linkerd is generally a fault-tolerant service mesh, there are a few conditions that we see arise more than others. Knowing how to tackle these can be extremely helpful.

Pods failing to start

If you run into a situation where injected Pods are failing to start, the first step will be to identify exactly where the failure is occurring. This is where the ...

Get Linkerd: Up and Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.