Arguably, just dumping log data to disk is one solution, and it’s what most mobile applications do (using “debug logs”). But most failures require correlation of events from two nodes. This means searching lots of debug logs by hand to find the ones that matter. It’s not a very clever approach.
We want to send log data somewhere central, either immediately or opportunistically (i.e., store and forward). For now, let’s focus on immediate logging. My first idea, when it comes to sending data, is to use Zyre for this: just send log data to a group called “LOG,” and hope someone collects it.
But using Zyre to log Zyre itself is a Catch-22. Who logs the logger? What if we want a verbose log of every message sent? Do we include logging messages in that, or not? It quickly gets messy. We want a logging protocol that’s independent of Zyre’s main ZRE protocol. The simplest approach is a PUB-SUB protocol, where all nodes publish log data on a PUB socket and a collector picks that up via a SUB socket (Figure 8-3).
Figure 8-3. Distributed log collection
The collector can, of course, run on any node. This gives us a nice range of use cases:
A passive log collector that stores log data on disk for eventual statistical analysis. This would be a PC with sufficient hard disk space for weeks or months of log data.
A collector that stores log data into a database where it ...