Components of a data mining pipeline

There are probably many ways of building near real-time data mining for traces. In Canopy, the feature extraction functionality is built directly into the tracing backend, whereas in Jaeger, it can be done via post-processing add-ons, as we will do in this chapter's code exercise. Major components that are required are shown in Figure 12.1:

  • Tracing backend, or tracing infrastructure in general, collects tracing data from the microservices of the distributed application
  • Trace completion trigger makes a judgement call that all spans of the trace have been received and it is ready for processing
  • Feature extractor performs the actual calculations on each trace
  • An optional Aggregator combines features from individual ...

Get Mastering Distributed Tracing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.