Chapter 4. More Complex, Adaptive Models
As we saw in the previous chapter, it is relatively easy to build the very simplest anomaly detector that looks for deviations from an ideal value. Tools like the t-digest can help by analyzing historical data to accurately find a good threshold. Statistically, such a system is building a model of the input data that describes the data as a constant value with some additive noise. For a system like most of the ones we have seen so far, the model is nearly trivial, but it is a model nonetheless.
But what about the more complicated situations, such as the one shown at the end of the last chapter in Figure 3-3? Systems that are not stationary or that have complicated patterns even when they are roughly periodic require something more than a simple threshold detector. And what happens when conditions change?
What is needed is an adaptive machine-learning model for anomaly detection. In Chapter 2 we discussed the idea of a probabilistic model that is trained using histories of past events to estimate their likelihood of occurrence as a way to describe what is normal. This type of model is adaptive: as small fluctuations occur in the majority of events, our model can adjust its view of “normal” accordingly. In other words, it adapts to reasonable variations. Back to our bird-watching analogy, if our bird detector is looking for unusual species (“accidentals” in bird-watching jargon) or significant and possibly catastrophic changes in the population ...