16.1 Introduction

Data relevant to insider threats is typically accumulated over many years of organization and system operations, and is therefore best characterized as an unbounded data stream. Such a stream can be partitioned into a sequence of discrete chunks; for example, each chunk might comprise a week’s worth of data. Figure 16.1 illustrates how a classifier’s decision boundary changes when such a stream observes the concept drift. Each circle in the picture denotes a data point with unfilled circles representing true negatives (TNs) (i.e., nonanomalies) and solid circles representing true positives (TPs) (i.e., anomalies). The solid line in each chunk represents the decision boundary for ...

Get Big Data Analytics with Applications in Insider Threat Detection now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.