Numerosity reduction – clustering design pattern

This design pattern explores the implementation of the clustering technique for data reduction.


Clustering belongs to the numerosity reduction category of data reduction. Clustering is a nonparametric model and works without the prior knowledge of a class label using unsupervised learning.


Clustering is a general approach to solve the problem of grouping data. It can be achieved by various algorithms that differ in the way they define what goes into a group and how to find the candidates for that group. There are more than 100 different implementations of clustering algorithms that solve a variety of problems for different objectives. There is no single size that fits all the clustering ...

Get Pig Design Patterns now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.