
Maintaining Histograms from Data Streams 59
4.4 Applications to Data Mining
Discretization of continuous attributes is an important task for certain
types of machine learning algorithms. Bayesian approaches, for instance, re-
quire assumptions about data distribution. Decision trees require sorting op-
erations to deal with continuous attributes, which largely increases learning
times. Nowadays, there are hundreds of discretization methods: Dougherty
et al. (1995); Yang (2003); Kerber (1992); Boulle (2004). Dougherty et al.
(1995) define three different dimensions upon which we may classify dis-
cretization methods: supervised vs. unsupervised, global ...