September 2015
Beginner to intermediate
608 pages
13h 43m
English
There is one fact that the Mahalanobis distance measure is unable to overcome, though, and this is known as the curse of dimensionality. As the number of dimensions in a dataset rises, every point tends to become equally far from every other point. We can demonstrate this quite simply with the following code:
(defn ex-6-27 [] (let [distances (for [d (range 2 100) :let [data (->> (dataset-of-dimension d) (s/mahalanobis-distance) (map first))]] [(apply min data) (apply max data)])] (-> (c/xy-plot (range 2 101) (map first distances) :x-label "Number of Dimensions" :y-label "Distance Between Points" :series-label "Minimum Distance" :legend true) (c/add-lines (range 2 101) (map second distances) :series-label "Maximum Distance") ...
Read now
Unlock full access