February 2018
Intermediate to advanced
378 pages
10h 14m
English
The ways to assess the quality of a model's predictions quantitatively are known as metrics. The simplest metric in classification is accuracy, a proportion of correctly classified cases. Accuracy metric can be misleading. Imagine that you have a training set with 1000 samples. 999 of them are of class A, and 1 of class B. Such a kind of dataset is called imbalanced. The baseline (the simplest) solution in this case would be to always predict class A. Accuracy of such a model would then be 0.999, which can be pretty impressive, but only if you don't know about the ratio of classes in the training set. Now imagine that class A corresponds to an outcome of healthy, and class B to cancer, in the ...
Read now
Unlock full access