Developing analytics without metrics is insufficient. It is important to build measures that examine whether the analytics are generating classifications that are statistically significant, economically useful, and stable. For an analytic to be statistically valid, it should meet some criterion that signifies classification accuracy and power. Being economically useful sets a different bar—does it make money? And stability is a double-edged quality: one, does it perform well in sample and out of sample? And, two, is the behavior of the algorithm stable across training corpora?
Here, we explore some of the metrics that have been developed and propose others. No doubt, as the range of analytics grows, so will the range of metrics.
2.4.1 Confusion matrix
The confusion matrix is the classic tool for assessing classification accuracy. Given n categories, the matrix is of dimension n × n. The rows relate to the category assigned by the analytic algorithm and the columns refer to the correct category in which the text resides. Each cell (i, j) of the matrix contains the number of text messages that were of type j and were classified as type i. The cells on the diagonal of the confusion matrix state the number of times the algorithm got the classification right. All other cells are instances of classification error. If an algorithm has no classification ability, then the rows and columns of the matrix will be independent of each other. Under this null hypothesis, the statistic ...