A Framework for Supervised Classification Performance Analysis with Information-Theoretic Methods
Abstract-—We introduce a framework for the evaluation of multiclass classifiers by exploring their confusion matrices. Instead of using error-counting measures of performance, we concentrate in quantifying the information transfer from true to estimated labels using information-theoretic measures. First, the Entropy Triangle allows us to visualize the balance of mutual information, variation of information and the deviation from uniformity in the true and estimated label distributions. Next the Entropy-Modified Accuracy allows us to rank classifiers by performance while the Normalized Information Transfer rate allows us to evaluate classifiers by the amount of information accrued during learning. Finally, if the question rises to elucidate which errors are systematically committed by the classifier, we use a generalization of Formal Concept Analysis to elicit such knowledge. All such techniques can be applied either to artificially or biologically embodied classifiers—e.g. human performance on perceptual tasks. We instantiate the framework in a number of examples to provide guidelines for the use of these tools in the case of assessing single classifiers or populations of them—whether induced with the same technique or not—either on single tasks or in a set of them. These include well-known UCI tasks and the more complex KDD cup 99 competition on Intrusion Detection.
sales on Site11,021