Multi-class Classification Metric

I am working on a statistical classification project and was looking for a metric that would describe performance of my algorithm. In my case, using accuracy as a metric would not be very useful, because my data sets have very different sizes of classes. For example, I might have a data set made of 95% of class A and 5% of class B. If my model never predicted class B, I might falsely assume that my algorithm was working great because it was predicting the correct answer 95% of the time.

I have read about the F1 Score which seems to be often used as a replacement for accuracy in binary classification problems. To adapt the F1 score to my multi-class problem, I take the average F1 score for each class of data. This seems to work nicely. Here is an example:

String[] truth = new String[]     {"C", "A", "C", "C", "A", "B"};  
String[] predicted = new String[] {"A", "A", "C", "C", "A", "C"};

F1 = 0.606061  
Acc = 0.666667

B predected as C 1 times  
C predected as A 1 times  

I have implemented an Evaluation class that can be used by feeding it pairs of true outcomes and predicted outcomes and it will calculate the confusion matrix and F1 score.

Paul Soucy

Read more posts by this author.