# Multi-class Classification Metric

I am working on a statistical classification project and was looking for a metric that would describe performance of my algorithm. In my case, using accuracy as a metric would not be very useful, because my data sets have very different sizes of classes. For example, I might have a data set made of 95% of class A and 5% of class B. If my model never predicted class B, I might falsely assume that my algorithm was working great because it was predicting the correct answer 95% of the time.

I have read about the F1 Score which seems to be often used as a replacement for accuracy in binary classification problems. To adapt the F1 score to my multi-class problem, I take the average F1 score for each class of data. This seems to work nicely. Here is an example:

```
String[] truth = new String[] {"C", "A", "C", "C", "A", "B"};
String[] predicted = new String[] {"A", "A", "C", "C", "A", "C"};
F1 = 0.606061
Acc = 0.666667
B predected as C 1 times
C predected as A 1 times
```

I have implemented an Evaluation class that can be used by feeding it pairs of true outcomes and predicted outcomes and it will calculate the confusion matrix and F1 score.