39

How does Keras calculate accuracy from the classwise probabilities? Say, for example we have 100 samples in the test set which can belong to one of two classes. We also have a list of the classwise probabilites. What threshold does Keras use to assign a sample to either of the two classes?

pseudomonas
  • 1,042
  • 3
  • 14
  • 30

1 Answers1

34

For binary classification, the code for accuracy metric is:

K.mean(K.equal(y_true, K.round(y_pred)))

which suggests that 0.5 is the threshold to distinguish between classes. y_true should of course be 1-hots in this case.

It's a bit different for categorical classification:

K.mean(K.equal(K.argmax(y_true, axis=-1), K.argmax(y_pred, axis=-1)))

which means "how often predictions have maximum in the same spot as true values"

There is also an option for top-k categorical accuracy, which is similar to one above, but calculates how often target class is within the top-k predictions.

Ethan
  • 1,633
  • 9
  • 24
  • 39
Mikhail Yurasov
  • 726
  • 6
  • 7
  • Thank you for the answer. Does that mean even for binary classification, the labels need to be one hot encoded? – pseudomonas Mar 20 '17 at 05:02
  • @Raghuram No, for binary classification you just need 0 or 1 as class, no need to one hot encode them. Since K.mean(K.equal(y_true, K.round(y_pred))) will match 2 float values for each case, so it has to be 0 or 1 and not [0,1],[1,0]. – Divyanshu Kalra Jul 04 '17 at 20:13
  • For categorical accuracy, use categorical_accuracy. – Shital Shah Dec 23 '17 at 11:12
  • 2
    for a multi-class problem (with more than two classes), is there a difference between using "accuracy" vs "categorical_accuracy" – Quetzalcoatl Nov 06 '18 at 20:03
  • 2
    And just in case, if the classes are mutually exclusive then use sparse_categorical_accuracy instead of categorical_accuracy, this usually improves the outputs. The difference is discused here. – Noir Dec 10 '19 at 19:51
  • @mikhail - in my case my GT labels are [ 1 0 0 0 0 1 ] and values are generally [ 0.23 0.34 0.45 0.22 0.10 0.9] ..basically only the last one matches and the rest are counted as match because of the threshold artificially inflating results ..any suggestions on what other metric can be used here ? – Vikram Murthy Apr 16 '20 at 07:04
  • What is K? Because if it's supposed to be keras, I get module 'tensorflow.keras' has no attribute 'round' – Jack M Dec 05 '20 at 19:29