48

For detection, a common way to determine if one object proposal was right is Intersection over Union (IoU, IU). This takes the set $A$ of proposed object pixels and the set of true object pixels $B$ and calculates:

$$IoU(A, B) = \frac{A \cap B}{A \cup B}$$

Commonly, IoU > 0.5 means that it was a hit, otherwise it was a fail. For each class, one can calculate the

  • True Positive ($TP(c)$): a proposal was made for class $c$ and there actually was an object of class $c$
  • False Positive ($FP(c)$): a proposal was made for class $c$, but there is no object of class $c$
  • Average Precision for class $c$: $\frac{\#TP(c)}{\#TP(c) + \#FP(c)}$

The mAP (mean average precision) = $\frac{1}{|classes|}\sum_{c \in classes} \frac{\#TP(c)}{\#TP(c) + \#FP(c)}$

If one wants better proposals, one does increase the IoU from 0.5 to a higher value (up to 1.0 which would be perfect). One can denote this with mAP@p, where $p \in (0, 1)$ is the IoU.

But what does mAP@[.5:.95] (as found in this paper) mean?

Martin Thoma
  • 18,880
  • 35
  • 95
  • 169
  • I suspect the [.5:.95] part refers to a range of IoU values, but how that range is assessed into a single mAP I would not know. – Neil Slater Feb 07 '17 at 09:46
  • 1
    @NeilSlater But why would you want an upper boundary? Isn't a higher IoU always better? – Martin Thoma Feb 07 '17 at 10:11
  • Achieving a match with higher IoU is better, but presumably the mAP value is reduced if we measure how well the model describes perfect matches (for any model), and it is not considered a useful measure. Why it is not included in the range I don't know though, but then I don't know how the mAP is calculated in this case - it may be a simple mean based on samples for instance. – Neil Slater Feb 07 '17 at 10:19
  • 2
    There is this github repository with an excellent explanation on IOU, Precision, Recall, Average Precision and mAP. It also has a code that evaluates any object detectors. It will certainly help you guys: https://github.com/rafaelpadilla/Object-Detection-Metrics – Rafael Padilla Jul 11 '18 at 03:29

3 Answers3

41

mAP@[.5:.95](someone denoted mAP@[.5,.95]) means average mAP over different IoU thresholds, from 0.5 to 0.95, step 0.05 (0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95).

There is an associated MS COCO challenge with a new evaluation metric, that averages mAP over different IoU thresholds, from 0.5 to 0.95 (written as “0.5:0.95”). [Ref]

We evaluate the mAP averaged for IoU ∈ [0.5 : 0.05 : 0.95] (COCO’s standard metric, simply denoted as mAP@[.5, .95]) and [email protected] (PASCAL VOC’s metric). [Ref]

To evaluate our final detections, we use the official COCO API [20], which measures mAP averaged over IOU thresholds in [0.5 : 0.05 : 0.95], amongst other metrics. [Ref]

BTW, the source code of coco shows exactly what mAP@[.5:.95] is doing:

self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)

References

Zephyr
  • 997
  • 4
  • 10
  • 20
Icyblade
  • 4,326
  • 1
  • 24
  • 34
  • Do you mind a question? If for example we have 3 instances of a certain class in the dataset, and the model returns iou of 0.1, 0.6 and 0.9 for them, does it mean that we discard the 0.1 result and the mean iou of 0.75 and the corresponding mAP? – Alex Oct 20 '17 at 09:31
7

You already have the answer from Icyblade. However, I want to point out that your Average Precision formula is wrong. The formula $\frac{\#TP(c)}{\#TP(c) + \#FP(c)}$ is the definition of precision, not Average Precision. For object detection, AP is defined in here. Briefly, it summarises the precision/recall curve hence not only precision but also recall is taken into account (hence the False Negative will be penalised too).

anhvh
  • 71
  • 1
  • 1
4

AP is averaged over all categories. Traditionally, this is called "mean average precision" (mAP). We make no distinction between AP and mAP (and likewise AR and mAR) and assume the difference is clear from context.

http://cocodataset.org/#detections-eval

Mark Yang
  • 41
  • 1
  • I thought that mAP is the average of APs in multi-class. I like knowing your/paper author definition of the category. – Cloud Cho Oct 02 '19 at 00:22