Why CIFAR 10 Vs. CIFAR 100 is the most popular for OOD benchmark?

Question

CIFAR 10 Vs. CIFAR 100 is the most popular dataset for the task of Out-of-Distribution performance evaluation. On the infamous "Papers-with-code" [1] CIFAR 10Vs.100 is the most used Benchmark for OOD. Also on Google's blog announcing their new state-of-the-art pre-trained embedding model pixel, CIFAR 10Vs.100 is used for benchmarking OOD [2].

[1] https://paperswithcode.com/datasets?task=out-of-distribution-detection

[2] https://ai.googleblog.com/2022/07/towards-reliability-in-deep-learning.html

score 1 · Answer 1 · answered Mar 14 '23 at 04:00

The answer can be found within the paper citation associated with the data collection of CIFAR10Vs.100 [1]. First, that data was manually annotated by humans as described in [1]* which gives it credibility. Then of course the second most important reason which is mutual exclusivity ensured by gathering Negative Examples of CIFAR 10 in CIFAR 100**.

[1] https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf

*"We paid students to label a subset of the tiny images dataset. The labeled subset we collected consists of ten classes of objects with 6000 images in each class. The classes are airplane, automobile (but not truck or pickup truck), bird, cat, deer, dog, frog, horse, ship, and truck (but not pickup truck)."

**We call this the CIFAR-10 dataset, after the Canadian Institute for Advanced Research, which funded the project. In addition to this dataset, we have collected another set 600 images in each of 100 classes. This we call the CIFAR-100 dataset. The methodology for collecting this dataset was identical to that for CIFAR-10. The CIFAR-100 classes are mutually exclusive with the CIFAR-10 classes, and so the can be used as negative examples for CIFAR-10. For example, CIFAR-10 has the classes automobile and truck, but neither of these classes includes images of pickup trucks. CIFAR-100 has the class pickup truck.

Why CIFAR 10 Vs. CIFAR 100 is the most popular for OOD benchmark?

1 Answers1