Published 19 October, 2020; last updated 08 March, 2021
Progress in computer image classification performance took:
ImageNet1 is a large collection of images organized into a hierarchy of noun categories. We looked at ‘top-5 accuracy’ in categorizing images. In this task, the player is given an image, and can guess five different categories that the image might represent. It is judged as correct if the image is in fact in any of those five categories.
We used Andrej Karpathy’s interface2 for doing the ImageNet top-5 accuracy task ourselves, and asked a few friends to do it. Five people did it, with performances ranging from 74% to 89%, with a median performance of 81%.
This was not a random sample of people, and conditions for taking the test differed. Most notably, there was no time limit, so time allocated was set by patience for trying to marginally improve guesses.
ImageNet categorization is not a popular activity for humans, so we do not know what highly talented and trained human performance would look like. The best relatively high human performance measure we have comes from Russakovsky et al, who report on performance of two ‘expert annotators’, who they say learned many of the categories. 3 The better performing annotator there had a 5.1% error rate.4
The ImageNet database was released in 2009.5. An annual contest, the ImageNet Large Scale Visual Recognition Challenge, began in 2010.6
In the 2010 contest, the best top-5 classification performance had 28.2% error.7
However image classification broadly is older. Pascal VOC was a similar previous contest, which ran from 2005.8 We do not know when the first successful image classification systems were developed. In a blog post, Amidi & Amidi point to LeNet as pioneering work in image classification9, and it appears to have been developed in 1998.10
The first entrant in the ImageNet contest to perform better than our beginner level benchmark was SuperVision (commonly known as AlexNet) in 2012, with a 15.3% error rate.11
In 2015 He et al apparently achieved a 4.5% error rate, slightly better than our high human benchmark.12
According to paperswithcode.com, performance has continued to climb, to 2020, though slower than earlier.13
Given the above dates, we have:
Range | Start | End | Duration (years) |
First attempt to beginner level | <1998 | 2012 | >14 |
Beginner to superhuman | 2012 | 2015 | 3 |
Above superhuman | 2015 | >2020 | >5 |
Primary author: Rick Korzekwa
“ImageNet.” Accessed October 19, 2020. http://www.image-net.org/.
Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, et al. “ImageNet Large Scale Visual Recognition Challenge.” ArXiv:1409.0575 [Cs], January 29, 2015. http://arxiv.org/abs/1409.0575.
Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, et al. “ImageNet Large Scale Visual Recognition Challenge.” ArXiv:1409.0575 [Cs], January 29, 2015. http://arxiv.org/abs/1409.0575.
Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, et al. “ImageNet Large Scale Visual Recognition Challenge.” ArXiv:1409.0575 [Cs], January 29, 2015. http://arxiv.org/abs/1409.0575.
Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, et al. “ImageNet Large Scale Visual Recognition Challenge.” ArXiv:1409.0575 [Cs], January 29, 2015. http://arxiv.org/abs/1409.0575.
Everingham, Mark, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman. “The Pascal Visual Object Classes (VOC) Challenge.” International Journal of Computer Vision 88, no. 2 (June 2010): 303–38. https://doi.org/10.1007/s11263-009-0275-4.
“LeNet.” In Wikipedia, June 19, 2020. https://en.wikipedia.org/w/index.php?title=LeNet&oldid=963418885.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. “ImageNet Classification with Deep Convolutional Neural Networks.” In Advances in Neural Information Processing Systems 25, edited by F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, 1097–1105. Curran Associates, Inc., 2012. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf.
Also, see Table 6 for a list of other entrants:
Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, et al. “ImageNet Large Scale Visual Recognition Challenge.” ArXiv:1409.0575 [Cs], January 29, 2015. http://arxiv.org/abs/1409.0575.
“Papers with Code – ImageNet Benchmark (Image Classification).” Accessed October 19, 2020. https://paperswithcode.com/sota/image-classification-on-imagenet.