speed_of_ai_transition:range_of_human_performance:time_for_ai_to_cross_the_human_performance_range_in_diabetic_retinopathy

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

speed_of_ai_transition:range_of_human_performance:time_for_ai_to_cross_the_human_performance_range_in_diabetic_retinopathy [2022/09/21 07:37] (current)
Line 1: Line 1:
 +====== Time for AI to cross the human performance range in diabetic retinopathy ======
 +
 +// Published 21 November, 2018; last updated 20 January, 2021 //
 +
 +<HTML>
 +<p><span style="font-weight: 400;">In diabetic retinopathy,</span> automated systems <span style="font-weight: 400;">started out just below expert human level performance, and took around ten years to reach expert human level performance.</span></p>
 +</HTML>
 +
 +
 +
 +===== Details =====
 +
 +
 +<HTML>
 +<p><span style="font-weight: 400;">Diabetic retinopathy is a complication of diabetes in which the back of the eye is damaged by high blood sugar levels.<span class="easy-footnote-margin-adjust" id="easy-footnote-1-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-1-1241" title=' See e.g. &lt;a href="https://www.nhs.uk/conditions/diabetic-retinopathy/"&gt;https://www.nhs.uk/conditions/diabetic-retinopathy/&lt;/a&gt; '><sup>1</sup></a></span> It is the most common cause of blindness among working-age adults.<span class="easy-footnote-margin-adjust" id="easy-footnote-2-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-2-1241" title='See &lt;a href="https://nei.nih.gov/health/diabetic/retinopathy"&gt;https://nei.nih.gov/health/diabetic/retinopathy&lt;/a&gt;'><sup>2</sup></a></span> The disease is diagnosed by examining images of the back of the eye.</span> <span style="font-weight: 400;">The gold standard used for diabetic retinopathy diagnosis is typically some sort of pooling mechanism over several expert opinions. Thus, in the papers below, each time expert sensitivity/specificity (Se/Sp) is considered, it is the Se/Sp of individual experts graded against aggregate expert agreement.</span></p>
 +</HTML>
 +
 +
 +<HTML>
 +<p><span style="font-weight: 400;">As a rough benchmark for expert-level performance we’ll take the average Se/Sp of ophthalmologists from a few studies. Based on Google Brain’s work (detailed below), this paper <span class="easy-footnote-margin-adjust" id="easy-footnote-3-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-3-1241" title=' See Results section before adjudication and consensus &lt;/span&gt;&lt;a href="https://www.ncbi.nlm.nih.gov/pubmed/23494039"&gt;&lt;span style="font-weight: 400;"&gt;https://www.ncbi.nlm.nih.gov/pubmed/23494039&lt;/span&gt;&lt;/a&gt;&lt;span style="font-weight: 400;"&gt; '><sup>3</sup></a></span>, and this paper <span class="easy-footnote-margin-adjust" id="easy-footnote-4-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-4-1241" title=' See Figure 3 &lt;/span&gt;&lt;a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911785/"&gt;&lt;span style="font-weight: 400;"&gt;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911785/&lt;/span&gt;&lt;/a&gt;&lt;span style="font-weight: 400;"&gt; '><sup>4</sup></a></span> , the average specificity of 14 opthamologists, which indicates expert human-level performance, is 95% and the average sensitivity is 82%.</span></p>
 +</HTML>
 +
 +
 +<HTML>
 +<p><span style="font-weight: 400;">As far as we can tell, 1996 is when the first algorithm automatically detecting diabetic retinopathy was developed. When compared to opthamologists’ ratings, the algorithm achieved 88.4% sensitivity and 83.5% specificity.</span></p>
 +</HTML>
 +
 +
 +<HTML>
 +<p><span style="font-weight: 400;">In late 2016 Google algorithms were on par with eight opthamologist diagnoses of diabetic retinopathy. See Figure 1.<span class="easy-footnote-margin-adjust" id="easy-footnote-5-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-5-1241" title=' &lt;a href="https://ai.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html"&gt;https://ai.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html&lt;/a&gt; '><sup>5</sup></a></span></span><span style="font-weight: 400;"> The high-sensitivity operating point (labelled on the graph) achieved 97.5/93.4 Se/Sp.   </span></p>
 +</HTML>
 +
 +
 +<HTML>
 +<figure aria-describedby="caption-attachment-1242" class="wp-caption alignnone" id="attachment_1242" style="width: 553px">
 +<a href="http://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1.png"><img alt="" class="wp-image-1242 size-full" height="550" sizes="(max-width: 553px) 100vw, 553px" src="https://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1.png" srcset="https://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1.png 553w, https://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1-150x150.png 150w, https://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1-300x298.png 300w" width="553"/></a>
 +<figcaption class="wp-caption-text" id="caption-attachment-1242">
 +                  Figure 1: Performance comparison of a late 2016 Google algorithm, and eight opthalmologists, from here.<span style="font-weight: 400;"> The black curve represents the algorithm and the eight colored dots are opthamologists.</span>
 +</figcaption>
 +</figure>
 +</HTML>
 +
 +
 +<HTML>
 +<p><span style="font-weight: 400;">Many other papers were published in between 1996 and 2016. However, none of them achieved better than expert human-level performance on both specificity and sensitivity. For instance 86/77 Se/Sp was achieved in 2007, 97/59 in 2013, and 94/72 by another team in 2016. <span class="easy-footnote-margin-adjust" id="easy-footnote-6-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-6-1241" title=' ‘&lt;/span&gt;&lt;span style="font-weight: 400;"&gt;Automated and semi-automated diabetic retinopathy evaluation has been previously studied by other groups. Abràmoff et al&lt;/span&gt;&lt;a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r4"&gt;&lt;span style="font-weight: 400;"&gt;4&lt;/span&gt;&lt;/a&gt;&lt;span style="font-weight: 400;"&gt; reported a sensitivity of 96.8% at a specificity of 59.4% for detecting referable diabetic retinopathy on the publicly available Messidor-2 data set.&lt;/span&gt;&lt;a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r9"&gt;&lt;span style="font-weight: 400;"&gt;9&lt;/span&gt;&lt;/a&gt;&lt;span style="font-weight: 400;"&gt;Solanki et al&lt;/span&gt;&lt;a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r12"&gt;&lt;span style="font-weight: 400;"&gt;12&lt;/span&gt;&lt;/a&gt;&lt;span style="font-weight: 400;"&gt; reported a sensitivity of 93.8% at a specificity of 72.2% on the same data set. A study by Philip et al&lt;/span&gt;&lt;a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r21"&gt;&lt;span style="font-weight: 400;"&gt;21&lt;/span&gt;&lt;/a&gt;&lt;span style="font-weight: 400;"&gt; reported a sensitivity of 86.2% at a specificity of 76.8% for predicting disease vs no disease on their own data set of 14, 406 images.’ &lt;/span&gt;&lt;a href="https://jamanetwork.com/journals/jama/fullarticle/2588763"&gt;&lt;span style="font-weight: 400;"&gt;https://jamanetwork.com/journals/jama/fullarticle/2588763&lt;/span&gt;&lt;/a&gt;&lt;span style="font-weight: 400;"&gt; '><sup>6</sup></a></span></span></p>
 +</HTML>
 +
 +
 +<HTML>
 +<p><span style="font-weight: 400;">Thus it took about</span> <b>ten years</b> <span style="font-weight: 400;">to go from just below expert human level performance to slightly superhuman performance.</span></p>
 +</HTML>
 +
 +
 +===== Contributions =====
 +
 +
 +<HTML>
 +<p><em>Aysja Johnson researched and wrote this page. Justis Mills and Katja Grace contributed feedback.</em></p>
 +</HTML>
 +
 +
 +===== Footnotes =====
 +
 +
 +<HTML>
 +<ol class="easy-footnotes-wrapper">
 +<li><div class="li">
 +<span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-1-1241"></span> See e.g. <a href="https://www.nhs.uk/conditions/diabetic-retinopathy/">https://www.nhs.uk/conditions/diabetic-retinopathy/</a> <a class="easy-footnote-to-top" href="#easy-footnote-1-1241"></a>
 +</div></li>
 +<li><div class="li">
 +<span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-2-1241"></span>See <a href="https://nei.nih.gov/health/diabetic/retinopathy">https://nei.nih.gov/health/diabetic/retinopathy</a><a class="easy-footnote-to-top" href="#easy-footnote-2-1241"></a>
 +</div></li>
 +<li><div class="li">
 +<span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-3-1241"></span> See Results section before adjudication and consensus <a href="https://www.ncbi.nlm.nih.gov/pubmed/23494039"><span style="font-weight: 400">https://www.ncbi.nlm.nih.gov/pubmed/23494039</span></a> <span style="font-weight: 400"><a class="easy-footnote-to-top" href="#easy-footnote-3-1241"></a></span>
 +</div></li>
 +<li><div class="li">
 +<span style="font-weight: 400"><span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-4-1241"></span> See Figure 3</span> <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911785/"><span style="font-weight: 400">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911785/</span></a> <span style="font-weight: 400"><a class="easy-footnote-to-top" href="#easy-footnote-4-1241"></a></span>
 +</div></li>
 +<li><div class="li"><span style="font-weight: 400"><span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-5-1241"></span> <a href="https://ai.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html">https://ai.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html</a> <a class="easy-footnote-to-top" href="#easy-footnote-5-1241"></a></span></div></li>
 +<li><div class="li">
 +<span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-6-1241"></span> ‘<span style="font-weight: 400">Automated and semi-automated diabetic retinopathy evaluation has been previously studied by other groups. Abràmoff et al</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r4"><span style="font-weight: 400">4</span></a> <span style="font-weight: 400">reported a sensitivity of 96.8% at a specificity of 59.4% for detecting referable diabetic retinopathy on the publicly available Messidor-2 data set.</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r9"><span style="font-weight: 400">9</span></a><span style="font-weight: 400">Solanki et al</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r12"><span style="font-weight: 400">12</span></a> <span style="font-weight: 400">reported a sensitivity of 93.8% at a specificity of 72.2% on the same data set. A study by Philip et al</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r21"><span style="font-weight: 400">21</span></a> <span style="font-weight: 400">reported a sensitivity of 86.2% at a specificity of 76.8% for predicting disease vs no disease on their own data set of 14, 406 images.’</span> <a href="https://jamanetwork.com/journals/jama/fullarticle/2588763"><span style="font-weight: 400">https://jamanetwork.com/journals/jama/fullarticle/2588763</span></a> <span style="font-weight: 400"><a class="easy-footnote-to-top" href="#easy-footnote-6-1241"></a></span>
 +</div></li>
 +</ol>
 +</HTML>
 +
 +
  
speed_of_ai_transition/range_of_human_performance/time_for_ai_to_cross_the_human_performance_range_in_diabetic_retinopathy.txt · Last modified: 2022/09/21 07:37 (external edit)