|
|
— |
speed_of_ai_transition:range_of_human_performance:time_for_ai_to_cross_the_human_performance_range_in_diabetic_retinopathy [2022/09/21 07:37] (current) |
| ====== Time for AI to cross the human performance range in diabetic retinopathy ====== |
| |
| // Published 21 November, 2018; last updated 20 January, 2021 // |
| |
| <HTML> |
| <p><span style="font-weight: 400;">In diabetic retinopathy,</span> automated systems <span style="font-weight: 400;">started out just below expert human level performance, and took around ten years to reach expert human level performance.</span></p> |
| </HTML> |
| |
| |
| |
| ===== Details ===== |
| |
| |
| <HTML> |
| <p><span style="font-weight: 400;">Diabetic retinopathy is a complication of diabetes in which the back of the eye is damaged by high blood sugar levels.<span class="easy-footnote-margin-adjust" id="easy-footnote-1-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-1-1241" title=' See e.g. <a href="https://www.nhs.uk/conditions/diabetic-retinopathy/">https://www.nhs.uk/conditions/diabetic-retinopathy/</a> '><sup>1</sup></a></span> It is the most common cause of blindness among working-age adults.<span class="easy-footnote-margin-adjust" id="easy-footnote-2-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-2-1241" title='See <a href="https://nei.nih.gov/health/diabetic/retinopathy">https://nei.nih.gov/health/diabetic/retinopathy</a>'><sup>2</sup></a></span> The disease is diagnosed by examining images of the back of the eye.</span> <span style="font-weight: 400;">The gold standard used for diabetic retinopathy diagnosis is typically some sort of pooling mechanism over several expert opinions. Thus, in the papers below, each time expert sensitivity/specificity (Se/Sp) is considered, it is the Se/Sp of individual experts graded against aggregate expert agreement.</span></p> |
| </HTML> |
| |
| |
| <HTML> |
| <p><span style="font-weight: 400;">As a rough benchmark for expert-level performance we’ll take the average Se/Sp of ophthalmologists from a few studies. Based on Google Brain’s work (detailed below), this paper <span class="easy-footnote-margin-adjust" id="easy-footnote-3-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-3-1241" title=' See Results section before adjudication and consensus </span><a href="https://www.ncbi.nlm.nih.gov/pubmed/23494039"><span style="font-weight: 400;">https://www.ncbi.nlm.nih.gov/pubmed/23494039</span></a><span style="font-weight: 400;"> '><sup>3</sup></a></span>, and this paper <span class="easy-footnote-margin-adjust" id="easy-footnote-4-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-4-1241" title=' See Figure 3 </span><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911785/"><span style="font-weight: 400;">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911785/</span></a><span style="font-weight: 400;"> '><sup>4</sup></a></span> , the average specificity of 14 opthamologists, which indicates expert human-level performance, is 95% and the average sensitivity is 82%.</span></p> |
| </HTML> |
| |
| |
| <HTML> |
| <p><span style="font-weight: 400;">As far as we can tell, 1996 is when the first algorithm automatically detecting diabetic retinopathy was developed. When compared to opthamologists’ ratings, the algorithm achieved 88.4% sensitivity and 83.5% specificity.</span></p> |
| </HTML> |
| |
| |
| <HTML> |
| <p><span style="font-weight: 400;">In late 2016 Google algorithms were on par with eight opthamologist diagnoses of diabetic retinopathy. See Figure 1.<span class="easy-footnote-margin-adjust" id="easy-footnote-5-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-5-1241" title=' <a href="https://ai.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html">https://ai.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html</a> '><sup>5</sup></a></span></span><span style="font-weight: 400;"> The high-sensitivity operating point (labelled on the graph) achieved 97.5/93.4 Se/Sp. </span></p> |
| </HTML> |
| |
| |
| <HTML> |
| <figure aria-describedby="caption-attachment-1242" class="wp-caption alignnone" id="attachment_1242" style="width: 553px"> |
| <a href="http://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1.png"><img alt="" class="wp-image-1242 size-full" height="550" sizes="(max-width: 553px) 100vw, 553px" src="https://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1.png" srcset="https://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1.png 553w, https://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1-150x150.png 150w, https://aiimpacts.org/wp-content/uploads/2018/11/eyepacs1-300x298.png 300w" width="553"/></a> |
| <figcaption class="wp-caption-text" id="caption-attachment-1242"> |
| Figure 1: Performance comparison of a late 2016 Google algorithm, and eight opthalmologists, from here.<span style="font-weight: 400;"> The black curve represents the algorithm and the eight colored dots are opthamologists.</span> |
| </figcaption> |
| </figure> |
| </HTML> |
| |
| |
| <HTML> |
| <p><span style="font-weight: 400;">Many other papers were published in between 1996 and 2016. However, none of them achieved better than expert human-level performance on both specificity and sensitivity. For instance 86/77 Se/Sp was achieved in 2007, 97/59 in 2013, and 94/72 by another team in 2016. <span class="easy-footnote-margin-adjust" id="easy-footnote-6-1241"></span><span class="easy-footnote"><a href="#easy-footnote-bottom-6-1241" title=' ‘</span><span style="font-weight: 400;">Automated and semi-automated diabetic retinopathy evaluation has been previously studied by other groups. Abràmoff et al</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r4"><span style="font-weight: 400;">4</span></a><span style="font-weight: 400;"> reported a sensitivity of 96.8% at a specificity of 59.4% for detecting referable diabetic retinopathy on the publicly available Messidor-2 data set.</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r9"><span style="font-weight: 400;">9</span></a><span style="font-weight: 400;">Solanki et al</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r12"><span style="font-weight: 400;">12</span></a><span style="font-weight: 400;"> reported a sensitivity of 93.8% at a specificity of 72.2% on the same data set. A study by Philip et al</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r21"><span style="font-weight: 400;">21</span></a><span style="font-weight: 400;"> reported a sensitivity of 86.2% at a specificity of 76.8% for predicting disease vs no disease on their own data set of 14, 406 images.’ </span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763"><span style="font-weight: 400;">https://jamanetwork.com/journals/jama/fullarticle/2588763</span></a><span style="font-weight: 400;"> '><sup>6</sup></a></span></span></p> |
| </HTML> |
| |
| |
| <HTML> |
| <p><span style="font-weight: 400;">Thus it took about</span> <b>ten years</b> <span style="font-weight: 400;">to go from just below expert human level performance to slightly superhuman performance.</span></p> |
| </HTML> |
| |
| |
| ===== Contributions ===== |
| |
| |
| <HTML> |
| <p><em>Aysja Johnson researched and wrote this page. Justis Mills and Katja Grace contributed feedback.</em></p> |
| </HTML> |
| |
| |
| ===== Footnotes ===== |
| |
| |
| <HTML> |
| <ol class="easy-footnotes-wrapper"> |
| <li><div class="li"> |
| <span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-1-1241"></span> See e.g. <a href="https://www.nhs.uk/conditions/diabetic-retinopathy/">https://www.nhs.uk/conditions/diabetic-retinopathy/</a> <a class="easy-footnote-to-top" href="#easy-footnote-1-1241"></a> |
| </div></li> |
| <li><div class="li"> |
| <span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-2-1241"></span>See <a href="https://nei.nih.gov/health/diabetic/retinopathy">https://nei.nih.gov/health/diabetic/retinopathy</a><a class="easy-footnote-to-top" href="#easy-footnote-2-1241"></a> |
| </div></li> |
| <li><div class="li"> |
| <span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-3-1241"></span> See Results section before adjudication and consensus <a href="https://www.ncbi.nlm.nih.gov/pubmed/23494039"><span style="font-weight: 400">https://www.ncbi.nlm.nih.gov/pubmed/23494039</span></a> <span style="font-weight: 400"><a class="easy-footnote-to-top" href="#easy-footnote-3-1241"></a></span> |
| </div></li> |
| <li><div class="li"> |
| <span style="font-weight: 400"><span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-4-1241"></span> See Figure 3</span> <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911785/"><span style="font-weight: 400">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911785/</span></a> <span style="font-weight: 400"><a class="easy-footnote-to-top" href="#easy-footnote-4-1241"></a></span> |
| </div></li> |
| <li><div class="li"><span style="font-weight: 400"><span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-5-1241"></span> <a href="https://ai.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html">https://ai.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html</a> <a class="easy-footnote-to-top" href="#easy-footnote-5-1241"></a></span></div></li> |
| <li><div class="li"> |
| <span class="easy-footnote-margin-adjust" id="easy-footnote-bottom-6-1241"></span> ‘<span style="font-weight: 400">Automated and semi-automated diabetic retinopathy evaluation has been previously studied by other groups. Abràmoff et al</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r4"><span style="font-weight: 400">4</span></a> <span style="font-weight: 400">reported a sensitivity of 96.8% at a specificity of 59.4% for detecting referable diabetic retinopathy on the publicly available Messidor-2 data set.</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r9"><span style="font-weight: 400">9</span></a><span style="font-weight: 400">Solanki et al</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r12"><span style="font-weight: 400">12</span></a> <span style="font-weight: 400">reported a sensitivity of 93.8% at a specificity of 72.2% on the same data set. A study by Philip et al</span><a href="https://jamanetwork.com/journals/jama/fullarticle/2588763#joi160132r21"><span style="font-weight: 400">21</span></a> <span style="font-weight: 400">reported a sensitivity of 86.2% at a specificity of 76.8% for predicting disease vs no disease on their own data set of 14, 406 images.’</span> <a href="https://jamanetwork.com/journals/jama/fullarticle/2588763"><span style="font-weight: 400">https://jamanetwork.com/journals/jama/fullarticle/2588763</span></a> <span style="font-weight: 400"><a class="easy-footnote-to-top" href="#easy-footnote-6-1241"></a></span> |
| </div></li> |
| </ol> |
| </HTML> |
| |
| |
| |