Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
arguments_for_ai_risk:quantitative_estimates_of_ai_risk [2023/06/21 21:51] jeffreyheninger |
arguments_for_ai_risk:quantitative_estimates_of_ai_risk [2023/12/01 18:15] (current) harlanstewart |
====== Quantitative Estimates of AI Risk ====== | ====== Quantitative Estimates of AI Risk ====== |
| /* |
| COMMENT: |
| Things to add to this: |
| - https://optimists.ai/2023/11/28/ai-is-easy-to-control/ |
| |
| */ |
// This page is in an early draft. It is very incomplete and may contain errors. // | // This page is in an early draft. It is very incomplete and may contain errors. // |
| |
The table below includes estimates from individuals working in AI Safety of how likely very bad outcomes due to AI are. | The table below includes estimates from individuals working in AI Safety of how likely very bad outcomes due to AI are. |
| |
Many of the individuals expressed [[https://en.wikipedia.org/wiki/Knightian_uncertainty|Knightian uncertainty]] when making their estimates, saying that their probability varies day-to-day, or that the estimate is currently in development, or that this is a very quick-and-dirty estimate. People who have explicitly said something like this include Katja Grace, Joe Carlsmith, Peter Wildeford, Nate Soares, Paul Christiano, and others. These estimates should not be treated as definitive statements of these individuals' beliefs, but rather as glimpses of their thinking at that moment. | Many of the individuals expressed [[https://en.wikipedia.org/wiki/Knightian_uncertainty|Knightian uncertainty]] when making their estimates, saying that their probability varies day-to-day, or that the estimate is currently in development, or that this is a very quick-and-dirty estimate. People who have explicitly said something like this include Katja Grace, Joseph Carlsmith, Peter Wildeford, Nate Soares, Paul Christiano, and others. These estimates should not be treated as definitive statements of these individuals' beliefs, but rather as glimpses of their thinking at that moment. |
| |
Each estimate includes: | Each estimate includes: |
* The estimate the individual gives for the probability that AI development causes a very bad outcome. | * The estimate the individual gives for the probability that AI development causes a very bad outcome. |
* The source for this estimate. | * The source for this estimate. |
| * Whether this is the person's most recent public estimate that we are aware of and whether this is this person's best guess as opposed to a conditional estimate. |
| |
The estimates are in no particular order. The table can be sorted by clicking at the top of each column. | The estimates are in no particular order. The table can be sorted by clicking at the top of each column. |
<th onclick="sortTable(3)">Probability</th> | <th onclick="sortTable(3)">Probability</th> |
<th onclick="sortTable(4)">Source</th> | <th onclick="sortTable(4)">Source</th> |
| <th onclick="sortTable(5)">Most Recent?</th> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.19</td> | <td>0.19</td> |
<td><a href="https://youtu.be/j5Lu01pEDWA?t=2257">Will AI end everything? A guide to guessing</a></td> | <td><a href="https://youtu.be/j5Lu01pEDWA?t=2257">Will AI end everything? A guide to guessing</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>Joe Carlsmith</td> | <td>Joseph Carlsmith</td> |
<td>2021</td> | <td>2021</td> |
<td>Existential catastrophe by 2070 from advanced, planning, strategic AI</td> | <td>Existential catastrophe by 2070 from advanced, planning, strategic AI</td> |
<td>0.05</td> | <td>0.05</td> |
<td><a href="https://arxiv.org/pdf/2206.13353.pdf">Is Power-Seeking AI an Existential Risk?</a></td> | <td><a href="https://arxiv.org/pdf/2206.13353.pdf">Is Power-Seeking AI an Existential Risk?</a></td> |
| <td>No</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>Joe Carlsmith</td> | <td>Joseph Carlsmith</td> |
<td>2022</td> | <td>2022</td> |
<td>Existential catastrophe by 2070 from advanced, planning, strategic AI</td> | <td>Existential catastrophe by 2070 from advanced, planning, strategic AI</td> |
<td>0.1+</td> | <td>0.1+ (Greater than 10%)</td> |
<td>Update to: <a href="https://arxiv.org/pdf/2206.13353.pdf">Is Power-Seeking AI an Existential Risk?</a></td> | <td>Update to: <a href="https://arxiv.org/pdf/2206.13353.pdf">Is Power-Seeking AI an Existential Risk?</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.77</td> | <td>0.77</td> |
<td><a href="https://www.lesswrong.com/posts/cCMihiwtZx7kdcKgt/comments-on-carlsmith-s-is-power-seeking-ai-an-existential">Comments on Carlsmith's "Is power-seeking AI an existential risk?"</a></td> | <td><a href="https://www.lesswrong.com/posts/cCMihiwtZx7kdcKgt/comments-on-carlsmith-s-is-power-seeking-ai-an-existential">Comments on Carlsmith's "Is power-seeking AI an existential risk?"</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.1</td> | <td>0.1</td> |
<td><a href="https://theprecipice.com/">The Precipice</a></td> | <td><a href="https://theprecipice.com/">The Precipice</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.2</td> | <td>0.2</td> |
<td><a href="https://80000hours.org/podcast/episodes/toby-ord-the-precipice-existential-risk-future-humanity/#transcript">Toby Ord on the precipice and humanity's potential futures</a></td> | <td><a href="https://80000hours.org/podcast/episodes/toby-ord-the-precipice-existential-risk-future-humanity/#transcript">Toby Ord on the precipice and humanity's potential futures</a></td> |
| <td>No</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>~1</td> | <td>~1</td> |
<td><a href="https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities">AGI Ruin: A List of Lethalities</a></td> | <td><a href="https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities">AGI Ruin: A List of Lethalities</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.1</td> | <td>0.1</td> |
<td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_rohin_shah">Conversation with Rohin Shah</a></td> | <td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_rohin_shah">Conversation with Rohin Shah</a></td> |
| <td>No</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.1</td> | <td>0.1</td> |
<td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_paul_christiano">Conversation with Paul Christiano</a></td> | <td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_paul_christiano">Conversation with Paul Christiano</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.22</td> | <td>0.22</td> |
<td>Slack channel & private conversation</td> | <td>Slack channel & private conversation</td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.6 - 0.7</td> | <td>0.6 - 0.7</td> |
<td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_adam_gleave">Conversation with Adam Gleave</td> | <td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_adam_gleave">Conversation with Adam Gleave</td> |
| <td>No</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.3 - 0.4</td> | <td>0.3 - 0.4</td> |
<td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_adam_gleave">Conversation with Adam Gleave</td> | <td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_adam_gleave">Conversation with Adam Gleave</td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.1 - 0.2</td> | <td>0.1 - 0.2</td> |
<td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_adam_gleave">Conversation with Adam Gleave</td> | <td><a href="https://wiki.aiimpacts.org/doku.php?id=conversation_notes:conversation_with_adam_gleave">Conversation with Adam Gleave</td> |
| <td>No</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.05</td> | <td>0.05</td> |
<td><a href="https://futureoflife.org/podcast/an-overview-of-technical-ai-alignment-in-2018-and-2019-with-buck-shlegeris-and-rohin-shah/">AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah</a></td> | <td><a href="https://futureoflife.org/podcast/an-overview-of-technical-ai-alignment-in-2018-and-2019-with-buck-shlegeris-and-rohin-shah/">AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.5</td> | <td>0.5</td> |
<td><a href="https://futureoflife.org/podcast/an-overview-of-technical-ai-alignment-in-2018-and-2019-with-buck-shlegeris-and-rohin-shah/">AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah</a></td> | <td><a href="https://futureoflife.org/podcast/an-overview-of-technical-ai-alignment-in-2018-and-2019-with-buck-shlegeris-and-rohin-shah/">AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah</a></td> |
| <td>No</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.0005</td> | <td>0.0005</td> |
<td><a href="https://forum.effectivealtruism.org/posts/2sMR7n32FSvLCoJLQ/critical-review-of-the-precipice-a-reassessment-of-the-risks">Critical Review of 'The Precipice': A Reassessment of the Risks of AI and Pandemics</a></td> | <td><a href="https://forum.effectivealtruism.org/posts/2sMR7n32FSvLCoJLQ/critical-review-of-the-precipice-a-reassessment-of-the-risks">Critical Review of 'The Precipice': A Reassessment of the Risks of AI and Pandemics</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.25</td> | <td>0.25</td> |
<td><a href="https://youtu.be/YTlrPeikoyw?t=1768">The current alignment plan, and how we might improve it</a></td> | <td><a href="https://youtu.be/YTlrPeikoyw?t=1768">The current alignment plan, and how we might improve it</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.33 - 0.5</td> | <td>0.33 - 0.5</td> |
<td><a href="https://youtu.be/i4LjoJGpqIY?t=2380">The future is going to be wonderful if we don't get whacked</a></td> | <td><a href="https://youtu.be/i4LjoJGpqIY?t=2380">The future is going to be wonderful if we don't get whacked</a></td> |
| <td>No</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.05 - 0.3</td> | <td>0.05 - 0.3</td> |
<td><a href="https://youtu.be/WLXuZtWoRcE?t=1229">Is AI an existential threat? We don't know, and we should work on it</a></td> | <td><a href="https://youtu.be/WLXuZtWoRcE?t=1229">Is AI an existential threat? We don't know, and we should work on it</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.33 - 0.7</td> | <td>0.33 - 0.7</td> |
<td><a href="https://futureoflife.org/podcast/rohin-shah-on-the-state-of-agi-safety-research-in-2021/">Rohin Shah on the State of AGI Safety Research in 2021</a></td> | <td><a href="https://futureoflife.org/podcast/rohin-shah-on-the-state-of-agi-safety-research-in-2021/">Rohin Shah on the State of AGI Safety Research in 2021</a></td> |
| <td>No</td> |
</tr> | </tr> |
<tr> | <tr> |
<td>0.35</td> | <td>0.35</td> |
<td><a href="https://forum.effectivealtruism.org/posts/9Y6Y6qoAigRC7A8eX/my-take-on-what-we-owe-the-future">My take on What We Owe the Future</a></td> | <td><a href="https://forum.effectivealtruism.org/posts/9Y6Y6qoAigRC7A8eX/my-take-on-what-we-owe-the-future">My take on What We Owe the Future</a></td> |
| <td>Yes</td> |
</tr> | </tr> |
| <tr> |
| <td>Katja Grace</td> |
| <td>2022</td> |
| <td>AI destroys the world</td> |
| <td>0.07</td> |
| <td><a href="https://theinsideview.ai/katja#why-katja-thinks-there-is-a-7-chance-of-ai-destroys-the-world">Katja Grace on Slowing Down AI and Surveys</a></td> |
| <td>No</td> |
| </tr> |
| <tr> |
| <td>Andrew Critch</td> |
| <td>2023</td> |
| <td>Humanity not surviving the next 50 years</td> |
| <td>0.8</td> |
| <td><a href="https://www.lesswrong.com/posts/gZkYvA6suQJthvj4E/my-may-2023-priorities-for-ai-x-safety-more-empathy-more">My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI</a></td> |
| <td>Yes</td> |
| </tr> |
| <tr> |
| <td>Andrew Critch</td> |
| <td>2023</td> |
| <td>Humanity not surviving the next 50 years, without a major international regulatory effort to control how AI is used</td> |
| <td>0.9+</td> |
| <td><a href="https://www.lesswrong.com/posts/gZkYvA6suQJthvj4E/my-may-2023-priorities-for-ai-x-safety-more-empathy-more">My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI</a></td> |
| <td>No</td> |
| </tr> |
| <tr> |
| <td>Scott Aaronson</td> |
| <td>2023</td> |
| <td>The generative AI race, which started in earnest around 2016 or 2017 with the founding of OpenAI, to play a central causal role in the extinction of humanity</td> |
| <td>0.02</td> |
| <td><a href="https://scottaaronson.blog/?p=7064">Why am I not terrified of AI?</td> |
| <td>Yes</td> |
| </tr> |
| |
| |
</table> | </table> |
Different people use different framings to arrive at their estimate of AI risk. The most common framing seems to be to describe a model of what the risk from advanced AI looks like, assign probabilities to various components of that model, and then calculate the existential risk from AI on the basis of this model. Another framing is to describe various scenarios for the future of AI, assign probabilities to the various scenarios, and then add together the probabilities of the different scenarios to determine the total existential risk from AI. There are also some people who give a probability without describing what framing they used to get this number. | Different people use different framings to arrive at their estimate of AI risk. The most common framing seems to be to describe a model of what the risk from advanced AI looks like, assign probabilities to various components of that model, and then calculate the existential risk from AI on the basis of this model. Another framing is to describe various scenarios for the future of AI, assign probabilities to the various scenarios, and then add together the probabilities of the different scenarios to determine the total existential risk from AI. There are also some people who give a probability without describing what framing they used to get this number. |
| |
Below is an example of each of these two framings, due to Joe Carlsmith and Peter Wildeford, respectively. Both individuals have updated their estimates since publishing their framing, so neither probability breakdown reflects the author's most recent estimate of AI risk. They are included to show how these framings work. | Below is an example of each of these two framings, due to Joseph Carlsmith and Peter Wildeford, respectively. Both individuals have updated their estimates since publishing their framing, so neither probability breakdown reflects the author's most recent estimate of AI risk. They are included to show how these framings work. |
| |
==== Model ==== | ==== Model ==== |
| |
One example of using a model to calculate the existential risk from AI is due to Joe Carlsmith. He calculates AI-risk by 2070 by breaking it down in the following way: | One example of using a model to calculate the existential risk from AI is due to Joseph Carlsmith. He calculates AI-risk by 2070 by breaking it down in the following way: |
| |
- It will become possible and financially feasible to build APS [advanced, planning, strategic] systems. **65%** | - It will become possible and financially feasible to build APS [advanced, planning, strategic] systems. **65%** |