List of arguments that AI poses an existential risk

This is a list of lines of reasoning suggesting that future progress in artificial intelligence may bring about the extinction of humankind or drastically limit human influence over the long-run future.

Arguments

Each 'argument' here is intended to be a different line of reasoning, however they are often not pointing to independent scenarios or using independent evidence. Some arguments attempt to reason about the same causal pathway to the same catastrophic scenarios, but relying on different concepts. Furthermore, and 'lines of reasoning' are a vague construct, and different people may consider different arguments here to be equivalent, for instance depending on what other assumptions they make or the relationship between their understanding of concepts.

Competent non-aligned agents

Computer games contain competent AI agents whose goals are explicitly opposed to those of human players. Humans increasingly lose to the best AI systems. If AI systems become similarly adept at navigating the real world, will humans also lose out?¹⁾

(Main article: Argument for AI X-risk from competent non-aligned agents)

Summary:

Humans will build AI systems that are 'agents', i.e. they will autonomously pursue goals
Humans won’t figure out how to make systems with goals that are compatible with human welfare and realizing human values
Such systems will be built or selected to be highly competent, and so gain the power to achieve their goals
Thus the future will be primarily controlled by AIs, who will direct it in ways that are at odds with long-run human welfare or the realization of human values

Selected counterarguments:

It is unclear that AI will tend to have goals that are bad for humans
There are many forms of power. It is unclear that a competence advantage will ultimately trump all others in time
This argument also appears to apply to human groups such as corporations, so we need an explanation of why those are not an existential risk

People who probably endorse this argument²⁾ (specific quotes here): Paul Christiano(2021), Ajeya Cotra (2023), Eliezer Yudkowsky(2024), Nick Bostrom (2014³⁾).

Second species argument

An orangutan uses a stick to control juice, while humans use complex systems of tools, structures, and behavioral coordination to control the orangutan. Should orangutans have felt safe inventing humans, if they had had the choice?⁴⁾

(Main article: Second species argument for AI x-risk)

Summary:

Human dominance over other animal species is primarily due to humans having superior cognitive and coordination abilities
Therefore if another 'species' appears with abilities superior to those of humans, that species will become dominant over humans in the same way
AI will essentially be a 'species' with superior abilities to humans
Therefore AI will dominate humans

Selected counterarguments:

Human dominance over other species is plausibly not due to the cognitive abilities of individual humans, but rather because of human ability to communicate and store information through culture and artifacts
Intelligence in animals doesn't appear to generally relate to dominance. For instance, elephants are much more intelligent than beetles, and it is not clear that elephants have dominated beetles
Large differences in capabilities don't empirically lead to extinction. In the modern world, more powerful countries arguably control less powerful countries, but they do not wipe them out and most colonized countries have eventually gained independence

People who probably endorse this argument ⁵⁾ (specific quotes here): Joe Carlsmith (2024), Richard Ngo (2020), Stuart Russell (2020⁶⁾), Nick Bostrom (2015).

Loss of control via inferiority

(Main article: Argument for AI x-risk from loss of control through inferiority)

The coronation of Henry VI⁷⁾. It can be hard for a child monarch to act in their own interests, even with official power, because they are so much less competent than their advisors. Humans surrounded by advanced AI systems may be in an analogous situation.

Summary:

AI systems will become much more competent than humans at decision-making
Thus most decisions will probably be allocated to AI systems
If AI systems make most decisions, humans will lose control the future
If humans have no control of the future, the future will probably be bad for humans

Selected counterarguments:

Humans do not generally seem to become disempowered by possession of software that is far superior to them, even if it makes many 'decisions' in the process of carrying out their will
In the same way that humans avoid being overpowered by companies, even though companies are more competent than individual humans, humans can track AI trustworthiness and have AI systems compete for them as users. This might substantially mitigate untrustworthy AI behavior

People who probably endorse this argument ⁸⁾ (specific quotes here): Paul Christiano (2014), Ajeya Cotra (2023), Richard Ngo (2024).

Loss of control via speed

Tetris is a game that speeds up over time.⁹⁾ As the time for a player to react grows shorter, the player's moves become worse, until the player loses. If advanced AI causes events to speed up, human responses might similarly become decreasingly appropriate, potentially until humans lose all relevant control.

(Main article: Argument for AI x-risk from loss of control through speed)

Summary:

Advances in AI will produce very rapid changes, in available AI technology, other technologies, and society
Faster changes reduce the ability for humans to exert meaningful control over events, because they need time to make non-random choices
The pace of relevant events could become so fast as to allow for negligible relevant human choice
If humans are not ongoingly involved in choosing the future, the future is likely to be bad by human lights

Selected counterarguments:

The pace at which humans can participate is not fixed. AI technologies will likely speed up processes for human participation
It is not clear that advances in AI will produce very rapid changes

People who probably endorse this argument ¹⁰⁾ (specific quotes here): Joe Carlsmith (2021)

Human non-alignment

A utilitarian, a deep ecologist, and a Christian might agree on policy in the present world, but given arbitrary power: one might replace the world with efficient pleasure-producing machines, one may return it to nature, and one may invest in magnificent churches. All may consider the other futures a radical loss. This isn't a problem of AI, but AI may cause us to face it much sooner than otherwise, before we have tools to navigate this situation.

(Main article: Argument for AI x-risk from variance in human values)

Summary:

People who broadly agree on good outcomes within the current world may, given much more power, choose outcomes that others would consider catastrophic
AI may empower some humans or human groups to bring about futures closer to what they would choose
From 1, that may be catastrophic according to the values of most other humans

Selected counterarguments:

Human values might be reasonably similar (possibly after extensive reflection)
This argument applies to anything that empowers humans. So it fails to show that AI is unusually dangerous among desirable technologies and efforts

People who probably endorse this argument¹¹⁾ (specific quotes here): Joe Carlsmith (2024), Katja Grace (2022), Scott Alexander (2018)

Catastrophic tools

The BADGER nuclear explosion, April 18, 1953 at the Nevada Test Site.¹²⁾ Leo Szilard realized nuclear chain reactions might be possible in 1933¹³⁾, five years before nuclear fission was discovered in 1938. A large surge of intelligent effort might uncover more potentially world-ending technologies in quick succession.

(Main article: Argument for AI x-risk from catastrophic tools)

There appear to be non-AI technologies that would pose a risk to humanity if developed
AI will markedly increase the speed of development of harmful non-AI technologies
AI will markedly increase the breadth of access to harmful non-AI technologies
Therefore AI development poses an existential risk to humanity

Selected counterarguments:

It is not clear that developing a potentially catastrophic technology makes its deployment highly likely
New technologies that are sufficiently catastrophic to pose an extinction risk may not be feasible soon, even with relatively advanced AI

People who probably endorse this argument¹⁴⁾ (specific quotes here): Dario Amodei (2023), Holden Karnofsky (2016), Yoshua Bengio (2024).

Powerful black boxes

(Main article: Argument for AI x-risk from powerful black boxes)

A volunteer and a nurse in a Phase 1 clinical trial.¹⁵⁾ Medicine is another area where we sometimes develop technology without understanding its mechanisms of action well. There we would expect the frequent accidental deaths of patients if we didn't proceed with caution. AI systems are arguably less well-understood and their consequences have higher stakes.

Summary:

So far, humans have developed technology largely through understanding relevant mechanisms
AI systems developed in 2024 are created via repeatedly modifying random systems in the direction of desired behaviors, rather than being manually built, so the mechanisms the systems themselves ultimately use are not understood by human developers
Systems whose mechanisms are not understood are more likely to produce undesired consequences than well-understood systems
If such systems are powerful, then the scale of undesired consequences may be catastrophic

Selected counterarguments:

It is not clear that developing technology without understanding mechanisms is so rare. We have historically incorporated many biological products into technology, and improved them, without deep understanding of all involved mechanisms
Even if this makes AI more likely to be dangerous, that doesn't mean the harms are likely to be large enough to threaten humanity

Multi-agent dynamics

Rabbits in Australia bred until the government stepped in, contrary to rabbit welfare (1938).¹⁶⁾ Groups of entities often end up in scenarios that none of the members would individually choose, for instance because of the dynamics of competition. Prevalence of powerful AI may worsen this through heightening the intensity of competition.

(Main article: Argument for AI x-risk from destructive multi-agent dynamics)

Summary:

Competition can produce outcomes undesirable to all parties, through selection pressure for the success of any behavior that survives well, or through high stakes situations where well-meaning actors' best strategies are risky to all (as with nuclear weapons in the 20th Century)
AI will increase the intensity of relevant competitions

Selected counterarguments:

It's not clear what direction AI will have on the large number of competitive situations in the world

Large impacts

Replicas of Niña, Pinta and Santa María sail in 1893¹⁷⁾, mirroring Columbus' original transit 400 years earlier. Events with large consequences on many aspects of life are arguably more likely to have catastrophic consequences.

(Main article: Argument for AI x-risk from large impacts)

Summary:

AI development will have very large impacts, relative to the scale of human society
Large impacts generally raise the chance of large risks

Selected counterarguments:

That AI will have large impacts is a vague claim, so it is hard to tell if it is relevantly true. For instance, 'AI' is a large bundle of technologies, so it might be expected to have large impacts. Many other large bundles of things will have 'large' impacts, for instance the worldwide continued production of electricity, relative to its ceasing. However we do not consider electricity producers to pose an existential risk for this reason
Minor changes frequently have large impacts on the world according to (e.g. the butterfly effect). By this reasoning, perhaps we should never leave the house

People who probably endorse this argument¹⁸⁾ (specific quotes here): Richard Ngo (2019)

Expert opinion

Summary:

The people best placed to judge the extent of existential risk from AI are AI researchers, forecasting experts, experts on AI risk, relevant social scientists, and some others
Median members of these groups frequently put substantial credence (e.g. 5%) on human extinction or similar disempowerment from AI

Selected counterarguments:

Most of these groups do not have demonstrated skill at forecasting, and to our knowledge none have demonstrated skill at forecasting speculative events more than 5 years into the future

800 randomly selected responses from our 2023 Expert Survey on Progress in AI on how good or bad they expect the long-run impacts of 'high level machine intelligence' to be on the future of humanity. Each vertical bar represents one participant's guess. The black section of each bar is the probability that participant put on 'extremely bad (e.g. human extinction)'.

Notes

¹⁾

Image from Midjourney

²⁾ , ¹¹⁾ , ¹⁴⁾ , ¹⁸⁾

Nathan Young puts 80% that at the time of the quote the individual would have endorsed the respective argument.

³⁾

Superintelligence, Chapter 8

⁴⁾

William H. Calvin, CC BY-SA 4.0, https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons, https://commons.wikimedia.org/wiki/File:Orangutan_using_precision_grip.jpg

⁵⁾ , ⁸⁾ , ¹⁰⁾

Nathan Young puts 80% that any individual person will endorse their quote applying to this argument

⁶⁾

Human Compatible: Artificial Intelligence and the Problem of Control

⁷⁾

Opie, John (R. A.), 1797, https://catalogue.etoncollege.com/object-fda-e-2166-2015

⁹⁾

Cezary Tomczak, Maxime Lorant, BSD http://opensource.org/licenses/bsd-license.php, via Wikimedia Commons, https://commons.wikimedia.org/wiki/File:TetrisJS-GameOver.png

¹²⁾

Photo courtesy of National Nuclear Security Administration / Nevada Site Office, Public domain, via Wikimedia Commons, https://commons.wikimedia.org/wiki/File:Operation_Upshot-Knothole_-_Badger_001.jpg

¹³⁾

“When Hitler rose to power in 1933, Szilard moved to England. He developed the idea of the nuclear chain reaction in 1933.” Atomic Heritage Foundation, https://ahf.nuclearmuseum.org/ahf/profile/leo-szilard/

¹⁵⁾

From Wikimedia, https://commons.wikimedia.org/wiki/File:Clinical_trial_for_malaria_treatment_(49450846413).jpg

¹⁶⁾

'Research - Myxomatosis - Rabbits around a waterhole during myxomatosis trials at Wardang Island, South Australia', National Archives of Australia, https://www.naa.gov.au/students-and-teachers/learning-resources/learning-resource-themes/environment-and-nature/conservation/rabbits-around-waterhole-during-myxomatosis-trial

¹⁷⁾

Wikimedia Commons, https://commons.wikimedia.org/wiki/File:1893_Nina_Pinta_Santa_Maria_replicas.jpg

AI Impacts Wiki

Table of Contents

List of arguments that AI poses an existential risk

Arguments

Competent non-aligned agents

Second species argument

Loss of control via inferiority

Loss of control via speed

Human non-alignment

Catastrophic tools

Powerful black boxes

Multi-agent dynamics

Large impacts

Expert opinion

See also

Contributors

Notes

AI Impacts Wiki

User Tools

Site Tools

Table of Contents

List of arguments that AI poses an existential risk

Arguments

Competent non-aligned agents

Second species argument

Loss of control via inferiority

Loss of control via speed

Human non-alignment

Catastrophic tools

Powerful black boxes

Multi-agent dynamics

Large impacts

Expert opinion

See also

Contributors

Notes

Page Tools