User Tools

Site Tools


arguments_for_ai_risk:list_of_arguments_that_ai_poses_an_xrisk:start

List of arguments that AI poses an existential risk

This is a list of lines of reasoning suggesting that future progress in artificial intelligence may bring about the extinction of humankind or drastically limit human influence over the long-run future.

Arguments

Each 'argument' here is intended to be a different line of reasoning, however they are often not pointing to independent scenarios or using independent evidence. Some arguments attempt to reason about the same causal pathway to the same catastrophic scenarios, but relying on different concepts. Furthermore, and 'lines of reasoning' are a vague construct, and different people may consider different arguments here to be equivalent, for instance depending on what other assumptions they make or the relationship between their understanding of concepts.

Competent non-aligned agents

Computer games contain competent AI agents whose goals are explicitly opposed to those of human players. Humans increasingly lose to the best AI systems. If AI systems become similarly adept at navigating the real world, will humans also lose out?1)

(Main article: Argument for AI X-risk from competent non-aligned agents)

Summary:

  1. Humans will build AI systems that are 'agents', i.e. they will autonomously pursue goals
  2. Humans won’t figure out how to make systems with goals that are compatible with human welfare and realizing human values
  3. Such systems will be built or selected to be highly competent, and so gain the power to achieve their goals
  4. Thus the future will be primarily controlled by AIs, who will direct it in ways that are at odds with long-run human welfare or the realization of human values

Selected counterarguments:

  • It is unclear that AI will tend to have goals that are bad for humans
  • There are many forms of power. It is unclear that a competence advantage will ultimately trump all others in time
  • This argument also appears to apply to human groups such as corporations, so we need an explanation of why those are not an existential risk

People who probably endorse this argument2) (specific quotes here): Paul Christiano(2021), Ajeya Cotra (2023), Eliezer Yudkowsky(2024), Nick Bostrom (20143)).


Second species argument

An orangutan uses a stick to control juice, while humans use complex systems of tools, structures, and behavioral coordination to control the orangutan. Should orangutans have felt safe inventing humans, if they had had the choice?4)

(Main article: Second species argument for AI x-risk)

Summary:

  1. Human dominance over other animal species is primarily due to humans having superior cognitive and coordination abilities
  2. Therefore if another 'species' appears with abilities superior to those of humans, that species will become dominant over humans in the same way
  3. AI will essentially be a 'species' with superior abilities to humans
  4. Therefore AI will dominate humans

Selected counterarguments:

  • Human dominance over other species is plausibly not due to the cognitive abilities of individual humans, but rather because of human ability to communicate and store information through culture and artifacts
  • Intelligence in animals doesn't appear to generally relate to dominance. For instance, elephants are much more intelligent than beetles, and it is not clear that elephants have dominated beetles
  • Large differences in capabilities don't empirically lead to extinction. In the modern world, more powerful countries arguably control less powerful countries, but they do not wipe them out and most colonized countries have eventually gained independence

People who probably endorse this argument 5) (specific quotes here): Joe Carlsmith (2024), Richard Ngo (2020), Stuart Russell (20206)), Nick Bostrom (2015).


Loss of control via inferiority

(Main article: Argument for AI x-risk from loss of control through inferiority)

The coronation of Henry VI7). It can be hard for a child monarch to act in their own interests, even with official power, because they are so much less competent than their advisors. Humans surrounded by advanced AI systems may be in an analogous situation.

Summary:

  1. AI systems will become much more competent than humans at decision-making
  2. Thus most decisions will probably be allocated to AI systems
  3. If AI systems make most decisions, humans will lose control the future
  4. If humans have no control of the future, the future will probably be bad for humans

Selected counterarguments:

  • Humans do not generally seem to become disempowered by possession of software that is far superior to them, even if it makes many 'decisions' in the process of carrying out their will
  • In the same way that humans avoid being overpowered by companies, even though companies are more competent than individual humans, humans can track AI trustworthiness and have AI systems compete for them as users. This might substantially mitigate untrustworthy AI behavior

People who probably endorse this argument 8) (specific quotes here): Paul Christiano (2014), Ajeya Cotra (2023), Richard Ngo (2024).


Loss of control via speed

Tetris is a game that speeds up over time.9) As the time for a player to react grows shorter, the player's moves become worse, until the player loses. If advanced AI causes events to speed up, human responses might similarly become decreasingly appropriate, potentially until humans lose all relevant control.

(Main article: Argument for AI x-risk from loss of control through speed)

Summary:

  1. Advances in AI will produce very rapid changes, in available AI technology, other technologies, and society
  2. Faster changes reduce the ability for humans to exert meaningful control over events, because they need time to make non-random choices
  3. The pace of relevant events could become so fast as to allow for negligible relevant human choice
  4. If humans are not ongoingly involved in choosing the future, the future is likely to be bad by human lights

Selected counterarguments:

  • The pace at which humans can participate is not fixed. AI technologies will likely speed up processes for human participation
  • It is not clear that advances in AI will produce very rapid changes

People who probably endorse this argument 10) (specific quotes here): Joe Carlsmith (2021)


Human non-alignment

A utilitarian, a deep ecologist, and a Christian might agree on policy in the present world, but given arbitrary power: one might replace the world with efficient pleasure-producing machines, one may return it to nature, and one may invest in magnificent churches. All may consider the other futures a radical loss. This isn't a problem of AI, but AI may cause us to face it much sooner than otherwise, before we have tools to navigate this situation.

(Main article: Argument for AI x-risk from variance in human values)

Summary:

  1. People who broadly agree on good outcomes within the current world may, given much more power, choose outcomes that others would consider catastrophic
  2. AI may empower some humans or human groups to bring about futures closer to what they would choose
  3. From 1, that may be catastrophic according to the values of most other humans

Selected counterarguments:

  • Human values might be reasonably similar (possibly after extensive reflection)
  • This argument applies to anything that empowers humans. So it fails to show that AI is unusually dangerous among desirable technologies and efforts

People who probably endorse this argument11) (specific quotes here): Joe Carlsmith (2024), Katja Grace (2022), Scott Alexander (2018)


Catastrophic tools

The BADGER nuclear explosion, April 18, 1953 at the Nevada Test Site.12) Leo Szilard realized nuclear chain reactions might be possible in 193313), five years before nuclear fission was discovered in 1938. A large surge of intelligent effort might uncover more potentially world-ending technologies in quick succession.

(Main article: Argument for AI x-risk from catastrophic tools)

  1. There appear to be non-AI technologies that would pose a risk to humanity if developed
  2. AI will markedly increase the speed of development of harmful non-AI technologies
  3. AI will markedly increase the breadth of access to harmful non-AI technologies
  4. Therefore AI development poses an existential risk to humanity

Selected counterarguments:

  • It is not clear that developing a potentially catastrophic technology makes its deployment highly likely
  • New technologies that are sufficiently catastrophic to pose an extinction risk may not be feasible soon, even with relatively advanced AI

People who probably endorse this argument14) (specific quotes here): Dario Amodei (2023), Holden Karnofsky (2016), Yoshua Bengio (2024).


Powerful black boxes

(Main article: Argument for AI x-risk from powerful black boxes)

A volunteer and a nurse in a Phase 1 clinical trial.15) Medicine is another area where we sometimes develop technology without understanding its mechanisms of action well. There we would expect the frequent accidental deaths of patients if we didn't proceed with caution. AI systems are arguably less well-understood and their consequences have higher stakes.

Summary:

  1. So far, humans have developed technology largely through understanding relevant mechanisms
  2. AI systems developed in 2024 are created via repeatedly modifying random systems in the direction of desired behaviors, rather than being manually built, so the mechanisms the systems themselves ultimately use are not understood by human developers
  3. Systems whose mechanisms are not understood are more likely to produce undesired consequences than well-understood systems
  4. If such systems are powerful, then the scale of undesired consequences may be catastrophic

Selected counterarguments:

  • It is not clear that developing technology without understanding mechanisms is so rare. We have historically incorporated many biological products into technology, and improved them, without deep understanding of all involved mechanisms
  • Even if this makes AI more likely to be dangerous, that doesn't mean the harms are likely to be large enough to threaten humanity

Multi-agent dynamics

Rabbits in Australia bred until the government stepped in, contrary to rabbit welfare (1938).16) Groups of entities often end up in scenarios that none of the members would individually choose, for instance because of the dynamics of competition. Prevalence of powerful AI may worsen this through heightening the intensity of competition.

(Main article: Argument for AI x-risk from destructive multi-agent dynamics)

Summary:

  1. Competition can produce outcomes undesirable to all parties, through selection pressure for the success of any behavior that survives well, or through high stakes situations where well-meaning actors' best strategies are risky to all (as with nuclear weapons in the 20th Century)
  2. AI will increase the intensity of relevant competitions

Selected counterarguments:

  • It's not clear what direction AI will have on the large number of competitive situations in the world

Large impacts

Replicas of Niña, Pinta and Santa María sail in 189317), mirroring Columbus' original transit 400 years earlier. Events with large consequences on many aspects of life are arguably more likely to have catastrophic consequences.

(Main article: Argument for AI x-risk from large impacts)

Summary:

  1. AI development will have very large impacts, relative to the scale of human society
  2. Large impacts generally raise the chance of large risks

Selected counterarguments:

  • That AI will have large impacts is a vague claim, so it is hard to tell if it is relevantly true. For instance, 'AI' is a large bundle of technologies, so it might be expected to have large impacts. Many other large bundles of things will have 'large' impacts, for instance the worldwide continued production of electricity, relative to its ceasing. However we do not consider electricity producers to pose an existential risk for this reason
  • Minor changes frequently have large impacts on the world according to (e.g. the butterfly effect). By this reasoning, perhaps we should never leave the house

People who probably endorse this argument18) (specific quotes here): Richard Ngo (2019)


Expert opinion

Summary:

  1. The people best placed to judge the extent of existential risk from AI are AI researchers, forecasting experts, experts on AI risk, relevant social scientists, and some others
  2. Median members of these groups frequently put substantial credence (e.g. 5%) on human extinction or similar disempowerment from AI

Selected counterarguments:

  • Most of these groups do not have demonstrated skill at forecasting, and to our knowledge none have demonstrated skill at forecasting speculative events more than 5 years into the future
800 randomly selected responses from our 2023 Expert Survey on Progress in AI on how good or bad they expect the long-run impacts of 'high level machine intelligence' to be on the future of humanity. Each vertical bar represents one participant's guess. The black section of each bar is the probability that participant put on 'extremely bad (e.g. human extinction)'.

See also

Contributors

Primary author: Katja Grace

Other authors: Nathan Young, Josh Hart

Suggested citation:

Grace, K., Young, N., Hart, J., (2024), List of arguments that AI poses an existential risk, AI Impacts Wiki, https://wiki.aiimpacts.org/arguments_for_ai_risk:list_of_arguments_that_ai_poses_an_xrisk:start

Notes

1)
Image from Midjourney
2) , 11) , 14) , 18)
Nathan Young puts 80% that at the time of the quote the individual would have endorsed the respective argument.
3)
Superintelligence, Chapter 8
5) , 8) , 10)
Nathan Young puts 80% that any individual person will endorse their quote applying to this argument
6)
Human Compatible: Artificial Intelligence and the Problem of Control
12)
Photo courtesy of National Nuclear Security Administration / Nevada Site Office, Public domain, via Wikimedia Commons, https://commons.wikimedia.org/wiki/File:Operation_Upshot-Knothole_-_Badger_001.jpg
13)
“When Hitler rose to power in 1933, Szilard moved to England. He developed the idea of the nuclear chain reaction in 1933.” Atomic Heritage Foundation, https://ahf.nuclearmuseum.org/ahf/profile/leo-szilard/
16)
'Research - Myxomatosis - Rabbits around a waterhole during myxomatosis trials at Wardang Island, South Australia', National Archives of Australia, https://www.naa.gov.au/students-and-teachers/learning-resources/learning-resource-themes/environment-and-nature/conservation/rabbits-around-waterhole-during-myxomatosis-trial
arguments_for_ai_risk/list_of_arguments_that_ai_poses_an_xrisk/start.txt · Last modified: 2024/08/15 21:19 by nathanpmyoung