User Tools

Site Tools


Is AI an existential risk to humanity?

This page is under active work and may currently be incoherent or inaccurate.

The balance of evidence appears to suggest that AI poses a substantial existential risk, though none of the arguments that we know of appear be conclusive evidence.


Many thinkers believe advanced artificial intelligence (AI) poses a large threat to humanity's long term survival or flourishing. Here we review evidence.

For views of specific people working on AI, see this page.

Note that arguments included here are not intended to be straightforwardly independent lines of evidence. They may instead represent different ways of conceptualizing and reasoning about the same underlying situation.


(Main article: Will malign AI agents control the future?)

Several arguments have been made for expecting artificial intelligence to pose an existential risk. The most prominent argument for AI posing a severe threat to humanity is for a scenario where competent, malign agents control the future, and can be summarized briefly as follows:

  1. Some advanced AI systems will very likely be 'goal-oriented'
  2. The aggregate goals of these systems may be bad. (There are reasons to think this.)
  3. Such systems will likely have the power to achieve their goals even against the will of humans
  4. Thus, there is some chance that the future will proceed in opposition to long-run human welfare, because these advanced AI systems will succeed in their (bad) goals

Other arguments

Further arguments for AI posing an existential risk to humanity can be categorized by scenario.

Scenario: malign AI agents control the future

(Main article: Will malign AI agents control the future?)

Arguments that this scenario will occur include:

  • AI will replace humans as most intelligent 'species'

    (Main article: Argument for AI x-risk from most intelligent species)

    Summary: Humans' dominance over other species in controlling the world is due primarily to our superior cognitive abilities. If another 'species' with better cognitive abilities appeared, we should then expect humans to lose control over the future and therefore for the future to lose its value.

    Apparent status: Somewhat suggestive, though doesn't appear to be valid, since intelligence in animals doesn't appear to generally relate to dominance. A valid version may be possible to construct.
  • AI agents will cause humans to 'lose control'

    Summary: AI will ultimately be much faster and more competent than humans, so either, a) must make most decisions because waiting for humans will be so costly, b) will make decisions if it wants, since humans will be so relatively powerless, due to their intellectual inferiority. Losing control of the future isn't necessarily bad, but is prima facie a very bad sign.

    Apparent status: Suggestive, but as stated does not appear to be valid. For instance, humans do not generally seem to become disempowered by possession of software that is far superior to them.
  • Argument for loss of control from extreme speed

    Summary: Advancing AI will tend to produce very rapid changes, either because of feedback loops in automation of automation processes, or because automation tends to be faster than the human activity it replaces. Faster change reduces human ability to steer a situation, e.g. reviewing and understanding it, responding to problems as they appear, preparing. In the extreme, the pace of socially relevant events could become so fast as to exclude human participation.

    Apparent status: Heuristically suggestive, however the burden of proof should arguably be high for an implausible event such as the destruction of humanity. This argument also seems to support concern about a wide range of technologies, which may be correct.

In light of these arguments, this scenario seems to us plausible but not guaranteed. Its likelihood appears to depend strongly on the strength of one's prior probability on arbitrary risks being sufficient to destroy the world.

Scenario: AI empowers bad human actors

Some people and collectives have goals whose fulfillment would be considered bad by most people. If advanced AI empowered those people disproportionately, this could be destructive. This could happen by bad luck, or because the situation systematically advantages unpopular values.

Arguments for this type of scenario occurring:

  • Bad actors are usually constrained by lack of human cognitive labor available widely disliked projects (e.g. terrorist groups only have access to the minuscule fraction of talent that is aligned with their ideology and in favor of terrorism). If money can more reliably buy cognitive labor for any purpose, these agendas may be less disadvantaged.
  • Perhaps virtually all people at present, if suddenly empowered to unprecedented levels, would bring about regrettable outcomes, such that this scenario would result from almost people or groups being empowered a lot suddenly.

These arguments appear to raise the chance of such a scenario, but not massively.

Scenario: new AI cognitive labor is misdirected, causing destruction contrary to all actors' goals

(Main article: Argument for AI x-risk from potential for accidents and misuse)

Advanced AI could yield powerful destructive capabilities such as new weapons or hazardous technologies. As well as being used maliciously (see previous section) or forcing well-meaning actors into situations where unfortunate risks are hard to avoid, as with nuclear weapons (see next section), these raise the risk of cataclysmic accidents, just by being used in error.

Scenario: destructive multi-agent dynamics are accelerated by new AI cognitive labor

(Main article: Argument for AI x-risk from destructive competition)

Competition can produce outcomes undesirable to all parties, through selection pressure for the success of any behavior that survives well. AI may increase the intensity of relevant competitions.

General evidence

This is evidence for existential risk from AI which doesn't point to specific scenarios:

  1. Expert opinion expects non-negligible extinction risk: in a large survey run by AI Impacts, the median machine learning researcher appeared to put 5-10% chance on extinction risk from human-level artificial intelligence, across different question framings.
  2. AI will have large impacts, which is heuristically indicative of risk. (Main article: Argument for AI x-risk from large impacts)

General counterarguments

Counterarguments to specific arguments above are listed in their main articles. These are general counterarguments to concern about AI risk, not directed at specific arguments.

  1. In the absence of clearly reliable reasoning, we should not put high probability on outcomes which initially had low likelihood. In more detail:
    1. None of this reasoning is rigorous
    2. Reasoning is fallible, and these styles of reasoning have not been tested empirically
    3. Human extinction is a rare event, so should have a very low prior likelihood
    4. Human extinction specifically from uncontrolled artificial intelligence seems intuitively absurd to many, which heuristically suggests it should have a very low prior likelihood
  2. There is a dearth of examples of concrete scenarios which appear to be plausible and realistic

Arguments for risk being higher, if it exists

These are arguments that, supposing there is some other reason to expect a risk at all, the risk may be larger or worse than expected.

  1. AI performance may increase very fast due to inherent propensities to discontinuity
  2. AI performance may increase very fast once AI contributes to AI progress, due to a feedback dynamic ('intelligence explosion' from 'recursive self improvement')

There are various other scenarios and supporting arguments


In light of these arguments, this scenario seems plausible but not guaranteed.

See also


arguments_for_ai_risk/is_ai_an_existential_threat_to_humanity/start.txt · Last modified: 2024/05/28 19:23 by katjagrace