User Tools

Site Tools


arguments_for_ai_risk:is_ai_an_existential_threat_to_humanity:start

This is an old revision of the document!


Is AI an existential threat to humanity?

This page is under active work and may be updated soon.

The balance of evidence appears to suggest that AI poses a substantial existential risk, though none of the arguments that we know of appear be strongly compelling evidence.

Background

Many thinkers believe advanced artificial intelligence (AI) poses a large threat to humanity's long term survival or flourishing. Here we review evidence.

Note that arguments included here are not intended to be straightforwardly independent lines of evidence. They may instead represent different ways of conceptualizing and reasoning about the same underlying situation.

Scenarios and supporting arguments

Advanced AI could conceivably threaten humanity's future via several different disaster scenarios, each of which is suggested by different arguments. We consider these scenarios and arguments here, one at a time.

Scenario: amoral AI agents control the future

(Main article: Will amoral AI agents control the future?)

This appears to be the most discussed extinction scenario. In it:

  1. AI systems are created which, a) have goals, and b) are each more capable than a human at many economically valuable tasks, including strategic decision making.
  2. These AI systems' superior performance allows them to take control of the future, for instance through accruing social and economic power, or through immediately devising a plan for destroying humanity.
  3. The AI systems do not want the same things as humans, so will bring about a future that humans would disprefer

This scenario includes sub-scenarios where the above process happens fast or slow, or involves different kinds of agents, or different specific routes, etc.

Arguments for this scenario occurring include:

  • AI developments will produce powerful agents with undesirable goals

    (Main article: Argument for AI X-risk from competent malign agents)

    Summary: At least some advanced AI systems will probably be 'goal-oriented', a powerful force in the world, and their goals will probably be bad by human lights. Powerful goal-oriented agents tend to achieve their goals.

    Apparent status: This seems to us the most suggestive argument, though not watertight. This seems prima facie plausible, but destroying everything is a very implausible event, so the burden of proof is high.
  • AI will replace humans as most intelligent 'species'

    (Main article: Argument for AI x-risk from most intelligent species)

    Summary: Humans' dominance over other species in controlling the world is due primarily to our superior cognitive abilities. If another 'species' with better cognitive abilities appeared, we should then expect humans to lose control over the future and therefore for the future to lose its value.

    Apparent status: Somewhat suggestive, though doesn't appear to be valid, since intelligence in animals doesn't appear to generally relate to dominance. A valid version may be possible to construct.
  • AI agents will cause humans to 'lose control'

    Summary: AI will ultimately be much faster and more competent than humans, so either, a) must make most decisions because waiting for humans will be so costly, b) will make decisions if it wants, since humans will be so relatively powerless, due to their intellectual inferiority. Losing control of the future isn't necessarily bad, but is prima facie a very bad sign.

    Apparent status: Suggestive, but as stated does not appear to be valid. For instance, humans do not generally seem to become disempowered by possession of software that is far superior to them.
  • Argument for loss of control from extreme speed

    Summary: Advancing AI will tend to produce very rapid changes, either because of feedback loops in automation of automation processes, or because automation tends to be faster than the human activity it replaces. Faster change reduces human ability to steer a situation, e.g. reviewing and understanding it, responding to problems as they appear, preparing. In the extreme, the pace of socially relevant events could become so fast as to exclude human participation.

    Apparent status: Heuristically suggestive, however the burden of proof should arguably be high for an implausible event such as the destruction of humanity. This argument also seems to support concern about a wide range of technologies, which may be correct.

In light of these arguments, this scenario seems to us plausible but not guaranteed. Its likelihood appears to depend strongly on the strength of one's prior probability on arbitrary risks being sufficient to destroy the world.

Scenario: bad human actors are empowered by cheap AI cognitive labor

Some people and collectives have goals whose fulfillment would be considered bad by most people. If advanced AI empowered those people disproportionately, this could be destructive. This could happen by bad luck, or because the situation systematically advantages unpopular values.

Arguments for this type of scenario occurring:

  • Bad actors are usually constrained by lack of human cognitive labor available widely disliked projects (e.g. terrorist groups only have access to the minuscule fraction of talent that is aligned with their ideology and in favor of terrorism). If money can more reliably buy cognitive labor for any purpose, these agendas may be less disadvantaged.
  • Perhaps virtually all people at present, if suddenly empowered to unprecedented levels, would bring about regrettable outcomes, such that this scenario would result from almost people or groups being empowered a lot suddenly.

These arguments appear to raise the chance of such a scenario, but not massively.

Scenario: new AI cognitive labor is misdirected, causing destruction contrary to all actors' goals

(Main article: Argument for AI x-risk from potential for accidents and misuse)

Advanced AI could yield powerful destructive capabilities such as new weapons or hazardous technologies. As well as being used maliciously (see previous section) or forcing well-meaning actors into situations where unfortunate risks are hard to avoid, as with nuclear weapons (see next section), these raise the risk of cataclysmic accidents, just by being used in error.

Scenario: destructive multi-agent dynamics are accelerated by new AI cognitive labor

(Main article: Argument for AI x-risk from destructive competition)

Competition can produce outcomes undesirable to all parties, through selection pressure for the success of any behavior that survives well. AI may increase the intensity of relevant competitions.

General evidence

This is evidence for existential risk from AI which doesn't point to specific scenarios:

  1. Expert opinion expects non-negligible extinction risk: in a large survey run by AI Impacts, the median machine learning researcher appeared to put 5-10% chance on extinction risk from human-level artificial intelligence, across different question framings.
  2. AI will have large impacts, which is heuristically indicative of risk. (Main article: Argument for AI x-risk from large impacts)

General counterarguments

Counterarguments to specific arguments above are listed in their main articles. These are general counterarguments to concern about AI risk, not directed at specific arguments.

  1. In the absence of clearly reliable reasoning, we should not put high probability on outcomes which initially had low likelihood. In more detail:
    1. None of this reasoning is rigorous
    2. Reasoning is fallible, and these styles of reasoning have not been tested empirically
    3. Human extinction is a rare event, so should have a very low prior likelihood
    4. Human extinction specifically from uncontrolled artificial intelligence seems intuitively absurd to many, which heuristically suggests it should have a very low prior likelihood
  2. There is a dearth of examples of concrete scenarios which appear to be plausible and realistic

Arguments for risk being higher, if it exists

These are arguments that, supposing there is some other reason to expect a risk at all, the risk may be larger or worse than expected.

  1. AI performance may increase very fast due to inherent propensities to discontinuity
  2. AI performance may increase very fast once AI contributes to AI progress, due to a feedback dynamic ('intelligence explosion' from 'recursive self improvement')

See also

Notes

arguments_for_ai_risk/is_ai_an_existential_threat_to_humanity/start.1676175042.txt.gz · Last modified: 2023/02/12 04:10 by katjagrace