Is AI an existential risk to humanity?
This page is under active work and may be updated soon.
The balance of evidence appears to suggest that AI poses a substantial existential risk, though none of the arguments that we know of appear be conclusive evidence.
Many thinkers believe advanced artificial intelligence (AI) poses a large threat to humanity's long term survival or flourishing. Here we review evidence.
For views of specific people working on AI, see this page.
Note that arguments included here are not intended to be straightforwardly independent lines of evidence. They may instead represent different ways of conceptualizing and reasoning about the same underlying situation.
Scenarios and supporting arguments
Advanced AI could conceivably threaten humanity's future via several different disaster scenarios, each of which is suggested by different arguments. We consider these scenarios and arguments here, one at a time.
Scenario: malign AI agents control the future
(Main article: Will malign AI agents control the future?)
This appears to be the most discussed extinction scenario. In it:
AI systems are created which, a) have goals, and b) are each more capable than a human at many economically valuable tasks, including strategic decision making.
These AI systems' superior performance allows them to take control of the future, for instance through accruing social and economic power, or through immediately devising a plan for destroying humanity.
The AI systems do not want the same things as humans, so will bring about a future that humans would disprefer
This scenario includes sub-scenarios where the above process happens fast or slow, or involves different kinds of agents, or different specific routes, etc.
Various arguments are made for this scenario. The most prominent appears to be:
AI developments will produce powerful agents with undesirable goals
(Main article: Argument for AI X-risk from competent malign agents)
: At least some advanced AI systems will probably be 'goal-oriented', a powerful force in the world, and their goals will probably be bad by human lights. Powerful goal-oriented agents tend to achieve their goals.
: This seems to us the most suggestive argument, though not watertight. This seems prima facie plausible, but destroying everything is a very implausible event, so the burden of proof is high.
In light of arguments, this scenario seems plausible but not guaranteed.
Scenario: AI empowers bad human actors
Some people and collectives have goals whose fulfillment would be considered bad by most people. If advanced AI empowered those people disproportionately, this could be destructive. This could happen by bad luck, or because the situation systematically advantages unpopular values.
Arguments for this type of scenario occurring:
Bad actors are usually constrained by lack of human cognitive labor available widely disliked projects (e.g. terrorist groups only have access to the minuscule fraction of talent that is aligned with their ideology and in favor of terrorism). If money can more reliably buy cognitive labor for any purpose, these agendas may be less disadvantaged.
Perhaps virtually all people at present, if suddenly empowered to unprecedented levels, would bring about regrettable outcomes, such that this scenario would result from almost people or groups being empowered a lot suddenly.
These arguments appear to raise the chance of such a scenario, but not massively.
Scenario: new AI cognitive labor is misdirected, causing destruction contrary to all actors' goals
(Main article: Argument for AI x-risk from potential for accidents and misuse)
Advanced AI could yield powerful destructive capabilities such as new weapons or hazardous technologies. As well as being used maliciously (see previous section) or forcing well-meaning actors into situations where unfortunate risks are hard to avoid, as with nuclear weapons (see next section), these raise the risk of cataclysmic accidents, just by being used in error.
Scenario: destructive multi-agent dynamics are accelerated by new AI cognitive labor
(Main article: Argument for AI x-risk from destructive competition)
Competition can produce outcomes undesirable to all parties, through selection pressure for the success of any behavior that survives well. AI may increase the intensity of relevant competitions.
This is evidence for existential risk from AI which doesn't point to specific scenarios:
Expert opinion expects non-negligible extinction risk
: in a large survey
run by AI Impacts, the median machine learning researcher appeared to put 5-10% chance on extinction risk from human-level artificial intelligence
, across different question framings.
Counterarguments to specific arguments above are listed in their main articles. These are general counterarguments to concern about AI risk, not directed at specific arguments.
In the absence of clearly reliable reasoning, we should not put high probability on outcomes which initially had low likelihood. In more detail:
None of this reasoning is rigorous
Reasoning is fallible, and these styles of reasoning have not been tested empirically
Human extinction is a rare event, so should have a very low prior likelihood
Human extinction specifically from uncontrolled artificial intelligence seems intuitively absurd to many, which heuristically suggests it should have a very low prior likelihood
There is a dearth of examples of concrete scenarios which appear to be plausible and realistic
Arguments for risk being higher, if it exists
These are arguments that, supposing there is some other reason to expect a risk at all, the risk may be larger or worse than expected.
AI performance may increase very fast due to inherent propensities to discontinuity
AI performance may increase very fast once AI contributes to AI progress, due to a feedback dynamic ('intelligence explosion' from 'recursive self improvement')