User Tools

Site Tools


AI Timeline Surveys

Published 10 January 2015; last updated 9 January 2024

This page is incomplete, under active work and may be updated soon.

We know of 20 surveys on the predicted timing of human-level AI. If we collapse a few slightly different meanings of ‘human-level AI’, then:

  • Median estimates made before 2016 for when there will be a 10% chance of human-level AI are all in the 2020s, median estimates made after 2016 for when there will be a 10% chance of human-level AI are in the 2030s, except for one with a median of 2046.
  • Median estimates for when there will be a 50% chance of human-level AI ranged between 2035 and 2074 (from twelve surveys).
  • Of three surveys in recent decades asking for predictions but not probabilities, two produced median estimates of when human-level AI will arrive in the 2050s, and one in 2085.

Participants appear to mostly be experts in AI or related areas, but with a large contingent of others. Several groups of survey participants seem likely to over-represent people who are especially optimistic about human-level AI being achieved soon.


List of surveys

These are the surveys that we know of on timelines to human-level AI:


Results summary

Year Survey #  10% chance  50% chance  90% chance  Other key ‘Predictions’ Participants Response rate Link to original document
1972  Michie 67       Median 50y (2022) (vs 20 or >50) AI, CS link
2005  Bainbridge 26        Median 2085 Tech  link
2006  AI@50         median >50y (2056) AI conf link
2007  Klein 888       median 2030-2050 Futurism? link and link
2009  AGI-09  21  2020  2040  2075   AGI conf; AI link
2011  AGI-11 60       AGI-11 link
2011  FHI Winter Intelligence 35  2028 2050  2150   AGI impacts conf; 44% related technical 41% link
2011-2012  Kruel interviews 37  2025  2035  2070   AGI, AI link
2012  FHI: AGI-12 72  2022  2040  2065   AGI & AGI impacts conf; AGI, technical work 65% link
2012  FHI:PT-AI 43  2023  2048  2080   Philosophy & theory of AI conf; not technical AI 49% link
2012-?  Hanson ~10        ≤ 10% progress to human level in past 20y AI link
2013  FHI: TOP100 29 2022  2040  2075   Top AI 29% link
2013  FHI:EETN 26  2020  2050  2093   Greek assoc. for AI; AI 10%


2015  ESPAI 2016 352 2025  2061     NIPS and ICML 2015 21% link
2016  Etzioni 80       41% link
2017  Walsh 849 AI experts: 2035
Robotics experts: 2033
Non-experts: 2026
AI experts: 2061
Robotics experts: 2065
Non-experts: 2039
AI experts: 2109
Robotics experts: 2118
Non-experts: 2060
  200 AI Experts
101 Robotics experts
548 Non-experts
2018  Gruetzemacher 164 2046 2074 2117   ICML, IJCAI, and HLAI 2018 link
2019  GovAI 296 2039 2059 2119 NIPS and ICML 2018 20% link
2022  ESPAI 2022 461 2029 2060 2288   NIPS and ICML 2021 17% link
2023  ESPAI 2023 1714 2027 2047 2186   NeurIPS, ICML, ICLR, AAAI, JMLR, and IJCAI 2022 15% link

Time to a 10% chance and a 50% chance of human-level AI

The FHI Winter Intelligence, Müller and Bostrom, AGI-09, Kruel, and 2016 ESPAI surveys asked for years when participants expected 10%, 50% and 90% probabilities of human-level AI (or a similar concept). All of these surveys were taken between 2009 and 2012, except the 2016 ESPAI.

Survey participants’ median estimates for when there will be a 10% chance of human-level AI are all in the 2020s or 2030s. Until the 2016 ESPAI survey, median estimates for when there will be a 50% chance of human-level AI ranged between 2035 and 2050. The 2016 ESPAI asked about human-level AI using both very similar questions to previous surveys, and a different style of question based on automation of specific human occupations. The former questions found median dates of at least 2056, and the latter question prompted median dates of at least 2106.

Non-probabilistic predictions

Three surveys (Bainbridge, Klein, and AI@50) asked about predictions, rather than confidence levels. These produced median predictions of  >2056 (AI@50), 2030-50 (Klein), and 2085 (Bainbridge). It is unclear how participants interpret the request to estimate when a thing will happen; these responses may mean the same as the 50% confidence estimate discussed above. These surveys together appear to contain a high density of people who don’t work in AI, compared to the other surveys.

Michie survey

Michie’s survey is unusual in being much earlier than the others (1972). In it, less than a third of participants expected human-level AI by 1992, another almost third estimated 2022, and the rest expected it later. Note that the participants’ median expectation (50 years away) was further from their present time than those of contemporary survey participants. This point conflicts with a common perception that early AI predictions were shockingly optimistic, and quickly undermined.

Hanson survey

Hanson’s survey is unusual in its methodology. Hanson informally asked some AI experts what fraction of the way to human-level capabilities we had come in 20 years, in their subfield. He also asked about apparent acceleration. Around half of answers were in the 5-10% range, and all except one which hadn’t passed human-level already were less than 10%. Of six who reported on acceleration, only one saw positive acceleration.

These estimates suggest human-level capabilities in most fields will take more than 200 years, if progress proceeds as it has (i.e. if we progress at 10% per twenty years, it will take 200 years to get to 100%). This estimate is quite different from those obtained from most of the other surveys.

The 2016 ESPAI attempted to replicate this methodology, and did not appear to find similarly long implied timelines, however little attention has been paid to analyzing that data.

This methodology is discussed more in the methods section below.


Survey participants

In assessing the quality of predictions, we are interested in the expertise of the participants, the potential for biases in selecting them, and the degree to which a group of well-selected experts generally tend to make good predictions. We will leave the third issue to be addressed elsewhere, and here describe the participants’ expertise and the surveys’ biases. We will see that the participants have much expertise relevant to AI, but – relatedly – their views are probably biased toward optimism because of selection effects as well as normal human optimism about projects.

Summary of participant backgrounds

The FHI (2011), AGI-09, and one of the four FHI collection surveys are from AGI (artificial general intelligence) conferences, so will tend to include a lot of people who work directly on trying to create human-level intelligence, and others who are enthusiastic or concerned about that project. At least two of the aforementioned surveys draw some participants from the ‘impacts’ section of the AGI conference, which is likely to select for people who think the effects of human-level intelligence are worth thinking about now.

Kruel’s participants are not from the AGI conferences, but around half work in AGI. Klein’s participants are not known, except they are acquaintances of a person who is enthusiastic about AGI (his site is called ‘AGI-world’). Thus many participants either do AGI research, or think about the topic a lot.

Many more participants are AI researchers from outside AGI. Hanson’s participants are experts in narrow AI fields. Michie’s participants are computer scientists working close to AI. Müller and Bostrom’s surveys of the top 100 artificial intelligence researchers, and Members of the Greek Association for Artificial Intelligence, would be almost entirely AI researchers, and there is little reason to expect them to be in AGI. AI@50 seems to include a variety of academics interested in AI rather than those in the narrow field of AGI, though also includes others, such as several dozen graduate and post-doctoral students. 2016 ESPAI is everyone publishing in two top machine learning conferences, so largely machine learning researchers.

The remaining participants appear to be mostly highly educated people from academia and other intellectual areas. The attendees at the 2011 Conference on Philosophy and Theory of AI appear to be a mixture of philosophers, AI researchers, and academics from related fields such as brain sciences. Bainbridge’s participants are contributors to ‘converging technology’ reports, on topics of nanotechnology, biotechnology, information technology, and cognitive science. From looking at what appears to be one of these reports, these seem to be mostly experts from government and national laboratories, academia, and the private sector. Few work in AI in particular. An arbitrary sample includes the Director of the Division of Behavioral and Cognitive Sciences at NSF, a person from the Defense Threat Reduction Agency, and a person from HP laboratories.

AGI researchers

As noted above, many survey participants work in AGI – the project to create general intelligent agents, as opposed to narrow AI applications. In general, we might expect people working on a given project to be unusually optimistic about its success, for two reasons. First, those who are most optimistic initially will more likely find the project worth investing in. Secondly, people are generally observed to be especially optimistic about the time needed for their own projects to succeed. So we might expect AGI researchers to be biased toward optimism, for these reasons.

On the other hand, AGI researchers are working on projects most closely related to human-level AI, so probably have the most relevant expertise.

Other AI researchers

Just as AGI researchers work on topics closer to human-level AI than other AI researchers – and so may be more biased but also more knowledgeable – AI researchers work on more relevant topics than everyone else. Similarly, we might expect them to both be more accurate due to their additional expertise, but more biased due to selection effects and optimism about personal projects.

Hanson’s participants are experts in narrow AI fields, but are also reporting on progress in their own fields of narrow AI (rather than on general intelligence), so we might expect them to be more like the AGI researchers – especially expert and especially biased. On the other hand, Hanson asks about past progress rather than future expectations, which should diminish both the selection effect and the effect from the planning fallacy, so we might expect the bias to be weaker.

Definitions of human-level AI

A few different definitions of human-level AI are combined in this analysis.

The AGI-09 survey asked about four benchmarks; the one reported here is the Turing-test capable AI. Note that ‘Turing test capable’ seems to sometimes be interpreted as merely capable of holding a normal human discussion. It isn’t clear that the participants had the same definition in mind.

Kruel only asked that the AI be as good as humans at science, mathematics, engineering and programming, and asks conditional on favorable conditions continuing (e.g. no global catastrophes). This might be expected prior to fully human-level AI.

Even where people talk about ‘human-level’ AI, they can mean a variety of different things. For instance, it is not clear whether a machine must operate at human cost to be ‘human-level’, or to what extent it must resemble a human.

At least three surveys use the acronym ‘HLMI’, but it can stand for either ‘human-level machine intelligence’ or ‘high level machine intelligence’ and is defined differently in different surveys.

Here is a full list of exact descriptions of something like ‘human-level’ used in the surveys:

  • Michie: ‘computing system exhibiting intelligence at adult human level’
  • Bainbridge: ‘The computing power and scientific knowledge will exist to build
    machines that are functionally equivalent to the human brain’
  • Klein: ‘When will AI surpass human-level intelligence?’
  • AI@50: ‘When will computers be able to simulate every aspect of human intelligence?’
  • FHI 2011: ‘Assuming no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of human-level machine intelligence? Feel free to answer ‘never’ if you believe such a milestone will never be reached.’
  • Müller and Bostrom: ‘[machine intelligence] that can carry out most human professions at least as well as a typical human’
  • Hanson: ‘human level abilities’ in a subfield (wording is probably not consistent, given the long term and informal nature of the poll)
  • AGI-09: ‘Passing the Turing test’
  • Kruel: Variants on, ‘Assuming beneficial political and economic development and that no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of artificial intelligence that is roughly as good as humans (or better, perhaps unevenly) at science, mathematics, engineering and programming?’
  • 2016 ESPAI (our emboldening)
    • Say we have ‘high level machine intelligence’ when unaided machines can accomplish every task better and more cheaply than human workers. Ignore aspects of tasks for which being a human is intrinsically advantageous, e.g. being accepted as a jury member. Think feasibility, not adoption.
    • Say an occupation becomes fully automatable when unaided machines can accomplish it better and more cheaply than human workers. Ignore aspects of occupations for which being a human is intrinsically advantageous, e.g. being accepted as a jury member. Think feasibility, not adoption.
    • Say we have reached ‘full automation of labor’ “when all occupations are fully automatable. That is, when for any occupation, machines could be built to carry out the task better and more cheaply than human workers.”

Inside vs. outside view methods

Hanson’s survey was unusual in that it asked participants for their impressions of past rates of progress, from which extrapolation could be made (an ‘outside view’ estimate), rather than asking directly about expected future rates of progress (an ‘inside view’ estimate). It also produced much later median dates for human-level AI, suggesting that this outside view methodology in general produces much later estimates (rather than for instance, Hanson’s low sample size and casual format just producing a noisy or biased estimate that happened to be late).

If so, this would be important because outside view estimates in general are often informative.

However the 2016 ESPAI included a set of questions similar to Hanson’s, and did not at a glance find similarly long implied timelines, though the data has not been carefully analyzed. This is some evidence against the outside view style methodology systematically producing longer timelines, though arguably not enough to overturn the hypothesis.

We might expect Hanson’s outside view method to be especially useful in AI forecasting because a key merit is that asking people about the past means asking questions more closely related to their expertise, and the future of AI is arguably especially far from anyone’s expertise (relative to say asking a dam designer how long it will take for their dam to be constructed) . On the other hand, AI researchers’ expertise may include a lot of information about AI other than how far we have come, and translating what they have seen into what fraction of the way we have come may be difficult and thus introduce additional error.

ai_timelines/predictions_of_human-level_ai_timelines/ai_timeline_surveys/ai_timeline_surveys.txt · Last modified: 2024/01/09 20:12 by harlanstewart