Published 04 August, 2022; last updated 26 May, 2023
The 2022 Expert Survey on Progress in AI (2022 ESPAI) is a survey of machine learning researchers that AI Impacts ran in June-August 2022.
The 2022 ESPAI is a rerun of the 2016 Expert Survey on Progress in AI that researchers at AI Impacts previously collaborated on with others. Almost all of the questions were identical, and both surveyed authors who recently published in NeurIPS and ICML, major machine learning conferences.
Zhang et al ran a followup survey in 2019 (published in 2022)1 however they reworded or altered many questions, including the definitions of HLMI, so much of their data is not directly comparable to that of the 2016 or 2022 surveys, especially in light of large potential for framing effects observed.
We contacted approximately 4271 researchers who published at the conferences NeurIPS or ICML in 2021. These people were selected by taking all of the authors at those conferences and randomly allocating them between this survey and a survey being run by others. We then contacted those whose email addresses we could find. We found email addresses in papers published at those conferences, in other public data, and in records from our previous survey and Zhang et al 2022. We received 738 responses, some partial, for a 17% response rate.
Participants who previously participated in the the 2016 ESPAI or Zhang et al surveys received slightly longer surveys, and received questions which they had received in past surveys (where random subsets of questions were given), rather than receiving newly randomized questions. This was so that they could also be included in a ‘matched panel’ survey, in which we contacted all researchers who completed the 2016 ESPAI or Zhang et al surveys, to compare responses from exactly the same samples of researchers over time. These surveys contained additional questions matching some of those in the Zhang et al survey.
We invited the selected researchers to take the survey via email. We accepted responses between June 12 and August 3, 2022.
The full list of survey questions is available below, as exported from the survey software. The export does not preserve pagination, or data about survey flow. Participants received randomized subsets of these questions, so the survey each person received was much shorter than that shown below.
A small number of changes were made to questions since the 2016 survey (list forthcoming).
Edits were made to the raw data before analysis in the hope of preserving its intended meaning, including but not limited to:
‘HLMI’ was defined as follows:
The following questions ask about ‘high–level machine intelligence’ (HLMI). Say we have ‘high-level machine intelligence’ when unaided machines can accomplish every task better and more cheaply than human workers. Ignore aspects of tasks for which being a human is intrinsically advantageous, e.g. being accepted as a jury member. Think feasibility, not adoption.
The anonymized dataset is available here.
The aggregate forecast time to HLMI was 36.6 years, conditional on “human scientific activity continu[ing] without major negative disruption.” and considering only questions using the HLMI definition. We have not yet analyzed data about the conceptually similar Full Automation of Labor (FAOL), which in 2016 prompted much later timeline estimates. Thus this timeline figure is expected to be low relative to an overall estimate from this survey.
This aggregate is the 50th percentile date in an equal mixture of probability distributions created by fitting a gamma distribution to each person’s answers to three questions either about the probability of HLMI occurring by a given year or the year at which a given probability would obtain.
Participants were either asked about the probability of an occupation being fully automated by a given year or asked about the year at which a given probability would obtain. We have not yet used the methodology in the above section to aggregate the results from the two questions into a single prediction. Below are the results for the fixed-year version of the question.
Participants were asked:
Assume for the purpose of this question that HLMI will at some point exist. How positive or negative do you expect the overall impact of this to be on humanity, in the long run? Please answer by saying how probable you find the following kinds of impact, with probabilities adding to 100%:
______ Extremely good (e.g. rapid growth in human flourishing) (1)
______ On balance good (2)
______ More or less neutral (3)
______ On balance bad (4)
______ Extremely bad (e.g. human extinction) (5)
Medians
Note that this means that at least half of respondents placed at least 5% probability on an extremely bad outcome.
Means
Full distribution of responses
The areas allocated to different futures correspond to the means in the last subsection, so for instance 14% of the area is in extremely bad futures.
For comparison, here is the full distribution of responses from the 2016 survey:
Total area given to different scenarios in 2016 data:
Here is the 2022 data again, ordered by overall optimism:
Participants were asked:
Assume that HLMI will exist at some point. How likely do you then think it is that the rate of global technological improvement will dramatically increase (e.g. by a factor of ten) as a result of machine intelligence:
Within two years of that point? ___% chance
Within thirty years of that point? ___% chance
Median P(within two years) = 20% (20% in 2016)
Median P(within thirty years) = 80% (80% in 2016)
Participants were asked:
Assume that HLMI will exist at some point. How likely do you think it is that there will be machine intelligence that is vastly better than humans at all professions (i.e. that is vastly more capable or vastly cheaper):
Within two years of that point? ___% chance
Within thirty years of that point? ___% chance
Median P(…within two years) = 10% (10% in 2016)
Median P(…within thirty years) = 60% (50% in 2016)
Participants were asked:
Some people have argued the following:
If AI systems do nearly all research and development, improvements in AI will accelerate the pace of technological progress, including further progress in AI.
Over a short period (less than 5 years), this feedback loop could cause technological progress to become more than an order of magnitude faster.
How likely do you find this argument to be broadly correct?
Participants were asked about the sensitivity of progress in AI capabilities to various changes in inputs.
“Imagine that over the past decade, only half as much researcher effort had gone into AI research. For instance, if there were actually 1,000 researchers, imagine that there had been only 500 researchers (of the same quality). How much less progress in AI capabilities would you expect to have seen?”
The median response was 25%.
“Over the last n years the cost of computing hardware has fallen by a factor of 20. Imagine instead that the cost of computing hardware had fallen by only a factor of 5 over that time (around half as far on a log scale). How much less progress in AI capabilities would you expect to have seen?”
The median response was 60%.
“Imagine that over the past decade, there had only been half as much effort put into increasing the size and availability of training datasets. For instance, perhaps there are only half as many datasets, or perhaps existing datasets are substantially smaller or lower quality. How much less progress in AI capabilities would you expect to have seen?”
The median response was 50%.
“Imagine that over the past decade, AI research had half as much funding (in both academic and industry labs). For instance, if the average lab had a budget of \$20 million each year, suppose their budget had only been \$10 million each year. How much less progress in AI capabilities would you expect to have seen?”
The median response was 35%.
“Imagine that over the past decade, there had been half as much progress in AI algorithms. You might imagine this as conceptual insights being half as frequent. How much less progress in AI capabilities would you expect to have seen?”
The median response was 50%.
In an above question, participants’ credence in “extremely bad” outcomes of HLMI have median 5% and mean 14%. To better clarify what participants mean by this, we also asked a subset of participants one of the following questions, which did not appear in the 2016 survey:
Participants were asked:
What probability do you put on future AI advances causing human extinction or similarly permanent and severe disempowerment of the human species?
Median 5%.
Participants were asked:
What probability do you put on human inability to control future advanced AI systems causing human extinction or similarly permanent and severe disempowerment of the human species?
Median 10%.
This question is more specific and thus necessarily less probable than the previous question, but it was given a higher probability at the median. This could be due to noise (different random subsets of respondents received the questions, so there is no logical requirement that their answers cohere), or due to the representativeness heuristic.
Participants were asked:
Let ‘AI safety research’ include any AI-related research that, rather than being primarily aimed at improving the capabilities of AI systems, is instead primarily aimed at minimizing potential risks of AI systems (beyond what is already accomplished for those goals by increasing AI system capabilities).
Examples of AI safety research might include:
How much should society prioritize AI safety research, relative to how much it is currently prioritized?
69% of respondents think society should prioritize AI safety research more or much more, up from 49% in 2016.
Participants were asked:
Stuart Russell summarizes an argument for why highly advanced AI might pose a risk as follows:
The primary concern [with highly advanced AI] is not spooky emergent consciousness but simply the ability to make high-quality decisions. Here, quality refers to the expected outcome utility of actions taken […]. Now we have a problem:
1. The utility function may not be perfectly aligned with the values of the human race, which are (at best) very difficult to pin down.
2. Any sufficiently capable intelligent system will prefer to ensure its own continued existence and to acquire physical and computational resources – not for their own sake, but to succeed in its assigned task.
A system that is optimizing a function of n variables, where the objective depends on a subset of size k<n, will often set the remaining unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable. This is essentially the old story of the genie in the lamp, or the sorcerer’s apprentice, or King Midas: you get exactly what you ask for, not what you want.
Do you think this argument points at an important problem?
How valuable is it to work on this problem today, compared to other problems in AI?
How hard do you think this problem is compared to other problems in AI?
Importance:
Value today:
Hardness:
Places where the 2022 ESPAI has been cited:
The survey was run by Katja Grace and Ben Weinstein-Raun. Data analysis was done by Zach Stein-Perlman, Ben Weinstein-Raun and John Salvatier. This page was written by Zach Stein-Perlman and Katja Grace.
We thank many colleagues and friends for help, discussion and encouragement, including John Salvatier, Nick Beckstead, Howie Lempel, Joe Carlsmith, Leopold Aschenbrenner, Ramana Kumar, Jimmy Rintjema, Jacob Hilton, Ajeya Cotra, Scott Siskind, Chana Messinger, Noemi Dreksler, and Baobao Zhang.
We also thank the expert participants who spent time sharing their impressions with us, including:
Michał Zając
Morten Goodwin
Yue Sun
Ningyuan Chen
Egor Kostylev
Richard Antonello
Elia Turner
Andrew C Li
Zachary Markovich
Valentina Zantedeschi
Michael Cooper
Thomas A Keller
Marc Cavazza
Richard Vidal
David Lindner
Xuechen (Chen) Li
Alex M. Lamb
Tristan Aumentado-Armstrong
Ferdinando Fioretto
Alain Rossier
Wentao Zhang
Varun Jampani
Derek Lim
Muchen Li
Cong Hao
Yao-Yuan Yang
Linyi Li
Stéphane D’Ascoli
Lang Huang
Maxim Kodryan
Hao Bian
Orestis Paraskevas
David Madras
Tommy Tang
Li Sun
Stefano V Albrecht
Tristan Karch
Muhammad A Rahman
Runtian Zhai
Benjamin Black
Karan Singhal
Lin Gao
Ethan Brooks
Cesar Ferri
Dylan Campbell
Xujiang Zhao
Jack Parker-Holder
Michael Norrish
Jonathan Uesato
Yang An
Maheshakya Wijewardena
Ulrich Neumann
Lucile Ter-Minassian
Alexander Matt turner
Subhabrata Dutta
Yu-Xiang Wang
Yao Zhang
Joanna Hong
Yao Fu
Wenqing Zheng
Louis C Tiao
Hajime Asama
Chengchun Shi
Moira R Dillon
Yisong Yue
Aurélien Bellet
Yin Cui
Gang Hua
Jongheon Jeong
Martin Klissarov
Aran Nayebi
Fabio Maria Carlucci
Chao Ma
Sébastien Gambs
Rasoul Mirzaiezadeh
Xudong Shen
Julian Schrittwieser
Adhyyan Narang
Fuxin Li
Linxi Fan
Johannes Gasteiger
Karthik Abinav Sankararaman
Patrick Mineault
Akhilesh Gotmare
Jibang Wu
Mikel Landajuela
Jinglin Liu
Qinghua Hu
Noah Siegel
Ashkan Khakzar
Nathan Grinsztajn
Julian Lienen
Xiaoteng Ma
Mohamad H Danesh
Ke ZHANG
Feiyu Xiong
Wonjae Kim
Michael Arbel
Piotr Skowron
Lê-Nguyên Hoang
Travers Rhodes
Liu Ziyin
Hossein Azizpour
Karl Tuyls
Hangyu Mao
Yi Ma
Junyi Li
Yong Cheng
Aditya Bhaskara
Xia Li
Danijar Hafner
Brian Quanz
Fangzhou Luo
Luca Cosmo
Scott Fujimoto
Santu Rana
Michael Curry
Karol Hausman
Luyao Yuan
Samarth Sinha
Matthew McLeod
Hao Shen
Navid Naderializadeh
Alessio Micheli
Zhenbang You
Van Huy Vo
Chenyang Wu
Thanard Kurutach
Vincent Conitzer
Chuang Gan
Chirag Gupta
Andreas Schlaginhaufen
Ruben Ohana
Luming Liang
Marco Fumero
Paul Muller
Hana Chockler
Ming Zhong
Jiamou Liu
Sumeet Agarwal
Eric Winsor
Ruimeng Hu
Changjian Shui
Yiwei Wang
Joey Tianyi Zhou
Anthony L. Caterini
Guillermo Ortiz-Jimenez
Iou-Jen Liu
Jiaming Liu
Michael Perlmutter
Anurag Arnab
Ziwei Xu
John Co-Reyes
Aravind Rajeswaran
Roy Fox
Yong-Lu Li
Carl Yang
Divyansh Garg
Amit Dhurandhar
Harris Chan
Tobias Schmidt
Robi Bhattacharjee
Marco Nadai
Reid McIlroy-Young
Wooseok Ha
Jesse Mu
Neale Ratzlaff
Kenneth Borup
Binghong Chen
Vikas Verma
Walter Gerych
Shachar Lovett
Zhengyu Zhao
Chandramouli Chandrasekaran
Richard Higgins
Nicholas Rhinehart
Blaise Agüera Y Arcas
Santiago Zanella-Beguelin
Dian Jin
Scott Niekum
Colin A. Raffel
Sebastian Goldt
Yali Du
Bernardo Subercaseaux
Hui Wu
Vincent Mallet
Ozan Özdenizci
Timothy Hospedales
Lingjiong Zhu
Cheng Soon Ong
Shahab Bakhtiari
Huan Zhang
Banghua Zhu
Byungjun Lee
Zhenyu Liao
Adrien Ecoffet
Vinay Ramasesh
Jesse Zhang
Soumik Sarkar
Nandan Kumar Jha
Daniel S Brown
Neev Parikh
Chen-Yu Wei
David K. Duvenaud
Felix Petersen
Songhua Wu
Huazhu Fu
Roger B Grosse
Matteo Papini
Peter Kairouz
Burak Varici
Fabio Roli
Mohammad Zalbagi Darestani
Jiamin He
Lys Sanz Moreta
Xu-Hui Liu
Qianchuan Zhao
Yulia Gel
Jan Drgona
Sajad Khodadadian
Takeshi Teshima
Igor T Podolak
Naoya Takeishi
Man Shun Ang
Mingli Song
Jakub Tomczak
Lukasz Szpruch
Micah Goldblum
Graham W. Taylor
Tomasz Korbak
Maheswaran Sathiamoorthy
Lan-Zhe Guo
Simone Fioravanti
Lei Jiao
Davin Choo
Kristy Choi
Varun Nair
Rayana Jaafar
Amy Greenwald
Martin V. Butz
Aleksey Tikhonov
Samuel Gruffaz
Yash Savani
Rui Chen
Ke Sun
We thank FTX Future Fund for encouraging this project, though they did not ultimately fund it as anticipated due to the Bankruptcy of FTX.
Katja Grace, Zach Stein-Perlman, Benjamin Weinstein-Raun, and John Salvatier, “2022 Expert Survey on Progress in AI.” AI Impacts, 3 Aug. 2022. https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/.