Interview on the strength of the evidence for AI risk claims with an anonymous AI alignment researcher
About anonymous AI alignment researcher
This researcher is an associate professor in CS, and works on the alignment team for a major AI lab.
They think there’s a 25% chance that AI causes human extinction by 2100, but an 85% chance that AI causes permanent human disempowerment by 2100.1)
Anonymous AI alignment researcher’s overall assessment of the evidence
The background model that makes this researcher believe that AI risk is so high is:
2)
This researcher thinks that there are two stable states: runaway competition leading to extinction, or totalitarianism leading to value lock in.
3)
They also think we should expect values to get worse (relative to our current values) rather than to continue to improve.
4)
Empirical evidence about AI capabilities is not very important to this researcher’s beliefs about AI risk.
5)
This researcher thinks that the evidence for AI risk is weaker and more speculative than would usually be the case to motivate expensive policy interventions.
6)
Reasons that this researcher is convinced of AI risk anyway include:
7)
A track record of being well-calibrated
The social proof of accomplished people believing these claims
The weakness of counterarguments
Anonymous AI alignment researcher’s assessment of the evidence on particular AI capabilities
This researcher thinks that only small capability improvements (combined with a continuing drop in the price of compute) are required to cause a large economic shock.
8)
AI systems can already be copied cheaply and learn in a distributed way.
9)
This researcher is 95% confident that AI systems will improve in planning sufficiently to seriously marginalize humans.
10)
Many similar problems have already been solved.
This researcher expects compute used to increase by a factor of a hundred or a thousand in the next few years.
So even without any algorithmic improvements, this researcher is 80% confident that these planning abilities will be achieved.
This researcher expects that the level of robotics required for humans to be seriously marginalized might take another 10 or 20 years.
11)