Published 06 August, 2022
This page is incomplete, under active work and may be updated soon.
This is a bibliography of pieces arguing that AI poses an existential risk.
Adamczewski, Tom. “A Shift in Arguments for AI Risk.” Fragile Credences. Accessed October 20, 2020. https://fragile-credences.github.io/prioritising-ai/.
Amodei, Dario, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. “Concrete Problems in AI Safety.” ArXiv:1606.06565 [Cs], July 25, 2016. http://arxiv.org/abs/1606.06565.
Bensinger, Rob, Eliezer Yudkowsky, Richard Ngo, Nate Soares, Holden Karnofsky, Ajeya Cotra, Carl Shulman, and Rohin Shah. “2021 MIRI Conversations – LessWrong.” Accessed August 6, 2022. https://www.lesswrong.com/s/n945eovrA3oDueqtq.
Bostrom, N., Superintelligence, Oxford University Press, 2014.
Carlsmith, Joseph. “Is Power-Seeking AI an Existential Risk? [Draft].” Open Philanthropy Project, April 2021. https://docs.google.com/document/d/1smaI1lagHHcrhoi6ohdq3TYIZv0eNWWZMPEy8C8byYg/edit?usp=embed_facebook.
Christian, Brian. The Alignment Problem: Machine Learning and Human Values. W. W. Norton & Company, 2021.
Christiano, Paul. “What Failure Looks Like.” AI Alignment Forum (blog), March 17, 2019. https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like.
Dai, Wei. “Comment on Disentangling Arguments for the Importance of AI Safety – LessWrong.” Accessed December 9, 2021. https://www.lesswrong.com/posts/JbcWQCxKWn3y49bNB/disentangling-arguments-for-the-importance-of-ai-safety.
Hubinger, Evan, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, and Scott Garrabrant. “Risks from Learned Optimization in Advanced Machine Learning Systems,” June 5, 2019. https://arxiv.org/abs/1906.01820v3.
Ngo, Richard. “Thinking Complete: Disentangling Arguments for the Importance of AI Safety.” Thinking Complete (blog), January 21, 2019. http://thinkingcomplete.blogspot.com/2019/01/disentangling-arguments-for-importance.html. (Also LessWrong and the Alignment Forum, with relevant comment threads.)
Ngo, Richard. “AGI Safety from First Principles,” September 28, 2020. https://www.lesswrong.com/s/mzgtmmTKKn5MuCzFJ.
Ord, Toby. The Precipice: Existential Risk and the Future of Humanity. Illustrated Edition. New York: Hachette Books, 2020.
Piper, Kelsey. “The Case for Taking AI Seriously as a Threat to Humanity.” Vox, December 21, 2018. https://www.vox.com/future-perfect/2018/12/21/18126576/ai-artificial-intelligence-machine-learning-safety-alignment.
Russell, Stuart. Human Compatible: Artificial Intelligence and the Problem of Control. Viking, 2019.
Turner, Alexander Matt, Logan Smith, Rohin Shah, Andrew Critch, and Prasad Tadepalli. “Optimal Policies Tend to Seek Power.” ArXiv:1912.01683 [Cs], December 3, 2021. http://arxiv.org/abs/1912.01683.
Yudkowsky, Eliezer. “Artificial Intelligence as a Positive and Negative Factor in Global Risk.” In Global Catastrophic Risks, edited by Nick Bostrom and Milan M. Ćirković, 46. New York, n.d. https://intelligence.org/files/AIPosNegFactor.pdf.
Yudkowsky, Eliezer, Rob Bensinger, and So8res. “2022 MIRI Alignment Discussion – LessWrong.” Accessed August 6, 2022. https://www.lesswrong.com/s/v55BhXbpJuaExkpcD.
Yudkowsky, Eliezer, and Robin Hanson. “The Hanson-Yudkowsky AI-Foom Debate – LessWrong.” Accessed August 6, 2022. https://www.lesswrong.com/tag/the-hanson-yudkowsky-ai-foom-debate.
Garfinkel, Ben, Miles Brundage, Daniel Filan, Carrick Flynn, Jelena Luketina, Michael Page, Anders Sandberg, Andrew Snyder-Beattie, and Max Tegmark. “On the Impossibility of Supersized Machines.” ArXiv:1703.10987 [Physics], March 31, 2017. http://arxiv.org/abs/1703.10987.
Primary author: Katja Grace