Summary:
This argument may work with different forms of AI superiority, such as:
People who probably 2) endorse this argument, include:
Paul Christiano (2014):
it becomes increasingly difficult for humans to directly control what happens in a world where nearly all productive work, including management, investment, and the design of new machines, is being done by machines.
Ajeya Cotra (2023):
This is an analogy of the position that humans might be in when they’re trying to train systems that are much smarter than them and more powerful than them, based on basically giving them rewards, or based on the human’s understanding of what’s going on.
Imagine an eight-year-old that has inherited a large fortune from his parents, like a trillion-dollar company or something. And he doesn’t really have any adult allies in his life that are really looking out for his best interests. He needs to find someone to manage all his affairs and run this company and keep it safe for him until he grows up and can take on leadership himself.
Because this is such a large prize, a whole bunch of adults might apply for this role. But if he has no existing adult allies, then it can be very difficult for him to tell — based on performance in things like interviews or work trials or even references — who actually is going to have his best interests at heart in the long run, versus is just totally capable of appearing reasonable in an interview process.
Richard Ngo (2024):
“One salient possibility is that AGIs use the types of deception described in the previous section to convince humans that it’s safe to deploy them widely, then leverage their positions to disempower humans…
AGIs deployed as personal assistants could emotionally manipulate human users, provide biased information to them, and be delegated responsibility for increasingly important tasks and decisions (including the design and implementation of more advanced AGIs), until they’re effectively in control of large corporations or other influential organizations. As an early example of AI persuasive capabilities, many users feel romantic attachments towards chatbots like Replika 3)
Primary author: Katja Grace
Other authors: Nathan Young, Josh Hart
Suggested citation:
Grace, K., Young, N., Hart, J., (2024), Argument for AI x-risk from loss of control through inferiority, AI Impacts Wiki, https://wiki.aiimpacts.org/arguments_for_ai_risk/list_of_arguments_that_ai_poses_an_xrisk/argument_for_ai_x-risk_from_loss_of_control_through_inferiority