====== Affordances for AI labs ====== //Published 25 January 2023// This is a list of actions AI labs could take that may be strategically relevant (or consequences or characteristics of possible actions). ===== List ===== * Deploy an AI system * Pursue AI capabilities * Pursue risky (and more or less alignable systems) systems * Pursue systems that enable risky (and more or less alignable) systems * Pursue weak AI that's mostly orthogonal to progress in risky stuff for a specific (strategically significant) task or goal * This could enable or abate catastrophic risks besides unaligned AI * Do alignment (and related) research (or: decrease the [[https://forum.effectivealtruism.org/posts/63stBTw3WAW6k45dY/paul-christiano-current-work-in-ai-alignment|alignment tax]] by doing technical research) * Including interpretability and work on solving or avoiding alignment-adjacent problems like [[https://www.lesswrong.com/posts/brXr7PJ2W4Na2EW2q/the-commitment-races-problem|decision theory and strategic interaction]] and maybe [[http://acritch.com/arches/|delegation involving multiple humans or multiple AI systems]] * Advance global capabilities * Publish capabilities research * Cause investment or spending in big AI projects to increase * Advance alignment (or: decrease the alignment tax) in ways other than doing technical research * Support and coordinate with external alignment researchers * Attempt to align a particular system (or: try to pay the alignment tax) * Interact with other labs * Coordinate with other labs (notably including coordinating to avoid risky systems) * Make themselves transparent to each other * Make themselves transparent to an external auditor * Merge * Effectively commit to share upsides * Effectively commit to [[https://openai.com/charter/|stop and assist]] * Affect what other labs believe on the object level (about AI capabilities or risk in general, or regarding particular memes) * Practice [[https://www.lesswrong.com/posts/vZzg8NS7wBtqcwhoJ/nearcast-based-deployment-problem-analysis#SelectiveInformationSharing|selective information sharing]] * Demonstrate AI risk (or provide evidence about it) * Negotiate with other labs, or affect other labs' incentives or meta-level beliefs * Affect public opinion, media, and politics * Publish research * Make demos or public statements * Release or deploy AI systems * Improve their culture or operations * Improve operational security * Affect attitudes of effective leadership * Affect attitudes of researchers * Make a plan for alignment (e.g., [[https://openai.com/blog/our-approach-to-alignment-research/|OpenAI's]]); share it; update and improve it; and coordinate with capabilities researchers, alignment researchers, or other labs if relevant * Make plans for what to do with powerful AI (e.g. a process for producing powerful aligned AI given some type of advanced AI system, or a specification for parties interacting peacefully) * Improve their ability to make themselves (selectively) transparent * Try to better understand the future, the strategic landscape, risks, and possible actions * Acquire resources (money, hardware, talent, influence over states, status/prestige/trust, etc.) * Affect other actors' resources * Affect the flow of talent between labs or between projects * Plan, execute, or participate in [[https://arbital.com/p/pivotal/|pivotal acts]] or [[https://www.lesswrong.com/posts/etNJcXCsKC6izQQZj/pivotal-outcomes-and-pivotal-processes|processes]] * Capture scarce resources * E.g., language data from language model users //Primary author: Zach Stein-Perlman//