arguments_for_ai_risk:is_ai_an_existential_threat_to_humanity:interviews_on_the_strength_of_the_evidence_for_ai_risk_claims

Interviews on the strength of the evidence for AI risk claims

This is a collection of interviews with AI risk researchers on the strength of the evidence for AI risk.

Notes on interviewee selection

  • We contacted AI researchers we knew (or knew of) at some prominent labs and AI safety organizations and in academia
    • This was not a systematic process, and we expect there to be some substantive bias introduced both by who we reached out to and who agreed to be interviewed
  • All of the people we interviewed are concerned about AI risk and spend some or all of their work time working to reduce it
    • On the one hand, this means that they are expert in the topic we’re interested in (the evidence for AI risk claims)
    • On the other hand, it also means that they have an incentive to interpret the evidence as stronger rather than weaker
    • The information we’ve gathered from the interview series should therefore be taken as evidence for AI risk claims, rather than an unbiased review of the evidence on AI risk claims overall

The questions we asked

The core questions we asked were:

  1. Can you introduce yourself and the work you do?
  2. What’s your probability that AI causes human extinction by 2100?
  3. What evidence do you find most convincing for existential risk from AI?
  4. How would you summarize the state of the evidence for AI risk overall?

Most of the interview time was spent on question 3. We sought to ask follow-up questions covering misalignment (including specification gaming, goal misgeneralization and deceptive alignment) and power-seeking (including self-preservation and self-improvement).

A list of the interviews

Note that these interviews represent researchers’ personal views and not the views of their employers.

(We are still waiting for permission to publish more interview summaries.)

arguments_for_ai_risk/is_ai_an_existential_threat_to_humanity/interviews_on_the_strength_of_the_evidence_for_ai_risk_claims.txt · Last modified: 2023/10/13 07:55 by rosehadshar