This shows you the differences between two versions of the page.
— |
featured_articles:glossary_of_ai_risk_terminology_and_common_ai_terms [2022/09/21 07:37] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Glossary of AI Risk Terminology and common AI terms ====== | ||
+ | |||
+ | // Published 30 October, 2015; last updated 21 January, 2022 // | ||
+ | |||
+ | |||
+ | ===== Terms ===== | ||
+ | |||
+ | |||
+ | ==== A ==== | ||
+ | |||
+ | |||
+ | === AI timeline === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An expectation about how much time will lapse before important AI events, especially the advent of < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Artificial General Intelligence (also, AGI) === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Artificial Intelligence (also, AI) === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Associative value accretion === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A hypothesized approach to value learning in which the AI acquires values using some machinery for synthesizing appropriate new values as it interacts with its environment, | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Anthropic capture === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A hypothesized control method in which the AI thinks it might be in a simulation, and so tries to behave in ways that will be rewarded by its simulators (Bostrom 2014 p134).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Anthropic reasoning === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Augmentation === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An approach to obtaining a superintelligence with desirable motives that consists of beginning with a creature with desirable motives (eg, a human), then making it smarter, instead of designing good motives from scratch (Bostrom 2014, p142).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== B ==== | ||
+ | |||
+ | |||
+ | === Backpropagation === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A fast method of computing the derivative of cost with respect to different parameters in a network, allowing for training neural nets through gradient descent. See <a href=" | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Boxing === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A control method that consists of constructing the AI’s environment so as to minimize interaction between the AI and the outside world. (Bostrom 2014, p129).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== C ==== | ||
+ | |||
+ | |||
+ | === Capability control methods === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Cognitive enhancement === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Collective superintelligence === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Computation === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A sequence of mechanical operations intended to shed light on something other than this mechanical process itself, through an established relationship between the process and the object of interest.</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === The common good principle === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Crucial consideration === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An idea with the potential to change our views substantially, | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== D ==== | ||
+ | |||
+ | |||
+ | === Decisive strategic advantage === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Direct specification === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An approach to the control problem in which the programmers figure out what humans value, and code it into the AI (Bostrom 2014, p139-40).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Domesticity === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An approach to the control problem in which the AI is given goals that limit the range of things it wants to interfere with (Bostrom 2014, p140-1).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== E ==== | ||
+ | |||
+ | |||
+ | === Emulation modulation === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Evolutionary selection approach to value learning === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A hypothesized approach to the value learning problem which obtains an AI with desirable values by iterative selection, the same way evolutionary selection produced humans (Bostrom 2014, p187-8).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Existential risk === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== F ==== | ||
+ | |||
+ | |||
+ | === Feature === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A dimension in the vector space of activations in a single layer of a neural network (i.e. a neuron activation or linear combination of activations of different neurons)</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === First principal-agent problem === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== G ==== | ||
+ | |||
+ | |||
+ | === Genie === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An AI that carries out a high level command, then waits for another (Bostrom 2014, p148).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== H ==== | ||
+ | |||
+ | |||
+ | === Hardware overhang === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A situation where large amounts of hardware being used for other purposes become available for AI, usually posited to occur when AI reaches human-level capabilities.</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Human-level AI === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An AI that matches human capabilities in virtually every domain of interest. Note that this term is used ambiguously; | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Human-level hardware === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Human-level software === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== I ==== | ||
+ | |||
+ | |||
+ | === Impersonal perspective === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Incentive methods === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Incentive wrapping === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Indirect normativity === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An approach to the control problem in which we specify a way to specify what we value, instead of specifying what we value directly (Bostrom, p141-2).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Instrumental convergence thesis === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>We can identify ‘convergent instrumental values’. That is, subgoals that are useful for a wide range of more fundamental goals, and in a wide range of situations (Bostrom 2014, p109).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Intelligence explosion === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A hypothesized event in which an AI rapidly improves from ‘relatively modest’ to superhuman level (usually imagined to be as a result of recursive self-improvement).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== M ==== | ||
+ | |||
+ | |||
+ | === Macrostructural development accelerator === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An imagined lever used in thought experiments which slows the large scale features of history (e.g. technological change, geopolitical dynamics) while leaving the small scale features the same.</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Mind crime === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Moore’s Law === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Moral rightness (MR) AI === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An AI which seeks to do what is morally right.</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Motivational scaffolding === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A hypothesized approach to value learning in which the seed AI is given simple goals, and these goals are replaced with more complex ones once it has developed sufficiently sophisticated representational structure (Bostrom 2014, p191-192).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Multipolar outcome === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A situation after the arrival of superintelligence in which no single agent controls most of the resources.</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== O ==== | ||
+ | |||
+ | |||
+ | === Optimization power === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Oracle === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An AI that only answers questions (Bostrom 2014, p145).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Orthogonality thesis === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== P ==== | ||
+ | |||
+ | |||
+ | === Person-affecting perspective === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Perverse instantiation === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A solution to a posed goal (eg, make humans smile) that is destructive in unforeseen ways (eg, paralyzing face muscles in the smiling position).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Price-Performance Moore’s Law === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Principle of differential technological development === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Principle of epistemic deference === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== Q ==== | ||
+ | |||
+ | |||
+ | === Quality superintelligence === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== R ==== | ||
+ | |||
+ | |||
+ | === Recalcitrance === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Recursive self-improvement === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Reinforcement learning approach to value learning === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A hypothesized approach to value learning in which the AI is rewarded for behaviors that more closely approximate human values (Bostrom 2014, p188-9).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== S ==== | ||
+ | |||
+ | |||
+ | === Second principal-agent problem === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Seed AI === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A modest AI which can bootstrap into an impressive AI by improving its own architecture.</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Singleton === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An agent that is internally coordinated and has no opponents.</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Sovereign === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An AI that acts autonomously in the world, in pursuit of potentially long range objectives (Bostrom 2014, p148).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Speed superintelligence === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === State risk === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A risk that comes from being in a certain state, such that the amount of risk is a function of the time spent there. For example, the state of not having the technology to defend from asteroid impacts carries risk proportional to the time we spend in it.</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Step risk === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A risk that comes from making a transition. Here the amount of risk is not a simple function of how long the transition takes. For example, traversing a minefield is not safer if done more quickly.</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Stunting === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A control method that consists of limiting the AI’s capabilities, | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Superintelligence === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== T ==== | ||
+ | |||
+ | |||
+ | === Takeoff === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Technological completion conjecture === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>If scientific and technological development efforts do not cease, then all important basic capabilities that could be obtained through some possible technology will be obtained (Bostrom 2014, p127).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Technology coupling === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A predictable timing relationship between two technologies, | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Tool AI === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An AI that is not ‘like an agent’, but like a more flexible and capable version of contemporary software. Most notably perhaps, it is not goal-directed (Bostrom 2014, p151).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== U ==== | ||
+ | |||
+ | |||
+ | === Utility function === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A mapping from states of the world to real numbers (‘utilities’), | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== V ==== | ||
+ | |||
+ | |||
+ | === Value learning === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>An approach to the value loading problem in which the AI learns the values that humans want it to pursue (Bostrom 2014, p207).</ | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Value loading problem === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== W ==== | ||
+ | |||
+ | |||
+ | === Wise-Singleton Sustainability Threshold === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A capability set exceeds the wise-singleton threshold if and only if a patient and existential risk-savvy system with that capability set would, if it face no intelligent opposition or competition, | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Whole-brain emulation === | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Word embedding === | ||
+ | |||
+ | |||
+ | < | ||
+ | <p>A mapping of words to high-dimensional vectors that has been trained to be useful in a word task such that the arrangement of words in the vector space is meaningful. For instance, words near one other in the vector-space are related, and similar relationships between different pairs of words correspond to similar vectors between them, so that e.g. if E(x) is the vector for the word ‘x’, then E(king) – E(queen) ≈ E(woman) – E(man). Word embeddings are explained in more detail <a href=" | ||
+ | </ | ||
+ | |||
+ | |||
+ | ===== Notes ===== | ||
+ | |||
+ | |||
+ | < | ||
+ | <ol class=" | ||
+ | < | ||
+ | <span class=" | ||
+ | </ | ||
+ | < | ||
+ | <span class=" | ||
+ | </ | ||
+ | </ol> | ||
+ | </ | ||
+ | |||
+ | |||