Differences

This shows you the differences between two versions of the page.

--- arguments_for_ai_risk:is_ai_an_existential_threat_to_humanity:will_malign_ai_agents_control_the_future:argument_for_ai_x-risk_from_competent_malign_agents:will_advanced_ai_be_agentic:start [2023/02/14 01:49]
katjagrace created
+++ arguments_for_ai_risk:is_ai_an_existential_threat_to_humanity:will_malign_ai_agents_control_the_future:argument_for_ai_x-risk_from_competent_malign_agents:will_advanced_ai_be_agentic:start [2023/09/20 18:22] (current)
katjagrace
@@ Line 2: / Line 2: @@
 //This page is a stub. It is likely to be expanded upon soon.//
+Reasons to expect that some superhuman AI systems will be goal-directed include:
+  - **Some goal-directed behavior is likely to be [[arguments_for_ai_risk:is_ai_an_existential_threat_to_humanity:will_malign_ai_agents_control_the_future:argument_for_ai_x-risk_from_competent_malign_agents:will_advanced_ai_be_agentic:how_large_are_economic_incentives_for_agentic_ai:start|economically valuable to create]]** (i.e. also not replaceable using only non-goal-directed systems). This appears to be true even for [[arguments_for_ai_risk:incentives_to_create_ai_systems_known_to_pose_extinction_risks|apparently x-risky systems]], and will likely [[arguments_for_ai_risk:is_ai_an_existential_threat_to_humanity:will_dangerous_ai_systems_appear_safe|appear true]] more often than it is.
+  - **Goal-directed entities [[arguments_for_ai_risk:is_ai_an_existential_threat_to_humanity:will_malign_ai_agents_control_the_future:argument_for_ai_x-risk_from_competent_malign_agents:will_advanced_ai_be_agentic:will_mesaoptimization_produce_misalignment|may tend to arise]]** from machine learning training processes not intending to create them.
+  - **‘[[agency:what_do_coherence_arguments_imply_about_the_behavior_of_advanced_ai|Coherence arguments]]‘** may imply that systems with some goal-directedness will **become more strongly goal-directed over time**.

AI Impacts Wiki

User Tools

Site Tools

Differences

Page Tools