This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
uncategorized:capabilities_of_sota_ai [2024/01/24 22:12] harlanstewart |
uncategorized:capabilities_of_sota_ai [2024/12/10 21:50] (current) harlanstewart |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | / | ||
+ | Some things to add to this page, if someone wants to update it at some point: | ||
+ | -GPT-4o advanced voice mode | ||
+ | -GDM GenCast SOTA weather forecasting | ||
+ | -Sora | ||
+ | -o1 reasoning abilities | ||
+ | -Genie 2 and GameNGen | ||
+ | -Hacking milestone from Google' | ||
+ | -Forecasting capabilities https:// | ||
+ | -METR' | ||
+ | -Evaluating Neuroscience results https:// | ||
+ | -Math https:// | ||
+ | */ | ||
+ | |||
====== Capabilities of state-of-the-art AI, 2024 ====== | ====== Capabilities of state-of-the-art AI, 2024 ====== | ||
- | This is a list of some noteworthy capabilities of current state-of-the-art AI in various categories. Last updated 1/3/2024 | + | This is a list of some noteworthy capabilities of current state-of-the-art AI in various categories. Last updated 1/24/2024 |
==== Games ==== | ==== Games ==== | ||
Line 16: | Line 30: | ||
* In 2019, AlphaStar reached Grandmaster level in Starcraft, playing with the same constraints as a human player (viewing the world through a camera, restricted clickrate).((Alphastar: | * In 2019, AlphaStar reached Grandmaster level in Starcraft, playing with the same constraints as a human player (viewing the world through a camera, restricted clickrate).((Alphastar: | ||
* DreamerV3 is a general algorithm from 2023 that can learn to play a variety of games without human data, and is able to collect diamonds in Minecraft.(( Hafner, D., Pasukonis, J., Ba, J., & Lillicrap, T. (2023). Mastering Diverse Domains through World Models. arXiv. https:// | * DreamerV3 is a general algorithm from 2023 that can learn to play a variety of games without human data, and is able to collect diamonds in Minecraft.(( Hafner, D., Pasukonis, J., Ba, J., & Lillicrap, T. (2023). Mastering Diverse Domains through World Models. arXiv. https:// | ||
- | * CICERO, from 2022, can play Diplomacy, a game that involves communicating and coordinating with other players. Cicero ranked in the top 10% of players who had played more than one game on webDiplomacy.net.(( Cicero. Meta AI. (n.d.). Retrieved November 23, 2022, from https:// | ||
* In 2022, DeepNash won 84% of Stratego games against the top expert human players on Gravon games.((Mastering Stratego, the Classic Game of Imperfect Information. DeepMind blog. (2022, December 1). Retrieved December 2, 2022, from https:// | * In 2022, DeepNash won 84% of Stratego games against the top expert human players on Gravon games.((Mastering Stratego, the Classic Game of Imperfect Information. DeepMind blog. (2022, December 1). Retrieved December 2, 2022, from https:// | ||
+ | * CICERO, from 2022, can play Diplomacy, a game that involves communicating and coordinating with other players. Cicero ranked in the top 10% of players who had played more than one game on webDiplomacy.net.(( Cicero. Meta AI. (n.d.). Retrieved November 23, 2022, from https:// | ||
+ | |||
+ | < | ||
+ | <iframe width=" | ||
+ | </ | ||
+ | //Examples and discussion of Diplomacy gameplay with Cicero// | ||
====Language==== | ====Language==== | ||
* GPT-4, a large language model from 2023, can write poetry, answer questions, reason about the world, have conversations, | * GPT-4, a large language model from 2023, can write poetry, answer questions, reason about the world, have conversations, | ||
+ | |||
+ | [{{: | ||
+ | |||
* Large language models such as GPT-4 can also write code. GPT-4 correctly solved programming problems in the HumanEval dataset 67% of the time. | * Large language models such as GPT-4 can also write code. GPT-4 correctly solved programming problems in the HumanEval dataset 67% of the time. | ||
* GPT-4 achieved human-level performance on various professional and academic exams, including SATs, AP exams, and the Uniform Bar Exam. | * GPT-4 achieved human-level performance on various professional and academic exams, including SATs, AP exams, and the Uniform Bar Exam. | ||
Line 85: | Line 107: | ||
* Although they are prone to occasional mistakes, self-driving cars are able to drive with human supervision.((Metz, | * Although they are prone to occasional mistakes, self-driving cars are able to drive with human supervision.((Metz, | ||
* In 2022, an AI-piloted drone won multiple races against three world-champion human drone pilots. ((Edwards, Benj. (2023, August 31). High-speed AI drone beats world-champion racers for the first time. Ars Technica. Retrieved October 31, 2023, from https:// | * In 2022, an AI-piloted drone won multiple races against three world-champion human drone pilots. ((Edwards, Benj. (2023, August 31). High-speed AI drone beats world-champion racers for the first time. Ars Technica. Retrieved October 31, 2023, from https:// | ||
- | * Atlas, a humanoid robot, can walk, run, and perform parkour moves such as backflips.((Atlas™. Boston Dynamics. (n.d.). Retrieved November 22, 2022, from https:// | ||
* A robot made by OpenAI in 2019 can solve a rubik’s cube with one human-like hand.((Akkaya, | * A robot made by OpenAI in 2019 can solve a rubik’s cube with one human-like hand.((Akkaya, | ||
* In 2022, a robot successfully performed laparoscopic surgery on four pigs, without human assistance.((Gregory, | * In 2022, a robot successfully performed laparoscopic surgery on four pigs, without human assistance.((Gregory, | ||
+ | * Atlas, a humanoid robot, can walk, run, and perform parkour moves such as backflips.((Atlas™. Boston Dynamics. (n.d.). Retrieved November 22, 2022, from https:// | ||
+ | |||
+ | < | ||
+ | <iframe width=" | ||
+ | </ | ||
+ | //A demo of the robot Atlas performing parkour.// | ||
====Biology==== | ====Biology==== |