This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
uncategorized:capabilities_of_sota_ai [2024/01/03 23:11] harlanstewart |
uncategorized:capabilities_of_sota_ai [2024/12/10 21:50] (current) harlanstewart |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | / | ||
+ | Some things to add to this page, if someone wants to update it at some point: | ||
+ | -GPT-4o advanced voice mode | ||
+ | -GDM GenCast SOTA weather forecasting | ||
+ | -Sora | ||
+ | -o1 reasoning abilities | ||
+ | -Genie 2 and GameNGen | ||
+ | -Hacking milestone from Google' | ||
+ | -Forecasting capabilities https:// | ||
+ | -METR' | ||
+ | -Evaluating Neuroscience results https:// | ||
+ | -Math https:// | ||
+ | */ | ||
+ | |||
====== Capabilities of state-of-the-art AI, 2024 ====== | ====== Capabilities of state-of-the-art AI, 2024 ====== | ||
- | This is a list of some noteworthy capabilities of current state-of-the-art AI in various categories. Last updated 1/3/2024 | + | This is a list of some noteworthy capabilities of current state-of-the-art AI in various categories. Last updated 1/24/2024 |
==== Games ==== | ==== Games ==== | ||
Line 16: | Line 30: | ||
* In 2019, AlphaStar reached Grandmaster level in Starcraft, playing with the same constraints as a human player (viewing the world through a camera, restricted clickrate).((Alphastar: | * In 2019, AlphaStar reached Grandmaster level in Starcraft, playing with the same constraints as a human player (viewing the world through a camera, restricted clickrate).((Alphastar: | ||
* DreamerV3 is a general algorithm from 2023 that can learn to play a variety of games without human data, and is able to collect diamonds in Minecraft.(( Hafner, D., Pasukonis, J., Ba, J., & Lillicrap, T. (2023). Mastering Diverse Domains through World Models. arXiv. https:// | * DreamerV3 is a general algorithm from 2023 that can learn to play a variety of games without human data, and is able to collect diamonds in Minecraft.(( Hafner, D., Pasukonis, J., Ba, J., & Lillicrap, T. (2023). Mastering Diverse Domains through World Models. arXiv. https:// | ||
- | * CICERO, from 2022, can play Diplomacy, a game that involves communicating and coordinating with other players. Cicero ranked in the top 10% of players who had played more than one game on webDiplomacy.net.(( Cicero. Meta AI. (n.d.). Retrieved November 23, 2022, from https:// | ||
* In 2022, DeepNash won 84% of Stratego games against the top expert human players on Gravon games.((Mastering Stratego, the Classic Game of Imperfect Information. DeepMind blog. (2022, December 1). Retrieved December 2, 2022, from https:// | * In 2022, DeepNash won 84% of Stratego games against the top expert human players on Gravon games.((Mastering Stratego, the Classic Game of Imperfect Information. DeepMind blog. (2022, December 1). Retrieved December 2, 2022, from https:// | ||
+ | * CICERO, from 2022, can play Diplomacy, a game that involves communicating and coordinating with other players. Cicero ranked in the top 10% of players who had played more than one game on webDiplomacy.net.(( Cicero. Meta AI. (n.d.). Retrieved November 23, 2022, from https:// | ||
+ | |||
+ | < | ||
+ | <iframe width=" | ||
+ | </ | ||
+ | //Examples and discussion of Diplomacy gameplay with Cicero// | ||
====Language==== | ====Language==== | ||
- | * GPT-4, a large language model from 2023, can write poetry, answer questions, have conversations, | + | * GPT-4, a large language model from 2023, can write poetry, answer questions, reason about the world, have conversations, |
+ | |||
+ | [{{: | ||
* Large language models such as GPT-4 can also write code. GPT-4 correctly solved programming problems in the HumanEval dataset 67% of the time. | * Large language models such as GPT-4 can also write code. GPT-4 correctly solved programming problems in the HumanEval dataset 67% of the time. | ||
* GPT-4 achieved human-level performance on various professional and academic exams, including SATs, AP exams, and the Uniform Bar Exam. | * GPT-4 achieved human-level performance on various professional and academic exams, including SATs, AP exams, and the Uniform Bar Exam. | ||
Line 59: | Line 81: | ||
[{{: | [{{: | ||
- | * PaLI, released in 2022, can answer questions about images, caption images, detect objects in images, and classify images.((Chen, X., & Wang, X. (2022, September 15). PaLI: Scaling Language-Image Learning in 100+ Languages – Google AI Blog. Google AI Blog. Retrieved April 27, 2023, from https://ai.googleblog.com/2022/09/pali-scaling-language-image-learning-in.html)) | + | * An AI system from 2023 can convincingly copy someone' |
- | * Make-a-Video, released in 2022, can generate | + | * AI systems such as VideoPoet, from 2023, can generate |
+ | |||
+ | < | ||
+ | <iframe width=" | ||
+ | </ | ||
+ | //A movie composed of several individual video clips produced by VideoPoet// | ||
====Audio==== | ====Audio==== | ||
Line 69: | Line 96: | ||
)) | )) | ||
* AudioLM, from 2022, creates predicted “continuations” of an audio input.((AudioLM. Retrieved February 27, 2023, from https:// | * AudioLM, from 2022, creates predicted “continuations” of an audio input.((AudioLM. Retrieved February 27, 2023, from https:// | ||
- | * Suno.ai, from 2023, can create songs with lyrics and instrumentation based on a text description of the song's style and subject. ((Suno.ai. Retrieved January 3, 2024, from https:// | ||
* Models such as Deep Voice 3, from 2018, can imitate a human voice based on a few samples of recorded speech.((Arik, | * Models such as Deep Voice 3, from 2018, can imitate a human voice based on a few samples of recorded speech.((Arik, | ||
* Recent models such as Koe can take a recorded voice sample and change it into another voice.((Koe: | * Recent models such as Koe can take a recorded voice sample and change it into another voice.((Koe: | ||
+ | * Suno.ai, from 2023, can create songs with lyrics and instrumentation based on a text description of the song's style and subject. ((Suno.ai. Retrieved January 3, 2024, from https:// | ||
+ | |||
+ | {{ : | ||
+ | //Output from Suno.AI, given the prompt "A soulful R&B song that is self-referentially about how the song is an example of AI-generated audio output on a wiki page about the capabilities of state-of-the-art AI systems"// | ||
====Robotics==== | ====Robotics==== | ||
Line 77: | Line 107: | ||
* Although they are prone to occasional mistakes, self-driving cars are able to drive with human supervision.((Metz, | * Although they are prone to occasional mistakes, self-driving cars are able to drive with human supervision.((Metz, | ||
* In 2022, an AI-piloted drone won multiple races against three world-champion human drone pilots. ((Edwards, Benj. (2023, August 31). High-speed AI drone beats world-champion racers for the first time. Ars Technica. Retrieved October 31, 2023, from https:// | * In 2022, an AI-piloted drone won multiple races against three world-champion human drone pilots. ((Edwards, Benj. (2023, August 31). High-speed AI drone beats world-champion racers for the first time. Ars Technica. Retrieved October 31, 2023, from https:// | ||
- | * Atlas, a humanoid robot, can walk, run, and perform parkour moves such as backflips.((Atlas™. Boston Dynamics. (n.d.). Retrieved November 22, 2022, from https:// | ||
* A robot made by OpenAI in 2019 can solve a rubik’s cube with one human-like hand.((Akkaya, | * A robot made by OpenAI in 2019 can solve a rubik’s cube with one human-like hand.((Akkaya, | ||
* In 2022, a robot successfully performed laparoscopic surgery on four pigs, without human assistance.((Gregory, | * In 2022, a robot successfully performed laparoscopic surgery on four pigs, without human assistance.((Gregory, | ||
+ | * Atlas, a humanoid robot, can walk, run, and perform parkour moves such as backflips.((Atlas™. Boston Dynamics. (n.d.). Retrieved November 22, 2022, from https:// | ||
+ | |||
+ | < | ||
+ | <iframe width=" | ||
+ | </ | ||
+ | //A demo of the robot Atlas performing parkour.// | ||
====Biology==== | ====Biology==== | ||
Line 86: | Line 121: | ||
* In 2022 a model was able to predict the effect of a molecule on levels of an enzyme in humans and find molecules that inhibit a particular enzyme.((Urbina, | * In 2022 a model was able to predict the effect of a molecule on levels of an enzyme in humans and find molecules that inhibit a particular enzyme.((Urbina, | ||
* MinD-Vis, from 2022, can decode a subject’s brain activity to reconstruct an image that has some of the details and features of the image the subject is looking at.((Seeing beyond the brain: Conditional diffusion model with sparse masked modeling for vision decoding submitted to Anonymous Conference. MinD-Vis. (n.d.). Retrieved November 22, 2022, from https:// | * MinD-Vis, from 2022, can decode a subject’s brain activity to reconstruct an image that has some of the details and features of the image the subject is looking at.((Seeing beyond the brain: Conditional diffusion model with sparse masked modeling for vision decoding submitted to Anonymous Conference. MinD-Vis. (n.d.). Retrieved November 22, 2022, from https:// | ||
+ | |||
+ | ====Mathematics==== | ||
+ | * In 2022, AlphaTensor discovered efficient new algorithms for matrix multiplication, | ||
+ | * In 2024, AlphaGeometry solved 25 out of 30 Olympiad-level geometry problems, approaching the level of an Olympiad gold medalist.((Trinh, |