This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
uncategorized:capabilities_of_sota_ai [2023/10/31 20:04] harlanstewart |
uncategorized:capabilities_of_sota_ai [2024/12/10 21:50] (current) harlanstewart |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Capabilities of state-of-the-art AI, 2023 ====== | + | / |
+ | Some things to add to this page, if someone wants to update it at some point: | ||
+ | -GPT-4o advanced voice mode | ||
+ | -GDM GenCast SOTA weather forecasting | ||
+ | -Sora | ||
+ | -o1 reasoning abilities | ||
+ | -Genie 2 and GameNGen | ||
+ | -Hacking milestone from Google' | ||
+ | -Forecasting capabilities https:// | ||
+ | -METR' | ||
+ | -Evaluating Neuroscience results https:// | ||
+ | -Math https:// | ||
+ | */ | ||
- | This is a list of some noteworthy capabilities of current state-of-the-art AI in various categories. Last major update 2/27/2023, last updated | + | ====== Capabilities of state-of-the-art AI, 2024 ====== |
+ | |||
+ | This is a list of some noteworthy capabilities of current state-of-the-art AI in various categories. Last updated | ||
==== Games ==== | ==== Games ==== | ||
Line 16: | Line 30: | ||
* In 2019, AlphaStar reached Grandmaster level in Starcraft, playing with the same constraints as a human player (viewing the world through a camera, restricted clickrate).((Alphastar: | * In 2019, AlphaStar reached Grandmaster level in Starcraft, playing with the same constraints as a human player (viewing the world through a camera, restricted clickrate).((Alphastar: | ||
* DreamerV3 is a general algorithm from 2023 that can learn to play a variety of games without human data, and is able to collect diamonds in Minecraft.(( Hafner, D., Pasukonis, J., Ba, J., & Lillicrap, T. (2023). Mastering Diverse Domains through World Models. arXiv. https:// | * DreamerV3 is a general algorithm from 2023 that can learn to play a variety of games without human data, and is able to collect diamonds in Minecraft.(( Hafner, D., Pasukonis, J., Ba, J., & Lillicrap, T. (2023). Mastering Diverse Domains through World Models. arXiv. https:// | ||
- | * CICERO, from 2022, can play Diplomacy, a game that involves communicating and coordinating with other players. Cicero ranked in the top 10% of players who had played more than one game on webDiplomacy.net.(( Cicero. Meta AI. (n.d.). Retrieved November 23, 2022, from https:// | ||
* In 2022, DeepNash won 84% of Stratego games against the top expert human players on Gravon games.((Mastering Stratego, the Classic Game of Imperfect Information. DeepMind blog. (2022, December 1). Retrieved December 2, 2022, from https:// | * In 2022, DeepNash won 84% of Stratego games against the top expert human players on Gravon games.((Mastering Stratego, the Classic Game of Imperfect Information. DeepMind blog. (2022, December 1). Retrieved December 2, 2022, from https:// | ||
+ | * CICERO, from 2022, can play Diplomacy, a game that involves communicating and coordinating with other players. Cicero ranked in the top 10% of players who had played more than one game on webDiplomacy.net.(( Cicero. Meta AI. (n.d.). Retrieved November 23, 2022, from https:// | ||
- | ====Language==== | + | < |
+ | <iframe width=" | ||
+ | </ | ||
+ | //Examples and discussion of Diplomacy gameplay with Cicero// | ||
- | | + | ====Language==== |
+ | | ||
- | [{{: | + | [{{: |
* Large language models such as GPT-4 can also write code. GPT-4 correctly solved programming problems in the HumanEval dataset 67% of the time. | * Large language models such as GPT-4 can also write code. GPT-4 correctly solved programming problems in the HumanEval dataset 67% of the time. | ||
Line 43: | Line 61: | ||
====Images==== | ====Images==== | ||
- | * Image classification systems | + | * GPT-4 can recognize |
- | Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, et al. “ImageNet Large Scale Visual Recognition Challenge.” ArXiv:1409.0575 [Cs], January 29, 2015. http:// | + | |
- | * DenseCap, from 2015, can identify and describe multiple objects within an image.((Johnson, | + | |
- | [{{: | + | [{{: |
* Sensetime is a facial recognition system from 2014 that surpassed average human performance in accurately labeling faces in a large dataset of images.((Lu, | * Sensetime is a facial recognition system from 2014 that surpassed average human performance in accurately labeling faces in a large dataset of images.((Lu, | ||
Line 65: | Line 81: | ||
[{{: | [{{: | ||
- | * PaLI, released in 2022, can answer questions about images, caption images, detect objects in images, and classify images.((Chen, X., & Wang, X. (2022, September 15). PaLI: Scaling Language-Image Learning in 100+ Languages – Google AI Blog. Google AI Blog. Retrieved April 27, 2023, from https://ai.googleblog.com/2022/09/pali-scaling-language-image-learning-in.html)) | + | * An AI system from 2023 can convincingly copy someone' |
- | * Make-a-Video, released in 2022, can generate | + | * AI systems such as VideoPoet, from 2023, can generate |
+ | |||
+ | < | ||
+ | <iframe width=" | ||
+ | </ | ||
+ | //A movie composed of several individual video clips produced by VideoPoet// | ||
====Audio==== | ====Audio==== | ||
Line 74: | Line 95: | ||
* Automatic speech recognition systems can transcribe recordings of human speech. Whisper, from 2022, is able to transcribe recordings with an accuracy close to that of professional human transcribers.((Radford, | * Automatic speech recognition systems can transcribe recordings of human speech. Whisper, from 2022, is able to transcribe recordings with an accuracy close to that of professional human transcribers.((Radford, | ||
)) | )) | ||
- | * Jukebox, from 2020, can generate samples of music with a provided genre, artist, and lyrics as input.((Dhariwal, | ||
* AudioLM, from 2022, creates predicted “continuations” of an audio input.((AudioLM. Retrieved February 27, 2023, from https:// | * AudioLM, from 2022, creates predicted “continuations” of an audio input.((AudioLM. Retrieved February 27, 2023, from https:// | ||
- | * MusicLM, from 2022, creates samples of music based on a text caption.((MusicLM: | ||
- | |||
- | {{ : | ||
- | //Sample output from MusicLM, generated with the prompt "Lofi beats to listen to while reading a wiki page about AI capabilities, | ||
- | |||
* Models such as Deep Voice 3, from 2018, can imitate a human voice based on a few samples of recorded speech.((Arik, | * Models such as Deep Voice 3, from 2018, can imitate a human voice based on a few samples of recorded speech.((Arik, | ||
* Recent models such as Koe can take a recorded voice sample and change it into another voice.((Koe: | * Recent models such as Koe can take a recorded voice sample and change it into another voice.((Koe: | ||
+ | * Suno.ai, from 2023, can create songs with lyrics and instrumentation based on a text description of the song's style and subject. ((Suno.ai. Retrieved January 3, 2024, from https:// | ||
+ | |||
+ | {{ : | ||
+ | //Output from Suno.AI, given the prompt "A soulful R&B song that is self-referentially about how the song is an example of AI-generated audio output on a wiki page about the capabilities of state-of-the-art AI systems"// | ||
====Robotics==== | ====Robotics==== | ||
* Although they are prone to occasional mistakes, self-driving cars are able to drive with human supervision.((Metz, | * Although they are prone to occasional mistakes, self-driving cars are able to drive with human supervision.((Metz, | ||
- | * In 2021, an AI-piloted drone won a race against | + | * In 2022, an AI-piloted drone won multiple races against |
- | * Atlas, a humanoid robot, can walk, run, and perform parkour moves such as backflips.((Atlas™. Boston Dynamics. (n.d.). Retrieved November 22, 2022, from https:// | + | |
* A robot made by OpenAI in 2019 can solve a rubik’s cube with one human-like hand.((Akkaya, | * A robot made by OpenAI in 2019 can solve a rubik’s cube with one human-like hand.((Akkaya, | ||
* In 2022, a robot successfully performed laparoscopic surgery on four pigs, without human assistance.((Gregory, | * In 2022, a robot successfully performed laparoscopic surgery on four pigs, without human assistance.((Gregory, | ||
+ | * Atlas, a humanoid robot, can walk, run, and perform parkour moves such as backflips.((Atlas™. Boston Dynamics. (n.d.). Retrieved November 22, 2022, from https:// | ||
+ | |||
+ | < | ||
+ | <iframe width=" | ||
+ | </ | ||
+ | //A demo of the robot Atlas performing parkour.// | ||
====Biology==== | ====Biology==== | ||
Line 97: | Line 121: | ||
* In 2022 a model was able to predict the effect of a molecule on levels of an enzyme in humans and find molecules that inhibit a particular enzyme.((Urbina, | * In 2022 a model was able to predict the effect of a molecule on levels of an enzyme in humans and find molecules that inhibit a particular enzyme.((Urbina, | ||
* MinD-Vis, from 2022, can decode a subject’s brain activity to reconstruct an image that has some of the details and features of the image the subject is looking at.((Seeing beyond the brain: Conditional diffusion model with sparse masked modeling for vision decoding submitted to Anonymous Conference. MinD-Vis. (n.d.). Retrieved November 22, 2022, from https:// | * MinD-Vis, from 2022, can decode a subject’s brain activity to reconstruct an image that has some of the details and features of the image the subject is looking at.((Seeing beyond the brain: Conditional diffusion model with sparse masked modeling for vision decoding submitted to Anonymous Conference. MinD-Vis. (n.d.). Retrieved November 22, 2022, from https:// | ||
+ | |||
+ | ====Mathematics==== | ||
+ | * In 2022, AlphaTensor discovered efficient new algorithms for matrix multiplication, | ||
+ | * In 2024, AlphaGeometry solved 25 out of 30 Olympiad-level geometry problems, approaching the level of an Olympiad gold medalist.((Trinh, |