This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
uncategorized:capabilities_of_sota_ai [2023/10/31 20:24] harlanstewart |
uncategorized:capabilities_of_sota_ai [2024/12/10 21:50] (current) harlanstewart |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== Capabilities of state-of-the-art AI, 2023 ====== | + | / |
| + | Some things to add to this page, if someone wants to update it at some point: | ||
| + | -GPT-4o advanced voice mode | ||
| + | -GDM GenCast SOTA weather forecasting | ||
| + | -Sora | ||
| + | -o1 reasoning abilities | ||
| + | -Genie 2 and GameNGen | ||
| + | -Hacking milestone from Google' | ||
| + | -Forecasting capabilities https:// | ||
| + | -METR' | ||
| + | -Evaluating Neuroscience results https:// | ||
| + | -Math https:// | ||
| + | */ | ||
| - | This is a list of some noteworthy capabilities of current state-of-the-art AI in various categories. Last major update 2/27/2023, last updated | + | ====== Capabilities of state-of-the-art AI, 2024 ====== |
| + | |||
| + | This is a list of some noteworthy capabilities of current state-of-the-art AI in various categories. Last updated | ||
| ==== Games ==== | ==== Games ==== | ||
| Line 16: | Line 30: | ||
| * In 2019, AlphaStar reached Grandmaster level in Starcraft, playing with the same constraints as a human player (viewing the world through a camera, restricted clickrate).((Alphastar: | * In 2019, AlphaStar reached Grandmaster level in Starcraft, playing with the same constraints as a human player (viewing the world through a camera, restricted clickrate).((Alphastar: | ||
| * DreamerV3 is a general algorithm from 2023 that can learn to play a variety of games without human data, and is able to collect diamonds in Minecraft.(( Hafner, D., Pasukonis, J., Ba, J., & Lillicrap, T. (2023). Mastering Diverse Domains through World Models. arXiv. https:// | * DreamerV3 is a general algorithm from 2023 that can learn to play a variety of games without human data, and is able to collect diamonds in Minecraft.(( Hafner, D., Pasukonis, J., Ba, J., & Lillicrap, T. (2023). Mastering Diverse Domains through World Models. arXiv. https:// | ||
| - | * CICERO, from 2022, can play Diplomacy, a game that involves communicating and coordinating with other players. Cicero ranked in the top 10% of players who had played more than one game on webDiplomacy.net.(( Cicero. Meta AI. (n.d.). Retrieved November 23, 2022, from https:// | ||
| * In 2022, DeepNash won 84% of Stratego games against the top expert human players on Gravon games.((Mastering Stratego, the Classic Game of Imperfect Information. DeepMind blog. (2022, December 1). Retrieved December 2, 2022, from https:// | * In 2022, DeepNash won 84% of Stratego games against the top expert human players on Gravon games.((Mastering Stratego, the Classic Game of Imperfect Information. DeepMind blog. (2022, December 1). Retrieved December 2, 2022, from https:// | ||
| + | * CICERO, from 2022, can play Diplomacy, a game that involves communicating and coordinating with other players. Cicero ranked in the top 10% of players who had played more than one game on webDiplomacy.net.(( Cicero. Meta AI. (n.d.). Retrieved November 23, 2022, from https:// | ||
| - | ====Language==== | + | < |
| + | <iframe width=" | ||
| + | </ | ||
| + | //Examples and discussion of Diplomacy gameplay with Cicero// | ||
| - | | + | ====Language==== |
| + | | ||
| - | [{{: | + | [{{: |
| * Large language models such as GPT-4 can also write code. GPT-4 correctly solved programming problems in the HumanEval dataset 67% of the time. | * Large language models such as GPT-4 can also write code. GPT-4 correctly solved programming problems in the HumanEval dataset 67% of the time. | ||
| Line 43: | Line 61: | ||
| ====Images==== | ====Images==== | ||
| - | * Image classification systems | + | * GPT-4 can recognize |
| - | Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, et al. “ImageNet Large Scale Visual Recognition Challenge.” ArXiv:1409.0575 [Cs], January 29, 2015. http:// | + | |
| - | * DenseCap, from 2015, can identify and describe multiple objects within an image.((Johnson, | + | |
| - | [{{: | + | [{{: |
| * Sensetime is a facial recognition system from 2014 that surpassed average human performance in accurately labeling faces in a large dataset of images.((Lu, | * Sensetime is a facial recognition system from 2014 that surpassed average human performance in accurately labeling faces in a large dataset of images.((Lu, | ||
| Line 65: | Line 81: | ||
| [{{: | [{{: | ||
| - | * PaLI, released in 2022, can answer questions about images, caption images, detect objects in images, and classify images.((Chen, X., & Wang, X. (2022, September 15). PaLI: Scaling Language-Image Learning in 100+ Languages – Google AI Blog. Google AI Blog. Retrieved April 27, 2023, from https://ai.googleblog.com/2022/09/pali-scaling-language-image-learning-in.html)) | + | * An AI system from 2023 can convincingly copy someone' |
| - | * Make-a-Video, released in 2022, can generate | + | * AI systems such as VideoPoet, from 2023, can generate |
| + | |||
| + | < | ||
| + | <iframe width=" | ||
| + | </ | ||
| + | //A movie composed of several individual video clips produced by VideoPoet// | ||
| ====Audio==== | ====Audio==== | ||
| Line 74: | Line 95: | ||
| * Automatic speech recognition systems can transcribe recordings of human speech. Whisper, from 2022, is able to transcribe recordings with an accuracy close to that of professional human transcribers.((Radford, | * Automatic speech recognition systems can transcribe recordings of human speech. Whisper, from 2022, is able to transcribe recordings with an accuracy close to that of professional human transcribers.((Radford, | ||
| )) | )) | ||
| - | * Jukebox, from 2020, can generate samples of music with a provided genre, artist, and lyrics as input.((Dhariwal, | ||
| * AudioLM, from 2022, creates predicted “continuations” of an audio input.((AudioLM. Retrieved February 27, 2023, from https:// | * AudioLM, from 2022, creates predicted “continuations” of an audio input.((AudioLM. Retrieved February 27, 2023, from https:// | ||
| - | * MusicLM, from 2022, creates samples of music based on a text caption.((MusicLM: | ||
| - | |||
| - | {{ : | ||
| - | //Sample output from MusicLM, generated with the prompt "Lofi beats to listen to while reading a wiki page about AI capabilities, | ||
| - | |||
| * Models such as Deep Voice 3, from 2018, can imitate a human voice based on a few samples of recorded speech.((Arik, | * Models such as Deep Voice 3, from 2018, can imitate a human voice based on a few samples of recorded speech.((Arik, | ||
| * Recent models such as Koe can take a recorded voice sample and change it into another voice.((Koe: | * Recent models such as Koe can take a recorded voice sample and change it into another voice.((Koe: | ||
| + | * Suno.ai, from 2023, can create songs with lyrics and instrumentation based on a text description of the song's style and subject. ((Suno.ai. Retrieved January 3, 2024, from https:// | ||
| + | |||
| + | {{ : | ||
| + | //Output from Suno.AI, given the prompt "A soulful R&B song that is self-referentially about how the song is an example of AI-generated audio output on a wiki page about the capabilities of state-of-the-art AI systems"// | ||
| ====Robotics==== | ====Robotics==== | ||
| Line 88: | Line 107: | ||
| * Although they are prone to occasional mistakes, self-driving cars are able to drive with human supervision.((Metz, | * Although they are prone to occasional mistakes, self-driving cars are able to drive with human supervision.((Metz, | ||
| * In 2022, an AI-piloted drone won multiple races against three world-champion human drone pilots. ((Edwards, Benj. (2023, August 31). High-speed AI drone beats world-champion racers for the first time. Ars Technica. Retrieved October 31, 2023, from https:// | * In 2022, an AI-piloted drone won multiple races against three world-champion human drone pilots. ((Edwards, Benj. (2023, August 31). High-speed AI drone beats world-champion racers for the first time. Ars Technica. Retrieved October 31, 2023, from https:// | ||
| - | * Atlas, a humanoid robot, can walk, run, and perform parkour moves such as backflips.((Atlas™. Boston Dynamics. (n.d.). Retrieved November 22, 2022, from https:// | ||
| * A robot made by OpenAI in 2019 can solve a rubik’s cube with one human-like hand.((Akkaya, | * A robot made by OpenAI in 2019 can solve a rubik’s cube with one human-like hand.((Akkaya, | ||
| * In 2022, a robot successfully performed laparoscopic surgery on four pigs, without human assistance.((Gregory, | * In 2022, a robot successfully performed laparoscopic surgery on four pigs, without human assistance.((Gregory, | ||
| + | * Atlas, a humanoid robot, can walk, run, and perform parkour moves such as backflips.((Atlas™. Boston Dynamics. (n.d.). Retrieved November 22, 2022, from https:// | ||
| + | |||
| + | < | ||
| + | <iframe width=" | ||
| + | </ | ||
| + | //A demo of the robot Atlas performing parkour.// | ||
| ====Biology==== | ====Biology==== | ||
| Line 97: | Line 121: | ||
| * In 2022 a model was able to predict the effect of a molecule on levels of an enzyme in humans and find molecules that inhibit a particular enzyme.((Urbina, | * In 2022 a model was able to predict the effect of a molecule on levels of an enzyme in humans and find molecules that inhibit a particular enzyme.((Urbina, | ||
| * MinD-Vis, from 2022, can decode a subject’s brain activity to reconstruct an image that has some of the details and features of the image the subject is looking at.((Seeing beyond the brain: Conditional diffusion model with sparse masked modeling for vision decoding submitted to Anonymous Conference. MinD-Vis. (n.d.). Retrieved November 22, 2022, from https:// | * MinD-Vis, from 2022, can decode a subject’s brain activity to reconstruct an image that has some of the details and features of the image the subject is looking at.((Seeing beyond the brain: Conditional diffusion model with sparse masked modeling for vision decoding submitted to Anonymous Conference. MinD-Vis. (n.d.). Retrieved November 22, 2022, from https:// | ||
| + | |||
| + | ====Mathematics==== | ||
| + | * In 2022, AlphaTensor discovered efficient new algorithms for matrix multiplication, | ||
| + | * In 2024, AlphaGeometry solved 25 out of 30 Olympiad-level geometry problems, approaching the level of an Olympiad gold medalist.((Trinh, | ||