ai_timelines:hardware_and_ai_timelines:computing_capacity_of_all_gpus_and_tpus

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
ai_timelines:hardware_and_ai_timelines:computing_capacity_of_all_gpus_and_tpus [2023/04/03 18:46]
harlanstewart
ai_timelines:hardware_and_ai_timelines:computing_capacity_of_all_gpus_and_tpus [2024/01/24 23:06] (current)
harlanstewart
Line 1: Line 1:
 +/*
 +EDITOR NOTES (publicly accessible)
 +-Harlan: I plan to update this page once global market size estimates for GPUs become available for 2023
 +*/
 ====== How much computing capacity exists in GPUs and TPUs in Q1 2023? ====== ====== How much computing capacity exists in GPUs and TPUs in Q1 2023? ======
 +
 +//Published 3 April 2023, last updated 3 April 2023//
  
 A back-of-the-envelope calculation based on market size, price-performance, hardware lifespan estimates, and the sizes of Google’s data centers estimates that there is around 3.98 * 10^21 FLOP/s of computing capacity on GPUs and TPUs as of Q1 2023. A back-of-the-envelope calculation based on market size, price-performance, hardware lifespan estimates, and the sizes of Google’s data centers estimates that there is around 3.98 * 10^21 FLOP/s of computing capacity on GPUs and TPUs as of Q1 2023.
Line 9: Line 15:
 Consulting firms reported their estimates of the world-wide GPU market size for 2017, 2018, 2019, 2020, 2021, and 2022.((Bhutani, Ankita, and Preeti Wadhwani. 2019. “Graphics Processing Unit (GPU) Market Size By Component, Hardware, Device Type, Service, By Deployment Model, By Application, Industry Analysis Report, Regional Outlook, Growth Potential, Competitive Market Share & Forecast, 2018 - 2024.” Global Market Insights. https://web.archive.org/web/20220303120312/https://www.gminsights.com/industry-analysis/gpu-market. Consulting firms reported their estimates of the world-wide GPU market size for 2017, 2018, 2019, 2020, 2021, and 2022.((Bhutani, Ankita, and Preeti Wadhwani. 2019. “Graphics Processing Unit (GPU) Market Size By Component, Hardware, Device Type, Service, By Deployment Model, By Application, Industry Analysis Report, Regional Outlook, Growth Potential, Competitive Market Share & Forecast, 2018 - 2024.” Global Market Insights. https://web.archive.org/web/20220303120312/https://www.gminsights.com/industry-analysis/gpu-market.
 ))((“GPU Market Size, Share, Trends | Forecast 2026.” n.d. Acumen Research and Consulting. Accessed March 31, 2023. https://www.acumenresearchandconsulting.com/gpu-market. ))((R, Rachita. 2020. “GPU Market Size, Share & Forecast by 2027 : Graphics Processing Unit.” Allied Market Research. https://www.alliedmarketresearch.com/graphic-processing-unit-market. ))((Alsop, Thomas. 2021. “Graphics processing unit (GPU) market size worldwide in 2020 and 2028.” Statista. https://web.archive.org/web/20220901103010/https://www.statista.com/statistics/1166028/gpu-market-size-worldwide/ ))((“GPU Market Size, Share, Trends | Forecast 2026.” n.d. Acumen Research and Consulting. Accessed March 31, 2023. https://www.acumenresearchandconsulting.com/gpu-market. ))((R, Rachita. 2020. “GPU Market Size, Share & Forecast by 2027 : Graphics Processing Unit.” Allied Market Research. https://www.alliedmarketresearch.com/graphic-processing-unit-market. ))((Alsop, Thomas. 2021. “Graphics processing unit (GPU) market size worldwide in 2020 and 2028.” Statista. https://web.archive.org/web/20220901103010/https://www.statista.com/statistics/1166028/gpu-market-size-worldwide/
-))(( “Graphic Processing Unit (GPU) Market Size, Share, Trends & Forecast.” 2022. Verified Market Research. https://www.verifiedmarketresearch.com/product/graphic-processing-unit-gpu-market/. ))(( Wadhwani, Preeti, and Aayush Jain. 2023. “Graphics Processing Unit (GPU) Market Size, Share Report – 2032.” Global Market Insights. https://www.gminsights.com/industry-analysis/gpu-market. )) Unfortunately, the details of these reports are paywalled, so the methodology and scope of these estimates is not completely clear. A sample of GMI’s report on 2022 GPU market size seems to indicate that the estimate is reasonably comprehensive.((The report include revenue related to integrated GPUs, including from Apple. Because Apple does not appear to sell their integrated GPUs directly to consumers, this likely means that the report is including GPUs that come installed in Apple devicesrather than only stand-alone GPUs. If this is not the case, and the estimate only includes sales of stand-alone GPUs, then this page might significantly underestimate the total compute of all GPUs and TPUs.))+))(( “Graphic Processing Unit (GPU) Market Size, Share, Trends & Forecast.” 2022. Verified Market Research. https://www.verifiedmarketresearch.com/product/graphic-processing-unit-gpu-market/. ))(( Wadhwani, Preeti, and Aayush Jain. 2023. “Graphics Processing Unit (GPU) Market Size, Share Report – 2032.” Global Market Insights. https://www.gminsights.com/industry-analysis/gpu-market. )) Unfortunately, the details of these reports are paywalled, so the methodology and scope of these estimates is not completely clear. A sample of GMI’s report on 2022 GPU market size seems to indicate that the estimate is reasonably comprehensive.((The report includes revenue related to integrated GPUs, including from Apple. Because Apple does not appear to sell their integrated GPUs directly to consumers, this likely means that the report is including GPUs that come installed in Apple devices rather than only stand-alone GPUs. If this is not the case, and the estimate only includes sales of stand-alone GPUs, then this page might significantly underestimate the total compute of all GPUs and TPUs.)) 
 + 
 +  * Our best guess estimate models market size growth as a piecewise linear function, with slow growth before 2017 and fast growth after 2017.  
 +  * Our lower bound estimate fits the years before 2017 to an exponential curve and assumes that the market size will not change between 2022 and 2023.  
 +  * Our upper bound estimate assumes that there was no change in market size between 2010 and 2017 and fits 2023 to an exponential curve.
  
 {{:ai_timelines:hardware_and_ai_timelines:estimated_gpu_market_size_2010-2023.png?600|}} {{:ai_timelines:hardware_and_ai_timelines:estimated_gpu_market_size_2010-2023.png?600|}}
Line 15: Line 25:
 ==== Estimating FLOP/s per dollar GPUs had per year ==== ==== Estimating FLOP/s per dollar GPUs had per year ====
  
-Hobbhahn & Besiroglu (2022) used a dataset of GPUs to find trends in FLOP/s per dollar over time. This provides a simple way to convert the above estimates of total spending on GPUs per year into estimates of total GPU computing capacity produced per year. Hobbhahn & Besiroglu found no statistically significant difference in the trends of FP16 and FP32 precision compute, and I will follow their example by focusing on FP32 precision for the computing capacity of GPUs.((Hobbhahn, Marius. 2022. “Trends in GPU price-performance.” Epoch AI. https://epochai.org/blog/trends-in-gpu-price-performance. ))+Hobbhahn & Besiroglu (2022) used a dataset of GPUs to find trends in FLOP/s per dollar over time. This provides a simple way to convert the above estimates of total spending on GPUs per year into estimates of total GPU computing capacity produced per year. Hobbhahn & Besiroglu found no statistically significant difference in the trends of FP16 and FP32 precision compute, and we follow their example by focusing on FP32 precision for the computing capacity of GPUs.((Hobbhahn, Marius. 2022. “Trends in GPU price-performance.” Epoch AI. https://epochai.org/blog/trends-in-gpu-price-performance. ))
  
-  * My best guess estimate for FLOP/s per dollar over time is based on the trendline over all of the data in the dataset. +  * Our best guess estimate for FLOP/s per dollar over time is based on the trendline over all of the data in the dataset. 
-  * My lower bound estimate is based on the trendline for machine learning GPUs  +  * Our lower bound estimate is based on the trendline for machine learning GPUs  
-  * My upper bound estimate is based on the trendline for the most price-performance efficient GPUs+  * Our upper bound estimate is based on the trendline for the most price-performance efficient GPUs
  
 ==== Estimating proportion of GPUs still functional per year ==== ==== Estimating proportion of GPUs still functional per year ====
  
-Because GPUs eventually fail, the computing capacity produced in previous years does not all still exist. I could not find much empirical data about the usual lifespan of GPUs, so the estimates in this section are highly speculative.+Because GPUs eventually fail, the computing capacity produced in previous years does not all still exist. We did not find much empirical data about the usual lifespan of GPUs, so the estimates in this section are highly speculative.
  
-  * Based on anecdotal claims that GPUs usually last 5 years with heavy use or 7 or more years with moderate use, my best guess estimate is that GPU lifespan is normally distributed with a mean of 6 years and a standard deviation of 1.5 years.(( “How Long Do GPUs Last? (Average Lifespan & Effectiveness).” n.d. Cybersided. Accessed March 31, 2023. https://cybersided.com/how-long-do-gpus-last/.)) +  * Based on anecdotal claims that GPUs usually last 5 years with heavy use or 7 or more years with moderate use, our best guess estimate is that GPU lifespan is normally distributed with a mean of 6 years and a standard deviation of 1.5 years.(( “How Long Do GPUs Last? (Average Lifespan & Effectiveness).” n.d. Cybersided. Accessed March 31, 2023. https://cybersided.com/how-long-do-gpus-last/.)) 
-  * Because Ostrouchov et al. (2017) show that GPUs in datacenters usually last around 3 years, my lower bound estimate has a mean of 3 years and a standard deviation of 1.5 years.((G. Ostrouchov, D. Maxwell, R. A. Ashraf, C. Engelmann, M. Shankar and J. H. Rogers, "GPU Lifetimes on Titan Supercomputer: Survival Analysis and Reliability," SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, Atlanta, GA, USA, 2020, pp. 1-14, doi: 10.1109/SC41405.2020.00045.)) +  * Because Ostrouchov et al. (2017) show that GPUs in datacenters usually last around 3 years, our lower bound estimate has a mean of 3 years and a standard deviation of 1.5 years.((G. Ostrouchov, D. Maxwell, R. A. Ashraf, C. Engelmann, M. Shankar and J. H. Rogers, "GPU Lifetimes on Titan Supercomputer: Survival Analysis and Reliability," SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, Atlanta, GA, USA, 2020, pp. 1-14, doi: 10.1109/SC41405.2020.00045.)) 
-  * For my upper bound estimate, assumed that all GPUs since 2010 still exist.+  * For our upper bound estimate, we assumed that all GPUs since 2010 still exist.
  
 {{:ai_timelines:hardware_and_ai_timelines:estimated_proportion_of_gpus_that_are_functional.png?600|}} {{:ai_timelines:hardware_and_ai_timelines:estimated_proportion_of_gpus_that_are_functional.png?600|}}
Line 33: Line 43:
 ==== Estimating FLOP/s of all GPUs ==== ==== Estimating FLOP/s of all GPUs ====
  
-To estimate total GPU computing capacity, summed the products of each years’ estimated market size, price-performance, and surviving GPUs.+To estimate total GPU computing capacity, we summed the products of each years’ estimated market size, price-performance, and surviving GPUs.
  
-  * My best guess estimate for total GPU computing capacity is 3.95 * 10^21 FLOP/s (FP32) +  * Our best guess estimate for total GPU computing capacity is 3.95 * 10^21 FLOP/s (FP32) 
-  * My lower bound estimate for total GPU computing capacity is 1.40 * 10^21 FLOP/s (FP32) +  * Our lower bound estimate for total GPU computing capacity is 1.40 * 10^21 FLOP/s (FP32) 
-  * My upper bound estimate for total GPU computing capacity is 7.71 * 10^21 FLOP/s (FP32)+  * Our upper bound estimate for total GPU computing capacity is 7.71 * 10^21 FLOP/s (FP32)
  
 ==== Estimating FLOP/s of all TPUs ==== ==== Estimating FLOP/s of all TPUs ====
Line 43: Line 53:
 TPUs are specialized chips designed by Google for machine learning. These chips can be rented through the cloud, and seem to be located in 7 Google data centers.((“TPU regions and zones.” n.d. Google Cloud. Accessed March 31, 2023. https://cloud.google.com/tpu/docs/regions-zones.)) There are 3 versions of the TPU available: V2, which has 45 * 10^12 FLOP/s per chip, V3, which has 123 * 10^12 FLOP/s per chip, and V4, which has 275 * 10^12 FLOP/s per chip.(( Kennedy, Patrick. 2017. “Case Study on the Google TPU and GDDR5 from Hot Chips 29.” ServeTheHome. https://www.servethehome.com/case-study-google-tpu-gddr5-hot-chips-29/. ))(( “System Architecture | Cloud TPU.” n.d. Google Cloud. Accessed March 31, 2023. https://cloud.google.com/tpu/docs/system-architecture-tpu-vm.)) Google does not say exactly how many TPUs it owns, but the company did report that its data center in Oklahoma (the only with V4 chips available) had 9 * 10^18 FLOP/s of computing capacity.((Lardinois, Frederic. 2022. “Google launches a 9 exaflop cluster of Cloud TPU v4 pods into public preview.” TechCrunch. https://techcrunch.com/2022/05/11/google-launches-a-9-exaflop-cluster-of-cloud-tpu-v4-pods-into-public-preview/.)) TPUs are specialized chips designed by Google for machine learning. These chips can be rented through the cloud, and seem to be located in 7 Google data centers.((“TPU regions and zones.” n.d. Google Cloud. Accessed March 31, 2023. https://cloud.google.com/tpu/docs/regions-zones.)) There are 3 versions of the TPU available: V2, which has 45 * 10^12 FLOP/s per chip, V3, which has 123 * 10^12 FLOP/s per chip, and V4, which has 275 * 10^12 FLOP/s per chip.(( Kennedy, Patrick. 2017. “Case Study on the Google TPU and GDDR5 from Hot Chips 29.” ServeTheHome. https://www.servethehome.com/case-study-google-tpu-gddr5-hot-chips-29/. ))(( “System Architecture | Cloud TPU.” n.d. Google Cloud. Accessed March 31, 2023. https://cloud.google.com/tpu/docs/system-architecture-tpu-vm.)) Google does not say exactly how many TPUs it owns, but the company did report that its data center in Oklahoma (the only with V4 chips available) had 9 * 10^18 FLOP/s of computing capacity.((Lardinois, Frederic. 2022. “Google launches a 9 exaflop cluster of Cloud TPU v4 pods into public preview.” TechCrunch. https://techcrunch.com/2022/05/11/google-launches-a-9-exaflop-cluster-of-cloud-tpu-v4-pods-into-public-preview/.))
  
-  * Based on assumptions that all of Google’s data centers that have TPUs have the same number of TPUs and that if a data center has two types of TPUs it has equal numbers of each, my best guess estimate for computing capacity of all TPUs is 2.93 * 10^19 FLOP/s (bf16) +  * Based on assumptions that all of Google’s data centers that have TPUs have the same number of TPUs and that if a data center has two types of TPUs it has equal numbers of each, our best guess estimate for computing capacity of all TPUs is 2.93 * 10^19 FLOP/s (bf16) 
-  * Because the only computing capacity publicly reported by Google is that of the Oklahoma data center, my lower bound estimate assumes that the 9 * 10^18 FLOP/s (bf16) is the total for all TPUs. +  * Because the only computing capacity publicly reported by Google is that of the Oklahoma data center, our lower bound estimate assumes that the 9 * 10^18 FLOP/s (bf16) is the total for all TPUs. 
-  * Because Google claimed that their Oklahoma data center had more computing capacity than any other publicly available compute cluster, my upper bound estimate assumes that all 7 data centers have 9 * 10^18 FLOP/s of compute, for a total of 6.3 * 10^19 FLOP/s (bf16)+  * Because Google claimed that their Oklahoma data center had more computing capacity than any other publicly available compute cluster, our upper bound estimate assumes that all 7 data centers have 9 * 10^18 FLOP/s of compute, for a total of 6.3 * 10^19 FLOP/s (bf16)
  
 ==== Estimated FLOP/s of all GPUs and TPUs ==== ==== Estimated FLOP/s of all GPUs and TPUs ====
Line 53: Line 63:
 Wang, Shibo, and Pankaj Kanwar. 2019. “BFloat16: The secret to high performance on Cloud TPUs.” Google Cloud. https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus.)) Adding the estimates together gives the below estimates. Wang, Shibo, and Pankaj Kanwar. 2019. “BFloat16: The secret to high performance on Cloud TPUs.” Google Cloud. https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus.)) Adding the estimates together gives the below estimates.
  
-  * My best guess estimate for computing capacity of all GPUs and TPUs is 3.98 * 10^21 FLOP/s +  * Our best guess estimate for computing capacity of all GPUs and TPUs is 3.98 * 10^21 FLOP/s 
-  * My lower bound estimate is 1.41 * 10^21 FLOP/s +  * Our lower bound estimate is 1.41 * 10^21 FLOP/s 
-  * My upper bound estimate is 7.77 * 10^21 FLOP/s+  * Our upper bound estimate is 7.77 * 10^21 FLOP/s
  
 ===== Discussion ===== ===== Discussion =====
Line 61: Line 71:
 This calculation is rough, and the estimates could be wrong. In particular, some uncertainty remains about the methods used by the cited consulting firms to estimate GPU market size, and the details of those methods might substantially change the estimates in either direction. This calculation is rough, and the estimates could be wrong. In particular, some uncertainty remains about the methods used by the cited consulting firms to estimate GPU market size, and the details of those methods might substantially change the estimates in either direction.
  
-Google’s Pathways Language Model (PaLM) was trained using 2.56 * 10^24 FLOPs over 64 days.((Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H. W., Sutton, C., Gehrmann, S., Schuh, P., Shi, K., Tsvyashchenko, S., Maynez, J., Rao, A., Barnes, P., Tay, Y., Shazeer, N., Prabhakaran, V., … Fiedel, N. (2022, October 5). Palm: Scaling language modeling with pathways. arXiv.org. Retrieved March 31, 2023, from https://arxiv.org/abs/2204.02311 )) My best guess estimate would indicate that training PaLM used about 0.01% of the world’s current GPU and TPU computing capacity, which seems plausible.(( (2.56 * 10^24 / 64 / 24 / 60 / 60) / (3.98 * 10^21) = ~ 1.16 * 10^-4))+Google’s Pathways Language Model (PaLM) was trained using 2.56 * 10^24 FLOPs over 64 days.((Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H. W., Sutton, C., Gehrmann, S., Schuh, P., Shi, K., Tsvyashchenko, S., Maynez, J., Rao, A., Barnes, P., Tay, Y., Shazeer, N., Prabhakaran, V., … Fiedel, N. (2022, October 5). Palm: Scaling language modeling with pathways. arXiv.org. Retrieved March 31, 2023, from https://arxiv.org/abs/2204.02311 )) Our best guess estimate would indicate that training PaLM used about 0.01% of the world’s current GPU and TPU computing capacity, which seems plausible.(( (2.56 * 10^24 / 64 / 24 / 60 / 60) / (3.98 * 10^21) = ~ 1.16 * 10^-4))
  
 There are likely other, possibly better, ways that one could go about estimating the computing capacity of GPUs and TPUs. Estimations that are calculated in different ways could be useful for reducing uncertainty. Other methods might involve researching the manufacturing capacity of semiconductor fabrication plants. Other methods may also make it feasible to estimate the computing capacity of all microprocessors, including CPUs, rather than just GPUs and TPUs. There are likely other, possibly better, ways that one could go about estimating the computing capacity of GPUs and TPUs. Estimations that are calculated in different ways could be useful for reducing uncertainty. Other methods might involve researching the manufacturing capacity of semiconductor fabrication plants. Other methods may also make it feasible to estimate the computing capacity of all microprocessors, including CPUs, rather than just GPUs and TPUs.
ai_timelines/hardware_and_ai_timelines/computing_capacity_of_all_gpus_and_tpus.1680547594.txt.gz · Last modified: 2023/04/03 18:46 by harlanstewart