Published 3 April 2023, last updated 3 April 2023
A back-of-the-envelope calculation based on market size, price-performance, hardware lifespan estimates, and the sizes of Google’s data centers estimates that there is around 3.98 * 10^21 FLOP/s of computing capacity on GPUs and TPUs as of Q1 2023.
Consulting firms reported their estimates of the world-wide GPU market size for 2017, 2018, 2019, 2020, 2021, and 2022.1)2)3)4)5)6) Unfortunately, the details of these reports are paywalled, so the methodology and scope of these estimates is not completely clear. A sample of GMI’s report on 2022 GPU market size seems to indicate that the estimate is reasonably comprehensive.7)
Hobbhahn & Besiroglu (2022) used a dataset of GPUs to find trends in FLOP/s per dollar over time. This provides a simple way to convert the above estimates of total spending on GPUs per year into estimates of total GPU computing capacity produced per year. Hobbhahn & Besiroglu found no statistically significant difference in the trends of FP16 and FP32 precision compute, and we follow their example by focusing on FP32 precision for the computing capacity of GPUs.8)
Because GPUs eventually fail, the computing capacity produced in previous years does not all still exist. We did not find much empirical data about the usual lifespan of GPUs, so the estimates in this section are highly speculative.
To estimate total GPU computing capacity, we summed the products of each years’ estimated market size, price-performance, and surviving GPUs.
TPUs are specialized chips designed by Google for machine learning. These chips can be rented through the cloud, and seem to be located in 7 Google data centers.11) There are 3 versions of the TPU available: V2, which has 45 * 10^12 FLOP/s per chip, V3, which has 123 * 10^12 FLOP/s per chip, and V4, which has 275 * 10^12 FLOP/s per chip.12)13) Google does not say exactly how many TPUs it owns, but the company did report that its data center in Oklahoma (the only with V4 chips available) had 9 * 10^18 FLOP/s of computing capacity.14)
Although the above estimates for the computing capacity of all TPUs and for all GPUs are measured in different precision, bf16 and FP32 seem to be similar enough that computing capacities measured in both can reasonably be added together for a rough calculation.15) Adding the estimates together gives the below estimates.
This calculation is rough, and the estimates could be wrong. In particular, some uncertainty remains about the methods used by the cited consulting firms to estimate GPU market size, and the details of those methods might substantially change the estimates in either direction.
Google’s Pathways Language Model (PaLM) was trained using 2.56 * 10^24 FLOPs over 64 days.16) Our best guess estimate would indicate that training PaLM used about 0.01% of the world’s current GPU and TPU computing capacity, which seems plausible.17)
There are likely other, possibly better, ways that one could go about estimating the computing capacity of GPUs and TPUs. Estimations that are calculated in different ways could be useful for reducing uncertainty. Other methods might involve researching the manufacturing capacity of semiconductor fabrication plants. Other methods may also make it feasible to estimate the computing capacity of all microprocessors, including CPUs, rather than just GPUs and TPUs.