Published 25 March, 2020; last updated 09 June, 2020
We estimate that in recent years, GPU prices have fallen at rates that would yield an order of magnitude over roughly:
GPUs (graphics processing units) are specialized electronic circuits originally used for computer graphics.1 In recent years, they have been popularly used for machine learning applications.2 One measure of GPU performance is FLOPS, the number of operations on floating-point numbers a GPU can perform in a second.3 This page looks at the trends in GPU price / FLOPS of theoretical peak performance over the past 13 years. It does not include the cost of operating the GPUs, and it does not consider GPUs rented through cloud computing.
‘Theoretical peak performance’ numbers appear to be determined by adding together the theoretical performances of the processing components of the GPU, which are calculated by multiplying the clock speed of the component by the number of instructions it can perform per cycle.4 These numbers are given by the developer and may not reflect actual performance on a given application.5
We collected data on multiple slightly different measures of GPU price and FLOPS performance.
GPU prices are divided into release prices, which reflect the manufacturer suggested retail prices that GPUs are originally sold at, and active prices, which are the prices at which GPUs are actually sold at over time, often by resellers.
We expect that active prices better represent prices available to hardware users, but collect release prices also, as supporting evidence.
Several varieties of ‘FLOPS’ can be distinguished based on the specifics of the operations they involve. Here we are interested in single-precision FLOPS, half-precision FLOPS, and half-precision fused-multiply add FLOPS.
‘Single-precision’ and ‘half-precision’ refer to the number of bits used to specify a floating point number.6 Using more bits to specify a number achieves greater precision at the cost of more computational steps per calculation. Our data suggests that GPUs have largely been improving in single-precision performance in recent decades,7 and half-precision performance appears to be increasingly popular because it is adequate for deep learning.8
Nvidia, the main provider of chips for machine learning applications,9 recently released a series of GPUs featuring Tensor Cores,10 which claim to deliver “groundbreaking AI performance”. Tensor Core performance is measured in FLOPS, but they perform exclusively certain kinds of floating-point operations known as fused multiply-adds (FMAs).11 Performance on these operations is important for certain kinds of deep learning performance,12 so we track ‘GPU price / FMA FLOPS’ as well as ‘GPU price / FLOPS’.
In addition to purely half-precision computations, Tensor Cores are capable of performing mixed-precision computations, where part of the computation is done in half-precision and part in single-precision.13 Since explicitly mixed-precision-optimized hardware is quite recent, we don’t look at the trend in mixed-precision price performance, and only look at the trend in half-precision price performance.
Any GPU that performs multiple kinds of computations (single-precision, half-precision, half-precision fused multiply add) trades off performance on one for performance on the other, because there is limited space on the chip, and transistors must be allocated to either one type of computation or the other.14 All current GPUs that perform half-precision or TensorCore fused-multiply-add computations also do single-precision computations, so they are splitting their transistor budget. For this reason, our impression is that half-precision FLOPS could be much cheaper now if entire GPUs were allocated to each one alone, rather than split between them.
We collected data on theoretical peak performance (FLOPS), release date, and price from several sources, including Wikipedia.15 (Data is available in this spreadsheet). We found GPUs by looking at Wikipedia’s existing large lists16 and by Googling “popular GPUs” and “popular deep learning GPUs”. We included any hardware that was labeled as a ‘GPU’. We adjusted prices for inflation based on the consumer price index.17
We were unable to find price and performance data for many popular GPUs and suspect that we are missing many from our list. In our search, we did not find any GPUs that beat our 2017 minimum of
Figure 1 shows our collected dataset for GPU price / single-precision FLOPS over time.19
To find a clear trend for the prices of the cheapest GPUs / FLOPS, we looked at the running minimum prices every 10 days.20
The cheapest GPU price / FLOPS hardware using release date pricing has not decreased since 2017. However there was a similar period of stagnation between early 2009 and 2011, so this may not represent a slowing of the trend in the long run.
Based on the figures above, the running minimums seem to follow a roughly exponential trend. If we do not include the initial point in 2007, (which we suspect is not in fact the cheapest hardware at the time), we get that the cheapest GPU price / single-precision FLOPS fell by around 17% per year, for a factor of ten in ~12.5 years.21
Figure 3 shows GPU price / half-precision FLOPS for all the GPUs in our search above for which we could find half-precision theoretical performance.22
Again, we looked at the running minimums of this graph every 10 days, shown in Figure 4 below.23
If we assume an exponential trend with noise,24 cheapest GPU price / half-precision FLOPS fell by around 26% per year, which would yield a factor of ten after ~8 years.25
Figure 5 shows GPU price / half-precision FMA FLOPS for all the GPUs in our search above for which we could find half-precision FMA theoretical performance.26 (Note that this includes all of our half-precision data above, since those FLOPS could be used for fused-multiply adds in particular). GPUs with TensorCores are marked in red.
Figure 6 shows the running minimums of GPU price / HP FMA FLOPS.27
GPU price / Half-Precision FMA FLOPS appears to be following an exponential trend over the last four years, falling by around 46% per year, for a factor of ten in ~4 years.28
GPU prices often go down from the time of release, and some popular GPUs are older ones that have gone down in price.29 Given this, it makes sense to look at active price data for the same GPU over time.
We collected data on peak theoretical performance in FLOPS from TechPowerUp30 and combined it with active GPU price data to get GPU price / FLOPS over time.31 Our primary source of historical pricing data was Passmark, though we also found a less trustworthy dataset on Kaggle which we used to check our analysis. We adjusted prices for inflation based on the consumer price index.32
We scraped pricing data33 on GPUs between 2011 and early 2020 from Passmark.34 Where necessary, we renamed GPUs from Passmark to be consistent with TechPowerUp.35 The Passmark data consists of 38,138 price points for 352 GPUs. We guess that these represent most popular GPUs.
Looking at the ‘current prices’ listed on individual Passmark GPU pages, prices appear to be sourced from Amazon, Newegg, and Ebay. Passmark’s listed pricing data does not correspond to regular intervals. We don’t know if prices were pulled at irregular intervals, or if Passmark pulls prices regularly and then only lists major changes as price points. When we see a price point, we treat it as though the GPU is that price only at that time point, not indefinitely into the future.
The data contains several blips where a GPU is briefly sold very unusually cheaply. A random checking of some of these suggests to us that these correspond to single or small numbers of GPUs for sale, which we are not interested in tracking, because we are trying to predict AI progress, which presumably isn’t influenced by temporary discounts on tiny batches of GPUs.
This Kaggle dataset contains scraped data of GPU prices from price comparison sites PriceSpy.co.uk, PCPartPicker.com, Geizhals.eu from the years 2013 – 2018. The Kaggle dataset has 319,147 price points for 284 GPUs. Unfortunately, at least some of the data is clearly wrong, potentially because price comparison sites include pricing data from untrustworthy merchants.36 As such, we don’t use the Kaggle data directly in our analysis, but do use it as a check on our Passmark data. The data that we get from Passmark roughly appears to be a subset of the Kaggle data from 2013 – 2018,37 which is what we would expect if the price comparison engines picked up prices from the merchants Passmark looks at.
There are a number of reasons why we think this analysis may in fact not reflect GPU price trends:
A better version of this analysis might start with more complete data from price comparison engines (along the lines of the Kaggle dataset) and then filter out clearly erroneous pricing information in some principled way.
The original scraped datasets with cards renamed to match TechPowerUp can be found here. GPU price / FLOPS data is graphed on a log scale in the figures below. Price points for the same GPU are marked in the same color. We adjusted prices for inflation using the consumer price index.38 All points below are in 2019 dollars.
To try to filter out noisy prices that didn’t last or were only available in small numbers, we took out the lowest 5% of data in every several day period39 to get the 95th percentile cheapest hardware. We then found linear and exponential trendlines of best fit through the available hardware with the lowest GPU price / FLOPS every several days.40
Figures 7-10 show the raw data, 95th percentile data, and trendlines for single-precision GPU price / FLOPS for the Passmark dataset. This folder contains plots of all our datasets, including the Kaggle dataset and combined Passmark + Kaggle dataset.41
The cheapest 95th percentile data every 10 days appears to fit relatively well to both a linear and exponential trendline. However we assume that progress will follow an exponential, because previous progress has followed an exponential.
In the Passmark dataset, the exponential trendline suggested that from 2011 to 2020, 95th-percentile GPU price / single-precision FLOPS fell by around 13% per year, for a factor of ten in ~17 years,45 bootstrap46 95% confidence interval 16.3 to 18.1 years.47 We believe the rise in price / FLOPS in 2017 corresponds to a rise in GPU prices due to increased demand from cryptocurrency miners.48 If we instead look at the trend from 2011 through 2016, before the cryptocurrency rise, we instead get that 95th-percentile GPU price / single-precision FLOPS price fell by around 13% per year, for a factor of ten in ~16 years.49
This is slower than the order of magnitude every ~12.5 years we found when looking at release prices. If we restrict the release price data to 2011 – 2019, we get an order of magnitude decrease every ~13.5 years instead,50 so part of the discrepancy can be explained because of the different start times of the datasets. To get some assurance that our active price data wasn’t erroneous, we spot checked the best active price at the start of 2011, which was somewhat lower than the best release price at the same time, and confirmed that its given price was consistent with surrounding pricing data.51 We think active prices are likely to be closer to the prices at which people actually bought GPUs, so we guess that ~17 years / order of magnitude decrease is a more accurate estimate of the trend we care about.
Figures 11-14 show the raw data, 95th percentile data, and trendlines for half-precision GPU price / FLOPS for the Passmark dataset. This folder contains plots of the Kaggle dataset and combined Passmark + Kaggle dataset.
If we assume the trend is exponential, the Passmark trend seems to suggest that from 2015 to 2020, 95th-percentile GPU price / half-precision FLOPS of GPUs has fallen by around 21% per year, for a factor of ten over ~10 years,55 bootstrap56 95% confidence interval 8.8 to 11 years.57 This is fairly close to the ~8 years / order of magnitude decrease we found when looking at release price data, but we treat active prices as a more accurate estimate of the actual prices at which people bought GPUs. As in our previous dataset, there is a noticeable rise in 2017, which we think is due to GPU prices increasing as a result of cryptocurrency miners. If we look at the trend from 2015 through 2016, before this rise, we get that 95th-percentile GPU price / half-precision FLOPS has fallen by around 14% per year, which would yield a factor of ten over ~8 years.58
Figures 15-18 show the raw data, 95th percentile data, and trendlines for half-precision GPU price / FMA FLOPS for the Passmark dataset. GPUs with Tensor Cores are marked in black. This folder contains plots of the Kaggle dataset and combined Passmark + Kaggle dataset.
If we assume the trend is exponential, the Passmark trend seems to suggest the 95th-percentile GPU price / half-precision FMA FLOPS of GPUs has fallen by around 40% per year, which would yield a factor of ten in ~4.5 years,62 with a bootstrap63 95% confidence interval 4 to 5.2 years.64 This is fairly close to the ~4 years / order of magnitude decrease we found when looking at release price data, but we think active prices are a more accurate estimate of the actual prices at which people bought GPUs.
The figures above suggest that certain GPUs with Tensor Cores were a significant (~half an order of magnitude) improvement over existing GPU price / half-precision FMA FLOPS.
We summarize our results in the table below.
Release Prices | 95th-percentile Active Prices | 95th-percentile Active Prices (pre-crypto price rise) | |
11/2007 – 1/2020 | 3/2011 – 1/2020 | 3/2011 – 12/2016 | |
$ / single-precision FLOPS | 12.5 | 17 | 16 |
9/2014 – 1/2020 | 1/2015 – 1/2020 | 1/2015 – 12/2016 | |
$ / half-precision FLOPS | 8 | 10 | 8 |
$ / half-precision FMA FLOPS | 4 | 4.5 | — |
Release price data seems to generally support the trends we found in active prices, with the notable exception of trends in GPU price / single-precision FLOPS, which cannot be explained solely by the different start dates.65 We think the best estimate of the overall trend for prices at which people recently bought GPUs is the 95th-percentile active price data from 2011 – 2020, since release price data does not account for existing GPUs becoming cheaper over time. The pre-crypto trends are similar to the overall trends, suggesting that the trends we are seeing are not anomalous due to cryptocurrency.
Given that, we guess that GPU prices as a whole have fallen at rates that would yield an order of magnitude over roughly:
Half-precision FLOPS seem to have become cheaper substantially faster than single-precision in recent years. This may be a “catching up” effect as more of the space on GPUs was allocated to half-precision computing, rather than reflecting more fundamental technological progress.
Primary author: Asya Bergal
there are 15 SMs (2880/192), each with 64 DP ALUs that are capable of retiring one DP FMA instruction per cycle (== 2 DP Flops per cycle).
15 x 64 x 2 * 745MHz = 1.43 TFlops/sec
which is the stated perf:
http://www.nvidia.com/content/tesla/pdf/NVIDIA-Tesla-Kepler-Family-Datasheet.pdf “
Person. “Comparing CPU and GPU Theoretical GFLOPS.” NVIDIA Developer Forums, May 21, 2014. https://forums.developer.nvidia.com/t/comparing-cpu-and-gpu-theoretical-gflops/33335.