Добавить новость
Январь 2010 Февраль 2010 Март 2010 Апрель 2010 Май 2010
Июнь 2010
Июль 2010 Август 2010
Сентябрь 2010
Октябрь 2010
Ноябрь 2010
Декабрь 2010
Январь 2011
Февраль 2011 Март 2011 Апрель 2011 Май 2011 Июнь 2011 Июль 2011 Август 2011
Сентябрь 2011
Октябрь 2011 Ноябрь 2011 Декабрь 2011 Январь 2012 Февраль 2012 Март 2012 Апрель 2012 Май 2012 Июнь 2012 Июль 2012 Август 2012 Сентябрь 2012 Октябрь 2012 Ноябрь 2012 Декабрь 2012 Январь 2013 Февраль 2013 Март 2013 Апрель 2013 Май 2013 Июнь 2013 Июль 2013 Август 2013 Сентябрь 2013 Октябрь 2013 Ноябрь 2013 Декабрь 2013 Январь 2014 Февраль 2014
Март 2014
Апрель 2014 Май 2014 Июнь 2014 Июль 2014 Август 2014 Сентябрь 2014 Октябрь 2014 Ноябрь 2014 Декабрь 2014 Январь 2015 Февраль 2015 Март 2015 Апрель 2015 Май 2015 Июнь 2015 Июль 2015 Август 2015 Сентябрь 2015 Октябрь 2015 Ноябрь 2015 Декабрь 2015 Январь 2016 Февраль 2016 Март 2016 Апрель 2016 Май 2016 Июнь 2016 Июль 2016 Август 2016 Сентябрь 2016 Октябрь 2016 Ноябрь 2016 Декабрь 2016 Январь 2017 Февраль 2017 Март 2017 Апрель 2017 Май 2017
Июнь 2017
Июль 2017
Август 2017 Сентябрь 2017 Октябрь 2017 Ноябрь 2017 Декабрь 2017 Январь 2018 Февраль 2018 Март 2018 Апрель 2018 Май 2018 Июнь 2018 Июль 2018 Август 2018 Сентябрь 2018 Октябрь 2018 Ноябрь 2018 Декабрь 2018 Январь 2019
Февраль 2019
Март 2019 Апрель 2019 Май 2019 Июнь 2019 Июль 2019 Август 2019 Сентябрь 2019 Октябрь 2019 Ноябрь 2019 Декабрь 2019 Январь 2020
Февраль 2020
Март 2020 Апрель 2020 Май 2020 Июнь 2020 Июль 2020 Август 2020 Сентябрь 2020 Октябрь 2020 Ноябрь 2020 Декабрь 2020 Январь 2021 Февраль 2021 Март 2021 Апрель 2021 Май 2021 Июнь 2021 Июль 2021 Август 2021 Сентябрь 2021 Октябрь 2021 Ноябрь 2021 Декабрь 2021 Январь 2022 Февраль 2022 Март 2022 Апрель 2022 Май 2022 Июнь 2022 Июль 2022 Август 2022 Сентябрь 2022 Октябрь 2022 Ноябрь 2022 Декабрь 2022 Январь 2023 Февраль 2023 Март 2023 Апрель 2023 Май 2023 Июнь 2023 Июль 2023 Август 2023 Сентябрь 2023 Октябрь 2023 Ноябрь 2023 Декабрь 2023 Январь 2024 Февраль 2024 Март 2024 Апрель 2024 Май 2024 Июнь 2024 Июль 2024 Август 2024 Сентябрь 2024 Октябрь 2024 Ноябрь 2024 Декабрь 2024 Январь 2025 Февраль 2025 Март 2025 Апрель 2025 Май 2025 Июнь 2025 Июль 2025 Август 2025 Сентябрь 2025 Октябрь 2025 Ноябрь 2025 Декабрь 2025 Январь 2026 Февраль 2026 Март 2026 Апрель 2026
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
23
24
25
26
27
28
29
30
Game News |

Data, not guessing: Looking at Nvidia's past GPUs to predict the specs for its future RTX 60-series graphics cards

This year will be the tenth anniversary of the GeForce GTX 10-series, and since then Nvidia's gaming GPUs have undergone some fundamental changes to bring ray tracing and AI to the PC gaming masses. While compute performance, cache levels, and VRAM bandwidth are still key to getting high frame rates in games, today's GeForce graphics cards are far more versatile, capable, and complex than those from 2016.

But what of the future? What will Nvidia's next generation of gaming GPUs look like? With the chance of a Super refresh of Blackwell chips looking increasingly less likely, due to supply pressures on affordable VRAM, I've been spending some time mulling over what's next for Team Green.

To that end, I've looked back over 10 years' worth of GeForce cards, collated all the key information, and compared four tiers of models: 60-class, 70-class, 80-class, and the one at the very top of the chain. The latter is currently the 90-class, but with the GTX 10-series and RTX 20-series, it was known as the Titan.

I've got a few charts for you to peruse, and I'll discuss what each one can potentially tell us about the future. And then from all of this, a table of specs for the four primary tiers of RTX 60-series graphics cards I think we'll see in the near future.

Die size and process node

(Image credit: Taiwan Semiconductor Manufacturing Co., Ltd.)

As I'm sure you already know, Nvidia doesn't manufacture the GPUs and other products that it designs. For that, it hires the services of TSMC (Taiwan Semiconductor Manufacturing Company), partly because it has a long history with this company, but mostly because it's the world's biggest and best when it comes to churning out hulking chips at the cutting edge of processor technology.

Just like all chip makers do, TSMC uses the simple phrase 'process node' to describe the hugely complex sequence of steps it carries out to manufacture hundreds of thousands of silicon wafers, coated with all kinds of materials and etched with light, that eventually get sliced up into individual processor dies.

Nvidia employs one of TSMC's most advanced process nodes, N3, to make its Rubin AI behemoths, but the Blackwell gaming chips that power the RTX 50-series graphics cards are made via a custom version of the previous N5 node, called 4N. The RTX 40-series GPUs were also made on this node, but for the 30-series, Nvidia used Samsung's 8LPH.

(Image credit: Locuza / Fritzchens Fritz)

That was a bit of a surprise when it was announced because before the Ampere generation of GPUs, Nvidia's GTX 10- and RTX 20-series were both made on a custom TSMC N16 node, and its N28 process for a further three prior GTX generations.

I mention all of this because it's the first port of call we need to make before making an educated guess as to what RTX 60-series chips will be like. With Nvidia so heavily invested in AI now, I suspect that it won't use TSMC's most cutting-edge node, N2, but will stick with N3 for cost reasons.

This is important to understand because it will determine the approximate die density of these future GPUs, i.e. the number of transistors per square millimetre of die area. Nvidia's Blackwell and Ada Lovelace gaming chips have roughly the same density, as they're made on the same node: around 120 million transistors/mm2.

Relative die density for the last five generations of Nvidia gaming GPUs

(Image credit: Future)

TSMC's N3 is reported to be in the region of 200 or so (higher with certain variants), and if that's what we can expect for the next round of RTX GPUs, then we're looking at 66% increase in density. However, this doesn't automatically mean we'll see chips with 66% more shaders and cache.

That's because the die density figure is for logic only, the stuff that makes up the shader cores and other processing elements. For GPU parts such as cache and PCIe/VRAM circuitry, the increase in density is much smaller, around 5% at best. So while Nvidia can jam lots more CUDA cores into its next-gen GPUs, it's quite limited as to what it can do with cache and analogue systems.

It's also steadily favoured using small dies for the majority of its gaming products, helping to improve wafer yields (the percentage of dies from a wafer that can be used) and profit margins (smaller dies mean more dies per wafer).

The exception to this has been at the very top-end of the GPU scale, with the RTX 5090's chip being fairly close to the maximum size that TSMC's equipment can make. The reason for this isn't about making the 'ultimate' gaming GPU: it's all about having a product for prosumer AI market.

Relative die sizes for a selection of Nvidia GeForce RTX graphics cards

(Image credit: Future)

If you're wondering why the GTX 10-series chips were so small, it's because they were made on a heavily refined TSMC N16 process node that Nvidia had been working with for quite some time.

Anyway, as so much has changed since the GTX 10-series era of GPUs, it's hard to ascertain whether this trend will continue or if Nvidia's next chips will be substantially larger or smaller. My gut feeling is that, having refined its designs with Blackwell and Ada Lovelace, Nvidia will probably stick to using similar-sized dies for the RTX 60-series.

Putting all of this together suggests that we're going to see RTX 60-series GPUs with around 60-70% more transistors than in Blackwell chips, but still the same size. The next question to ponder is how well Nvidia spend that transistor budget?

CUDA cores and cache

With a stack more transistors to play around with, you'd think the first thing Nvidia would do would be to ramp up the number of CUDA cores (i.e. the 'shader' units), but historically that's not always been the case.

Relative number of CUDA cores in selected tiers of Nvidia GeForce GTX/RTX graphics cards

(Image credit: Future)

In the above chart, you can see that the CUDA count jumped up significantly with the switch to Samsung for the RTX 30-series, but other than the top-end models, the number of shaders in 60-, 70-, and 80-class graphics cards has barely changed. However, this chart is also rather misleading, and for two reasons.

First, not all shaders are equal, and second, not all shaders are clocked the same. Since the Pascal era of the GTX 10-series, CUDA cores have become increasingly more capable and flexible. Better process nodes and chip designs have substantially lifted clock speeds.

A more appropriate chart, though still a touch limited, is one that shows the peak FP32 throughput for each GPU. This is a measure of how many 32-bit floating-point operations the chip is capable of handling per second, one of the most common routines that takes place in 3D rendering.

Relative peak FP32 throughput for classes of GeForce graphics cards

(Image credit: Future)

At first glance, this chart might seem no different to the previous one, but if you look closely, you can see that there is a noticeable gap between the RTX 20 and 30-series, and again between the RTX 30 and 40/50-series. All because of big increases in clock speeds and changes to the CUDA cores themselves.

Generally speaking, Nvidia has squeezed out roughly similar levels of FP32 performance for a given die density across the past five generations of GPUs, with the exception of the RTX 30-series which was noticeable higher. If you wondering why the switch to TSMC N5 didn't make much difference in that aspect, it's because RTX 40/50-series GPUs have vastly more L2 cache than all previous chips.

So much so, that for the chart below, I've had to use a logarithmic y-axis scale (base 2) in order to separate out of the various GPUs enough for viewing.

Relative L2 cache levels in various GeForce RTX graphics cards

(Image credit: Future)

Where Pascal, Turing, and Ampere GPUs had to make do with a handful of megabytes of Level 2 cache, Nvidia took a leaf from AMD's RDNA 2 book and significantly increased the amount of last-level cache. Such large amounts of cache can be tricky to get right, in terms of capacity versus latencies, but the huge slices of SRAM go a long way in reducing the pressure on the VRAM bandwidth, as well as helping overall compute and ray tracing performance.

As already mentioned, due to how poorly SRAM scales with process node shrinks, Nvidia can't lob in a pile more cache without significantly increasing the die size. So it will probably stick to very similar amounts of L2 cache as used in Blackwell.

We should still see a healthy jump in the number of shaders, and thus FP32 throughput, but it's unlikely to be in the same order of scale as we'll probably see with die density. For example, Ampere chips have an average density 81% higher than Turing chips, and on average, 180% more CUDA cores per die size.

However, while Ada Lovelace GPUs are 173% more dense than Ampere, in terms of transistors per square millimetre, the shader units per die size figure is only 57% larger on average.

I feel that Nvidia will err on the side of caution with its RTX 60-series, motivated by a desire to keep profit margins as high as possible, and that we'll see something like a 30 to 50% increase in the shader count, compared to Blackwell. Before I go all crystal-ball and attempt to predict the specs of the main RTX 60-series cards, though, there are a couple more factors to consider: ray tracing and AI.

Tensor and Ray Tracing cores

In the chart below, I've plotted Nvidia's quoted 'AI TOPS' figures for each GPU. This is a measure of the absolute peak throughput for the GPU's Tensor cores, as measured in trillions of operations per second, and on first impression, it would seem that RTX GPUs (GTX chips don't have these matrix/tensor units) are almost nothing but Tensor cores.

Relative peak AI TOPS for specific classes of GeForce graphics cards

(Image credit: Future)

However, the chart is rather misleading, because for each successive generation of Tensor cores, Nvidia has upgraded them to not only carry out more operations per second, but also expanded the data formats they support. The respective AI TOPS figures are for the smallest, and thus quickest, format each GPU can handle. So for the RTX 20-series figures, they're all in INT4, whereas for RTX 50, it's FP4 with FP32 accumulate, using sparsity.

To address this, I spent some time calculating the relative figures for when INT8 is used, if only to have an even playing field. This particular data format isn't used in DLSS nor gaming in general, but it's the one that I get the data most easily for.

Relative peak INT8 performance of the tensor cores in Nvidia GPUs

(Image credit: Future)

Yes, the number of Tensor cores has increased in the top-end chips, but the units themselves have become more capable, especially after the RTX 30-series. How far Nvidia will be able to push this with RTX 60-series is anyone's guess, but I suspect that they won't be any better in terms of operations per cycle, just more of them due to the increased die density.

Not that this is a problem, as such. Believe it or not, the tiny RTX 5060 has an INT8 figure that's only 6% lower than an RTX 2080's, so getting more of those cores will certainly benefit the performance of DLSS, even though it mostly uses FP16 and FP8 for upscaling (the data format used in frame generation isn't clear, unfortunately).

(Image credit: Nvidia)

The one thing Nvidia can't do is throw a massive pile of Tensor cores into the RTX 60-series GPUs, even though they take up relatively little die space by themselves, compared to the entire Streaming Multiprocessor (SM) structure that houses the CUDA cores.

At least, not without increasing the size of the register file in the SM. From Pascal through to Blackwell, each one has 64 kB worth of SRAM that stores the data the CUDA cores process while grinding through an operation. The Tensor cores also use that register file, so adding more to an SM could potentially cause problems with running out of registers.

For its AI data center Blackwell chips, Nvidia solved this by adding a dedicated 256 kB cache in the SM, purely for the Tensor cores, and it's possible that it could do something similar for its RTX 60-series chips. DLSS 4.5 doesn't really load up the Tensor cores all that much, but DLSS 5 (whatever its final form looks like) may well be the polar opposite.

This annotated die shot of the RTX 4090's AD102 GPU shows just how much space the L2 cache takes up (Image credit: Nemez / Fritzchens Fritz)

And it's a similar story when it comes to ray tracing. Each SM in every RTX GPU is home to a single 'RT core', but what's there and what it's capable of has significantly changed over the generations. I don't have any charts to show you for this one, but one only needs to compare the RT cores in Blackwell to the first iteration units in Turing to see that there will be more to come.

Fortunately, all that kind of stuff is pure logic, rather than a big slab of SRAM, so although the next-gen RT cores will be even more potent, I don't expect them to take up any more space than they currently do, relative to the rest of the SM.2

Putting it all together

RTX 6090

RTX 6080

RTX 6070

RTX 6060

Die size

750 mm²

370 mm²

260 mm²

180 mm²

CUDA cores

32768

14734

7896

5346

FP32 TFLOPS

157

74

39

26

% increase to RTX 50 FP32

+50%

+31%

+25%

+37%

L2 cache

120

72

48

32

VRAM

32

24

16

12

Taking everything into account, and churning through various calculations with Excel, I've put down some ballpark figures for die size, CUDA count, L2 cache size, and VRAM capacity for the primary RTX 60-series models. I'm not suggesting that these are absolute figures or even targets; just think of them as being 'zones' for where I suspect the cards will fall into.

I think Nvidia will want to stick with similar die sizes that it already uses with Blackwell, but there's a good chance it could go smaller, especially if TSMC ends up charging a small fortune to use its N3 process node. If that turns out to be the case, then the above CUDA core counts are clearly going to be maximums, and the final numbers could be a good deal lower.

Due to the SRAM's poor scaling with node shrinks, I don't think next-gen Nvidia GPUs for consumers will be packing much more, if any, L2 cache than RTX 50-series chips currently do. Perhaps a little more for the higher-class models, but nothing outlandishly big.

(Image credit: Future)

So far, I've not said a word about VRAM, and that's because the global memory supply crisis has made that very difficult to judge. I do think that Nvidia will want to raise the capacities in some tiers, by using 3 GB GDDR7 modules instead of the usual 2 GB chips, if only to stave off some of the flak it got with the RTX 5060 and 5060 Ti.

However, I don't foresee any changes of note to the size of the aggregated memory bus width, i.e. the RTX 6060 will still be 128 bits, RTX 6070 will be 192 bits, and so on. The reason for this is that it will help to keep the number of VRAM modules required to the bare minimum, and in turn, help profit margins.

With memory prices still sky-high and showing no signs of drastically reducing anytime soon, we could even have another generation of RTX graphics cards where there's no increase in VRAM capacities at all.

(Image credit: Micron)

The biggest unknown, though, is how high the GPUs will be clocked. Although the RTX 40 and 50-series graphics cards enjoyed a substantial boost over the previous generations, I'm not certain that TMSC's N3 will afford the same luxury, so I wouldn't be surprised if the RTX 60-series launches with clocks similar to those in the RTX 40-series.

If that's the case, then we're potentially looking at between a 25 to 50% increase in compute performance, compared to the 50-series. That might seem like an outlandish suggestion to be making, especially given that the RTX 40-series had far bigger increases compared to the RTX 30-series. Apart from the RTX 4060, which was merely 19% higher, in terms of peak FP32 performance.

Of course, I could be wildly wrong here, and the bump in the number of CUDA cores is much bigger than I think it's going to be. But I'm not totally convinced it will be an almighty jump. The only product that Nvidia really needs to push the boat with is the RTX 6090, to keep on top of the AI prosumer demand. For the rest of the range, Team Green has little in the way of competition, snapping at its ankles, forcing to stay well ahead.

(Image credit: Nvidia)

That said, Nvidia can't rely on another DLSS moment to gift the RTX 60-series with seemingly outrageous levels of performance. Multi Frame Generation appears to be as performant as it's going to get, and Super Resolution upscaling certainly is. DLSS 5 is about changing the appearance of graphics, not outright boosting frame rates, so it's unlikely that Nvidia can lean on it to help out.

To push neural rendering and path tracing into the gaming masses, Nvidia will need to raise every aspect of its RTX GPUs, from CUDA count and cache levels to Tensor core performance and data bandwidth. Countering this are factors such as profit margin targets, process node costs, supply constraints for GDDR7, and a near-total lack of competition.

I've based my predictions on past data and trends, but with the semiconductor and PC markets being somewhat uncertain right now, I've also been somewhat cautious with my figures. All we need to do now is just wait for the inevitable 'leaks' and 'rumours' to see just how close to the mark I've been.



Читайте также

BladeRite: Rivals — анимешная королевская битва с упором на ближний бой

Peter Molyneux's right about one thing: It's sad how no one seems to care about god games anymore

Состояние мобильного гейминга в 2026 году — игры консольного уровня и выход за рамки «мобильности»




Game24.pro — паблик игровых новостей в календарном формате на основе технологичной новостной информационно-поисковой системы с элементами искусственного интеллекта, гео-отбора и возможностью мгновенной публикации авторского контента в режиме Free Public. Game24.pro — ваши Game News сегодня и сейчас в Вашем городе.

Опубликовать свою новость, реплику, комментарий, анонс и т.д. можно мгновенно — здесь.