New details about the Radeon RX 7900 XT with the new RDNA 3 architecture and the GeForce RTX 4090 display cards with Ada Lovelace support continue to come. These GPUs are expected to bring major performance improvements and will also be the most power-hungry chips ever produced.

While NVIDIA aims for a monolithic approach with its Ada Lovelace architecture, AMD is expected to use a multi-chip design (MCM) similar to its CDNA 2 architecture. AMD will now use the same MCM technology for consumer and gaming GPUs.

AMD Radeon RX 7900 XT: RDNA 3 Architecture and Navi 21

The flagship RDNA 3 chip, the AMD Navi 31 GPU, will power the next-generation Radeon RX 7900 XT graphics card. AMD’s new generation RDNA 3 chips will use WGPs (Workgroup Processors) instead of CU (Computing Unit). The graphics card is said to have two base IPs, a GCD (Graphics Core Die) based on TSMC’s 5nm process and an MCD (Multi-Cache Die) based on TSMC’s 6nm process.

The apparent Navi 31 GPU configuration includes two GCDs (Graphics Core Die) and a single MCD (Multi-Cache Die). Each GCD has 3 Shader Engines (6 total) and each Shader Engine has 2 Shader Arrays (2 per SE / 6 per GCD / 12 total).

Also Each Shader Array consists of 5 WGPs (10 per SE / 30 per GCD / 60 total) and each WGP has 8 SIMD32 units with 32 ALUs (40 SIMD32 per SA / 80 per SE / 240 per GCD / 480 total) owner. These SIMD32 units combine to form 7,680 cores per GCD and 15,360 cores in total.

The Navi 31 MCD will be connected to dual GCDs via the next-generation Infinity Fabric interconnect and will carry an Infinity Cache cache with a capacity of 256-512 MB. Each GPU must also have 4 memory connections (32-bit). This means that there will be a total of 8 32-bit memory controllers for the 256-bit bus interface.

Another recent rumor revealed that AMD will use 3D Infinity Cache memory technology in the RDNA 3 family. Just like the L3 cache of Milan-X chips, there will be an additional vertically stacked cache unit in addition to the existing cache on the GPU.

AMD RDNA GPU Comparison

GPU Name Navi 10 Navi 21 Navi 31
GPU Manufacturing Technology 7nm 7nm 5nm (6nm?)
GPU Pakedollars tab Monolithic Monolithic MCD (Multi-Chiplet Die)
Shader Engines 2 4 6
GPU WGP 20 40 30 (Per MCD)
60 (In Total)
SP per WGP 128 128 256
Calculation Unit 40 80 120 (per MCD)
240 (total)
Core (Per Die) 2560 5120 7680
Core (Total) 2560 5120 15360 (2 x MCD)
Memory Bus 256-bit 256-bit 256-bit
Memory Type GDDR6 GDDR6 GDDR6
Memory Capacity 8GB 16GB 32GB
Infinity Cache 128 MB 256-512MB
Flagship SKU Radeon RX 5700 XT Radeon RX 6900 XTX Radeon RX 7900 XT
TBP 225W 330W 350-550W
Release Date 2019 3rd Quarter 2020 4th Quarter 2022 4th Quarter

NVIDIA GeForce RTX 4090: Ada Lovelace Architecture and AD102 GPU

According to current information, TSMC’s N5 (5nm) fabrication technology will be used for NVIDIA’s Ada Lovelace GPUs. Unlike AMD, the green team will adopt a monolithic design in their new display cards. Let’s add that the flagship RTX 4090 model will have the AD102 GPU.

The AD102 GPU is said to be clocked as high as 2.5 GHz (2.3 GHz average boost). NVIDIA AD102 seems to have 18432 CUDA Cores, according to the (variable) preliminary specifications contained in 144 SM units. That’s an almost two-cadol increase in core count compared to the Turing architecture. 2.3-2. The 5 GHz clock speed gives us up to 85 to 92 TFLOPs of compute performance (FP32). This is more than double the FP32 performance of the current RTX 3090, which includes 36 TFLOPs of FP32 computing power.

The 150% performance gain seems huge, but it’s important to note that NVIDIA has already made a huge leap in FP32 values ​​this generation with Ampere. The Ampere GA102 GPU (RTX 3090) will offer 36 TFLOPs, while the Turing TU102 GPU (RTX 2080 Ti) will offer 13 TFLOPs of raw power. In other words, there was an increase of over 150% in terms of FP values. However, the RTX 3090 was around 50-60% stronger than the RTX 2080 Ti in terms of real-world gaming performance.

In addition, it is among the information that the NVIDIA GeForce RTX 40 flagship will have a 384-bit bus interface similar to the RTX 3090. Defeated ardollarsars will still benefit from GDDR6X memories, but we will see higher bandwidth compared to current models. The RTX 4090 will have 24GB of memory, so we can expect single-sided 16GB DRAM or double-sided 8GB DRAM modules.

NVIDIA GPU Benchmark

GPU Name TU102 GA102 AD102
GPU Architecture Turing Ampere Ada Lovelace
GPU Manufacturing Technology TSMC 12nm NFF Samsung 8nm 5nm
Graphic Rendering Clusters (GPC) 6 7 12
Texture Rendering Sets (TPC) 36 42 72
Stream Processors (SM) 72 84 144
CUDA Kernel 4608 10752 18432
Theoretical TFLOP 16. one 37. 6 ~90 TFLOPs?
Memory Bus 384-bit 384-bit 384-bit
Memory Capacity 11GB (2080 Ti) 24GB (3090) 24GB (4090?)
Flagship SKU RTX 2080 Ti RTX 3090 RTX 4090?
TGP 250W 350W 450-650W?
Release Date September 2018 September 2020 2022?

