We owe the ability to increase bandwidth on GPUs incredibly quickly to HBM memory. This debuted on the AMD Radeon R9 Fury X graphics back in 2015, with a first generation that ran at 128 GB/s per stack, but was very limited. Now is the time to talk about the future. HBM4 memorywhich is expected to jump to a 2048 bit interfacewhat I could double bandwidth compared to the current generation.
In graphics cards we have always talked about their VRAM memory and bandwidth, as quite important factors. It is not the same to have memory GDDR5 that GDDR6X top of the range, going from a bus 128 to 384 bitsIf we take as an example the RX 550 vs RTX 4090. It is an exaggerated example, but it helps us realize the enormous difference in bandwidth between the two. However, when we saw that AMD used HBM (High Performance Memory) in its graphics and we expected a great increase in performance, it was not like that.
The HBM4 memory would arrive with a 2,048-bit interface
AMD’s first GPUs with HBM1were limited to only 4GB, but fortunately there were new generations. With HBM2the speed was doubled until 256GB/s by stacking and the maximum capacity was 8GB. After this, in 2018, an update called HBM2Ewhich improved the speed up to 460GB/s and the limit was increased to 24GB.
From here the third generation arrived with HBM3 by Micron, which reached 819GB/s by stacking and with a capacity of up to 64GB. Finally, and where we are currently, is using memory HBM3Ewhich reaches a speed of 1.2TB/s per battery with a bus 1,024 bits. This is already very fast and is used in the most powerful GPUs, but now it is revealed that HBM4 would arrive with an interface 2048 bitswhich could double the bandwidth in the best of cases.
With 2,048 bits, half as many layers will be required as with HBM3
To achieve that ideal situation, they should maintain the same transfer speed with HBM4, something that may not be achieved. However, the main advantage is in its 2,048-bit interface, since it directly would reduce stacking layers or would increase the GB in the graphs. So that we can make one NVIDIA H100 which is one of the most powerful GPUs for AI, employs a total of six HBM3 memory stacks with 1,024 bits, resulting in a 6,144 bits.
If the HBM4 memory reaches the promised 2,048-bit interface, 3 stacks (half) will be necessary to achieve the same 6,144 bits, while achieving the same performance. This is too early to say for sure, as the current most modern standard, HBM3E, was revealed in May 2023 and we have only seen one GPU that is going to use it at the moment, the NVIDIA GH200. This will become NVIDIA’s most powerful graphics for AI with a total of 282GB HBM3E memory.