Recently we came to know that AMD and Nvidia are planning to adopt the High Memory Bandwidth (HMB) for their next generation of high end GPUs. AMD has taken a lead in announcing that it'll bring the first of these modules in 2016 for all the consumers to use and indulge in.
This makes it a very good time for me to intimate you with the three new standards in the memory world: Wide I/O, HBM, and HMC. But why d we need these technologies over the existing technologies is a very big question.
DDR4 and LPDDR4 are both incremental, evolutionary improvements to existing DRAM designs. As we’ll explore in this story, both standards improve power consumption and performance relative to DDR3/LPDDR3, but they’re not a huge leap forward.
While the standard has evolved considerably from where it began, it’s worth remembering that the first modern SDRAM DIMMs debuted on a 66MHz interface and provided 533MB/s of bandwidth. DDR4-3200, in contrast, is clocked at up to 1600MHz and offers up to 25.6GB/s of memory bandwidth. That’s an increase of 48x over nearly 20 years, but it also means that we’ve pushed the standard a very long way. While you can expect a newer standard maybe something like DDR5 to come up, the fact still remains that the interface and the extent to which we can stretch is limited and we need new standards to take its place in the near future. I'll give you a peek into the future in a manner that's as brief as possible so that you guys don't fall asleep midway.
According to Crucial, DDDR4 bandwidth maxes out at about 25.6 GB/s. Wide I/O, on the other hand, has a bandwidth of 12.8 GB/s.
Wide I/O 2 or 3 may offer significantly more bandwidth, and keep in mind that this is a technology designed for power efficiency first and foremost. It could eventually be a big deal for gaming laptops and other portable hardware.
Also, since it's designed explicitly for high-performance graphics situations, future GPUs built with HBM might reach 512GB/s to 1TB/s of main memory bandwidth.
These collective improvements could also obviate the critical bandwidth shortage that has hurt the GPU performance of both AMD’s APUs and, to a lesser extent, Intel’s CPUs as well. Both companies have designed solutions that make maximum use of a scarce resource, but a dual-channel memory architecture puts severe limits on how much GPU horsepower can fit into a CPU socket effectively. HMC and HBM could blow the doors off that problem, far beyond what AMD’s hypothetical GDDR5-equipped Kaveri might have offered.
AMD has talked about building an APU with HBM, but it’s not clear if we’ll see that chip in 2016 or at a later date. When it does come, the advantage could be profound. While AMD can’t pack a 300W GPU into a CPU socket for thermal reasons, the company could improve integrated GPU performance by 40-50% over today’s figures — and in the process, finally offer an integrated GPU that would hit “Good enough” for the low-end of the enthusiast market.
This makes it a very good time for me to intimate you with the three new standards in the memory world: Wide I/O, HBM, and HMC. But why d we need these technologies over the existing technologies is a very big question.
DDR4 and LPDDR4 are both incremental, evolutionary improvements to existing DRAM designs. As we’ll explore in this story, both standards improve power consumption and performance relative to DDR3/LPDDR3, but they’re not a huge leap forward.
While the standard has evolved considerably from where it began, it’s worth remembering that the first modern SDRAM DIMMs debuted on a 66MHz interface and provided 533MB/s of bandwidth. DDR4-3200, in contrast, is clocked at up to 1600MHz and offers up to 25.6GB/s of memory bandwidth. That’s an increase of 48x over nearly 20 years, but it also means that we’ve pushed the standard a very long way. While you can expect a newer standard maybe something like DDR5 to come up, the fact still remains that the interface and the extent to which we can stretch is limited and we need new standards to take its place in the near future. I'll give you a peek into the future in a manner that's as brief as possible so that you guys don't fall asleep midway.
Wide I/O
Wide I/O (and Wide I/O 2) is a high-bandwidth, low-power system designed (and most useful) for mobile SoCs. The standard has been backed by Samsung and other smartphone manufacturers as high-res handheld displays require lots of bandwidth but using as little power as possible is critical to battery life. Wide I/O is the first version of the standard, but it's likely that Wide I/O 2 or 3 is the version that actually makes it to market. No major devices are expected to ship with Wide I/O in the first half of 2015, but late 2015 may see the standard gain some limited ground.According to Crucial, DDDR4 bandwidth maxes out at about 25.6 GB/s. Wide I/O, on the other hand, has a bandwidth of 12.8 GB/s.
Wide I/O 2 or 3 may offer significantly more bandwidth, and keep in mind that this is a technology designed for power efficiency first and foremost. It could eventually be a big deal for gaming laptops and other portable hardware.
Hybrid Memory Cube (HMC)
Hybrid Memory Cube (HMC), a joint standard from Intel and Micron that offers significantly more bandwidth than Wide I/O 2 but at the cost of higher power consumption and price. HMC is a forward-looking architecture designed for multi-core systems, and is expected to deliver bandwidths of up to 400GB/s, according to Intel and Micron. Production could begin next year, with HMC commercially available in 2017.High Bandwidth Memory (HBM)
Finally, High Bandwidth Memory is a specialized application of Wide I/O 2, but explicitly designed for graphics. (Both AMD and Nvidia plan to adopt it for their next-generation GPUs.) HMB can stack up to eight 128-bit wide channels for a 1024-bit interface, allowing for total bandwidth in the 128-256GB/s range. In other words, it's not as cheap or power efficient as Wide I/O, but it should be cheaper than HMC.Also, since it's designed explicitly for high-performance graphics situations, future GPUs built with HBM might reach 512GB/s to 1TB/s of main memory bandwidth.
Conclusion
When these new technologies ship, they collectively could revolutionize memory access speeds and overall performance. Many of the new standards explicitly allow for multi-threading and simultaneous accesses to different banks of memory which could drastically cut latency on common operations. Meanwhile, in mobile, tablet GPUs should see a profound performance kick. One reason why tablet games continue to lag behind their desktop counterparts is because mobile parts simply lack the memory bandwidth for operations that desktop GPUs can handle.These collective improvements could also obviate the critical bandwidth shortage that has hurt the GPU performance of both AMD’s APUs and, to a lesser extent, Intel’s CPUs as well. Both companies have designed solutions that make maximum use of a scarce resource, but a dual-channel memory architecture puts severe limits on how much GPU horsepower can fit into a CPU socket effectively. HMC and HBM could blow the doors off that problem, far beyond what AMD’s hypothetical GDDR5-equipped Kaveri might have offered.
AMD has talked about building an APU with HBM, but it’s not clear if we’ll see that chip in 2016 or at a later date. When it does come, the advantage could be profound. While AMD can’t pack a 300W GPU into a CPU socket for thermal reasons, the company could improve integrated GPU performance by 40-50% over today’s figures — and in the process, finally offer an integrated GPU that would hit “Good enough” for the low-end of the enthusiast market.
No comments:
Post a Comment