When IBM unveiled the world’s first 2nm chipset in May 2021, capable of fitting up to 50 billion transistors onto a chip the size of your fingernail, a brave new world seemed on the horizon. However, it could be some years yet before 2nm benefits cascade down to datacentre power densities, efficiencies and sustainability.
Rumour has it that a 3nm Intel chip will not appear until Lunar Lake in 2024, with a 17th generation chip for client devices, Nova Lake, to follow with a hoped-for 50% boost to CPU performance and the largest change to the architecture since Core in 2006 – roughly contemporaneous with Diamond Rapids for the server in 2025 or thereabouts.
The dominant chipmaking giant continues to play its cards slowly, and close to its chest.
Intel declined to speak to ComputerWeekly for this feature, but Uptime research director Daniel Bizo also suggests that performance improvements will be gradual rather than revolutionary.
“Really, what it comes down to is the very nature of semiconductor physics,” says Bizo. “So this is not a new phenomenon. It improves at a slower pace than the physical density of it, across the span of a decade.”
From 2022, higher power envelope Sapphire Rapids multi-tile chiplet design 10nm Xeon chips are expected, incorporating integrated, dynamic, high-bandwidth memory for enhanced capacity and data storage closer to the processor, which will be helpful for certain workloads, such as those that require lower latencies, says Bizo.
The Intel roadmap unofficially posted on Reddit in mid-2021 suggested a 10% CPU performance boost in 2022 with Raptor Lake followed by a “true chiplet or tile design” in Meteor Lake, more or less keeping pace with AMD and Apple.
Xeon's Sapphire Rapids has been tipped to offer 56 cores, 112 threads and thermal design power (TDP) of up to 350W. Key rival AMD is expected to offer up to 96 cores and 192 threads at up to 400W TDP with its EPYC Genoa processors as well as improved cache, IO, PCIe lanes and DDR5 capabilities.
So it might be some time before chipset innovations on their own can assist with datacentre pressures.
Bizo adds: “When you keep integrating more cores, it’s difficult to keep pace with memory. You can add memory channels, which we’ve done, but it becomes costly after a certain point. You need logic boards with much more wiring, or you’ll run out of pins.”
Revising the software stack
Feeding processors with data without adding even more memory channels and modules should prove a more effective way of serving extreme-bandwidth applications. However, Uptime also considers that to drive performance and efficiency via the latest chips, examining and revising the software stack is becoming crucial.
“Towards the mid-2020s, upcoming chips are not really going to offer that much excitement if people are not willing really to shake up the application stack and the way they run the infrastructure,” says Bizo.
“You can gain efficiencies if you change your practices around things like workload consolidation and software virtualisation – with many more virtual machines on the same server or perhaps consider software containers.”
That can be the bottom line, says Bizo, noting that the Skylake generation of scalable server chips emerging from 2017, at 14nm, consumed less power idling than the latest chips do today.
Anthony Milovantsev, partner at tech consultancy Altman Solon, says the reality is that we will firmly be in the standard paradigm of silicon substrate, CMOS transistors and Von Neumann architecture for the foreseeable future.
He adds that while quantum computing is generating activity, use cases are a small subset of what is required – although datacentres to house a quantum machine will eventually look very different, perhaps with cryogenic cooling, for example.
“If they do need quantum capacity at all, normal enterprises will almost surely consume it as a service, rather than own their own,” says Milovantsev.
“More near-term, compound semiconductors have interesting properties allowing for higher clock speed operations, but they have been around a while and there are significant drawbacks versus silicon dioxide. So this will continue to be niche.”
So Milovantsev agrees with Bizo that chip innovation is likely to rely on continued incremental improvements in transistor process nodes like 3nm, as well as innovations such as gate-all-around RibbonFETs, or usage of innovative die packaging, such as 2.5D with silicon interposers or true 3D die stacking.
However, he points to Arm/RISC for datacentre chip developments for improved price performance or niche HPC workloads. Examples include hyperscalers such as Amazon Web Services (AWS) moving into Arm/RISC with Graviton or Nvidia’s announced Grace CPU for high-performance computing (HPC).
“The net result of all this, though, is only marginal like-for-like power reduction at the chip level,” says Milovantsev. “In fact, the main result is rather higher power densities as you cram more transistors into small form factors to serve the ever-growing need for compute power. The power density – and therefore the datacentre cooling – problem will only become more important over time.”
Once, unless you were a hyperscaler or a datacentre hosting infrastructure-as-a-service (IaaS) companies or cryptomining, you probably didn’t need the high power densities or the robust cooling to support it. Of course, things are changing as enterprises more broadly make use of analytics, big data and machine learning.
“High-end datacentre CPUs from Intel and AMD have historically had TDPs in the 100-200W range,” says Milovantsev. “Current top-end AMD EPYC or Intel Ice Lake are already above 250W, and Intel Sapphire Rapids in late 2022 will be 350W.”
He advises explicitly linking the right applications to the right hardware with the right cooling and power systems in the right type of hall or facility, although business units will increasingly be asking for, and buying, chips with higher TDP envelopes.
Datacentres should devise a menu of vetted cooling options to work with, as well as how to route higher-amperage powers using modern busbars in addition to using the right server monitoring tools, says Milovantsev. ...