Can Nvidia’s AI Platform Make Data Centres More Sustainable?

Nvidia’s newest AI platform, the GB300 NVL72, introduces a set of onboard energy storage and power management tools designed to reduce the strain AI workloads place on electricity grids.
By using both hardware and software to limit energy spikes, the system offers a way to improve grid stability during power-intensive training sessions and reduces the need for over-provisioned infrastructure in data centres.
Addressing instability from synchronised GPU workloads
In AI training, thousands of Graphics Processing Units (GPUs) often perform the same operations simultaneously but on different data.
This creates synchronised power demands that differ sharply from traditional data centre workloads, which are typically more staggered.
When these GPUs ramp up together at the start of a training job and wind down in unison at the end, the grid faces sudden shifts in demand that can take traditional generation systems up to 90 minutes to respond to.
Nvidia engineers illustrate these patterns using heatmaps and time-series data.
These show GPUs spiking in power usage during job start-up, fluctuating during the workload and dropping sharply at the end.
These shifts put stress on transformers, create electrical resonance and voltage instability and affect other grid users.
In July, CoreWeave became the first cloud provider to deploy the platform.
"CoreWeave is constantly working to push the boundaries of AI development further, deploying the bleeding-edge cloud capabilities required to train the next generation of AI models," says Peter Salanki, Co-Founder and Chief Technology Officer at CoreWeave.
"We're proud to be the first to stand up this transformative platform and help innovators prepare for the next exciting wave of AI."
To address this, Nvidia rolls out a coordinated system of energy management across three phases of GPU operation: ramp-up, steady state and ramp-down.
During the ramp-up, a new power cap function gently increases GPU power draw, avoiding a sudden surge.
At the end of the job, a GPU burn mechanism holds power consumption briefly, easing the transition instead of dropping immediately.
This lets the system taper off in a controlled manner and aligns better with the grid’s operating limits.
Energy storage smooths the curve within each rack
The GB300 NVL72 platform also integrates updated hardware that includes energy storage at the shelf level.
Each shelf now contains electrolytic capacitors, electrical storage devices that charge during low-demand periods and discharge during high demand.
This flattens the power usage curve at the grid input, even while keeping output to GPUs unchanged.
Nvidia’s internal testing compares identical training workloads on the older GB200 racks and the new GB300 racks.
The older systems track GPU spikes directly through to the power grid, creating instability.
The updated GB300 shelf reduces these peaks by 30% while maintaining GPU performance. The results show a marked improvement in alternating current (AC) power stability.
Working with LITEON Technology, Nvidia has re-engineered the power shelf to devote half its volume to energy storage.
Each GPU receives 65 joules of storage capacity and a dedicated controller monitors and manages charge and discharge cycles in real time to match fluctuations in demand.
This control ensures the shelf responds dynamically to the workload, delivering power smoothing that is completely local to each rack.
The same energy smoothing techniques also appear in Nvidia’s GB200 NVL72 platform.
Smoothing is managed at both shelf and rack levels using multiple shelves to distribute load and limit variation.
Fine-tuning of this setup is possible through Nvidia’s System Management Interface (SMI) tool or the Redfish protocol.
These allow operators to configure parameters like idle time before ramp-down and the rate at which power levels increase or decrease.
Lowering infrastructure cost for data centre operators
Data centres have traditionally overbuilt power infrastructure to accommodate peak loads.
This often leads to inefficient use of resources since those peak levels are infrequent.
By levelling the power demand curve, Nvidia’s new systems let operators size infrastructure more closely to average usage rather than maximum demand.
This brings two main benefits.
Operators can either fit more racks into the same power budget or reduce the total power allocation required for a deployment.
Either way, energy smoothing offers a route to better efficiency and potentially lower costs.
Importantly, the system does not return energy to the utility; all smoothing happens within the data centre rack.
The energy management features in the GB300 NVL72 arrive as AI workloads grow larger and more complex.
As training models scale, the need for better infrastructure support becomes more pressing.
Nvidia’s approach offers an integrated solution that doesn’t rely on external grid coordination and operates entirely at rack level.
Cloud provider CoreWeave is the first to deploy the GB300 NVL72 platform.

