By 2024, single-phase direct-to-chip (D2C) cooling will be the industry leader for high-end GPU thermal management. Nevertheless, two-phase D2C cooling will be necessary due to the rising thermal design power (TDP), and it is anticipated that it will be available in significant quantities by 2026 or 2027 at the latest.
„IDTechEx has interviewed a large number of players in the data center value chain, ranging from chip makers, cold plate suppliers, and system integrators. Despite different opinions on the exact timeline, the consensus is that around 1500W is the TDP where single-phase D2C starts to struggle, and 2000W might be the limit of single-phase D2C. According to analysis of the historic trend of thermal design power of GPUs by IDTechEx, the take-off of two-phase direct-to-chip will happen soon. IDTechEx also projects the future trend of GPU’s TDP, based on the historic trend and roadmap of leading chip suppliers interviewed by IDTechEx, such as Nvidia,” Yulin Wang, Senior Technology Analyst at IDTechEx mentioned.
D2C cooling challenges: Single and two-phase
IDTechEx anticipates some technical and business-related benefits and challenges with the possible use of two-phase liquid cooling. One popular and rather easy approach is single-phase direct-to-chip (D2C) cooling. Without going through a phase shift, it uses a liquid coolant—usually a water-glycol mixture—to absorb heat from the chips through convection.
Significant technical obstacles must be overcome, too, including the possibility of coolant leaks that could endanger IT systems and the mechanical strain brought on by high flow rates. It takes about 1.5L per minute to cool down a chip with a TDP of 1000W, which is quite considerable. The increased flow rate necessitates rapid disconnects with bigger diameters and increases the risk of erosion and corrosion, which quickly raises the overall cost.
„The complexity of plumbing in data centers, especially around tight spaces, adds to the maintenance burden. Additionally, the high capital expenditure (CAPEX) (e.g., US$200-US$400 for a cold plate system including QDs, fluid distribution manifold inside servers, hoses, etc.) required for installation, particularly in retrofitting older data centers, makes cold plate cooling a costly option upfront, despite the fact that over the long run, it will be more energy efficient, thereby saving costs,” Wang added.
On the other hand, two-phase D2C cooling offers higher efficiency by using the phase change of the coolant, which allows for better heat dissipation and lower cooling costs per watt. It also reduces mechanical stress because it operates at lower flow rates than single-phase systems. For instance, the flow rate for a two-phase cold plate is around 0.3L/min to cool down a chip with a TDP of 1000W.
However, two-phase systems come with their own challenges. The use of fluorinated liquids can lead to environmental hazards if these fluids escape and form aerosols, raising concerns about safety and their global warming potential (GWP). Additionally, these systems are expensive to implement, with higher CAPEX for cold plate setups and additional fluid recycling and disposal costs.
Despite its efficiency, the environmental and commercial hurdles make two-phase cooling a more complex choice. However, with design considerations, some of the challenges can be mitigated, and IDTechEx’s “Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities” report also quantifies the CAPEX of single and two-phase cooling technologies with costs per component.