OCP Global Summit started this week in San Jose and as to be expected, two topics are getting all the attention –AI Compute and Liquid Cooling. Apart from these two, co-packaged optics (CPO) is another topic that has garnered much interest this year at the summit.
As for liquid cooling, the market adoption has been primarily in using facility water via coolant distribution units (CDUs) and to a lesser extent using immersion cooling. However, immersion cooling is garnering more attention this year than the years past at the summit.
Liquid cooling is becoming a necessity as CPU and GPU power dissipations are crossing the threshold of 500W TDP (thermal design power), a generic demarcation signifying the limit of fan-based cooling in servers and switches.
Moreover, the need for liquid cooling is power consumption-driven as articulated in It’s All About PUE, where power usage effectiveness (PUE) measures the datacenter power efficiency.
Ideally PUE is 1.0, meaning every watt consumed powers the servers, storage and networking gear in the datacenter. Readers may refer to articles on PUE published here like Of Deepmind, DCIM, and Data Center Cooling, DCIM And Deepmind, Take 2: AI to Control Google’s Data Center Cooling, Human-Centric Value for Smart Data Centers, etc.
At Electronics Cooling, we have extensively covered the topic of liquid cooling. Articles like Updates in Liquid Cooling Research, Liquid Cooling Systems in Data Centers Help Companies Achieve Sustainability Goals and many others are worth reading.
Coming back to the OCP Global Summit, at last count there were more than 55 topics covering liquid cooling at the OCP Global Summit. Most of the presentation sessions are on Wednesday, 15-Oct-2024.
Here is a short list of a dozen sessions and their brief descriptions that are not to be missed.
- Wed, October 16, 8:00am – 8:20am | SJCC – Lower Level – LL20A
Best Practices for Liquid & Air Cooling of a 51.2Tbps Switch for High-Density AI Clusters
Track: Networking
Session Summary: The rapid evolution of AI training and inference is driving up compute and interconnect density, leading to higher power density per rack and increasing the challenge of managing heat dissipation.
This presentation will explore the efforts to address these challenges from a networking perspective, ranging from the switch silicon component to the system level.
- Wed, October 16, 9:15am – 9:30am | SJCC – Concourse Level – 220B
Open Systems for Density and AI
Track: Special Focus: Artificial Intelligence (AI)
Session Summary: AI is driving an unparalleled need for velocity and innovation across the industry. Dramatic gen-over-gen improvement in each new release and rising powers make it critical for the industry to rapidly deploy new infrastructure with innovative liquid cooling solutions at scale.
OCP standards provide industry standards for innovations to be adopted for today’s most demanding applications. This talk will introduce new OCP standards based solutions for high performance and AI workloads.
- Wed, October 16, 9:45am – 10:15am | SJCC – Concourse Level – 210AE
OCP Immersion Fluids Guidelines, Specifications and Lifecycle Management
Track: Cooling Environments: Immersion
Session Summary: The OCP immersion fluid community has grown exponentially over the last year, resulting in a host of new efforts aimed at solving fluid-related challenges faced by adopters of immersion cooling technology.
This session will provide updates on the latest community-driven developments including the latest guidelines in safe fluid handling, oxidation stability, cleaning procedures alongside sustainable practices for fluid selection, maintenance, and disposal.
The scope of the projects includes hydrocarbons, esters and fluorinated fluids, and both single phase and two-phase applications.
The session will include practical examples of how these recommendations have been adopted by end users to improve the safe and sustainable operation of immersion systems.
- Wed, October 16, 10:10am – 10:25am | SJCC – Lower Level – LL20BC
Techno-economic Analysis of Data Center Waste Heat Recovery
Track: FTS: Data Center Sustainability
Session Summary: The exponential growth of Data Centres (DCs) has necessitated the exploration of sustainable solutions to mitigate their substantial energy consumption and environmental impact.
This study presents a techno-economic analysis of Waste Heat Recovery (WHR) in DCs, focusing on the utilization of waste heat from HVAC systems.
The recovered heat is employed for power generation using the Phasic Heat Engine (HE). The analysis evaluates the technical feasibility, economic viability, and environmental benefits of this approach.
- Wed, October 16, 10:25am – 10:40am | SJCC – Lower Level – LL20BC
A Flexible and Scalable Thermal Test Vehicle Design for Electronics Cooling Solutions
Track: FTS: Data Center Sustainability
Session Summary: With recent advances in machine learning and artificial intelligence, the desire for high-performance computing has never been greater. Datacenters already represent two percent of global energy usage, and forecasts predict that this may double in the next few years.
The density and power consumption of modern graphics processing units (GPUs) and central processing units (CPUs) is growing rapidly, which necessitates the design of advanced liquid cooling systems. Existing solutions for characterizing and validating these coolers are inadequate.
In this work, the design of a flexible, scalable thermal test vehicle (TTV) is presented, which is based on an array of power transistors.
- Wed, October 16, 10:35am – 10:45am | SJCC – Concourse Level – 210AE
Safe Handling Guidelines for Immersion Fluids
Track: Cooling Environments: Immersion
Session Summary: The fluid safe handling sub group has drafted a whitepaper to provide safety recommendations for immersion cooling end users, fluid manufacturers, and information technology equipment manufacturers that may work directly with immersion cooling fluids.
The document outlines recommended safety practices and resources to consider when employing single phase and two phase immersion cooling fluids.
The whitepaper highlights the hazards associated with different fluid classes, recommendations for fluid handling, storage, and disposal, and methods to monitor and mitigate exposure.
- Wed, October 16, 10:45am – 11:10am | SJCC – Concourse Level – 210AE
Exploring the potential of single-phase immersion cooling through quantitative thermal performance testing
Track: Cooling Environments: Immersion
Session Summary: As the landscape of single-phase immersion expands, gaps have emerged in understanding the thermal performance expectations of the various cooling solutions in the market.
Specific to immersion fluids, previous work within Open Compute Project (OCP) has benefited from an analytic figures-of-merit (FOM) approach where fluids could be compared using fluid thermophysical properties.
In this talk we present experimental data detailing the thermal performance of a range of fluids in both natural and forced convection flow regimes. Key performance attributes such as case-to-fluid thermal resistance, fluid pumping power, and maximum thermal design power will be shared.
This presentation intends to prompt broader discussion within OCP on how a consistent, total system approach can drive the optimization of thermal management solutions and continue to push the boundaries of single-phase immersion to meet the future thermal demands of compute.
- Wed, October 16, 11:10am – 11:30am | SJCC – Concourse Level – 210AE
OCP Immersion Cooling Reliability
Track: Cooling Environments: Immersion
Session Summary: The OCP Immersion Reliability Committee addresses the broader compatibility and reliability issues of ITE hardware deployed in immersion cooling through the service lifetime.
This session will introduce the community contributions to provide guidance to understand the collective behavior of ITE hardware in contact with dielectric fluids, focusing on the material compatibility and performance stability of components and servers in the aging immersion fluid contaminated by ITE hardware materials.
The goal is to develop comprehensive guidelines that ensure the reliability and efficiency of complete immersion cooling systems.
- Wed, October 16, 12:30pm – 12:50pm | SJCC – Concourse Level – 210AE
Thermochemical Reliability of Components in Immersion Solutions
Track: Cooling Environments: Immersion
Session Summary: Immersion reliability testing is required to understand and mitigate thermochemical failure modes of components in immersion solutions since the immersion fluid is in direct contact with the IT equipment.
This session will introduce the immersion reliability test methodology Intel uses to build comprehensive thermochemical reliability risk assessments for Intel products using material compatibility and thermochemical reliability data collected on 3rd Gen Xeon CPUs.
- Wed, October 16, 12:50pm – 1:10pm | SJCC – Concourse Level – 210AE
Systems-Level Integration of Immersion Cooling: An Overview of the OCP Solutions Technical Committee
Track: Cooling Environments: Immersion
Session Summary: The “Immersion Solutions” Technical Committee includes work streams dedicated to the holistic integration of cooling solutions at the system level.
This community seeks to provide guidance for requirements, best practices, and safety protocols associated with system elements like fluids, IT equipment, tank/rack configurations, power distribution and systems monitoring.
The committee is also responsible for approving and maintaining Specifications and Product Acceptance within the OCP Marketplace related to immersion solutions. This session will provide an overview of current progress and priorities in advancing these systems-level insights.
Attendees will learn about the various work streams under the committee’s oversight, including Immersion Requirements, Power Distribution, Hardware Management, Failure Mode & Effect Analysis (FMEA), and Total Cost of Ownership (TCO).
- Wed, October 16, 1:10pm – 1:30pm | SJCC – Concourse Level – 210AE
Power Distribution in Immersion Environments: Challenges and Opportunities
Track: Cooling Environments: Immersion
Session Summary: The OCP Immersion Power Distribution work stream is innovating the integration of bus bars in immersion tanks, a crucial step in optimizing power delivery in liquid cooling systems.
This presentation will delve into a new base specification and explore both the technical challenges and the unique opportunities this integration presents.
Emphasis will be on the design considerations, thermal management implications, and the potential for enhancing power efficiency in data center operations through this novel approach.
- Wed, October 16, 1:30pm – 1:55pm | SJCC – Concourse Level – 210AE
Balancing Innovation, Performance and Safety: the Evolution of Immersion Requirements
Track: Cooling Environments: Immersion
The OCP Immersion Requirements work stream is dedicated to maintaining and evolving community guidance in the form of an Immersion Requirements document.
This document is a critical resource for the industry, providing essential specifications and standards for the implementation of immersion cooling technology.
It importantly serves as a foundation for all Base, Design and Product Specifications intended for use in an immersion environment.