Phil Tuma
3M Company, St. Paul, Minnesota
The inefficiencies of legacy datacenter air cooling schemes are by now well known. New “Free Air” cooling technologies in which air is introduced to the racks in a facility or container and confined in hot and cold plena probably represent the pinnacle of air cooling efficiency, at least where climate permits them. However, development of an air cooled server requires significant engineering resources. The resultant solution is hardware intensive and the power density and energy efficiency are ultimately limited by airflow paths, fans, blowers, filters and the inherent inability to capture and utilize the waste heat. This suggests an inefficient use of engineering and natural resources, electrical power, and real estate [1].
It is generally recognized that liquid cooling can dramatically increase efficiency, power density and the thermodynamic availability of the heat removed [2]. However, implementation of traditional pumped liquid cooling schemes, be they single- or two-phase, is also hardware intensive. These systems bear the inherent costs of cold plates, manifolds, redundant pumps, plumbing, controls, heat exchangers, quick disconnects (QDs), etc. Controlling the flow and mitigating the loss of water or refrigerant through this myriad of components within a rack or container is an engineering challenge exacerbated by the number and variety of heat generating devices on a server and the requirement that each server within a rack be “hot swappable.” It is for these reasons that liquid cooling has been relegated to the world of mainframes and supercomputers, the cost barriers too high for the commodity datacom environment to bear.
Passive two-phase immersion cooling is arguably one of the most elegant ways to capture all of the heat generated by a complex electronic assembly. By immersing electronics in a bath of volatile dielectric coolant that boils on the heat generating devices, much of the aforementioned hardware can be eliminated. The heat is captured efficiently as saturated vapor and can be transferred efficiently by condensation to an external heat sink like air or water. This technique has been used for decades in countless transformers, klystrons and traction inverters, some of which are still in production today being favored for their simplicity, reliability, power density and performance. However, these systems use sealed pressure vessels with hermetic electrical connections. Since they are evacuated and filled like refrigeration systems, they are not easily serviced. Creating and maintaining such an enclosure for commodity computational or communications hardware that must be field serviceable is challenging. It is for these reasons that two-phase immersion cooling too has been relegated to the world of supercomputers and most engineers dismiss the idea of immersion cooling within a data center.
However, immersion cooling can be applied without the aforementioned complexities resulting in a system that is not only elegant but simpler, more dense, much less expensive and at least as efficient as any other liquid cooling technique. This new twist on passive 2-phase immersion cooling was detailed in a recent publication [3].
Open Bath Immersion Cooling Concept (OBI)
In this concept, servers are immersed side-by-side in simple modular semi-open baths of a volatile dielectric fluid (Figure 1). The term “semi” denotes a bath that is closed when access is not needed much like a chest-type food freezer. Unlike more traditional immersion systems, these baths operate at atmospheric pressure and have no specialized hermetic connections for electrical inputs and outputs. Instead, electrical connections from a submerged backplane enter a simple conduit beneath the liquid level and exit the top of the tank. The only other opening is through a vapor trap as will be discussed. The vapor generated by boiling rises to a condenser integrated into the tank and cooled by tower water or water used at some distance for comfort heating. Alternatively, the vapor can flow passively to an outdoor natural draft or forced air condenser to transfer its heat to outdoor air without water as an intermediate. Condensed vapor simply falls back to the bath. Servers can be hot swapped without disturbing their neighbors by simply removing them from the bath. They exit the bath dry resulting in minimal and easily quantified fluid losses.
Among the advantages of OBI compared to more traditional liquid cooling schemes, is the fact that all server- and most rack-level cooling hardware are eliminated along with considerations relating to their integration, reliability and power consumption. Isothermal operation and fire protection are intrinsic to the technology. Of course, there are other considerations.
Chip-to-Fluid Performance
The system thermal performance has two components. The first is quantified by considering the junction-to-fluid temperature difference of the central processing unit (CPU). A typical CPU package is almost ideally suited for passive 2-phase immersion cooling. It requires only the addition of a 100μm thick porous metallic boiling enhancement coating (BEC) to the integrated heat spreader (IHS). These coatings produce boiling heat transfer coefficients, h>10 W/cm2-K at heat fluxes exceeding 30W/cm2. Incorporating this technology directly onto the IHS during package assembly eliminates the secondary thermal interface common to many liquid cooling schemes without altering the package assembly process. The IHS not only spreads the heat, to reduce the flux to a manageable level, but it protects the chip beneath from fluid-borne contaminants.
The resultant sink-to-fluid resistance, Rs-f, is dependent on the chip size (Figure 2). For a typical 20x20mm chip with a 30x30x3mm IHS, Rs-f =0.03°C/W. The additional resistances from sink-to-junction based on a 20x20mm thinned die and solder interface total 0.015°C/W [4]. With Rj-f=0.045°C/W, a 200W processor has an average junction-to-fluid temperature difference,
ΔTj-f = Rj-f QCPU, (1)
of about 9°C. The fluid temperature, Tf, is the fluid’s atmospheric boiling point, and it remains constant.
Tf = Tb = Tsat (Patm) (2)
It should be recognized that even though this is a passive technique with a thermal interface (TIM1), the resultant chip-to-fluid thermal resistance is lower than that achievable with direct-die-contact spray or jet impingement schemes based on dielectric coolants. These active techniques generally result in Rj-f >0.10°C/W for the 20x20mm die discussed earlier. The chip-to-fluid thermal resistance of passive immersion is about 0.015°C/W higher than that achievable with a water microchannel cooler [4]. However, with immersion there is no advective resistance associated with temperature glide. The result is a similar average resistance but with all devices at the same temperature.
Fluid-to-Water Performance
The second component of thermal performance, in the case of a water-cooled bath, is the temperature difference from the fluid to the facility water. An efficient water-cooled condenser can achieve a volume specific, fluid-to-water inlet resistance, of Rf-w=1.4ºC-cm3/W under isothermal conditions [5]. When the condenser volume is known, this number can be used in a log mean temperature difference (LMTD) analysis to predict condenser performance when the water temperature is allowed to rise.
The results depend upon the fluid selected since the fluid boiling point is fixed. A commercially available fluoroketone (FK) fluid with Tb=49°C results in an average junction temperature Tj~58°C. An 80kW bath roughly the size of that depicted in Figure 1 could use water at Tw,i=30°C at 65 liters/min returning it at a temperature very close to Tw,o=49°C. If the Tj is allowed to reach 83°C as has been proposed [6], then a FK with Tb=74°C could be used. When transferring heat to ambient air in hot climates, such a bath could use 60°C water and return it at 73°C, temperatures hot enough to eliminate the need for cooling tower water in the hottest climates. If instead water is used for comfort heating, the water flow rate could be reduced to about 26 liters per minute. The bath would accept 30°C water and return it to the heating system at ~74°C (Figure 2).
A Note on Power Density
Power density within a server chassis has historically been limited by airflow and plumbing considerations. The density of a typical air cooled server chassis is 0.04kW/liter versus 0.16kW/liter for a hybrid air/water supercomputer node [2]. An OBI-cooled server has no cooling hardware. Determining how densely the remaining electronics could be packaged is beyond the scope of this work but it is worthwhile exploring what power density could be cooled by immersion. The simulated printed circuit board (PCB) shown in Figure 3 holds 20 heater assemblies comprised of 19x19mm 200W ceramic heaters epoxy bonded on one side to 30x30x3mm BEC copper heat spreaders that simulate a modern IHS. A thermocouple in the fluid, Tf, and one within each heat spreader, Ts, permit calculation of the individual thermal resistances.
This PCB was immersed in a confined vertical channel of the same area as the board with 4 and 7mm gaps between the boiling surface and the adjacent wall. This assembly was able to dissipate 4kW (200W per heater assembly) for a 4mm gap when the bath was filled with C3F7OCH3, a hydrofluoroether working fluid. The average Rs-f are shown in Figure 3, bracketed by ½ standard deviation on each side. 4kW equates to a PCB level heat flux of 11.7W/cm2 versus 1.7W/cm2 for the Cray X1E spray-cooled supercomputer [7]. These data suggest that 4kW/liter with <100cc of fluid per kW are certainly attainable, if only from a thermal point of view. This fluid fill cost is less than the cost of copper used in two 2U heat sinks and the potential for reduction of materials and waste associated with PCB manufacture is significant.
Fluid Losses
Some fluid loss is intrinsic to the technology and this affects not only the cost of ownership but also the greenhouse gas emissions and the likely human exposure in the datacenter. However, the loss mechanisms are well understood and easily mitigated because they act at one point above the vapor zone of a bath which is at atmospheric pressure. One can contrast this with a conventional pumped loop with its myriad of braze joints, connectors and seals that are wetted and under positive pressure.
Losses are most pronounced during commissioning and startup of a new bath when air dissolved in the fluid and present in the head space are purged bringing vapor with them. Losses during operation are limited to diffusion through the IO conduit and those associated with daily power fluctuations which cause the vapor zone to rise (expelling more air/vapor) and fall. By venting the exiting air/vapor stream through an on-demand, thermoelectrically cooled condenser or “trap” most of the vapor within it can be condensed and returned to the system.
The resultant annual losses depend on this trap temperature, Tt, and can be easily calculated. Assuming Tt=10°C and one 25% load fluctuation per day, the annual fluid consumption cost for the 80kW bath mentioned earlier is $123/yr at the typical fluid list price. This compares favorably with the $184/yr cost of operating rack level pumps in a traditional liquid cooling system at $0.05/kWh and the $2,800/yr cost of operating just the server fans in a typical air-cooled rack assuming they use 80W per kW of server power. If a fluoroketone working fluid with a global warming potential (GWP) of 1 is used, the greenhouse gas emissions resulting from annual fluid loss are 0.2% of those associated with operating rack pumps that produce 7.18×10-4 metric tons CO2/kWh.
Conclusions
It has been said that datacenter operations are all about cost, cost, cost. Its costs money to develop, manufacture, house and operate a thermal management solution. Less tangible environmental costs like greenhouse gas emissions, natural resource consumption and e-waste are becoming increasingly important to society. Open bath immersion (OBI) cooling technology appears able to have a dramatic effect on both financial and environmental costs in all stages of a server’s life cycle (Figure 4). Its investigation continues through long term demonstrations that will track thermal performance, serviceability, fluid health, fluid loss, etc.
References
Shah, A., Christian, T., Patel, C., Bash, C., Sharma, R., “Assessing ICT’s Environmental Impact,” Computer, 42(7), pp. 91-93, July 2009.
Ellsworth, M., and Iyengar, M., “Energy Efficiency Analyses and Comparison of Air and Water Cooled High Performance Servers,” IPACK2009-89248, July 19-23, San Francisco, CA.
Tuma, P.E., “The Merits of Open Bath Immersion Cooling of Datacom Equipment,” Proc. 26th IEEE Semi-Therm Symposium, Santa Clara, CA, Feb. 21-25, 2010.
Colgan, E.G., et al, “A Practical Implementation of Silicon Microchannel Coolers for High Power Chips,” 21st IEEE Semi-Therm Symposium, San Francisco, California, March 2005, pp. 1-7.
Barnes, C.M. and Tuma, P.E., “Practical Considerations Relating
to Immersion Cooling of Power Electronics in Traction Systems,” Proc. 2009 IEEE Vehicle Power and Propulsion Conference (VPPC’09), Sept. 7-10, pp. 614-621.
http://www-03.ibm.com/press/us/en/pressrelease/32049.wss
Pautsch, G., “Thermal Challenges in the Next Generation of Supercomputers,” Presentation, CoolCon 2005, May 16-17.
Phil Tuma can be reached at petuma@mmm.com