Introduction
The computer industry is exploring various options for reducing the cost and energy consumption associated with cooling data centers. The use of liquid cooling in the data center has long been exploited as a more energy efficient means of heat removal than forced air cooling. However, more recently, there has been considerable interest in cooling data centers by means of circulating outside air through the data center to cool the computer hardware directly and then exhausting it to the outdoors. This achieves energy efficiency by eliminating active refrigeration, for at least part of the year. This cooling method is commonly called “free air cooling.”
Recent articles in this publication reflect the increased attention being directed at free air cooling. In the December, 2012, issue, an article described the design, energy efficiency, and bring-up challenges of a predominately free-air-cooled data center in the Pacific Northwest region of the US [1].
In that same issue, a Technical Brief article provided an update of the activities of the ASHRAE Technical Committee 9.9, devoted to data center cooling technologies and the development of best practices in that regard [2]. The article described the evolution of the maximum allowable data center temperature from 25˚C in 2004 to the creation of additional environmental classes in 2011 that allow temperatures up to of 40˚C and 45˚C, at maximum values of relative humidity (RH) of 40% and 32%, respectively. However, early experience in the deployment of free air cooled data centers indicates that, under more extreme weather conditions, the RH can reach levels exceeding the dew point [1, 3]. One expects that, over time, the technology will mature to the point that these episodes will occur less frequently. However, one would anticipate that it will always be more difficult to control RH using free air cooling than with the more traditional, and energy intensive, vapor-compression cooled air conditioners. Nonetheless, the expectation of this ASHRAE committee is that, over time, data centers will be migrating to higher temperature and relative humidity conditions than is now customary.
For reasons of economy and ease in manufacture, many of the components currently used in electronics systems employ organic materials. As is well known, organic materials are, in general, permeable to moisture and will, over time, absorb moisture until a concentration level is reached that is in equilibrium with the that of the ambient air. The precise value of the moisture concentration will depend on the ambient air temperature and RH and the temperature of the component in question and its material composition.
It has long been known that excessive levels of moisture in organic materials used in electronics can lead to reliability problems. This phenomenon has been studied in particular for materials that are suddenly heated to a high temperature such as during the solder reflow process. So-called “popcorn” cracking is a dramatic and typical failure mode in this situation. More subtle moisture-induced failures can occur once an electronic system is in the field. Examples are those caused by stress resulting from the swelling of polymers due to moisture intrusion or to electrochemical migration in the presence of electrical bias and moisture. These effects have also been widely studied. However, there has been much less study on the effect of moisture on the performance of passive components and active subsystems.
Case Study Analysis
Moisture Concentration in the Package Laminate Material — BT
A recent Calculation Corner column was devoted to the calculation methodology for modeling moisture diffusion [4]. It dealt with BT (bismaleimide triazine), a fiber-reinforced polymer commonly used in BGA (Ball Grid Array) packages. The article demonstrated a method for calculating the equilibrium concentration of moisture at values of ambient temperature and RH within and slightly beyond the current ASHRAE limits. Table 1 provides values of saturated moisture concentration in a BT substrate, assumed to be in an operating system, such that its temperature is 60˚C. [Note that, as shown in the article, the elevated temperature of the BT leads to a lower moisture concentration than had it been at ambient temperature.] All of the assumed values of ambient temperature and RH are within the current allowable ASHRAE range except for the 40˚C/60%RH value. Using the concentration at 20˚C/40%RH, namely 0.2 mg/cm3, as a baseline value, one sees that, at the top end of the allowable range, the moisture concentration is 4 times that, at 0.8 mg/cm3. We will return to these values in the discussion in the following section.
Another thing to note in the referenced analysis is that the time for the moisture to reach equilibrium in BT under these ambient conditions is on the order of only a few weeks.
Effect of Moisture Concentration on High-Speed Signal Propagation
A recent study measured the high-speed signal propagation along a copper trace, 21 µm wide and 15 µm thick and 50 mm long in a stripline configuration, typical of what would be used in a package substrate [5]. The trace is sandwiched between two layers of a low-loss, FR-4 type dielectric, each 130 µm thick, 1.4 mm wide, and 50 mm long. Each dielectric layer has a copper plane bonded to its outer surface. The signal propagation is measured using a vector network analyzer at frequencies between 2 and 16 GHz at two different moisture concentration levels in the dielectric. Tests were performed at each of the specified frequencies over a range of temperature values from 20 to 80˚C.
The first moisture level was achieved by a “soak” exposure at 30˚C/60%RH for one week. The calculated moisture concentration at the stripline location (along the centerline of the dielectric) is 1.8 mg/cm3. This value is comparable to that calculated for the BT at the upper end of the range.
After the completion of the first set of tests, the sample was “baked out” and then retested over the same frequency and temperature ranges. The bake out condition was 125˚C for 1 week. The calculated moisture concentration at the stripline location following the bakeout is 0.3 mg/cm3. This is comparable to the baseline moisture concentration in the preceding case study.
The results are plotted in Figure 1. The graph compares the signal loss (attenuation) at a given temperature minus the loss measured at 20˚C. There are two families of curves plotted, representing the sample in the “after soak” and the “after bake” conditions. We see that the soaked sample shows more signal loss than the baked one. The effect is more pronounced with increasing temperature and frequency. In the worst case reported here, the loss at 16 GHz and 80˚C is 36% greater for the soaked sample, compared to the baked one. In general, at these very high frequencies, the noise margins are tighter. The increased attenuation measured here could potentially lead to increased bit error rates unless it was anticipated in the design phase and effectively accounted for.
Effect of Moisture Exposure on a High-speed Network Switch
Another recent study measured the data throughput of two different populations of 3 identical network switches over an 8 week period [6]. One was maintained under environmental conditions representative of a benign air conditioned data center environment: 20˚C/50%RH. [Note that these conditions are close to the baseline values in the first case study.]
The second environment was chosen to be representative of conditions experienced in a free-air-cooled environment. It was conducted in an environmental chamber in which the temperature/RH setting was varied between 10˚C/85%RH and 50˚C/15%RH. A complete cycle was completed in 16 hours. In that time period there was a 4 hour hold at the lower temperature, followed by 4 hour ramp to the higher temperature, holding there for 4 hours, and then followed by a 4 hour ramp to return to the lower temperature.
A baseline throughput rate was established for each population by averaging the rate over an first 10,000 data packets sent. The duration of this part of the test was approximately 1 day. The baseline was 93.7 Mbps for the air conditioned environment and slightly less, at 92.4 Mbps, for the temperature cycled environment.
For each population, the authors parsed the data acquired over the remainder of the 8 week period into 3 groups representing a 1%, 2%, 5%, 10%, and 20% throughput dropoff compared to the baseline.
Most of the results are plotted in Figure 2. The number of packets at a specified level of dropoff was higher for the population with the free-air-cooled condition. The ratios of these values (averaged over the 8-week period) at the 1%, 2%, and 5% levels, respectively were 2.5:1, 7.3:1, and 14.3:1, respectively. Furthermore, there was a significant increase in the number of packets demonstrating dropoffs of 1 and 2% in the final week of the tests for the temperature cycled population.
There were no packets in the air conditioned environment at the 10% and 20% dropoff levels. However, for the harsher environment there were 105 and 55 packets, on the average, per week.
The authors concluded that the level of performance variation of the switch in the simulated free air cooled environment might well be unacceptable to many data center customers.
Conclusions
This article highlights a number of published studies that address performance problems related to moisture absorption in individual electronic components and in subsystems. In the experience of this author, the majority of moisture-related studies have to do with reliability not performance. Indications are that the relaxed temperature and relative humidity ranges approved by ASHRAE will some day become the norm in the datacenter. There is a risk that electronics companies will not anticipate the effect this change may have in the high-speed performance of their products. This situation could be exacerbated by the fact that performance degradation might well occur as soon as the moisture concentration achieves a critical value without the need for a secondary process to be triggered by the moisture absorption, as is usually the case with failure mechanisms.
It is hoped that this article will help to increase awareness of these issues in our industry and promote early action to effectively manage the risks detailed here.
References
- V. Mulay, “Humidity Excursions in Facebook Prineville Data Center,” ElectronicsCooling, Vol. 18, No. 4., December, 2012.
- R. Schmidt, “A History of ASHRAE Technical Committee TC9.9 and its Impact on Data Center Design and Operation,” ElectronicsCooling, Vol. 18, No. 4, December, 2012.
- D. Atwood and J. G. Miner, “Reducing Data Center Cost with an Air Economizer”, IT@Intel Brief, Intel Information Technology, Computer Manufacturing, Energy Efficiency, August 2008.
- B. Guenin, “Calculation Corner – Application of Transient Thermal Methods to Moisture Diffusion Calculations, Part 2,” Vol. 19, No. 1., March, 2013.
- J. Miller, Y. Li, K. Hinckley, G. Blando, B. Guenin, and I. Novak,” Temperature and Moisture Dependence of PCB and Package Traces and the Impact on Signal Performance,” Proceedings DesignCon Conference, Santa Clara, January 30 – February 2, 2012
- J. Dai, D. Das, M. Pecht, and M. Ohadi, “A Case Study on the Impact of Free Air Cooling on Telecom Equipment Performance,” Proceedings, SEMI-THERM XVIII Semiconductor Thermal Measurement, Modeling And Management Symposium, San Jose, CA, March 18-22, 2012, pp. 82-86.
Figure Captions
Figure 1: Graph of signal loss for a stripline structure versus temperature, frequency, and moisture content resulting from 1) soak process (1.76 mg/cm3) and 2) bake process (0.30 mg/cm3).
Figure 2: Number of data packets per week experiencing indicated levels of dropoff in data throughput (1%, 2%, and 5%).