Researchers at the University of Toronto, who studied data on equipment failures at data centers operated by Google, Los Alamos National Labs, and Canada’s SciNet HPC consortium, conclude that the effect of high data center temperatures on system reliability is smaller than often assumed.
For DRAM failures and node outages, no evidence for a correlation with higher temperatures was found. For those error conditions that did show a correlation (latent sector errors in disks and disk failures), the correlation was much weaker than expected. For device internal temperatures below 50C, errors were seen to grow linearly with temperature, rather than exponentially.
These findings demonstrate that many organizations could run their data centers hotter than they currently do without making significant sacrifices in system reliability.