Figure 1: Telecom Equipment Operating EnvironmentsThere are two main reasons for doing thermal design in the telecommunication industry. The first is to ensure functionality of the equipment and system when subjected to extreme environments. The second is to ensure high dependability of the network. While doing thermal design, many challenges must be kept in mind: the telecommunication system’s dependability requirements, expected product lifetime, the product’s environmental challenges, component power and power density challenges, as well as the need for fast design cycles in a highly competitive market.
Dependability
The largest difference between the telecom industry and the consumer electronics industry is telecom’s need for extremely high reliability and dependability. Poor reliability in a consumer item leads to poor customer loyalty and lost repeat business, but in Telecom it means that and much more. — when a 911 call is made, equipment reliability can literally be a matter of life and death. Traditionally telecom products are extremely dependable — users have come to expect an immediate dial tone when they lift the receiver, even during a power failure.
Equipment reliability is also important to the telephone and datacom service provider companies in the modern, intensely price competitive telecom business. Down times of only a few seconds can cause carriers to loose millions of dollars in revenues. Also, operating costs for maintenance and replacement are higher unless the equipment is highly dependable throughout its lifetime. This is particularly true for wireless basestations that are often in remote, unmanned locations.
Telecom network dependability is a function of many things including architectural redundancy, software robustness and manufacturing process control, but the dependability of the actual electronic hardware on the circuit boards is greatly affected by the temperature of the equipment. This makes proper thermal design a vital part of designing a dependable network. In a poorly designed product, brief excursions to extreme hot or cold can cause “soft failures” where a system drops traffic or stops operating entirely, often due to chip-to-chip timing problems. Furthermore, operating electronic components long term at or above the manufacturer’s temperature limit will over-stress them, causing non-recoverable “hard failures” that require hardware replacement.
In telecom, there are three main challenges to achieving dependability through good thermal design. They are:
- The widening range of environments which telecom products must withstand. (Table 1 lists some of these environments.)
- Developing and deploying new products with new technologies during accelerating design cycles with “short design cycle/short product life”
- Increasing component power and increasing power density on circuit cards
Product Environment
Modern telecom products are found in all kinds of indoor and outdoor environments. Many large Telco’s still maintain traditional “central offices” with racks of equipment in well controlled, air conditioned rooms with trained maintenance staff. This is the most benign of all the environments, although air conditioner failure still needs to be considered in the design.
Slightly more extreme is the “customer premises” environment where private companies or small operators have routers, telephone switches and telephones in occupied work areas or equipment closets. These locations are dirtier; fan noise is often not welcome; and, particularly in closets, temperature and humidity can be higher than in central offices. It is also possible that untrained installation and maintenance staff in these customer premises locations can make thermal problems worse.
The outdoor or “outside plant” equipment sees the most extreme environments. It is possible to deploy an optimum thermal design anywhere in the world without modification. This means a single “world design” will be able to withstand the heat of the Arizona desert, the cold of the Canadian north, the humidity of Thailand, the altitude of Columbia and the rapid temperature and humidity changes of St. Louis.
Sometimes it is not economically sound to include measures for all of these conditions in systems that will only see limited deployment, but the best designs at least have “hooks” in them, allowing these features to be added later with minimal design impact. Outside plant equipment must also withstand “attacks” from molds, insects, local fauna and, perhaps, the occasional shot gun blast. Figure 1 illustrates the various operating environments.
Table 1: Design Environments
Fast Design Cycles
Making a single “world design” leads us into our next challenge: that of developing and deploying new products with new technologies during accelerating design cycles with “short design cycle/short product life”. This challenge has become more important as the distinction between telecom products and network products has blurred. A single “world design” avoids the extra time and cost of redesign, retesting and requalification.
This philosophy allows quick sale of an existing product to new and unexpected customers; there is no customizing or requalifying to do. To be effective, however, this “world design” philosophy must be applied to more than just thermal considerations. World safety compliance, electromagnetic compatibility (EMC), and telecom compliance, among other things, must also be designed into the product “up front” to achieve fast deployment.
Often traditional EMC and thermal solutions are at odds with one another. For example, for high frequency processor signals, EMC requirements call for some kind of enclosure, limiting cooling air to hot devices. Since EMC is a regulatory requirement necessary to sell a product and thermal is not, it is most often the thermal solutions that must be innovative (i.e. use of enclosure surfaces as heat sinks or shielding layers in the PCB as heat spreaders), especially in the accelerating design cycle.
This can only be achieved with a parallel design process involving the electrical designer, the EMC engineer, the mechanical designer and the thermal engineer. Thermal design must therefore be part of the product design process from concept to testing.
Also key to providing thermal solutions in the fast design cycle is recognizing future technical challenges and having the solution ready to deploy. Although this is an overhead cost, it pays off when a thermal solution is required immediately to get a product out the door. Thermal underfill (patent # 5467251) played this role in one of Nortel Networks’ transport products a few years ago allowing for a sealed module EMC solution that would have otherwise increased component temperatures to above their operating range.
In the future we will need to have thermal solutions for chip-on-board, integrated passives, and buried passive devices, although at present these technologies are not being widely used in telecom products. These thermal solutions can be developed in-house or co-developed with vendors. Keeping up to date with developments presented at conferences and in the literature may also provide solutions.
Component Power and Power Density
Component powers and power densities are constantly increasing. This is a problem that all electronic packaging designers are facing, not just those engineers in telecom. Although the powers and power density values in the telecom industry are not as high as in the super computer industry, it is not uncommon to have to cool 7W to 11W logic devices, 45W power amplifier devices and 60W to 100W on a single PCB. Often the PCBs will be in sealed modules making cooling more difficult. This necessitates cooling solutions that will not only cool the hot component but also keep the heat from impacting the surrounding active and passive devices.
The introduction of 3.3V and 2.5V technologies along with higher allowable silicon operating temperatures through better manufacturing process control have recently helped delay the full impact of the power problem. Interestingly the limiting factor to board power these days is, in many cases, interconnect related, both in breaking-out and routing the traces on the PCBs and in connector and backplane limitations. As these issues are solved in the next few years, power densities will resume their upward march. This will force a departure from traditional fan cooled frames in air-conditioned rooms, leading to the introduction of liquid cooling.
Meeting the Challenges: The Nortel Networks Thermal Design Process
At Nortel Networks a four stage thermal design process is used to meet the thermal design challenges:
1. Concept
Thermal engineering staff must be involved in a multi-disciplinary team at the concept stage of the system packaging design to ensure the right architecture is chosen for the product and its environment. This team includes representatives from thermal, safety, EMC, interconnect, manufacturing, human interface, end users and others as required.
The thermal engineer determines, for example, if natural convection is enough to cool the system or if vans are required; what board pitch is sufficient, and so on. Hand calculations, technology demonstration vehicles, network models and very simple CFD (computational fluid dynamics) models are all used here. Acceptable accuracy of temperature estimates is 25%. Power dissipation estimates will often increase as the design involves so margin is required in the cooling strategy.
2. Alpha Hardware Design
Detailed thermal analysis is required as more mechanical and electrical details become available. System level CFD tools, PCB level tools and the occasional mock-up build are all used at this stage. Acceptable accuracy on the temperature estimates is 10%. Note that the accuracy of the analysis is only as good as the input data. This data includes device power dissipation and component thermal properties, both sources of error.
To help reduce these uncertainties, prototype temperature and airflow testing is done as soon as hardware is available. A portable thermal camera, thermocouple datalogger and airflow test equipment are available to allow testing in the development labs where it is possible to exercise the prototype PCBs with test software. Since device power estimates can be one of the greatest sources of errors in the thermal analysis, the power estimates can be verified at this stage by comparing infrared images or thermocouple measurements to predicted values. The results of the detailed analysis are used in the design process to aid the mechanical and electrical designers make key decisions and trade-offs.
3. Alpha Hardware Build and Test
De-risk and compliance testing is performed in environmental chambers as soon as hardware and software is available. The thermal engineer needs to design the test suite and oversee testing. The thermal engineer should analyze test results to ensure that the cooling design is performing as intended. The thermal engineer should also compare test results to thermal tool predictions as a check on the tool accuracy and analysis approach.
4. Validation and Compliance Test of Finalized Design
There should be very little thermal engineering left to do at this stage. Lab technicians can perform the compliance test suite. The thermal engineer must be aware of detailed test results to gain insight about how the design performs and feedback about how the tools perform.
Conclusions
As the telecom and IP industries are converging the design environment is becoming faster and more market-driven. In this intensely competitive environment product dependability, ability to quickly produce designs, and customer service are key differentiating features that will make or break a sale or a company. The thermal designer must meet these challenges and do so as an integral member of the design team.