Thermal analysis: It’s a field that every mechanical engineer is exposed to during their undergraduate studies and many dabble in at some point during their professional careers. It’s also a field that some devote their full-time careers to as dedicated thermal engineers (or thermal analysts). Regardless of where in the broad spectrum of mechanical engineering work scope you may fall, whether you typically perform a single brief thermal analysis per year or focus daily on it, you may find this manuscript regarding thermal analysis methodology valuable and edifying. The objective of this manuscript is to clearly and systematically outline best practices with regard to thermal analysis methodology that enable an accurate, well-executed analysis. In the context of this manuscript, the primary focus is on electronics systems for ground-based and airborne applications in the aerospace/defense industry.
First, it is important to define what thermal analysis is. Thermal analysis broadly encompasses the task of solving the temperature and flow fields of an electronics system in a given application and environment; it solves the conjugate heat transfer problem, which couples the conservation of energy, conservation of mass, and conservation of momentum equations. The conservation of momentum equations are more famously known by mechanical engineers as the Navier-Stokes equations from their undergraduate fluid dynamics courses. The Navier-Stokes equations are, in essence, the conservation of linear momentum equations with the constitutive equation for a fluid continuum substituted in for the Cauchy stress tensor term. Standard computational fluid dynamics (CFD) software has the capability of solving the conjugate thermal-flow problem. Among the biggest players in the industry are Fluent (ANSYS) and STAR-CCM+ (Simcenter).
Next, it’s important to define what thermal analysis is not. Thermal analysis is not pushing a button, running a solution, getting a pretty picture of temperature or flow contours, and reporting the results. To their detriment, many computer-aided design (CAD) engineering software tools, which are used primarily by the mechanical engineer (ME) with the ‘ME designer’ role rather than the ‘thermal engineer’ role, now offer a one-stop shop for mechanical design and thermal simulation. Mechanical designers can quickly get a thermal solution without necessarily having the experiential or analytical background to verify the accuracy of the solution. Rather, thermal analysis entails understanding and ensuring that the boundary conditions, thermophysical material properties, and modeling assumptions used are reasonably accurate. It entails reviewing modeling results with a healthy degree of suspicion (i.e., “guilty until proven otherwise” philosophy), sanity-checking them with simplified hand calculations, and convincing yourself that the simulation is indeed accurate. It entails using the analysis as a tool to guide the mechanical design of a system from a concept to a feasible design.
Now that we’ve outlined a broad definition of thermal analysis, let’s explore critical methodology steps in the thermal analysis process. Following each of these steps will help to ensure an accurate analysis.
Best Practice #1: Fundamentals
Nothing beats having a deep, thorough, and technical understanding of the relevant, underlying physics. Textbook knowledge is a requisite for analysis—here, primarily thermodynamics, fluid dynamics, and heat transfer (all modes) [1]. Because thermal/fluids phenomena may not be obvious or intuitive, a fundamental technical understanding can only be achieved by studying the relevant textbooks from a standard undergraduate curriculum in mechanical engineering. In fact, as I often advise undergraduate students, the most successful engineers in industry are those who have a well-developed and intuitive understanding of the relevant physics and, more importantly, can communicate that to their peers and leadership. Understanding the fundamentals enables the engineer to think critically when faced with challenging, complex, and non-textbook problems that are commonplace in industry. The following brief outline of fundamental topics for each mode of heat transfer will help prepare the thermal engineer in an electronics cooling framework.
From a conduction standpoint, understanding conductive resistances due to one-dimensional conduction through multi-layered materials is necessary to calculate expected temperature rises or the composite material’s effective thermal resistance [°C/W] or thermal conductivity [W/mK]. Calculating fin (i.e., extended surface) parameters is useful for quantifying effective fin performance and simplifying modeling efforts. Understanding energy storage (i.e., adiabatic heating of a solid or incompressible liquid) based on a material’s thermal capacitance, which is inherently a transient/diffusion problem, will help calculate the expected temperature rise or time duration
From a convection standpoint, understanding convective resistances due to free (natural) and forced convection based on a convective heat transfer coefficient (h value) is important for estimating surface temperatures and the temperature rise through the boundary (film) layer. The lumped capacitance approach facilitates calculating transient responses of a solid in a convective environment if the Biot number can be shown to be much smaller than unity. Understanding dimensionless numbers (e.g., Re, Pr, Ra, Gr, Nu, etc.) and their relevance to certain applications will help characterize the flow regime or relevance of empirical correlations. In general, the field of convection is a patchwork of various empirical correlations for a range of geometries under specific flow conditions, owing to the non-analytical nature of solutions to the Navier-Stokes equations. Therefore, having a reference repository of the relevant correlations for internal and external flow for flat plate or finned geometries will enable accurate thermal characterization efforts [1][2][3][4]. In fact, to many non-thermal engineers’ surprise, translating a fluid mass flow rate to a convective heat transfer coefficient is not necessarily a trivial exercise; it entails coupling the caloric rise in the fluid from a first law energy balance (Q=mCP(TOUT – TIN)) to Newton’s law of cooling (Q = hA(TSURFACE–TFLUID)), which can be coupled only via an empirical, geometry-specific Nusselt (Nu=hL/kFLUID) or Colburn (J=St Pr0.67) correlation [5]. Other relevant convection fundamentals include pressure drop (head losses) of fluid flow through an orifice, duct, and fittings [6]. As a side note, Ref. [7] provides an enjoyable collection of anecdotes illustrating conductive and convective heat transfer principles.
Finally, from a radiation standpoint, radiative cooling is generally not a primary cooling mechanism for most ground-based and airborne defense applications—the exception being cases in which hardware is primarily cooled via free convection. Radiation does, however, represent a critical cooling effect in rarefied gas and space applications. Therefore, it is important to understand surface emissivity, shape factors, and the temperature dependence relationship of radiation exchange between surfaces. Radiative heating (or environmental heating), in contrast, whether from aerodynamic air friction due to high-speed convective flows (colloquially, “aeroheating”) or incident solar radiation, can represent critical boundary conditions for airborne applications (and the latter for ground-based and space applications too). Aeroheating entails understanding the total air temperature of a certain flow, which is approximated by the recovery temperature (i.e., Eckert or adiabatic wall temperature). In aeroheating scenarios, a heat shield may be necessary to protect electronics from undesirable heating and may be accomplished from a low-emissivity (high reflectivity) or a low-thermal conductivity material depending on the driving mechanism of heat infiltration. In short, unless specifically dealing with solar radiation, supersonic speeds, or free convection designs, radiative cooling is typically not a primary cooling mechanism for electronics hardware in most ground-based and airborne applications.
Best Practice #2: Peer Review
As project deadlines loom or organizations grow, it’s natural to circumvent and minimize the value of a critical peer review. This is particularly true for fast-paced analyses supporting projects driven by cost and schedule more than technical rigor. However, it is important to fight the urge to skip a thorough and comprehensive peer review and instead hold one at the appropriate time. The appropriate time for a peer review is late enough that (1) sufficient progress in the analysis has been made such that a peer review is a reasonable and value-added task, but (2) not so late that any major analysis errors or blind spots identified in the review cannot be corrected in time for the analysis deadline (typically a design review, customer presentation, or deliverable report). Therefore, holding desk checks (i.e., informal or mini peer reviews) early and often is a good practice.
Peer review, by definition, is intended to have a fresh pair (or pairs) of eyes review the model and analysis in detail—ideally from experienced, senior engineers who are subject matter experts (SMEs). Peer review entails a technical deep dive into the analysis, including reviewing the model simulation and all relevant aspects (boundary conditions, geometry, assumptions, material thermophysical properties, etc.). It’s an unreasonable expectation for an engineer to be 100% right 100% of the time; everybody has blind spots or misses, and it’s natural for something to “fall through the cracks” for even the most experienced engineers. Although difficult at times, peer reviews require a degree of humility by the analyst and ultimately make an organization stronger by ensuring that analyses have been verified to be accurate. It’s difficult to overemphasize the value of peer review.
Best Practice #3: Sanity Checks
Sanity checks are a critical tool in the thermal engineer’s analysis armamentarium [8][9]. Sanity checks serve to verify the accuracy of a computational simulation and can be done in piecewise or aggregate approaches. A sanity check is simply a hand calculation—whether on paper, spreadsheet, or analysis tool like Mathcad (PTC) or MATLAB (MathWorks)—with simplifying assumptions to make the problem more feasible for an analytical solution. For example, the temperature rise due to heat conduction through a material (or material stack) with a uniform heat flux assumption would not account for localized heat fluxes nor lateral heat spreading effects but would provide a minimum ΔT value expected. Similarly, the transient temperature rise due to adiabatic heating (energy storage) of a lumped continuum (fluid or solid) in a closed system would not account for any convective or radiative cooling effects but would provide a maximum ΔT value expected. Similarly, the pressure drop due to fluid flow through a fin core (fin stock) or orifice plates would neglect minor head losses due to fittings and ducting bends but would provide a minimum ΔP value expected [2][3]. If constructed correctly, a sanity check could estimate the primary driving effects reasonably well while leaving secondary effects for the computational simulation to resolve in detail.
The truth is, nearly anybody can learn how to build and solve a CFD model. But without a technical background in the relevant thermal/fluids physics that the thermal engineer has learned both academically and experientially, the risk of a “garbage in, garbage out” exercise is high. The “garbage in, garbage out” caveat simply refers to the situation where erroneous model inputs (boundary conditions, properties, or assumptions) yield erroneous model results. The practice of sanity-checking a simulation is what sets the experienced thermal engineer apart. The key to a sanity check exercise is knowing what formulas and equations are relevant to the specific application. Performing an energy balance (conservation of energy) on a control volume (EIN – EOUT = ESTOR– EGEN) or going straight to the heat conduction equation and simplifying (k∇2𝑇=𝜌𝑐𝑃𝜕𝑇/𝜕𝑡−𝑄𝑉) would be a good starting point. A sanity check is ideally done for every analysis performed by the thermal engineer, which also helps commit relevant formulas and equations to memory, which can be too easy to forget with daily, non-analysis tasks engineers encounter in industry.
Furthermore, a related best practice that is important and necessary in any thermal analysis is model validation with test data. Simply put, models are validated with empirical data once the analyzed hardware is available to test. To quote Prof. Richard P. Feynman, “If it disagrees with experiment, it’s wrong.” Test validation is especially critical for deliverable products. Many references in the literature emphasize the importance of model validation that the reader is referred to [10].
Best Practice #4: Documentation
The goals in a documentation effort, whether in report or presentation format, are primarily twofold: (1) present the results of the analysis in a clear way, and (2) outline the methodology taken such that a fellow expert in the field can reasonably recreate the analysis and results. If the analysis or experiment is documented in a way that precludes verifiability or repeatability, then the documentation effort failed to accomplish its purpose.
- Ideally, a thoroughly documented thermal analysis generally encompasses the following sections:
- Executive summary, which states the bottom line up front (BLUF) in a short and concise paragraph
- System Overview
– System-level overview summarizing the mechanical architecture of the electronics hardware analyzed with key CAD graphics
– Historical survey of prior analyses (if relevant)
– Requirements [11] - Thermal analysis
– Objective(s)
– Geometry, which summarizes the thermal/cooling architecture of the electronics hardware analyzed with key thermal model graphics
– Methodology, including control volume assumptions, thermophysical material properties, boundary conditions, modes of heat transfer accounted for in the analysis, numerical solver settings, and mesh parameters/resolution
– Results in tabular and graphical formats, including temperatures of electronic components and boards; contours (temperature, velocity, pressure) - Conclusions, including any recommendations and pending or future work
- Backup content, including data reference sources, numerical convergence plots, and any ancillary information relevant to the analysis
The above outline is not intended to be a strict, comprehensive template but rather a guide for what content to include in an analysis report. It excludes necessary sections such as References, Abbreviations, and Appendices. Ideally, all the sections include key graphics as figures illustrating the text.
It is difficult to over-document an analysis for most engineers. In fact, most engineers are arguably culpable of under-documenting, owing to the stereotype that engineers prefer numbers and equations over words and prose. Nevertheless, documentation is essential because it retains a detailed historical record of the analysis which will facilitate any future effort recreating the analysis and, more importantly, follow-on efforts expanding on the documented analysis [12]. The more information that is included regarding the analyzed system, as trivial or obvious as it may seem, will almost certainly prove useful to subsequent generations of engineers reading the artifact for their analysis tasks.
Best Practice #5: Cost, Schedule, and Scope
Understanding cost, schedule, and scope is critical to performing a successful thermal analysis. Cost refers to the budget of labor hours (units of hours) and material (units of dollars) allotted by the project to perform the thermal analysis; labor hours are ultimately converted into dollars via the engineer’s fully-burdened labor rate. Schedule refers to the time duration and deadline set by the project to perform the thermal analysis; a schedule is monetized into pseudo-dollars by setting a timeline of expected tasks’ durations and completion dates. Work scope encompasses the requirements and analysis objectives outlined in the statement of work (formal or informal) regarding the thermal analysis [11].
It is good practice to clearly understand and adhere to these 3 elements while performing the thermal analysis and for the thermal engineer to pace him/herself accordingly. For example, if a project only has a small budget of 40 labor hours and short schedule of 2 weeks available to perform a new analysis, then the thermal engineer can outline how much (or little) thermal analysis that will buy the project—perhaps spreadsheet-level calculations and a simple conduction model, or updates to an existing model, with a short slide deck for documentation. In contrast, if a project has a large budget of 960 labor hours over a long schedule of 6 months—effectively translating to full-time support over the 6-month timeline—then the thermal engineer can provide a detailed, comprehensive, system- and subsystem-level CFD analysis with a thoroughly documented written report. If an analysis is projected to overrun on cost or schedule, early communication to project management can oftentimes buy the engineer an increase in cost or schedule. Additionally, (1) “scope creep,” i.e., the increase of work scope over time, without a corresponding increase in cost and schedule, and (2) “requirements swirl,” i.e., changing requirements over time, which may be due to under-defined or non-firm requirements, each have the potential to financially sink an entire project.
For most engineers, it is important to remain cognizant of the fact that we are employed by for-profit organizations. Therefore, maintaining cost and schedule constraints while performing assigned analysis tasks is critical for the company’s success. As a for-profit organization, cost and schedule are important metrics that are vital to the company’s financial performance and should not be disregarded by the engineer. The thermal engineer has the responsibility of ensuring technical accuracy and soundness in the analysis while adhering to cost, schedule, and scope constraints, which is not a trivial task.
Conclusion
This manuscript outlined five best practices in thermal analysis methodology:
(1) Understanding the relevant fundamental physics;
(2) Having a peer review of the analysis by an experienced, senior engineer;
(3) Performing sanity checks to verify the accuracy of a computational simulation, along with collecting test data to validate the model;
(4) Documenting the analysis clearly and thoroughly;
(5) Adhering to cost, schedule and scope constraints during the analysis while ensuring technical accuracy.
Following these best practices will promote a successful, well-executed analysis.
Acknowledgements
The author gratefully acknowledges Mr. Greg P. Schaefer, Mr. Dan E. Giles, and Dr. Jim S. Wilson who have served as mentors to him at Raytheon Technologies Corp.
References
[1] Incropera, F. P. et al., Fundamentals of Heat and Mass Transfer, 6th ed., John Wiley & Sons, 2007.
[2] Steinberg, D.S., Cooling Techniques for Electronic Equipment, 2nd ed., John Wiley & Sons, 1991.
[3] Kays, W. M. and London, A.L., Compact Heat Exchangers, 2nd ed., McGraw-Hill, 1964.
[4] Guyer, E.C., Handbook of Applied Thermal Design, 1st ed., Taylor & Francis, 1999.
[5] Marthinuss, J.E., “Air Cooled Compact Heat Exchanger Design For Electronics Cooling,” Electronics Cooling, February 2004, https://www.electronics-cooling.com/2004/02/air-cooled-compact-heat-exchanger-design-for-electronics-cooling/
[6] Lindeburg, M.R., Mechanical Engineering Reference Manual (MERM), 13th ed., Professional Publications, 2013.
[7] Kordyban, T., Hot Air Rises and Heat Sinks: Everything You Know About Cooling Electronics is Wrong, 1st ed., The American Society of Mechanical Engineers (ASME), 1998
[8] Electronics Cooling Editorial Board, “Thermal Facts & Fairy Tales: 7 years of college down the drain…,” Electronics Cooling, January 2017, https://www.electronics-cooling.com/2017/01/thermal-facts-fairy-tales-7-years-college-drain/
[9] Wilson, J., “Providing More Value Than Playing Video Games,” Electronics Cooling, November 2007, https://www.electronics-cooling.com/2007/11/providing-more-value-than-playing-video-games/
[10] Mohammed, R., “Experimental Methodologies for Thermal Design in Silicon Validation Platforms,” Electronics Cooling, July 2010, https://www.electronics-cooling.com/2010/07/experimental-methodologies-for-thermal-design-in-silicon-validation-platforms/
[11] Wilson, J., “Thermal Facts and Fairy Tales: Understanding and Defining Electronics Cooling Requirements,” Electronics Cooling, July 2016, https://www.electronics-cooling.com/2016/07/thermal-facts-and-fairy-tales-understanding-and-defining-electronics-cooling-requirements/
[12] Wilson, J., “Editorial: Archival Value,” Electronics Cooling, March 2011, https://www.electronics-cooling.com/2011/03/archival-value/