Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Cogeneration wikipedia , lookup
Underfloor heating wikipedia , lookup
Passive solar building design wikipedia , lookup
Thermoregulation wikipedia , lookup
Hyperthermia wikipedia , lookup
R-value (insulation) wikipedia , lookup
Solar air conditioning wikipedia , lookup
Thermal conductivity wikipedia , lookup
EE241 - Spring 2005 Advanced Digital Integrated Circuits Lecture 20: Thermal design Guest Lecturer: Prof. Mircea Stan ECE Dept., University of Virginia Thermal Design Why should you care about thermals? What do we mean by thermals? How do you model thermals? What can you do about thermals? Temperature-aware circuit design Thermal sensors References: - Intel® Technology Journal: http://developer.intel.com/technology/itj/ IBM Journal of Research and Development: http://www.research.ibm.com/journal/rd/ - IEEE Transactions on Components and Packaging Technologies - IEEE Transactions on VLSI Systems IEEE Journal on Solid-State Circuits - - 2 1 Why should you care about thermals? Temperature affects: Circuit performance Circuit power (especially leakage) System reliability IC and system packaging cost “Environment” 3 Circuit Performance vs. Temperature Temperature => Transistor threshold and carrier mobility Temperature => Transistor threshold and carrier mobility IDS = W α µCox (VGS −VTh ) 2L Temperature => Performance? Temperature => Performance? Source: E Long, WR Daasch, R Madge, B Benware, “Detection of Temperature Sensitive Defects Using ZTC” VLSI Test Symposium, 2004 4 2 Leakage vs. Temperature -3 log IDS [log A] -4 k = 1.38x10^-23 q = 1.6x10^-19 kT/q = 25.9mV at 27C = 23.5mV at 0C (273K) = 32mV at 100C (373K) S = kT/q ln10 (1+Cd/Ci) -5 Subthreshold slope S>ln10 kT/q -6 -7 -8 -9 0 0.2 0.4 0.6 0.8 1 1.2 VGS [V] W kT I ds = µ L q V g −VTh 2 e m kT q 1 − e − Vds kT q 5 [Taur, Ning] EECS241 Lecture 3 Leakage Power Fraction of leakage power increasing: exponentially with each generation exponentially dependent on temperature Increasing ratio for new technology nodes Static power/ Dynamic Power 70 Percentage 60 50 40 30 20 373 368 363 353 358 348 343 338 333 323 328 318 313 303 308 0 298 10 Temperature(K) 180nm 130nm 100nm 90nm 80nm 70nm 6 Source: Sankaranarayanan et al, University of Virginia 3 Reliability The Arrhenius Equation: MTF=A*exp(Ea/k*T) MTF: mean time to failure at T A: empirical constant Ea: activation energy k: Boltzmann’s constant T: absolute temperature Failure mechanisms: Die metalization (Corrosion, Electromigration, Contact spiking) Oxide (charge trapping, oxide breakdown, hot electrons) Device (ionic contamination, second breakdown, surface-charge) Die attach (fracture, thermal breakdown, adhesion fatigue) Interconnect (wirebond failure, flip-chip joint failure) Package (cracking, whisker and dendritic growth, lid seal failure) 7 System Packaging Cost Today… Grid computing: power plants co-located near compute farms IBM S/390: refrigeration Source: R. R. Schmidt, B. D. Notohardjono “High-end server low temperature cooling” IBM Journal of R&D 8 4 IC Packaging Cost IBM S/390 processor subassembly: complex! C4: Controlled Collapse Chip Connection (flip-chip) Source: R. R. Schmidt, B. D. Notohardjono “High-end server low temperature cooling” IBM Journal of R&D 9 Desktop processor, simpler, but still… Pentium 4, Itanium Source: Intel web site 10 5 “Environment” Environment Protection Agency (EPA): computers consume 10% of commercial electricity consumption This includes peripherals, possibly also manufacturing A DOE report suggested this percentage is much lower No consensus, but it’s still a lot Equivalent power (with only 30% efficiency) for AC CFCs used for refrigeration Lap burn Fan noise 11 Ultimate Effect: Thermal Runaway Temperature => Leakage power => Temperature … “Loop gain” > 1 trouble! Source: Tom’s Hardware Guide http://www6.tomshardware.com/cpu/01q3/010917/heatvideo-01.html 12 6 Thermal Design Why should you care about thermals? What do we mean by thermals? How do you model thermals? What can you do about thermals? Temperature-aware circuit design Thermal sensors 13 What do we mean by thermals? Anything that has to do with heat/temperature Heat is a form of energy transfer Temperature is a measure of entropy and determines heat flow Source: http://www.iun.edu/~cpanhd/C101webnotes/matter-and-energy/specificheat.html 14 7 Heat mechanisms Heat Conduction: phonons, vibrations Heat Convection: fluid molecules movement Heat Radiation: photons, EM waves Phase change: boiling, sublimation, condensation, etc. Heat storage: specific heat Refrigeration: move heat “backwards” Other many mechanisms… 15 Conduction “Similar” to electrical conduction (e.g. metals are good conductors) Heat flow from high temperature to low temperature Microscopic (vibration, adjacent molecules, electron transport) In a material: typically in solids (fluids: distance between mol) Typical example: thermal “slug”, spreader, heatsink A Source: CRC Press, R. Remsburg Ed. “Thermal Design of Electronic Equipment”, 2001 16 8 Convection Macroscopic (bulk transport, mix of hot and cold, energy storage) Need material (typically in fluids, liquid, gas) Natural vs. forced (air or liquid) Typical example: heatsink (fan), liquid cooling Source: CRC Press, R. Remsburg Ed. “Thermal Design of Electronic Equipment”, 2001 17 Simplistic Thermal Model Most thermal transfers: R = k/A Power density matters! Ohm’s law for thermals (steady-state) ∆V = I · R -> ∆T = P · R T_hot = P · Rth + T_amb Ways to reduce T_hot: - reduce P (power-aware) - reduce Rth (packaging) - reduce T_amb (move to Alaska?) - maybe also take advantage of transients (Cth) 18 9 Simplistic Dynamic Model Electrical-thermal duality V ≅ temp (T) I ≅ power (P) R ≅ thermal resistance (Rth) C ≅ thermal capacitance (Cth) RC ≅ time constant KCL differential eq. I = C · dV/dt + V/R difference eq. ∆V = I/C · ∆t + V/RC · ∆t thermal domain ∆T = P/C · ∆t + T/RC · ∆t (T = T_hot – T_amb) One can compute stepwise changes in temperature for any granularity at which one can get P, T, R, C 19 IC with die, package, heatsink R = T/Q R = V/I Rja = Rjc + Rcs + Rsa = (Tj - Ta)/Q Rsa = ((Ts - Ta)/Q) - Rjc - Rcs 20 10 Hot spots in Power4 Temperature “landscape”: space and time How to estimate early in the design cycle? 21 Trends in Power Density 1000 Rocket Nozzle Watts/cm 2 Nuclear Nuclear Reactor Reactor 100 Pentium® 4 Pentium® III Pentium® II Hot plate 10 Pentium® Pro Pentium® i386 i486 1 1.5µ 1µ 0.7µ 0.5µ 0.35µ 0.25µ 0.18µ 0.13µ 0.1µ 0.07µ Source: “New Microarchitecture Challenges in the Coming Generations Generations of CMOS Process Technologies” – Fred Pollack, Intel Corp. 22 11 Thermals for low-power ICs Different: little self-generated heat But… Cheaper packaging (higher Rth): challenge More extreme ambient (freezing to hot) Temporal thermal effects more important than spatial 23 Thermal Design Why should you care about thermals? What do we mean by thermals? How do you model thermals? What can you do about thermals? Temperature-aware circuit design Thermal sensors 24 12 How do you model thermals? Source: Electro-thermal circuit simulation using simulator coupling Wunsche, S. Clauss, C. Schwarz, P. Winkler, F. IEEE Transactions on VLSI Systems, Sep 1997 25 Why need to model thermals? Power metrics are not acceptable proxy Chip-wide average will not capture hot spots Localized average will not capture lateral coupling Different units have different power densities 26 13 Power electronics: long time ago! Integrated-circuit thermal modeling Castello, R. Antognetti, P. , IEEE Journal of SolidState Circuits Jun 1978 27 Model (package) “Vertical” heat flow 28 14 Model (die) •Block granularity (architecture) •Grid (circuits) •Also lateral flow 29 Spatial behavior - Hot Spots Source: W. Huang, S. Ghosh, K. Sankaranarayanan, K. Skadron, and M. R. Stan. “Compact Thermal Modeling for Temperature-Aware Design.” 41st ACM/IEEE Design Automation Conference (DAC), June 2004 30 15 Time-Varying Behavior – Hot Spots mesa 31 Tool validation: on-chip measurements M. R. Stan, K. Skadron, M. Barcella, W. Huang, K. Sankaranarayanan, and S. Velusamy. “HotSpot: A Dynamic Compact Thermal Model at the Processor-Architecture Level.” Microelectronics Journal: Circuits and Systems, Dec. 2003 32 16 Dynamic validation: measurements Micred test chip, transient vs. HotSpot 33 Thermal Design Why should you care about thermals? What do we mean by thermals? How do you model thermals? What can you do about thermals? Temperature-aware circuit design Thermal sensors 34 17 What can you do about thermals? Better estimates of performance, power, reliability Optimize at design time (e.g. package co-design) Adapt at run-time 35 The Role of a Thermal Model helps close loop for accurate design estimations: static or dynamic Power Model Thermal Model Performance Model Reliability Model 36 18 Self-consistent leakage 37 Design flow: still work in progress! 38 19 Package co-design For 200 traces (TPC-C, SPEC, Microsoft) Thermal design point can be reduced to 75% of true “max power” with minimal performance loss Aggressive clock gating Application variations Underutilized resources Source: Intel 39 Thermal Performance Graph How to select a heat sink Seri Lee, Aavid Thermal Technologies http://www.electronics-cooling.com/Resources/EC_Articles/JUN95/jun95_01.htm 40 20 Adapt at run-time Temperature Designed for Cooling Capacity w/out DTM Designed for Cooling Capacity w/ DTM System Cost Savings DTM Trigger Level DTM Disabled DTM/Response Engaged Time Source: David Brooks 2002 41 Thermal Design Why should you care about thermals? What do we mean by thermals? How do you model thermals? What can you do about thermals? Temperature-aware circuit design Thermal sensors 42 21 Temperature-Aware circuit design Power: first-order design constraint max power consumption: limits power delivery sustained power dissipation: limits thermal design/packaging average active power and idle power consumption: limit battery life, etc. fallacy: instantaneous power ≠ temperature Power-aware design: maximize performance for given power Low-power design: minimize power for required performance Temperature-aware design: performance, power, reliability: function of T T function of power density, ambient T maximize performance for given thermal envelope related to Power Density 43 Performance and Leakage Temperature (Berkeley PTM 70nm CMOS): Transistor threshold and mobility Subthreshold leakage, gate leakage Ion, Ioff, delay 44 22 Temperature-aware circuits Robustness constraint: sets Ion/Ioff ratio Robustness and reliability: Ion/Igate ratio 70nm CMOS, 1.2V, 110oC Ion/Ioff ~ 1000 Ion/Igate ~ 10000 Idea: keep ratio constant with T Trade leakage for performance Ref: “Ghoshal et al. “Refrigeration Technologies…”, ISSCC 2000 Garrett et al. “T3…”, ISCAS 2001 45 Adaptive Ion/Ix control Ion/Ioff = B/A = ct. through ABB Temperature-aware circuits (TAC) patent (2004) 46 23 Resulting voltages Wide range: -.4V < Vbb < .4V; 1.2V < Vdd < 1.3V Almost linear Robust to inter-die parameter variations Needs trimming for setpoint Margin for intra-die parameter variations Active cooling or natural thermal landscape 47 Resulting performance 25% extra performance (110oC to 0oC) – only NMOS 13% from low temperature alone "# ! 48 24 Temperature-Aware SRAM Bit Decoders Pre-Charge Bit Cell Access Transistors (N1) Wordlines Cell (Number of Entries) Sense Amps Bitlines Number of Ports Number of Ports (Data Width of Entries) Worst-case bitline leakage limits performance 49 SRAM Read time Same circuit, different application 6T SRAM memory: “reverse application” (heating) 70nm process (200mV threshold) Zero biasing at low temperature 50 25 SRAM bit-line sensing Differential sensing (100mV bitline difference) 128 cells per bit line Faster read even if higher RBB, smaller Ion 51 Electro-thermal simulations A rational formulation of thermal circuit models for electrothermalsimulation. I. Finite element method [power electronic systems] Jia Tzer Hsu Vu-Quoc, L. Circuits and Systems I: Fundamental Theory and Applications, IEEE Transactions 52 26 Also need electro-thermal models Electro-thermal circuit simulation using simulator coupling Wunsche, S. Clauss, C. Schwarz, P. Winkler, F. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Sep 1997 53 SOI circuits SOI thermal impedance extraction methodology and its significance for circuit simulation Wei Jin Weidong Liu Fung, S.K.H. Chan, P.C.H. Chenming Hu Electron Devices, IEEE Transactions on Apr 2001 54 27 Refrigeration “conventional” vs. thermo-electric (TEC) Can get T < T_amb (Rth < 0!) TEC: Peltier effect (can use for local cooling) 55 TEC electro-thermal model 56 28 Thermal Design Why should you care about thermals? What do we mean by thermals? How do you model thermals? What can you do about thermals? Temperature-aware design Thermal sensors 57 Sensors needed for run-time Thermocouples – voltage output Junction between wires of different materials; voltage at terminals is Tref – Tjunction Often used for external measurements Thermal diodes – voltage output Biased p-n junction; voltage drop for a known current is temperature-dependent Biased resistors (thermistors) – voltage output Voltage drop for a known current is temperature dependent You can also think of this as varying R Example: 1 K metal “snake” BiCMOS, CMOS – voltage or current output Rely on reference voltage or current generated from a reference band-gap circuit; or simple ring oscillators with no reference Relative (just need to adapt) vs. Absolute sensors (need actual T) May need a Reference – typically a Bandgap circuit 58 29 Typical Sensor Configuration PTAT – Proportional to Absolute Temperature 59 Absolute Sensor Delta Vgs Current Reference Syal, Lee, Ivanov, Altet, Online Testing Workshop, 2001 Generator and Delay Cell 60 30 Sensors: Problem Issues Poor control of CMOS transistor parameters Noisy environment Cross talk Ground noise Power supply noise These can be reduced by making the sensor larger This increases power dissipation But we may want many sensors 61 Calibration Accuracy vs. Precision Analogous to mean vs. stdev Calibration deals with accuracy The main issue is to reduce inter-die variations in offset Typically requires per-part testing and configuration Basic idea: measure offset, store it, then subtract this from dynamic measurements 62 31 Recap: Thermal Design Why should you care about thermals? What do we mean by thermals? How do you model thermals? What can you do about thermals? Temperature-aware circuit design Thermal sensors Questions? 63 32