* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Principles of Computer Architecture Dr. Mike Frank
Survey
Document related concepts
Power engineering wikipedia , lookup
Resistive opto-isolator wikipedia , lookup
Buck converter wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Stray voltage wikipedia , lookup
Photomultiplier wikipedia , lookup
Thermal runaway wikipedia , lookup
History of the transistor wikipedia , lookup
Voltage optimisation wikipedia , lookup
Surge protector wikipedia , lookup
Alternating current wikipedia , lookup
Opto-isolator wikipedia , lookup
Shockley–Queisser limit wikipedia , lookup
Transcript
Semiconductor Technology Basics Why Semiconductors? • Conductors always have a high concentration of electrons in conduction bands – states that are free to move through the material • Insulators always have virtually zero electrons in such bands – conduction band energy is too high – all the electrons are stuck in valance bands • localized to particular atoms/molecules in the material • Semiconductors have a conduction band whose electron population is easily manipulated – Sensitive to dopants, applied potentials, temperature Electronic Structure of Silicon • Silicon, atomic number: 14 – s+p orbitals of shell 3 are (together) half full 1s 2s 2p 3s 3p – Like in Carbon (element 6), s,p orbitals can rearrange to form four sp3 hybrid orbitals w. tetrahedral symmetry: – Each Si can share electrons with 4 neighboring Si’s to fill all the 3sp orbitals... Stable tetrahedral lattice, like diamond Electrons & Holes • At normal temperatures, – a small percentage of shell-3 electrons will be free of the bond orbitals • wandering thru the lattice… – leaving a “hole” in the lattice point they left • a hole acts like a positively charged particle • Once created, holes can “move,” too… – by a nearby electron hopping over to fill them – however, hole mobility is usually lower than that of electrons Donor & Acceptor Dopants • Boron (element 5) is one electron shy of having a half-empty shell 2 that would fit Si lattice 1s 2s 2p 3s 3p – Boron atoms readily accept extra mobile electrons and lock them in place, forming a negative B- ion • Reduces free-electron concentration, increases hole concentration when implanted into silicon • Phosphorus (element 15) has one too many shell-3 electrons to fit in Si lattice Forms P+ ion – Donates the extra electron 1s 2s readily to conduction band 2p 3s • Increases free-electron conc., decreases hole conc. 3p p-type vs. n-type Silicon • Pure silicon: – Has an equal number of positive & negative charge carriers (holes & electrons, resp.) • Acceptor-doped (e.g., boron-doped) silicon: – Has a charge-carrier concentration heavily dominated by positive charge carriers (holes, h+) • Balanced by negative, immobile ions of acceptor atom – We call it a “p-type” semiconductor. • Donor-doped (e.g., phosphorus-doped) silicon – Has charge-carrier concentration heavily dominated by negative charge carriers (electrons, e-) • Balanced by positive, immobile ions of donor atom – Call it “n-type” semiconductor pn junctions • What happens when you put p-type and n-type silicon in direct contact with each other? – Near the junction, electrons from the n and holes from the p diffuse into & annihilate each other! – Forms a depletion region free of charge carriers Depletion region p-type h+ B- h+ h+ B- n-type B- Bh+ h+ B- B- h+ h+ BBBB BB- h+ B- h+ B h+ h+ h+ BBh+ - B B B h+ B BB B B- h+ B B h+ h+ h+ h+ P+ e- + e- P+ e- + e- e- + eP P P+ P e- eP+ + e- + e- + eP P P + + P P + + + e- P+ e- P e- P P P+ eeP+ e+ P P+ P+ e- P+ P+ P+ pn junction electrostatics Depletion region p-type h+ B- h+ h+ B- B- Bh+ h+ n-type B- B- h+ h+ BBBB BB- h+ B- h+ B h+ h+ h+ BBB- B B h+ BBBBB- h+ BB h+ h+ h+ h+ cf. Pierret ‘96 P+ e- + e- P+ e- + e- e- + eP P P+ P e- eP+ + e- + e- + eP P P + + P P + + + e- P+ e- P P P P+ eP+ e+ P + P+ e- P+ P+ e- P P+ + Charge density Electric field Electrostatic potential Builtin voltage npn MOSFET (n-FET) MetalOxideSemiconductor FieldEffect Transistor Vbias gate electrode n Electron potential energy (negative of electric potential) e e e e e e e e e n p p+ p+ p+ p+ p+ + p+ p+ p e e e e e e e e e e e e Potential as seen by electrons When Vbias > 0 Gate voltage > Vt CMOS Inverters (a) CMOS inverter structure. (b) Transition curves. Semiconductor Technology Scaling Technology Scaling: Notation • Historically, device feature length scales have decreased by ~12%/year. – So: feature length 0.88year : – 1/ (1/0.88)year 1.14 year : • up 14%/year • Meanwhile, typical CPU die diameters have increased by ~2.3%/year. (Less stable trend.) – Diameter 1.023year : – 1/Diameter 0.978year : • Quantities that are constant over time are written as 1 : Resistance Scaling • Fixed-shape wire (any shape): R /wt / = – All dimensions scaling t equally. – E.g. a local interconnect in a small scaled logic block / functional unit w Current flow • Constant-length thin wire: R / = • Thin cross-chip wire: R / = ! – Up 33%/year! – Long-distance wires have to be extra thick to be fast • But, fewer thick wires can fit! Capacitance Scaling • Fixed-shape structure (any): C w/s / = – E.g. scaled devices/wires • Per unit wire length: – C w/s / (constant) • Cross-chip thin wire: C • Per unit area: C /s – E.g., total on-chip cap./cm2 w s Some 1st-order Semiconductor Scaling Laws • Voltages V (due to e.g. punch-through ) • Long-term: temperature T (prevents leakage) • Resistance: – Fixed-shape wire: R /wt / = – Thin cross-chip wire: R / = • Capacitance: – – – – Fixed-shape structure: C w/s / = Per unit wire length: C (constant) Cross-chip wire: C Per unit area: C 1/s Why Voltage Scaling? • For many years, logic voltages were maintained at fairly constant levels as transistors shrunk – TTL 5V logic – was standard for many years – later 3.3 V, now: ~1V within leading-edge CPUs • Further shrinkage w/o voltage scaling is no longer possible, due to various effects: – – – – Punch-through Device degradation from hot carriers Gate-insulator failure Carrier velocity saturation • In general, things break down at high field strengths – constant-field voltage scaling may be preferred Punch-Through Vbias gate electrode n e e e e e e n p p+ p+ p + p+ e e e e e e e e e e e e e e e Zero bias Moderate bias Strong bias e e e e e Very strong bias Need for Voltage Scaling Vbias Vbias gate electrode n e e e e e e n p p+ p+ p+ p+ e e e p n e e e eee p+p+p+p+ eee eee eee eee e e e eee e e e Smaller size & same voltage higher electric field strengths easier punch-through e e e e e e e e e n e e Long-term Temperature Scaling? • May be needed in the long term. • Sub-threshold power dissipation across “off” transistors is based on the leakage current density exp(−Vt / T) – Vt is the threshold voltage • Must scale down with Vdd, or else transistor can’t turn on! – T is the thermal voltage at temperature T • Equal to kBT/q, where q is electron charge magnitude • Voltage spread of individual electrons fr. thermal noise • As voltages decrease, – leakage power will dominate – devices will become unable to store charge • Unless (eventually), T V • Only alternative to low T: Scaling halts! – Probably what must happen, because low temps. imply slow rate of quantum evolution. Unfortunately, lower T fewer charge carriers! Delay Scaling • Charging time delay t RC : – – – – Through fixed shape conductor: RC = Thin constant-length wire: RC Via cross-die thin wire: RC · = up 36%/yr! Through a transistor: RC · = • Implications: – Transistors increasingly faster than long thin wires. – Even becoming faster than fixed-shape wires! – Local communication among chip elements is becoming increasingly favored! Performance scaling • Performance characteristics: – Clock frequency for small, transistor-delaydominated local structures: f 1/t (up 14%/yr) – Transistor density (per area): d = 1/ = – Perf. density RA = fd = ; chip area: A – Total raw performance (local transitions / chip / time): R = fd A = = 1.55year • Increases 55% each year! • Nearly doubles every 18 months (like Moore’s Law). • Raw performance has (in the past) been harnessed for improvements in serial microprocessor performance. • Future architectures will need to move to more parallel programming models to fully use further improvements. Charges & Currents • Charges & fields: – Charge on a structure: Q = CV – Surface charge density: Q/A – Electric field strengths: E = V/ • Currents: Resistivity: Constant – Peak current densities: J = E/ – Peak current in a wire: I = JA – Channel-crossing times: t = /v • Due to constant e saturation velocity v 200 kmph – Current in an on-transistor: I = Q/t / = – Effective trans. on-resistance: R = V/I / = • ~4-20 kΩ is typical for a min-sized transistor Interconnect Scaling • Since transistor delay dt scales as , • And wire delay dw (w. scaled cross-section size) for a wire of length scales as RC (/wt)(w/s) = 2/st 2/ = 2, • Then to keep dw < dt (1-cycle access) requires: 2 < 2 < / = < 3/2 • So wire length in units of transistor length t is /t < 3/2/ = 1/2 (down 6%/year) • So number of devices accessible within a constant × dt in 2-D goes as (1/2)2 = , in 3-D as (1/2)3 = 3/2. – Circuits must be increasingly local. Energy and Power • Energy: – Energy on a structure: E QV CV2 2 = 3 – Energy per-area: EA CV2/A 3/2 = – Energy densities: E/3 3/3 (not a problem) • Power levels: – Per-area power: PA = EAf = (not a problem) – Power per die: P = PAA (up ~5%/year) • Power-per-performance: PA/RA = / = • But, if constant-field scaling is not used (and it has not been, very much, and cannot be much further) all the above scaling rates get increased by the square of the field strength (F) scaling rate. – Because V F·, and E and P scale with V2. 3-D Scalability? • Consider stacking circuits in 3-D within a constant volume. • # of layers n: /thickness / • Total power: PT = P(flat chip)×n = • Enclosing surface area AE: • Power flux (if not recycled): PT/AE = / = – For this to be possible, coolant velocity &/or thermal conductivity must also increase as ! • Probably not feasible. • Power recycling is needed to scale in 3-D! Semiconductor Technology Limits Types of Limits • Meindl ‘95 identifies several kinds of limits on VLSI (from most to least fundamental): – Theoretical limits (focus on energy & delay) • • • • • Fundamental limits (such as we already discussed) Material limits (dependent on materials used) Device limits (dependent on structure & geometry) Circuit limits (dependent on circuit styles used) System limits (dependent on architecture & packaging) – Practical limits • Design limits • Manufacturing limits Fundamental Limits • Thermodynamic limits – Minimum dissipation per bit erasure • kT ln 2 limit. More stringent limits for reliability coming up. – Subthreshold conduction leakage currents • Ion/Ioff exp(Vdd / T) • Quantum mechanical limits – Tunneling leakage currents (cf. Mead ’94, next slide) – Energy-time uncertainty principle E h/t • Related to Margolus-Levitin bound tnop ≥ ½h/(E−E0) • Electromagnetic (relativistic) limits – Speed-of-light lower bound on delay for an interconnect of a given length, t ≥ /c. Tunneling Limit on Device Size • This graph plots the de Broglie wavelength λ = h(2mE)−1/2 of electrons of effective mass m having kinetic energy equal to a given barrier height E. • This is also the min. barrier width needed to prevent electrons from tunneling with probability greater than 3.5×10−6. Material Limits • Carrier mobility (carrier velocity/field strength) – Affects carrier velocity, on-current, transition time – 6x higher in GaAs than in Si, but only at low field • Carrier saturation velocity (max velocity) – Nearly equal for Si and GaAs. – Velocity maxes out @ ~100 nm/ps – Occurs @ ~1-10 V/m in Si (depends on doping) • Breakdown field strength Ec – 33% higher in GaAs than Si • Thermal conductivity – next slide • Dielectric constants – slide after Thermal Conductivity • For a given (device+heat-sink) structure, P K T – P - rate of heat removal (power) – K - thermal conductivity of materials used – T - how much hotter is device than its surroundings • K is 3x lower in GaAs than in Si – Implies that GaAs is 3x slower than Si when speed is limited by conductive cooling through substrate (often true)! • Highest known K: Diamond! – K = 2 mW/m·K, 14 times higher than Silicon! – Can be a semiconductor if Boron-doped, or an insulator if not. • Also has high mobility, high breakdown voltage, & good tolerance for high-temperature operation. – NTT recently demonstrated a diamond semiconductor capable of 81 GHz frequencies in analog applications. • Apollo Diamond in Massachusetts is developing a cheap manufacturing capability for single-crystal diamond wafers using CVD. Dielectric Constants • Dielectric constants = /0 = C/C0. SiO2 4 – Want high in thin gate dielectrics, • To maximize channel surface-charge density, & thus oncurrent, for given VG,on, • But avoid very low thickness w. high tunneling leakage. • But, material must also be an insulator! (SrTi = 310!) – Want low for thick interconnect (“field”) insulators • To minimize parasitic C and delay of interconnects • Lowest possible is that of vacuum (1). Air is close. Some Device Limits • MOSFET channel length – Generally, the lower, the better! • Reduces load capacitance & thus load charging time. – But, lengths are lower-bounded by the following: • • • • Manufacturing limits, such as lithography wavelengths. Supply voltage lower-limits to keep a decent Ion/Ioff. Depletion region thickness due to dopant density limits. Yield, in the face of threshold variation due to statistical fluctuation in dopant concentrations. • Source-to-drain tunneling. • Distributed RC network response time – Limited by: • of wires (e.g. the recent shift from Al to Cu) • of insulators (at most, 4x less than SiO2 is possible) • Widths, lengths of wires: limited by basic geometry Circuit Limits • Power supply voltage limits (later) • Switching energy limits (later) • Gate delays: – Fundamentally limited by transistor characteristics, RC network charging times • each of which are limited as per previous slide – There is a fastest possible logic gate in any given device technology • esp. considering it has to be switched by similar gates – Static CMOS & its close relatives (precharged domino, NORA) are probably close to the fastest-possible gates using CMOS transistors in a given tech. generation. System Limits We’ll discuss these more later in the course… • • • • • Architectural limits Power dissipation Heat removal capability of packaging Cycle time requirements Physical size Design & Design-Verification Limits • Increasing complexity (# of devices/chip) leads to continual new challenges in: – Design organization • modularity vs. efficiency – Automatic circuit synthesis & layout • circuit optimization – Design verification • layout-vs-schematic • logic-level simulation • analog (e.g. SPICE) modeling – Testing and design-for-testability • test coverage Manufacturing Limits See the ITRS ‘01 roadmap for these. • Lithography resolution, tools • Dopant implantation techniques • Process changes for new device structures • Assembly & packaging • Yield enhancement • Environmental / safety / health considerations • Metrology (measurement) • Product cost & factory cost “Red brick wall” could be reached as early as 2006! --ITRS ‘03 Possible Endpoints for Electronics • Merkle’s minimal “quantum FET” • Mesoscale nanoelectronic devices based on metal or semiconductor “islands” – E.g. Single-electron transistors, quantum dots, resonant tunneling transistors. • Various organic molecular electronic devices – diodes, transistors • Inorganic atomic-scale devices – 1-atom-wide chains of conductor/semiconductor atoms precisely positioned on/in substrates • Also discuss: Superconducting devices Energy Limits in Electronics • Origin of CV2/2 switching energy dissipation • Thermal reliability bounds on CV2 scaling – Voltage limits – Capacitance limits • Leakage trends in MOSFETs Limit on Switching Energy • Consider temporarily connecting a single unknown bit to ground. – Average dissipation is 1/4 CV2. – At least T log 2 average dissipation is required to erase a bit by Landauer’s principle. – Therefore, CV2 4T log 2 = 4kBT ln 2. 0/1? Entropy: log 2 0 CV2/4 0 Entropy: log 1 = 0 Reliability w. Thermal Noise • Consider N logic nodes, 1 of which is high. – Don’t know which: Entropy = log N. • Then, connect them all to ground temporarily. – Want them all to be 0, with high probability. – Logical entropy is now 0. • Log N entropy must be exported elsewhere. • Requires T log N expenditure of energy. – But, only ½CV2 energy was dissipated! • So, to reliably do N arbitrary irreversible bit operations requires at least ½CV2 T log N = kBT ln N energy per logic node. Illustration of Scenario 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N Entropy: log N CV2/2 ½CV2 T log N 0 Entropy: 0 Thermal Capacitance • What is the minimum entropy generation for a structure of given capacitance C? – Consider minimal node voltage V = (ln R)φT • Needed to get desired on/off ratio of R. • Let the thermal capacitance CT :≡ qe/T. – At room temperature CT = 6 aF. • Then we can derive an expression for minimum entropy generation for our structure: S ½(log N) C/CT • This implies that C 2(ln N) CT at minimum V. Voltage Bounds for Reliability • Suppose we are stuck with a given C. Then the minimum voltage that we can tolerate is CT V T 2 ln N C – One implication: If some nodes have C less than thermal capacitance, then voltages cannot actually approach the thermal voltage. • Other lower bounds on node voltages: V T - to switch FETs strongly on & off V >> VT - to avoid defects due to threshold variation In Particular Generations • Year 2001 technology, aggressive low-power: – 9 knats per transistor-switching op • Year 2012 projection: – 2 knats – 30x what’s needed for 1027 reliability (ln N=60) • 1e9 nodes lasting 1e9 seconds at 1e9 hertz w/o error