* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ENGS116 F04
Survey
Document related concepts
Transcript
ENGS 116 Lecture 1 1 ENGS 116 / COSC 107 Computer Architecture Introduction Vincent H. Berk September 24th , 2008 Reading for Friday: Chapter 1.1 – 1.4, Amdahl article Reading for Monday: 1.5 – 1.11 ENGS 116 Lecture 1 2 Prerequisite Knowledge • Assembly language programming • Fundamentals of logic design Combinational and sequential components (e.g., gates, multiplexers, decoders, ROMs, flip-flops, registers, RAMs) • Processor Design Instruction cycle, pipelining, branch prediction, exceptions • Memory Hierarchy Caches (direct-mapped, fully-associative, 2-way set associative), spatial locality, temporal locality, virtual memory, translation lookaside buffer (TLB) • Input and Output Polling, interrupts • Multiprocessors ENGS 116 Lecture 1 3 What is Computer Architecture? Two viewpoints: • Hardware designer’s viewpoint: CPUs, caches, buses, pipelines, physical memory, etc. • Programmer’s viewpoint: instruction set – opcodes, addressing modes, registers, virtual memory, etc. Study of architecture covers both instruction-set architectures and machine implementation organizations. ENGS 116 Lecture 1 4 Computer Architecture Is ... The attributes of a [computing] system as seen by the programmer, i.e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls the logic design, and the physical implementation. Amdahl, Blaauw, and Brooks, 1964 ENGS 116 Lecture 1 Computer Architecture’s Changing Definition • 1950s to 1960s: Computer Architecture Course = Computer Arithmetic. • 1970s to 1980s: Computer Architecture Course = Instruction Set Design, especially ISA appropriate for compilers. • 1990s to 2000s: Computer Architecture Course = Design of CPU, memory system, I/O • 2000 to now: Computer Architecture Course = ILP, DLP, TLP, storage 5 ENGS 116 Lecture 1 6 ENGS 116 Lecture 1 7 5 Generations of Electronic Computers (Hwang) First (1945-54) Vacuum tubes and relay memories, CPU driven by PC and accumulator, fixed-point arithmetic Second (1955-64) Discrete transistors and core memories, floating-point arithmetic, I/O processors, multiplexed memory access. Integrated circuits (SSI/MSI), microprogramming, pipelining, cache and lookahead. LSI/VLSI and semi-conductor memory, multiprocessors, vector super-computers, multicomputers. Third (1965-74) Fourth (1975-90) Fifth (1991present) ULSI/VHSIC processors, memory, and switches, high-density packaging, scalable architectures Machine/assembly lan-guages, single user, no subroutine linkage, pro-grammed I/O using CPU. HLL used with compilers, subroutine libraries, batch processing monitor. Multiprogramming and timesharing OS, multiuser applications Multiprocessor OS, languages, compilers, and environments for parallel processing. ENIAC, Princeton IAS, IBM 701 IBM 7030, CDC 1604, Univac LARC IBM 360/370, CDC 6600, TI-ASC, PDP-8 VAX/900, Cray X/MP, IBM 3090, BBN TC2000. Massively parallel process-ing, IBM/MPP, grand challenge appli-cations, Cray/MPP, heterogeneous pro-cessing. TMC/CM-5, Intel Paragon. ENGS 116 Lecture 1 8 Computer Tasks • Desktop Computing, Lightweight Servers, Laptops Price-performance (low cost) Communication, Graphics • Server Computing, Mainframe Systems Specific performance, processing power, storage Availability, Reliability • Embedded Computers and DSPs Power and Memory requirements Lowest cost for required performance Real-time or soft-real-time performance ENGS 116 Lecture 1 9 Task of Computer Designer • Determine which attributes are important for a new machine. • Design a machine to meet functional requirements, price, power and performance goals. ENGS 116 Lecture 1 10 Basic Computer Organization Processor Control Input Memory Datapath Output ENGS 116 Lecture 1 11 Computer Architecture Topics Input/Output and Storage Disks, WORM, Tape SDRAM Memory Hierarchy VLSI L2/L3 Cache L1 Cache Instruction Set Architecture Pipelining, Hazard Resolution, Superscalar, Reordering, Prediction, Speculation RAID Emerging Technologies Interleaving Bus protocols Multi-Core Coherence, Bandwidth, Latency Addressing, Protection, Exception Handling Pipelining, InstructionLevel Parallelism, ThreadLevel Parallelism ENGS 116 Lecture 1 12 Computer Architecture Topics P M P M S °°° P M P M Interconnection Network Processor-Memory-Switch Multiprocessors Networks and Interconnections Shared Memory, Message Passing, Data Parallelism Network Interfaces Topologies, Routing, Bandwidth, Latency, Reliability ENGS 116 Lecture 1 13 Course Focus Understanding the design techniques, machine structures, technology factors, and evaluation methods that will determine the form of computers in the 21st Century Technology Parallelism Programming Languages Applications Computer Architecture: • Instruction Set Design • Organization • Hardware Operating Systems Measurement & Evaluation Interface Design (ISA) History ENGS 116 Lecture 1 14 Technology Trends • Integrated circuit logic technology transistor density (feature size) transistor count cycle speed multiple cores • Semiconductor DRAM density latency and bandwidth • Magnetic disk technology density access time • Network technology bandwidth latency ENGS 116 Lecture 1 15 Scaling in ICs • Feature size: minimum size of a single distinguishable/producible item on a chip die 1971 – 10 microns 2001 – 0.18 microns 2003 – 0.06 microns 2006 – 5 nanometers (0.005 microns) • Complex relationships: Transistor density increases quadratically with decrease in feature size Reduction in feature size requires voltage reduction to maintain correct operation and reasonable reliability • Scaling IC wiring: Signal delay increases with product of resistance and capacitance Shorter wires can be smaller Smaller features have higher current leakage ENGS 116 Lecture 1 16 Power Consumption of ICs • Power requirements per transistor are proportional to load capacitance, frequency of switching and the square of the voltage. Power = ½ x Capacitance x Voltage2 x Frequency switched • Switching frequency and density of transistors increases faster than decrease in capacitance and voltage, leading to increased power consumption == generated heat • Pentium 4 consumes 135 Watts of power while the 8086i386 did not even feature a heat-sink ENGS 116 Lecture 1 17 Cost and Price • Cost of manufacturing decreases over time: learning curve • Learning curve is measured as an increase in yield • Volume doubling leads to 10% reduction in cost • Commodity products tend to decrease cost: Volume Competition Efficiency ENGS 116 Lecture 1 18 Difference between Cost and Price ENGS 116 Lecture 1 19 Wafers and Dies • Chips are produced on round silicon disks • Dies are the actual chip, cut out from the wafer • Testing occurs before cutting and after packaging ENGS 116 Lecture 1 20 Yield and Cost • However: Wafers do not just contain chip-dies, usually a large area, including several chip-dies, is dedicated for test equipment hook-up Actual yield in mass-production chip-fabs varies between 98% for DRAMS to 1% for new Processors ENGS 116 Lecture 1 21 Yield and Cost • Switch from 200mm to 300mm wafers: Although 300mm wafers have lower yield than 200mm wafers, the overhead processing costs per wafer are high enough to make 300mm wafers more cost effective. • Redundancy in dies: Single transistors do fail during production, causing memory cells, pipeline stages, control logic sections to fail Redundancy is built into the each die by introducing backup-units After testing, backup units are enabled and failed units can be disabled by LASER This decreases the chances of small flaws failing an entire die Few companies give insight into their redundant circuitry numbers ENGS 116 Lecture 1 22 Performance Hwang: “The ideal performance of a computer system demands a perfect match between machine capability and program behavior.” Machine capability – enhanced with better hardware technology, innovative architectural features, efficient resource management. Program behavior – affected by algorithm design, data structures, language efficiency, programmer skill, compiler technology. To improve software performance, need to understand how various hardware factors affect overall system performance! ENGS 116 Lecture 1 23 Measuring Performance • Key measure is time. • Response time (execution time): Time between start and completion of a task. • Throughput: total amount of work completed in a given time. Seconds Instr count Clock Cycles Seconds = Program Program Instr count Clock Cycle ENGS 116 Lecture 1 24 Comparing Design Alternatives “X is n times faster than Y” means ENGS 116 Lecture 1 25 Benchmarking • Real programs; e.g., compilers, photo editing • Modified or scripted real programs; e.g., compression algorithms • Kernels – small, key pieces from real programs; e.g., Livermore Loops, Linpack. • Toy benchmarks – typically 10 to 100 lines of code, useful primarily for intro programming assignments; e.g., quicksort, prime numbers, encryption • Synthetic benchmarks – try to match average frequency of operations and operands for a set of programs; e.g., Whetstone, Dhrystone. • Benchmark suites – collections of programs; e.g, SPEC CPU2000