Download CS-224 Computer Organization Lecture 01

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Computer Organization
CS224
Fall 2011
“Welcome to my course”
Will Sawyer
With thanks to M.J. Irwin, D. Patterson, and J. Hennessy for some lecture slide contents
CS224 Fall 2011 Chapter 1
CS224 Course Contents
Overview of computer technologies, instruction set architecture
(ISA), ISA design considerations, RISC vs. CISC, assembly and
machine language, translation and program start-up. Computer
arithmetic, arithmetic logic unit, floating-point numbers and their
arithmetic implementations. Processor design, data path and control
implementation, pipelining, hazards, pipelined processor design,
hazard detection and forwarding, branch prediction and exception
handling. Memory hierarchy, principles, structure, and performance
of caches, virtual memory, segmentation and paging. I/O devices,
I/O performance, interfacing I/O. Intro to multiprocessors, multicores,
and cluster computing.
From the Bilkent University Catalog: “Course Descriptions”
CS224 Fall 2011 Chapter 1
CS224 Policies
Everything is on the Web site:
found @ CS Dept > Course Home Pages > CS224
http://www.cs.bilkent.edu.tr/~will/courses/CS224/
Numerical average will be calculated from:
• 4-5 homeworks 10%
• X pop quizzes 10%
• 2 projects
30%
• Midterm
20%
• Final exam 30%
TO PASS, you must:
• have exam average >= 35% (weighted average)
• have overall course performance that is passing
CS224 Fall 2011 Chapter 1
CS224 Introduction
• This course is all about how computers work
• But what do we mean by a computer?
– Different types: desktop, servers, embedded devices
– Different uses: automobiles, graphics, finance, genomics…
– Different manufacturers: Intel, Apple, IBM, Microsoft, Sun…
– Different underlying technologies and different costs
• Best way to learn:
– Focus on a specific instance and learn how it works
– While learning general principles and historical perspectives
CS224 Fall 2011 Chapter 1
Why learn this stuff?
• You want to call yourself a “computer engineer”
• You want to build software people use (need performance)
• You need to make a purchasing decision or offer “expert” advice
• Both Hardware and Software affect performance:
– Algorithm determines number of source-level statements
– Language/Compiler/Architecture determine number of machine
instructions (Chapter 2 and 3)
– Processor/Memory determine how fast instructions are executed
(Chapter 4 and 5)
– I/O and Number_of_Cores determine overall system performance
(Chapter 6 and 7)
CS224 Fall 2011 Chapter 1
Organization of a Computer
• Five classic components of a computer – input, output, memory,
datapath, and control
q datapath
+ control
=
processor
CS224 Fall 2011 Chapter 1
What is a computer?
• Components:
– input (mouse, keyboard, camera, microphone...)
– output (display, printer, speakers....)
– memory (caches, DRAM, SRAM, hard disk drives, Flash....)
– network (both input and output)
• Our primary focus: the processor (datapath and control)
– implemented using billions of transistors
– Impossible to understand by looking at each transistor
– We need...abstraction!
An abstraction omits unneeded detail,
helps us cope with complexity.
CS224 Fall 2011 Chapter 1
How do computers work?
• Each of the following abstracts everything below it:
–
–
–
–
–
–
–
–
–
–
–
Applications software
Systems software
Assembly Language
Machine Language
Architectural Approaches: Caches, Virtual Memory, Pipelining
Sequential logic, finite state machines
Combinational logic, arithmetic circuits
Boolean logic, 1s and 0s
Transistors used to build logic gates (e.g. CMOS)
Semiconductors/Silicon used to build transistors
Properties of atoms, electrons, and quantum dynamics
• Notice how abstraction hides the detail of lower levels, yet gives a
useful view for a given purpose
CS224 Fall 2011 Chapter 1
Computer Architecture
Application
Operating
System
Compiler
Firmware
Instruction Set Architecture
Instr. Set Proc.
Implementation
I/O system
Logic Design
Circuit Design
Layout
Computer
Architecture
CS224 Fall 2011 Chapter 1
Instruction Set Architecture
• A very important abstraction
– interface between hardware and low-level software
– standardizes instructions, machine language bit patterns, etc.
– advantage: different implementations of the same architecture
– disadvantage: sometimes prevents using new innovations
• Common instruction set architectures:
– IA-64, IA-32, PowerPC, MIPS, SPARC, ARM, and others
– All are multi-sourced, with different implementations for the same
ISA
CS224 Fall 2011 Chapter 1
Instruction Set Architecture (ISA)
•
ISA, or simply architecture: the abstract interface between hardware and the
lowest level of software that encompasses all the information necessary to
write a machine language program, including instructions, registers,
memory access, IO, …
•
ISA Includes
– Organization of storage
– Data types
– Encoding and representing instructions
– Instruction Set (i.e. opcodes)
– Modes of addressing data items/instructions
– Program visible exception handling
•
ISA together with OS interface specifies the requirements for binary
compatibility across implementations (ABI: application binary interface)
CS224 Fall 2011 Chapter 1
Case Study: MIPS ISA
• Instruction Categories
– Load/Store
– Computational
– Jump and Branch
– Floating Point
– Memory Management
– Special
R0 - R31
PC
HI
LO
3 Instruction Formats, 32 bits wide
OP
rs
rt
OP
rs
rt
OP
CS224 Fall 2011 Chapter 1
rd
sa
immediate
jump target
funct
CS224 Fall 2011 Chapter 1
Function Units in a Computer
Classes of Computers
• Desktop computers: Designed to deliver good performance to a
single user at low cost usually executing 3rd party software, usually
incorporating a graphics display, a keyboard, and a mouse
• Servers: Used to run larger programs for multiple, simultaneous
users typically accessed only via a network and that places a
greater emphasis on dependability and (often) security
• Supercomputers: A high performance, high cost class of servers
with hundreds to thousands of processors, terabytes of memory
and petabytes of storage that are used for high-end scientific and
engineering applications
• Embedded computers (processors): A computer inside another
device, used for running one predetermined application
CS224 Fall 2011 Chapter 1
CS224 Fall 2011 Chapter 1
CS224 Fall 2011 Chapter 1
CS224 Fall 2011 Chapter 1
Embedded Processor Characteristics
The largest class of computers spanning the widest range of
applications and performance
•
•
•
•
Often have minimum performance requirements.
Often have stringent limitations on cost.
Often have stringent limitations on power consumption.
Often have low tolerance for failure.
In all these ways, embedded processors are very different than
supercomputers, servers, or desktops/laptops
CS224 Fall 2011 Chapter 1
Below the Program
Applications software
Systems software
Hardware
•
System software
– Operating system – supervising program that interfaces the user’s
program with the hardware (e.g., Linux, MacOS, Windows)
• Handles basic input and output operations
• Allocates storage and memory
• Provides for protected sharing among multiple applications
– Compiler – translate programs written in a high-level language (e.g., C,
Java) into instructions that the hardware can execute
CS224 Fall 2011 Chapter 1
CS224 Fall 2011 Chapter 1
Advantages of HLLs
• Higher-level languages (HLLs)
!
Allow the programmer to think in a more natural language and
tailored for the intended use (Fortran for scientific computation,
Cobol for business programming, Lisp for symbol manipulation, Java
for web programming, …)
!
Improve programmer productivity – more understandable code that
is easier to debug and validate
!
Improve program maintainability
!
Allow programs to be machine independent of the computer on
which they are developed (compilers and assemblers can translate
high-level language programs to the binary instructions of any
machine)
!
Emergence of optimizing compilers that produce very efficient
assembly code optimized for the target machine
• As a result, very little programming is done today at the
assembler level.
CS224 Fall 2011 Chapter 1
Compiler Basics
• High-level languages
– Programmers do not think in 0 and 1s
• Languages can also be specific to target applications, such
as Cobol (business) or Fortran (scientific)
– Applications are more concise Þ fewer bugs
– Programs can be independent of system on which they are
developed
• Compilers convert source code to object code
• Libraries simplify common tasks
CS224 Fall 2011 Chapter 1
Levels of Representation
temp = v[k];
High Level Language
Program
v[k] = v[k+1];
v[k+1] = temp;
Compiler
lw
lw
sw
sw
Assembly Language
Program
$15,
$16,
$16,
$15,
0($2)
4($2)
0($2)
4($2)
Assembler
Machine Language
Program
0000
1010
1100
0101
1001
1111
0110
1000
1100
0101
1010
0000
0110
1000
1111
1001
1010
0000
0101
1100
1111
1001
1000
0110
0101
1100
0000
1010
1000
0110
1001
1111
Machine Interpretation
Control Signal
Specification
°
°
CS224 Fall 2011 Chapter 1
ALUOP[0:3] <= InstReg[9:11] & MASK
[i.e.high/low on control lines]
Execution Cycle
Instruction
Fetch
Instruction
Decode
Operand
Fetch
Execute
Result
Store
Next
Instruction
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in storage for later use
Determine successor instruction
CS224 Fall 2011 Chapter 1
CS224 Fall 2011 Chapter 1
Communication
• The Information Age
• The Internet’s changes to communication are unlike any past
medium (printing press, radio, television)
– Estimated 400
million users in 2005
Residential Internet Subscribers
– An astounding 1.2
billion wireless users!!
300
240
Millions
250
200
150
145
160
180
270
205
100
50
0
2000 2001 2002 2003 2004 2005
Source: Ovum
CS224 Fall 2011 Chapter 1
Moore’s Law
q In
1965, Intel’s Gordon Moore
predicted that the number of
transistors that can be
integrated on single chip would
double about every two years
Dual Core
Itanium with
1.7B transistors
feature size
&
die size
Courtesy, Intel ®
Moore’s Law for CPUs and DRAMs
From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.
CS224 Fall 2011 Chapter 1
Technology Scaling Road Map
Year
2004
2006
2008
2010
2012
Feature size (nm)
90
65
45
32
22
Intg. Capacity (BT)
2
4
6
16
32
• Fun facts about 45nm transistors
– 30 million can fit on the head of a pin
– You could fit more than 2,000 across the width of a human
hair
– If car prices had fallen at the same rate as the price of a
single transistor has since 1968, a new car today would cost
about 1 cent
Semiconductors
• 50 year old industry
– Still has continuous improvements
– New generation every 2-3 years
• 30% reduction in dimension Þ 50% in area
• 30% reduction in delay Þ 50% speed increase
• Current generation: Reduce cost and increases performance
– Processors are fabricated on ingots cut into wafers which
are then etched to create transistors
– Wafers are then diced to form chips, some of which have
defects
– Yield is the measurement of the good chips
• Next generation: Larger with more functions
– Each generation is an incremental improvement
CS224 Fall 2011 Chapter 1
Semiconductor Manufacturing Process
for Silicon ICs
CS224 Fall 2011 Chapter 1
Main driver: device scaling ...
From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.
CS224 Fall 2011 Chapter 1
CS224 Fall 2011 Chapter 1
Hitting the Power Wall
“For the P6, success criteria included performance above a certain
level and failure criteria included power dissipation above some
threshold.”
Bob Colwell, Pentium Chronicles
CS224 Fall 2011 Chapter 1
Processor performance growth flattens!
CS224 Fall 2011 Chapter 1
The Latest Revolution: Multicores
The power challenge has forced a change in the design of
microprocessors
--Since 2002 the rate of improvement in the response time of programs
on desktop computers has slowed from a factor of 1.5 per year to less
than a factor of 1.2 per year
--In 2011 all desktop and server companies are shipping
microprocessors with multiple processors – cores – per chip
Product
Cores per chip
Clock rate
Power
q
AMD
Barcelona
Intel
Nehalem
IBM Power 6 Sun Niagara
2
4
4
2
8
2.5 GHz
~2.5 GHz?
4.7 GHz
1.4 GHz
120 W
~100 W?
~100 W?
94 W
The plan is to double the number of cores per chip per
generation (about every two years)
CS224 Fall 2011 Chapter 1
Workloads and Benchmarks
• Benchmarks – a set of programs that form a “workload” specifically
chosen to measure performance. With standard inputs, the
benchmarks are run and execution time is measured.
• SPEC (System Performance Evaluation Cooperative) creates
standard sets of benchmarks starting with SPEC89. The latest is
SPEC CPU2006 which consists of 12 integer benchmarks
(CINT2006) and 17 floating-point benchmarks (CFP2006).
www.spec.org
• There are also benchmark collections for power workloads
(SPECpower_ssj2008), for email workloads (SPECmail2008), for
multimedia workloads (mediabench), …
CS224 Fall 2011 Chapter 1
2002 SPEC Benchmarks
Integer benchmarks
FP benchmarks
gzip
compression
wupwise Quantum chromodynamics
vpr
FPGA place & route
swim
Shallow water model
gcc
GNU C compiler
mgrid
Multigrid solver in 3D fields
mcf
Combinatorial optimization applu
Parabolic/elliptic pde
crafty
Chess program
mesa
3D graphics library
parser
Word processing program
galgel
Computational fluid dynamics
eon
Computer visualization
art
Image recognition (NN)
perlbmk
perl application
equake
Seismic wave propagation
simulation
gap
Group theory interpreter
facerec
Facial image recognition
vortex
Object oriented database
ammp
Computational chemistry
bzip2
compression
lucas
Primality testing
twolf
Circuit place & route
fma3d
Crash simulation fem
sixtrack
Nuclear physics accel
apsi
Pollutant distribution
SPEC CINT2006 on Barcelona (2.5 GHz)
Name
ICx109
CPI
ExTime
RefTime
SPEC
ratio
perl
2,1118
0.75
637
9,770
15.3
bzip2
2,389
0.85
817
9,650
11.8
gcc
1,050
1.72
724
8,050
11.1
mcf
336
10.00
1,345
9,120
6.8
go
1,658
1.09
721
10,490
14.6
hmmer
2,783
0.80
890
9,330
10.5
sjeng
2,176
0.96
837
12,100
14.5
libquantum
1,623
1.61
1,047
20,720
19.8
h264avc
3,102
0.80
993
22,130
22.3
omnetpp
587
2.94
690
6,250
9.1
astar
1,082
1.79
773
7,020
9.1
xalancbmk
1,058
2.70
1,143
6,900
6.0
Geometric Mean
CS224 Fall 2011 Chapter 1
11.7
CS224 Fall 2011 Chapter 1
CS224 Fall 2011 Chapter 1
CS224: Course Content
Computer Architecture and Engineering
Instruction Set Design
Computer Organization
Interfaces
Hardware Components
Compiler/System View
Logic Designer’s View
“Building Architect”
“Construction Engineer”
CS224 Fall 2011 Chapter 1
CS224: So what's in it for me?
• In-depth understanding of the inner-workings of modern computers,
their evolution, and trade-offs present at the hardware/software
boundary.
– Insight into fast/slow operations that are easy/hard to implement
in hardware
• Experience with the design process in the context of a large complex
(hardware) design.
– Functional Spec --> Control & Datapath --> Simulation -->
Physical implementation
– Modern CAD tools
• Designer's "Conceptual" toolbox
CS224 Fall 2011 Chapter 1
Conceptual tool box
•
•
•
•
•
•
•
•
•
•
Evaluation Techniques
Levels of Translation (e.g. Compilation, Assembly)
Hierarchy (e.g. registers, cache, memory, disk)
Pipelining and Parallelism
Static / Dynamic Scheduling
Indirection and Address Translation
Timing, Clocking, and Latching
CAD Programs, Hardware Description Languages, Simulation
Physical Building Blocks (e.g. CLA)
Understanding Technology Trends
CS224 Fall 2011 Chapter 1