Download Architecture ISA L3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
‫ארכיטקטורת יחידת עיבוד מרכזית‬
'‫( – תשס"ג סמסטר א‬36113741)
March , 2007
Hugo Guterman ([email protected])
Web site: http://www.ee.bgu.ac.il/~cpuarch
Arch. CPU L3 ISA. 1
Guterman March 2007 ©BGU
What is “Computer Architecture”
Computer Architecture =
Instruction Set Architecture +
Machine Organization
Arch. CPU L3 ISA. 2
Guterman March 2007 ©BGU
Outline
° ISA and Assembly Language
° Instruction Set Definition (MIPS)
° Registers and Memory
° Arithmetic Instructions
° Load/store Instructions
° Instruction Formats
°DLX Architecture and ISA
Arch. CPU L3 ISA. 3
Guterman March 2007 ©BGU
Instruction Set Architecture (ISA)
Arch. CPU L3 ISA. 4
Guterman March 2007 ©BGU
Software Layers
Arch. CPU L3 ISA. 5
Guterman March 2007 ©BGU
Levels of Representation (Intr. to Comp. Review)
temp = v[k];
High Level Language
Program
v[k] = v[k+1];
v[k+1] = temp;
Compiler
lw$15,
lw$16,
sw
sw
Assembly Language
Program
Assembler
Machine Language
Program
0000
1010
1100
0101
1001
1111
0110
1000
1100
0101
1010
0000
0110
1000
1111
1001
0($2)
4($2)
$16, 0($2)
$15, 4($2)
1010
0000
0101
1100
1111
1001
1000
0110
0101
1100
0000
1010
1000
0110
1001
1111
Machine Interpretation
Control Signal
Specification
ALUOP[0:3] <= InstReg[9:11] & MASK
°
°
Arch. CPU L3 ISA. 6
Guterman March 2007 ©BGU
Basic ISA Classes
° Memory to Memory Machines
But we need storage for temporaries
•
• Memory is slow
• Memory is big (lots of address bits)
°Architectural Registers
registers can hold temporary variables
• registers are faster than memory
• memory traffic is reduced, so program is sped
up (since registers are faster than memory)
•
• code density improves (since register named
with fewer bits than memory location)
Arch. CPU L3 ISA. 7
Guterman March 2007 ©BGU
Basic ISA Classes (cont.)
Accumulator
•
1 address add A acc ← acc + mem[A]
•
1+x address addx A acc ← acc + mem[A + x]
General Purpose Register File (Register-Memory):
•
•
2 address add A B EA(A) ← EA(A) + EA(B)
3 address add A B C EA(A) ← EA(B) + EA(C)
General Purpose Register File (Load/Store):
•
•
3 address add Ra Rb Rc Ra ← Rb + Rc
load Ra Rb Ra ← mem[Rb]
•
store Ra Rb mem[Rb] ← Ra
Stack (not a register file but an operand stack)
•
0 address add tos ← tos + next
Comparison:
•
Bytes per instruction? Number of Instructions? Cycles per
instruction?
Arch. CPU L3 ISA. 8
Guterman March 2007 ©BGU
Comparing Number of Instructions
Arch. CPU L3 ISA. 9
Guterman March 2007 ©BGU
Generic Examples of Instruction Format Widths
Variable:
…
…
Fixed:
Hybrid:
Arch. CPU L3 ISA. 10
Guterman March 2007 ©BGU
Top 10 80x86 Instructions
° Rank instruction
Integer Average Percent total executed
1
load
22%
2
conditional branch
20%
3
compare
16%
4
store
12%
5
add
8%
6
and
6%
7
sub
5%
8
move register-register
4%
9
call
1%
10
return
1%
Total
96%
° Simple instructions dominate instruction frequency
Arch. CPU L3 ISA. 11
Guterman March 2007 ©BGU
Typical Operations (little change since 1960)
Data Movement
Load (from memory)
Store (to memory)
memory-to-memory move
register-to-register move
input (from I/O device)
output (to I/O device)
push, pop (to/from stack)
Arithmetic
integer (binary + decimal) or FP
Add, Subtract, Multiply, Divide
Shift
shift left/right, rotate left/right
Logical
not, and, or, set, clear
Control (Jump/Branch)
unconditional, conditional
Subroutine Linkage
call, return
Interrupt
trap, return
Synchronization
test & set (atomic r-m-w)
String
Graphics (MMX)
search, translate
parallel subword ops (4 16bit add)
Arch. CPU L3 ISA. 12
Guterman March 2007 ©BGU
Compilers and Instruction Set Architectures
• Ease of compilation
°orthogonality: no special registers, few special cases,
all operand modes available with any data type or instruction type
°completeness: support for a wide range of operations
and target applications
° regularity: no overloading for the meanings of instruction fields
° streamlined: resource needs easily determined
• Register Assignment is critical too
°Easier if lots of registers
Arch. CPU L3 ISA. 13
Guterman March 2007 ©BGU
Addressing Mode Usage? (ignore register mode)
3 programs measured on machine with all address modes (VAX)
--- Displacement:
42% avg, 32% to 55%
--- Immediate:
33% avg, 17% to 43%
--- Register deferred (indirect): 13% avg, 3% to 24%
--- Scaled:
7% avg, 0% to 16%
--- Memory indirect:
3% avg, 1% to 6%
--- Misc:
2% avg, 0% to 3%
75%
85%
75% displacement & immediate
88% displacement, immediate & register indirect
Arch. CPU L3 ISA. 14
Guterman March 2007 ©BGU
Instruction Format
• If have many memory operands per instructions and
many addressing modes,
=>Address Specifier per operand
•If have load-store machine with 1 address per instr.
and one or two addressing modes,
=> encode addressing mode in the opcode
Arch. CPU L3 ISA. 15
Guterman March 2007 ©BGU
MIPS R3000 Instruction Set Architecture (Summary)
Registers
° Instruction Categories
•
Load/Store
•
•
Computational
Jump and Branch
•
Floating Point
- coprocessor
•
•
Memory Management
Special
R0 - R31
PC
HI
LO
3 Instruction Formats: all 32 bits wide
OP
rs
rt
OP
rs
rt
OP
Arch. CPU L3 ISA. 16
rd
sa
funct
immediate
jump target
Guterman March 2007 ©BGU
MIPS I Registers
°Programmable storage
•
•
2^32 x bytes of memory
31 x 32-bit GPRs (R0 = 0)
•
•
32 x 32-bit FP regs (paired DP)
HI, LO, PC
r0
r1
°
°
°
r31
PC
lo
hi
Arch. CPU L3 ISA. 17
0
Guterman March 2007 ©BGU
MIPS Addressing Modes/Instruction Formats
• All instructions 32 bits wide
Register (direct)
op
rs
rt
rd
register
Immediate
Base+index
op
rs
rt
immed
op
rs
rt
immed
register
PC-relative
op
rs
PC
rt
Memory
+
immed
Memory
+
• Register Indirect?
Arch. CPU L3 ISA. 18
Guterman March 2007 ©BGU
Example: MIPS Assembly Language Notation
Arch. CPU L3 ISA. 19
Guterman March 2007 ©BGU
Instruction Set Definition (programming model)
Arch. CPU L3 ISA. 20
Guterman March 2007 ©BGU
Registers and Memory (MIPS)
Arch. CPU L3 ISA. 21
Guterman March 2007 ©BGU
Memory Organization
Arch. CPU L3 ISA. 22
Guterman March 2007 ©BGU
Memory Organization
Arch. CPU L3 ISA. 23
Guterman March 2007 ©BGU
Addressing Objects: Endianess and Alignment
°Big Endian:
address of most significant byte =
word address (xx00 = Big End of word)
• IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA
°Little Endian: address of least significant byte =
word address (xx00 = Little End of word)
• Intel 80x86, DEC Vax, DEC Alpha (Windows NT)
3
2
1
little endian byte 0
0
msb
0
big endian byte 0
lsb
1
2
0
3
1
2
3
Aligned
Alignment: require that objects fall on address
that is multiple of their size.
Not
Aligned
Arch. CPU L3 ISA. 24
Guterman March 2007 ©BGU
Instruction Cycle (execution model)
Arch. CPU L3 ISA. 25
Guterman March 2007 ©BGU
Instruction Cycle (execution model)
Arch. CPU L3 ISA. 26
Guterman March 2007 ©BGU
Executing an Assembly Instruction
Arch. CPU L3 ISA. 27
Guterman March 2007 ©BGU
Register File Program Execution
Arch. CPU L3 ISA. 28
Guterman March 2007 ©BGU
Register File Program Execution
Arch. CPU L3 ISA. 29
Guterman March 2007 ©BGU
Another Example
Arch. CPU L3 ISA. 30
Guterman March 2007 ©BGU
Accessing Data
Arch. CPU L3 ISA. 31
Guterman March 2007 ©BGU
Memory Operation - Loads
Arch. CPU L3 ISA. 32
Guterman March 2007 ©BGU
Memory Operations - Store
Arch. CPU L3 ISA. 33
Guterman March 2007 ©BGU
Memory Operation - Loads
Arch. CPU L3 ISA. 34
Guterman March 2007 ©BGU
Memory Operation – Loads – cont’
Arch. CPU L3 ISA. 35
Guterman March 2007 ©BGU
Instruction Format
Arch. CPU L3 ISA. 36
Guterman March 2007 ©BGU
Instruction Formats
Arch. CPU L3 ISA. 37
Guterman March 2007 ©BGU
Constants
Arch. CPU L3 ISA. 38
Guterman March 2007 ©BGU
Loading Immediate Values
Arch. CPU L3 ISA. 39
Guterman March 2007 ©BGU
MIPS Machine Language
Arch. CPU L3 ISA. 40
Guterman March 2007 ©BGU
Summary
°If code size is most important, use variable
length
° If performance is most important, use fixed
length instructions
° Recent embedded machines (ARM, MIPS)
have an optional mode to execute subset of 16bit wide instructions (Thumb, MIPS16); per
procedure, decide which one of performance or
density is more important
Arch. CPU L3 ISA. 41
Guterman March 2007 ©BGU
Summary (cont’)
° “Simple” computations, movements of data, etc.,
are not “simple” in terms of a single, obvious
assembly instruction
•
•
Often requires a sequence of even more primitive instructions
One options is to try to “anticipate” every such computation, and try
to provide an assembly instruction for it
PRO: assembly programs are easier to write by hand
CON: hardware gets really, really complicated by instructions
used very rarely. Compilers might be harder to write
• Other option is to provide a small set of essential primitive
instructions
CON: anything in a high level language turns into LOTS of
instructions in assembly language
PRO: hardware and compiler become easier to design,
cleaner, easier to optimize for speed, performance
-
Arch. CPU L3 ISA. 42
Guterman March 2007 ©BGU
DLX (“Deluxe”) Architecture
IF: Instruction fetch
ID: Instruction decode/
register file read
EX: Execute/
address calculation
MEM: Memory access
WB: Write back
0
M
u
x
1
Add
Add
4
Add
result
Shift
left 2
PC
Read
register 1
Address
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Instruction
Instruction
memory
Zero
ALU ALU
result
0
M
u
x
1
Write
data
Address
Read
data
1
M
u
x
0
Data
memory
Write
data
16
Sign
extend
32
Arch. CPU L3 ISA. 43
Guterman March 2007 ©BGU
Multicycle Approach
° Break up the instructions into steps, each step
takes a cycle
• balance the amount of work to be done
• restrict each cycle to use only one major functional unit
° At the end of a cycle
• store values for use in later cycles (easiest thing to do)
• introduce additional “internal” registers
PC
0
M
u
x
1
Address
Memory
MemData
Write
data
Instruction
[25– 21]
Read
register 1
Instruction
[20– 16]
Read
Read
register 2 data 1
Registers
Write
Read
register data 2
Instruction
[15– 0]
Instruction
register
Instruction
[15– 0]
Memory
data
register
Arch. CPU L3 ISA. 44
0
M
Instruction u
x
[15– 11]
1
B
0
M
u
x
1
Sign
extend
32
Zero
ALU ALU
result
ALUOut
0
4
Write
data
16
0
M
u
x
1
A
1 M
u
2 x
3
Shift
left 2
Guterman March 2007 ©BGU
Five Execution Steps
° Instruction Fetch
° Instruction Decode and Register Fetch
° Execution, Memory Address Computation, or
Branch Completion
° Memory Access or R-type instruction completion
° Write-back step
INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!
Arch. CPU L3 ISA. 45
Guterman March 2007 ©BGU
DLX Instruction Execution
° Every DLX instruction can be implemented in at
most 5 CC!!
•
•
•
Instruction Fetch (and PC Increment) cycle (IF)
- IR Mem [PC]
- NPC PC + 4
Instruction Decode / Register Fetch cycle (ID)
-
A Regs [ IR6……..10];
B Regs [ IR11……..15];
-
Imm ((IR16)16 # # IR16...31)
Execution/effective address cycle (EX)
Performs one of the four possible operations (depending
on the DLX instruction type)
1. Memory Reference:
- ALUOutput A + Imm;
2. Register-Register ALU Instruction
Arch. CPU L3 ISA. 46
ALUOutput A fun B;
Guterman March 2007 ©BGU
DLX Instruction Execution (cont’)
3. Register-Immediate ALU instruction
- ALUOutput A op Imm;
4. Branch
- ALUOutput NPC + Imm;
- Cond (A op 0)
• Memory access/branch completion cycle (MEM)
1. Memory Reference
- ALUOutput A op Imm;
2. Branch
- LMD Mem [ALUOutput] or Mem [ALUOutput] LMD
•
Write - Back Cycle (WB)
1. Register-Register ALU Instruction
– Regs [ IR16……..20] ALUOutput;
2. Register-Immediate ALU instruction
– Regs [ IR11……..15] ALUOutput;
3. Load instruction
– Regs [ IR11……..15] LDM;
Arch. CPU L3 ISA. 47
Guterman March 2007 ©BGU
Arch. CPU L3 ISA. 48
Guterman March 2007 ©BGU
Arch. CPU L3 ISA. 49
Guterman March 2007 ©BGU
Arch. CPU L3 ISA. 50
Guterman March 2007 ©BGU
Arch. CPU L3 ISA. 51
Guterman March 2007 ©BGU
Arch. CPU L3 ISA. 52
Guterman March 2007 ©BGU