Download Register

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
From C to Assembly Language
Jen-Chang Liu, Spring 2006
Adapted from
http://www-inst.eecs.berkeley.edu/~cs61c/
Hierarchy of Computer Organization
High
level
Application (programs)
Operating
Compiler
Software
Hardware
Assembler
System
(Windows 98)
Processor Memory I/O system
Datapath & Control
Low
level
Digital Design
Circuit Design
transistors
Instruction Set
Architecture
Hierarchy of Programming Languages
Degree of
abstraction
High-level
Java, C++
operand
operation
objects
method
C
variables
arithmetic op.
functions
Assembly
registers
instruction set
registers
binary operation code
Low-level
machine
Assembly Language
° Basic job of a CPU
• execute lots of instructions.
° Instructions are the primitive
operations that the CPU may execute.
° Different CPUs implement different
sets of instructions.
° The set of instructions a particular
CPU implements is an Instruction Set
Architecture (ISA).
• Examples: Intel 80x86 (Pentium 4),
IBM/Motorola PowerPC (Macintosh),
MIPS, Intel IA64, ...
Purpose of learning assembly
° Understanding of the underlying
hardware
° Assembly program is smaller and
faster
• Ex. Embedded applications
Book: Programming From the Ground Up
“A new book was just released which is
based on a new concept - teaching
computer science through assembly
language (Linux x86 assembly language,
to be exact). This book teaches how the
machine itself operates, rather than just
the language. I've found that the key
difference between mediocre and excellent
programmers is whether or not they know assembly
language. Those that do tend to understand
computers themselves at a much deeper level.
Although [almost!] unheard of today, this concept isn't
really all that new -- there used to not be much choice
in years past. Apple computers came with only BASIC
and assembly language, and there were books
available on assembly language for kids. This is why
the old-timers are often viewed as 'wizards': they had
to know assembly language programming.”
-- slashdot.org comment, 2004-02-05
We are going from C to assembly
° Target machine architecture for
assembly code: MIPS
• Designed in early 1980s
• Used by NEC, Nintendo, Silicon
Graphics, Sony, etc.
• RISC architecture (Reduced Instruction
Set) 精簡指令集
- vs. CISC (Complex Instruction Set)
° Why MIPS instead of Intel 80x86?
• MIPS is simple, elegant. Don’t want to
get bogged down in gritty details.
• MIPS widely used in embedded apps,
x86 little used in embedded, and more
embedded computers than PCs
MIPS CPU
Memory
CPU
Coprocessor 1 (FPU)
Registers
Registers
$0
$0
$31
$31
Arithmetic
unit
Multiply
divide
Lo
Arithmetic
unit
Hi
Coprocessor 0 (traps and memory)
Registers
BadVAddr
Cause
Status
EPC
Floating point
10
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
16
00
01
02
03
04
05
06
07
08
09
0a
0b
0c
0d
0e
0f
10
11
12
13
14
15
16
17
18
19
1a
1b
1c
1d
1e
1f
20
21
22
23
24
25
26
27
28
29
2a
2b
2c
2d
2e
2f
30
31
32
33
34
35
36
37
38
39
3a
3b
3c
3d
3e
3f
op(31:26)
j
jal
beq
bne
blez
bgtz
addi
addiu
slti
sltiu
andi
ori
xori
lui
z = 0
z = 1
z = 2
z = 3
lb
lh
lwl
lw
lbu
lhu
lwr
sb
sh
swl
sw
swr
lwc0
lwc1
lwc2
lwc3
swc0
swc1
swc2
swc3
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
rs
(25:21)
mfcz
0
1
(16:16)
bczt
bczt
cfcz
mtcz
ctcz
if z = 0
copz
copz
if z = l, if z = l,
f = d
f = s
funct
(4:0)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
tlbr
tlbwi
tlbwr
tlbp
rte
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
rt
(20:16)
bltz
bgez
bltzal
bgezal
10
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
funct(5:0)
sll
srl
sra
srlv
srav
jr
jalr
syscall
break
mfhi
mthi
mflo
mtlo
mult
multu
div
divu
add
addu
sub
subu
and
or
xor
nor
slt
sltu
10
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
funct(5:0)
add.f
sub.f
mul.f
div.f
abs.f
mov.f
neg.f
cvt.s.f
cvt.d.f
cvt.w.f
c.f.f
c.un.f
c.eq.f
c.ueq.f
c.olt.f
c.ult.f
c.ole.f
c.ule.f
c.st.f
c.ngle.f
c.seq.f
c.ngl.f
c.lt.f
c.nge.f
c.le.f
c.ngt.f
Laboratory #1: next Monday (3/13)
° MIPS simulator – SPIM
° Download the SPIM
• CD in the textbook
• http://www.cs.wisc.edu/~larus/spim.html
° Read SPIM document
• On-line document
• Textbook Appendix A
° Try a simple program
Outline
°C operators, operands
°Variables in Assembly: Registers
°Comments in Assembly
°Addition and Subtraction in Assembly
°Memory Access in Assembly
Review C Operators/Operands (1/2)
°Operators: +, -, *, /, % (mod);
•7/4==1, 7%4==3
°Operands:
• Variables: lower, upper, fahr, celsius
• Constants: 0, 1000, -17, 15.4
°Assignment Statement:
Variable = expression
• Examples:
celsius = 5*(fahr-32)/9;
a = b+c+d-e;
C Operators/Operands (2/2)
°In C (and most High Level Languages)
variables declared first and given a
type
• Example:
int fahr, celsius;
char a, b, c, d, e;
°Each variable can ONLY represent a
value of the type it was declared as
(cannot mix and match int and char
variables).
Outline
°C operators, operands
°Variables in Assembly: Registers
°Comments in Assembly
°Addition and Subtraction in Assembly
°Memory Access in Assembly
Assembly Design: Key Concepts
°Keep it simple!
• Limit what can be a variable and what
can’t
• Limit types of operations that can be
done to absolute minimum
- if an operation can be decomposed into a
simpler operation, don’t include it
Assembly Variables: Registers (1/4)
°Unlike HLL, assembly cannot use
variables
• Why not? Keep Hardware Simple
°Assembly Operands are registers
• limited number of special locations built
directly into the hardware
• operations can only be performed on
these!
°Benefit: Since registers are directly in
hardware, they are very fast
Assembly Variables: Registers (2/4)
°Drawback: Since registers are in
hardware, there are a predetermined
number of them
• Solution: MIPS code must be very
carefully put together to efficiently use
registers
°32 registers in MIPS
• Why 32? Smaller is faster
°Each MIPS register is 32 bits wide
• Groups of 32 bits called a word in MIPS
MIPS CPU
Memory
CPU
Coprocessor 1 (FPU)
Registers
Registers
$0
$0
$31
$31
Arithmetic
unit
Multiply
divide
Lo
Arithmetic
unit
Hi
Coprocessor 0 (traps and memory)
Registers
BadVAddr
Cause
Status
EPC
Assembly Variables: Registers (3/4)
°Registers are numbered from 0 to 31
°Each register can be referred to by
number or name
°Number references:
$0, $1, $2, … $30, $31
Assembly Variables: Registers (4/4)
°By convention, each register also has
a name to make it easier to code
°For now:
$16 - $22 
$s0 - $s7
(correspond to C variables)
$8 - $15

$t0 - $t7
(correspond to temporary variables)
°In general, use names to make your
code more readable
PCspim simulator
Comments (註解) in Assembly
°Another way to make your code more
readable: comments!
°Hash (#) is used for MIPS comments
• anything from hash mark to end of line is
a comment and will be ignored
°Note: Different from C.
• C comments have format /* comment */ ,
so they can span many lines
Outline
°C operators, operands
°Variables in Assembly: Registers
°Comments in Assembly
°Addition and Subtraction in Assembly
指令
°Memory Access in Assembly
Assembly Instructions
In C: a = b+c+d+e;
In assembly:
one addition at each instruction
°In assembly language, each statement
(called an Instruction), executes
exactly one of a short list of simple
commands
°Unlike in C (and most other High Level
Languages), each line of assembly
code contains at most 1 instruction
Addition and Subtraction (1/4)
°Syntax of Instructions:
1 2,3,4
°Example: add $s0,$s1,$s2
where:
1) operation by name
2) operand getting result (“destination”)
3) 1st operand for operation (“source1”)
4) 2nd operand for operation (“source2”)
°Syntax is rigid:
• 1 operator, 3 operands
• Why? Keep Hardware simple via regularity
Addition and Subtraction (2/4)
°Addition in Assembly
• Example:
add $s0,$s1,$s2 (in MIPS)
Equivalent to:
a = b + c
(in C)
where registers $s0,$s1,$s2 are
associated with variables a, b, c
°Subtraction in Assembly
• Example:
sub $s3,$s4,$s5 (in MIPS)
Equivalent to:
d = e - f
(in C)
where registers $s3,$s4,$s5 are
associated with variables d, e, f
Addition and Subtraction (3/4)
°How to do the following C statement?
a = b + c + d - e;
°Break into multiple instructions
add $s0, $s1, $s2 # a = b + c
add $s0, $s0, $s3 # a = a + d
sub $s0, $s0, $s4 # a = a - e
°Notice: A single line of C may break up
into several lines of MIPS.
°Notice: Everything after the hash mark
on each line is ignored (comments)
Addition and Subtraction (4/4)
°How do we do this?
•f = (g + h) - (i + j);
• ? operations, ? registers
°Use intermediate temporary register
add $s0,$s1,$s2
# f = g + h
add $t0,$s3,$s4
# t0 = i + j
# need to save i+j, but can’t use
# f, so use t0
sub $s0,$s0,$t0
# f=(g+h)-(i+j)
Immediates 常數 (1/2)
°Immediates are numerical constants.
°They appear often in code, so there
are special instructions for them.
• In C compiler gcc, 52% operations
involve constants
°Add Immediate:
addi $s0,$s1,10 (in MIPS)
f = g + 10
(in C)
where registers $s0,$s1 are associated
with variables f, g
° Where is the immediate in real
machine?
Immediates (2/2)
°There is no Subtract Immediate in
MIPS: Why?
addi $s0,$s1,-10 (in MIPS)
f = g - 10 (in C)
where registers $s0,$s1 are associated
with variables f, g
°Limit types of operations that can be
done to absolute minimum
• if an operation can be decomposed into a
simpler operation, don’t include it
addi …, -X = subi …, X => so no subi
Register Zero (常數0的暫存器)
°One particular immediate, the number
zero (0), appears very often in code.
°So we define register zero ($0 or
$zero) to always have the value 0; eg
add $s0,$s1,$zero (in MIPS)
f = g (in C)
where registers $s0,$s1 are associated
with variables f, g
°defined in hardware, so an instruction
addi $0,$0,5
will not do anything!
Brief summary
° In MIPS Assembly Language:
• Registers replace C variables
• One Instruction (simple operation) per line
• Simpler is Better
• Smaller is Faster
° New Instructions:
add, addi, sub
° New Registers:
C Variables: $s0 - $s7
Temporary Variables: $t0 - $t9
Zero: $zero
Outline
°C operators, operands
°Variables in Assembly: Registers
°Comments in Assembly
°Addition and Subtraction in Assembly
指令
°Memory Access in Assembly
Assembly Operands: Memory
°C variables map onto registers; what
about large data structures like arrays?
°1 of 5 components of a computer:
memory contains such data structures
°But MIPS arithmetic instructions only
operate on registers, never directly on
memory.
°Data transfer instructions transfer data
between registers and memory:
• Memory to register: load
• Register to memory: store
MIPS CPU
Memory
CPU
Coprocessor 1 (FPU)
Registers
Registers
$0
$0
$31
$31
Arithmetic
unit
Multiply
divide
Lo
Arithmetic
unit
Hi
Coprocessor 0 (traps and memory)
Registers
BadVAddr
Cause
Status
EPC
Floating point
Data Transfer: Memory to Reg (1/4)
°To transfer a word of data, we need to
specify two things:
• Register: specify this by number (0 - 31)
• Memory address: more difficult
- Think of memory as a single onedimensional array, so we can address it
simply by supplying a pointer to a
memory address.
- Other times, we want to be able to offset
from this pointer.
Recall: C pointer
° How to access C variables through
pointers?
#include <stdio.h>
main()
{
char str[]=“Hello, World!”;
char a;
a = *str;
a = *(str+1);
}
str
str+1
…
H
e
l
l
o
,
W
…
Data Transfer: Memory to Reg (2/4)
°To specify a memory address to copy
from, specify two things:
• A register which contains a pointer to
memory
• A numerical offset (in bytes)
°The desired memory address is the
sum of these two values.
°Example:
8($t0)
• specifies the memory address pointed to
by the value in $t0, plus 8 bytes
Memory access: 8($t0)
Pointer $t0
(register)
Offset 8
(constant)
Start of an array 8
(constant)
Offset $t0
(register)
Data Transfer: Memory to Reg (3/4)
°Load Instruction Syntax:
1 2,3(4)
Example: lw $t0,12($s0)
• where
1) operation name
2) register that will receive value
3) numerical offset in bytes
4) register containing pointer to memory
°Instruction Name:
•lw (meaning Load Word, so 32 bits (4
bytes) or one word are loaded at a time)
Example: load word
# hello.s
.globl main
main:
str:
.data
.asciiz "Hello, World!"
.text
“orld”
“o, W”
“Hell”
10010000 H
$gp=10008000H
addi $gp, 0x7fff
lw $t0, 0x1($gp)
Static data
10000000 H
Example: load word
Data Transfer: Reg to Memory (1/2)
°Also want to store value from a register
into memory
°Store instruction syntax is identical to
Load instruction syntax
°Instruction Name:
sw (meaning Store Word, so 32 bits
or one word are loaded at a time)
MIPS CPU
Memory
store
CPU
Coprocessor 1 (FPU)
load
Registers
Arithmetic
unit
Registers
$0
$0
$31
$31
Multiply
divide
Lo
Arithmetic
unit
Hi
Coprocessor 0 (traps and memory)
Registers
BadVAddr
Cause
Status
EPC
Floating point
Data Transfer: Reg to Memory (2/2)
°Example:
sw $t0,12($s0)
This instruction will take the pointer in
$s0, add 12 bytes to it, and then store the
value from register $t0 into the memory
address pointed to by the calculated sum
Pointers v.s. Values
°Key Concept: A register can hold any
32-bit value. (typeless)
• That value can be a (signed) int, an
unsigned int, a pointer (memory
address), etc.
°If you write add $t2,$t1,$t0
then $t0 and $t1
better contain values
°If you write lw $t2,0($t0)
then $t0 better contain a pointer
°Don’t mix these up!
Addressing: Byte vs. Word
°Every word in memory has an address,
similar to an index in an array
°Early computers numbered words like
C numbers elements of an array:
•Memory[0], Memory[1], Memory[2], …
?
Called the “address” of a word
°Computers needed to access 8-bit
bytes as well as words (4 bytes/word)
°Today machines address memory as
bytes, hence word addresses differ by 4
•Memory[0], Memory[4], Memory[8], …
C: word address => assembly: byte address
°What offset in lw to select A[8] in C?
• A is a 4-byte type (ex. long int)
°Compile by hand using registers:
g = h + A[8];
• g: $s1, h: $s2, $s3:base address of A
°4x8=32 bytes offset to select A[8]
°1st transfer from memory to register:
lw $t0,32($s3)
add $s1,$s2,$t0
# $t0 gets A[8]
# $s1 = h+A[8]
More Notes about Memory: Alignment
°MIPS requires that all words start at
addresses that are multiples of 4 bytes
0
1
2
3
Aligned
Not
Aligned
°Called Alignment: objects must fall on
address that is multiple of their size.
Role of Registers vs. Memory
°What if more variables than registers?
• Compiler tries to keep most frequently
used variable in registers
°Why not keep all variables in memory?
• Smaller is faster:
registers are faster than memory
• Registers more versatile:
- MIPS arithmetic instructions can read 2,
operate on them, and write 1 per instruction
- MIPS data transfer only read or write 1
operand per instruction, and no operation
Brief summary (1/2)
°In MIPS Assembly Language:
• Registers replace C variables
• One Instruction (simple operation) per line
• Simpler is Better
• Smaller is Faster
°Memory is byte-addressable, but lw and
sw access one word at a time.
°A pointer (used by lw and sw) is just a
memory address, so we can add to it or
subtract from it (using offset).
Brief summary (2/2)
°New Instructions:
add, addi,
sub
lw, sw
°New Registers:
C Variables: $s0 - $s7
Temporary Variables: $t0 - $t9
Zero: $zero