Download Finishing code generation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Recursion (computer science) wikipedia , lookup

C Sharp (programming language) wikipedia , lookup

Forth (programming language) wikipedia , lookup

C syntax wikipedia , lookup

Compiler wikipedia , lookup

Buffer overflow wikipedia , lookup

Program optimization wikipedia , lookup

C++ wikipedia , lookup

Corecursion wikipedia , lookup

One-pass compiler wikipedia , lookup

Interpreter (computing) wikipedia , lookup

Assembly language wikipedia , lookup

Subroutine wikipedia , lookup

Tail call wikipedia , lookup

Name mangling wikipedia , lookup

Call stack wikipedia , lookup

Transcript
Outline
•
•
•
•
Implementing function calls
Implementing functions
Optimizing away the frame pointer
Dynamically-allocated structures: strings
and arrays
• Register allocation the easy way
CS 4120
Introduction to Compilers
Andrew Myers
Cornell University
Lecture 18: Finishing code generation
7 Oct 09
2
CS 4120 Introduction to Compilers
Function calls
Stack layout
• How to generate code for function calls?
• Two kinds of IR statements in canonical form:
MOVE
f
e1 … en
fp
!
old pc
old fp
arg 2
arg 1
current
stack
frame
fp
sp
e1 … en
old pc
old fp
sp
en
:
:
sp
CS 4120 Introduction to Compilers
3
current
stack
frame
locals
locals
CALL
f
f(e1, …, en)
arg 2
arg 1
EXP
CALL
dest
!
CS 4120 Introduction to Compilers
e1
new stack
frame
4
MOVE
Two translations
bp
CALL
dest
f
old pc
old ebp
locals
e1 … en
fp
en
MOVE
RISC
push en
…
push e1
call f
mov dest, eax
add esp, 4*n
sp
e1
sub esp, 4*n
mov [esp + 4], e1
...
mov [esp + 4*n], en
call f
mov dest, eax
add esp, 4*n
CS 4120 Introduction to Compilers
push en
…
push e1
call f
mov dest, eax
add esp, 4*n
dest
CALL
:
:
non-RISC
Tiling a call
f
e1 … en
Try to match ei against r/m32, tile ei separately if failure.
Push is non-RISC, but still fast on IA-32.
5
Compiling function bodies
Abstract assembly for f
• How we compiled function body:
S! f (…) : T = e " = SEQ(MOVE(RV, E! e "),
LABEL(epilogue))
S! f (…) = e " = SEQ(EXP(E! e "),
LABEL(epilogue))
• Variables:
f(x: int, y:int) = x + y
MOVE
RV
#E! v " = MEM(TEMP(FP) + (4*i+4)) for arg. i
#E! v " = MEM(NAME(v)) for globals
• Try it out:
f(x: int, y:int) = x + y
t2
t1 +
MEM
MEM
+
+
#E! v " = TEMP(tv) for locals
CS 4120 Introduction to Compilers
6
CS 4120 Introduction to Compilers
FP
8
mov
mov
add
mov
t1, [ebp+8]
t2, t1
t2, [ebp+12]
eax, t2
FP 12
What’s missing here?
7
CS 4120 Introduction to Compilers
8
Stack frame setup
Function code
• Need code to set up stack frame on entry
fp
prev. stack
frame
locals
push ebp
mov ebp,esp
sub esp, 4*l
fp
en
:
:
sp
e1
prev pc
mov esp, ebp
pop ebp
new fp
new sp
prev
locals
en
prev. stack
frame
:
:
e1
prev. pc new stack
frame
prev. fp
new
locals
CS 4120 Introduction to Compilers
l
9
f: push ebp
mov ebp,esp
function prologue
sub esp, 4*l
mov t1, [ebp+8]
mov t2, t1
add t2, [ebp+12]
mov eax, t2
f_epilogue:
function epilogue
mov esp, ebp
pop ebp
ret
CS 4120 Introduction to Compilers
The Glory of CISC
Calling conventions
f: push ebp
mov ebp,esp
enter 4*l, 0
sub esp, 4*l
mov t1, [ebp+8]
mov t2, t1
add t2, [ebp+12]
mov eax, t2
f_epilogue:
f_epilogue:
mov esp, ebp
leave
pop ebp
ret
ret
CS 4120 Introduction to Compilers
10
11
• Have described the C calling convention:
– arguments pushed on stack, in reverse order.
– caller is responsible for cleaning up arguments.
• Others:
–
–
–
–
stdcall: callee cleans up.
MIPS/Alpha: first 4/6 arguments passed in registers.
fastcall: first 2 arguments passed in ecx, edx, rest on stack
See website for more details.
• Choice of caller- vs. callee-save:
– caller-save: caller must save registers if needed after call;
usable freely by callee (e[acd]x)
– callee-save: callee not allowed to change; to use, must be
saved (e[sd]i, e[sb]p, ebx)
CS 4120 Introduction to Compilers
12
Optimizing away ebp
Stack frame setup
• Really no need for frame pointer register!
• Idea: maintain constant offset k between
frame pointer and stack pointer
– Use RISC-style argument passing rather than
pushing arguments on stack
– All references to MEM(FP+n) translated to
operand [esp+(n+k)] instead of to [ebp+n]
• Advantage: whole extra register to use
when allocating registers (7!)
CS 4120 Introduction to Compilers
fp
prev. fp
prev. stack
frame
locals
sub esp, 4*l
prev
locals
en
en
:
:
e1
sp
prev pc
:
:
add esp, 4*l
e1
prev pc
sp + 4*l
sp
13
prev. stack
frame
new
locals
max arg
space
new stack
frame (size 4*l)
CS 4120 Introduction to Compilers
14
Caveats
Dynamic structures
• Get even faster (and RISC-core) prologue and
epilogue than with enter/leave but:
• To avoid breaking callers, must save ebp
register if we want to use it (ebp is callee-save)
• Doesn’t work if stack frame is truly variablesized
• Modern programming languages allow
dynamically allocated data structures: strings,
arrays, objects:
– alloca(n) call in C allocates n bytes on the
stack – compiler cannot predict n
– not a problem in Iota9: arrays heap-allocated, stack
frame has constant size
CS 4120 Introduction to Compilers
15
C: char *x = (char *)malloc(strlen(s) + 1);
C++: Foo *f = new Foo(…);
Java: Foo f = new Foo(…);
String s = s1 + s2;
Iota: x: int[5];
y: int[] = “hello”
CS 4120 Introduction to Compilers
16
Program Heap
• Program has 4 memory areas: code segment, stack
segment, static data, heap
• Two typical virtual memory layouts (depends on OS):
static data .data
pc
code
– global variables
– string constants
– assembler has data
segment declaration (.data)
static data
stack
sp
pc
code
17
CS 4120 Introduction to Compilers
– array creation, string concatenation
• Globals statically
allocated in data segment
heap
.text
• Dynamic objects allocated in the heap
– malloc(n) returns new chunk of n bytes,
free(x) releases memory starting at x
stack
sp
heap
Object allocation
stack
sp
heap
static data
pc
code
CS 4120 Introduction to Compilers
18
Iota dynamic structures
Trivial register allocation
• Can convert abstract assembly to real
assembly easily (but generate bad code)
• Allocate every temporary to location in the
current stack frame rather than to a register
a: t [n]
a: t[]
!
a
a(2)
a(1)
a(0)
a.length
a = CALL(malloc, 4*n + 4);
MEM(a) = n;
a + 4;
• PA4: We give you malloc call with garbage collection
support (no free call needed)
CS 4120 Introduction to Compilers
19
– t1 = [ebp-4], t2=[ebp-8], t3 = [ebp-12], ...
• Every temporary stored in different place -no possibility of conflict
• Three registers needed to shuttle data in and
out of stack frame (max. # registers used by
one instruction) : e.g, eax, ecx, edx
CS 4120 Introduction to Compilers
20
Rewriting abstract code
Result
• Given instruction, replace every temporary
in instruction with one of three registers
e[acd]x
• Add mov instructions before instruction
to load registers properly
• Add mov instructions after instruction to
put data back onto stack (if necessary)
• Simple way to get working code
• Code is longer than necessary, slower
• Also can allocate temporaries to registers
until registers run out (3 temporaries on
Pentium, 20+ on MIPS, Alpha)
• Code generation technique actually used
by some compilers when all optimization
turned off (-O0)
• Will use for Programming Assignment 4
push t1
! mov eax, [ebp-4]; push eax
mov [fp+4], t3 ! ?
CS 4120 Introduction to Compilers
21
Summary
22
Where we are
•
•
•
•
Complete code generation technique
Use tiling to perform instruction selection
Arguments mapped to stack locations, locals to temporaries
Function code generated by gluing prologue, epilogue onto
body
• Dynamic structure allocation handled by relying on heap
allocation routines (malloc)
• Static structures allocated by data segment assembler
declarations
• Allocate temporaries to stack locations to eliminate use of
unbounded # of registers
• Shuttle temporaries in and out using e[acd]x regs
CS 4120 Introduction to Compilers
CS 4120 Introduction to Compilers
23
High-level source code
!
Assembly code
CS 4120 Introduction to Compilers
24
Tiling, formally
Expression tile as rule
• Each tile is really an inference rule!
tf
• Write e ! c (reg t)
“c is valid assembly for IR e, puts result into t”
ADD
– Like E$e%=c, but not a function
t1
• Write s ! c
“c is valid code for IR statement s”
e1 ! c1 (reg t1)
e2 ! c2 (reg t2)
• Translation is not a function of e/s : many possible
translations (unlike E$e%, S$e%, T$e%, etc.)
– translation relation generalizes translation function
– can apply idea to earlier translations too – but less payoff
25
CS 4120 Introduction to Compilers
Statement tiles as rules
e1 ! c1 (reg t1)
e2 ! c2 (reg t2)
MOVE(MEM(e1), e2) !
c1 ; c2 ; mov [t1], t2
MOVE
MEM
t2
t1
e1,2 ! op1
(op1 is r/m32)
MOVE
+
MOVE(e1, e2 + CONST(k)) ! add op1, k
r/m32
CONST(k)
These rules are not syntax-directed – need heuristic or dynamic
programming to choose
CS 4120 Introduction to Compilers
t2
mov tf, t1
add tf, t2
27
ADD(e1, e2) ! c1; c2; mov tf, t1; add tf, t2 (reg tf)
CS 4120 Introduction to Compilers
26