Download Power Point version

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Dynamic-link library wikipedia , lookup

Stream processing wikipedia , lookup

Object-oriented programming wikipedia , lookup

Program optimization wikipedia , lookup

C Sharp (programming language) wikipedia , lookup

ILLIAC IV wikipedia , lookup

Interpreter (computing) wikipedia , lookup

Library (computing) wikipedia , lookup

Assembly language wikipedia , lookup

Transcript
Starting a Program
The 4 stages that take a C++ program (or any
high-level programming language) and
execute it in internal memory are:
• Compiler - C++ -> Assembly code
• Assembler - Asm -> Machine code (object)
• Linker Object -> Executable
• Loader Executable -> Execution in
Memory
1
Translation Hierarchy ( ‫) היררכיה‬
C program
Compiler
Assembly language program
Assembler
Object: Machine language module
Object: Library routine (machine language)
Linker
Executable: Machine language program
Loader
2
Memory
The Compiler
• The compiler transforms the C++ program into an
assembly language program, a symbolic form of
the machine language.
• High-level languages programs can be written in
much less lines than assembly language, so
programmer productivity ( ‫ ) תפוקה‬is high.
• In 1975 many operating systems, compilers and
assemblers were written in assembly because
compilers were inefficient and memories small.
• The increase in memory capacity has reduced
program size concern and optimizing compilers
produce assembly code as good as programmers.
3
The Assembler
• Assembly language is the interface between highlevel Programming Languages (PLs) and machine
code.
• The assembler can add instructions that aren't
implemented in hardware. These are called
pseudoinstructions. The use of them simplify
translation and programming.
• The pseudoinstruction mov $t0,$t1 is converted
by the assembler into the true machine instruction
add $t0,$zero,$t1.
• The assembler converts branches to faraway
locations into a branch and jump.
4
The Object File
• The assembler turns the assembly code into an
object file, which contains machine code, data, and
information needed to place instructions in memory.
• The assembler must map the labels in assembly
code to addresses in machine code. This
information is kept in the symbol table.
• After converting all labels to addresses the symbol
table contains the remaining labels that aren't
defined, such as external data or procedures.
• Each C++ source file is translated into one
assembly code file which is then translated to one
object file.
5
Object File Structure
The object file for Unix systems contains six parts:
• Object file header - size and position of the other
parts of the file.
• Text segment - the machine code.
• Data segment - static data that comes with the
program.
• Relocation information - identifies instructions and
data that depend on absolute addresses when the
program is loaded into memory.
• Symbol table - labels to external references.
• Debugging information - links machine instructions
to C++ statements.
6
The Linker
• A single change to one line of the program requires
compiling and assembling the whole program. This
is wasteful as most code won't be touched by the
programmer, even code such as standard libraries
which he/she didn't write, will be recompiled.
• An alternative is to compile and assemble each
procedure independently. A change to a procedure
will require compiling only a single procedure.
• The link editor or linker takes all the independent
object files and links them together.
• The output of the linker is the executable file or
executable.
7
Pictorial Description
Object file
sub:
·
·
·
Object file
Instructions
Relocation
records
main:
jal ???
·
·
·
jal ???
Executable file
Linker
call, sub
call, printf
C library
print:
·
·
·
8
main:
jal printf
·
·
·
jal sub
printf:
·
·
·
sub:
·
·
·
Linking Steps
There are 3 steps for the linker:
• Place code and data symbolically in memory.
• Determine the addresses of data and instruction
labels.
• Patch both the internal and external references.
The linker uses the relocation information and symbol
table in each object module to find all undefined
labels. These labels are found in branch and jump
instructions and in data addresses. It finds the old
addresses and replaces them with new addresses.
It is faster to "patch" the code
than recompile.
9
Memory Locations
• If all the external references are resolved the linker
determines the memory location of all procedures
and data.
• Since the files were assembled separately, the
assembler can't know where a modules code and
data will reside in memory relative to other
modules.
• When the linker places a module in memory all
absolute references, memory addresses not
relative to a register, must be relocated to their true
addresses.
10
MIPS Memory Allocation
• The stack starts at top $sp
and grows down towards
the data segment.
• The program code starts
at 0x40000.
• The static data starts at
0x1000000. Dynamic
data (data allocated by
new) starts right after it.
• The $gp is situated to
make it easy to access
the static data.
$gp
7fff ffff
hex
Stack
Dynamic data
1000 8000
hex
1000 0000
Static data
hex
Text
pc
11
0040 0000
hex
Reserved
0
Object File 1
Object File Header
Text Segment
Data Segment
Relocation Info
Symbol Table
Name
Text Size
Data Size
Address
0
4
…
0
…
Address
0
4
Label
X
B
Procedure A
0x100
0x20
Instruction
lw $a0, 0($gp)
jal 0
X
Instruction type
lw
jal
Address
-
12
Dependency
X
B
Object File 2
Object File Header
Text Segment
Data Segment
Relocation Info
Symbol Table
Name
Text Size
Data Size
Address
0
4
…
0
…
Address
0
4
Label
Y
A
Procedure B
0x200
0x30
Instruction
sw $a1, 0($gp)
jal 0
Y
Instruction type
sw
jal
Address
-
13
Dependency
Y
A
Executable File
Executable file header
Text Segment
Data Segment
Text size
Data size
Address
0x0040000
0x0040004
…
0x00400100
0x00400104
…
Address
0x10000000
…
0x10000020
…
14
0x300
0x50
Instruction
lw $a0, 0x8000($gp)
jal 0x400100
…
sw $a1, 0x8020($gp)
jal 0x400000
…
X
…
Y
…
The Executable File
• Contains a header, the text segment and the data
segment.
• The separate modules now reside together in the
text and data segments. All the unresolved
addresses in the link stage are now resolved.
• This file can now be run in the computer.
• In the debug stage of development the executable
will contain debug information. After development is
finished the file is stripped of debug information.
15
The Loader
The loader performs the following steps (Unix):
• Reads the executable to find out the size of the text
and data.
• Creates an address space large enough.
• Copies the instructions and data into memory.
• Copies parameters to the main program onto the
stack.
• Initializes the machines registers and sets the stack
pointer.
• Jumps to a start-up procedure that copies the
parameters into the argument registers and calls
the main procedure (main()
in C++)
16