* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Power Point version
Survey
Document related concepts
Transcript
Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: • Compiler - C++ -> Assembly code • Assembler - Asm -> Machine code (object) • Linker Object -> Executable • Loader Executable -> Execution in Memory 1 Translation Hierarchy ( ) היררכיה C program Compiler Assembly language program Assembler Object: Machine language module Object: Library routine (machine language) Linker Executable: Machine language program Loader 2 Memory The Compiler • The compiler transforms the C++ program into an assembly language program, a symbolic form of the machine language. • High-level languages programs can be written in much less lines than assembly language, so programmer productivity ( ) תפוקהis high. • In 1975 many operating systems, compilers and assemblers were written in assembly because compilers were inefficient and memories small. • The increase in memory capacity has reduced program size concern and optimizing compilers produce assembly code as good as programmers. 3 The Assembler • Assembly language is the interface between highlevel Programming Languages (PLs) and machine code. • The assembler can add instructions that aren't implemented in hardware. These are called pseudoinstructions. The use of them simplify translation and programming. • The pseudoinstruction mov $t0,$t1 is converted by the assembler into the true machine instruction add $t0,$zero,$t1. • The assembler converts branches to faraway locations into a branch and jump. 4 The Object File • The assembler turns the assembly code into an object file, which contains machine code, data, and information needed to place instructions in memory. • The assembler must map the labels in assembly code to addresses in machine code. This information is kept in the symbol table. • After converting all labels to addresses the symbol table contains the remaining labels that aren't defined, such as external data or procedures. • Each C++ source file is translated into one assembly code file which is then translated to one object file. 5 Object File Structure The object file for Unix systems contains six parts: • Object file header - size and position of the other parts of the file. • Text segment - the machine code. • Data segment - static data that comes with the program. • Relocation information - identifies instructions and data that depend on absolute addresses when the program is loaded into memory. • Symbol table - labels to external references. • Debugging information - links machine instructions to C++ statements. 6 The Linker • A single change to one line of the program requires compiling and assembling the whole program. This is wasteful as most code won't be touched by the programmer, even code such as standard libraries which he/she didn't write, will be recompiled. • An alternative is to compile and assemble each procedure independently. A change to a procedure will require compiling only a single procedure. • The link editor or linker takes all the independent object files and links them together. • The output of the linker is the executable file or executable. 7 Pictorial Description Object file sub: · · · Object file Instructions Relocation records main: jal ??? · · · jal ??? Executable file Linker call, sub call, printf C library print: · · · 8 main: jal printf · · · jal sub printf: · · · sub: · · · Linking Steps There are 3 steps for the linker: • Place code and data symbolically in memory. • Determine the addresses of data and instruction labels. • Patch both the internal and external references. The linker uses the relocation information and symbol table in each object module to find all undefined labels. These labels are found in branch and jump instructions and in data addresses. It finds the old addresses and replaces them with new addresses. It is faster to "patch" the code than recompile. 9 Memory Locations • If all the external references are resolved the linker determines the memory location of all procedures and data. • Since the files were assembled separately, the assembler can't know where a modules code and data will reside in memory relative to other modules. • When the linker places a module in memory all absolute references, memory addresses not relative to a register, must be relocated to their true addresses. 10 MIPS Memory Allocation • The stack starts at top $sp and grows down towards the data segment. • The program code starts at 0x40000. • The static data starts at 0x1000000. Dynamic data (data allocated by new) starts right after it. • The $gp is situated to make it easy to access the static data. $gp 7fff ffff hex Stack Dynamic data 1000 8000 hex 1000 0000 Static data hex Text pc 11 0040 0000 hex Reserved 0 Object File 1 Object File Header Text Segment Data Segment Relocation Info Symbol Table Name Text Size Data Size Address 0 4 … 0 … Address 0 4 Label X B Procedure A 0x100 0x20 Instruction lw $a0, 0($gp) jal 0 X Instruction type lw jal Address - 12 Dependency X B Object File 2 Object File Header Text Segment Data Segment Relocation Info Symbol Table Name Text Size Data Size Address 0 4 … 0 … Address 0 4 Label Y A Procedure B 0x200 0x30 Instruction sw $a1, 0($gp) jal 0 Y Instruction type sw jal Address - 13 Dependency Y A Executable File Executable file header Text Segment Data Segment Text size Data size Address 0x0040000 0x0040004 … 0x00400100 0x00400104 … Address 0x10000000 … 0x10000020 … 14 0x300 0x50 Instruction lw $a0, 0x8000($gp) jal 0x400100 … sw $a1, 0x8020($gp) jal 0x400000 … X … Y … The Executable File • Contains a header, the text segment and the data segment. • The separate modules now reside together in the text and data segments. All the unresolved addresses in the link stage are now resolved. • This file can now be run in the computer. • In the debug stage of development the executable will contain debug information. After development is finished the file is stripped of debug information. 15 The Loader The loader performs the following steps (Unix): • Reads the executable to find out the size of the text and data. • Creates an address space large enough. • Copies the instructions and data into memory. • Copies parameters to the main program onto the stack. • Initializes the machines registers and sets the stack pointer. • Jumps to a start-up procedure that copies the parameters into the argument registers and calls the main procedure (main() in C++) 16