* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Exception Handling and System Calls
Survey
Document related concepts
Transcript
Assignment # 1: Student Documentation Getting Started Exception Handling and System Calls Getting Started Before getting started lets see how user programs work in Nachos. (Nachos as you all know is an operating system simulator rather than a real operating system) First, user programs are compiled and linked with start.s. When you invoke nachos with the "-x" flag, the MIPS simulator begins executing the user program specified, instruction by instruction (i.e., the MIPS simulator is an interpreter that reads in binary instructions and simulates their effect). Whenever an event occurs that the OS needs to know about (e.g., a system call), the simulator makes a procedure call into the Nachos kernel in the file exception.cc, function ExceptionHandler. All necessary items you need to know about (e.g., which system call, parameters, etc.) will be in registers. What Are System Calls? System calls are extended instructions or APIs that enable you (users or user programs) to interact with the Operating System kernel. Example of this could be create, open or close function you use in C or C++ to create, open or close a file. These functions are actually system calls. The system calls cause a trap on the operating system kernel (in above case the kernel is unix, but in your case it will be NACHOS) and then the operating system performs the requested task. Thus, during the execution of system calls there is a switch from User Mode to Kernel Mode In NACHOS you have to implement the following system calls. Create Open Read Write Close You would implement the system calls in the file exception.cc which is in the userprog directory. For this project you would mainly make modifications in the userprog directory and add C programs as test files in the test directory. These test cases are actually user programs. You will write these user programs in order to check your system call implementations. Following the Test case path: While following the test case you would come to know how a process (executable file or executable program) is executed in an operating system in general and you would see how it is executed in NACHOS specifically. Normally you would run a test case, say 'halt' from the userprog directory by typing in nachos –x ../test/halt at the prompt. To follow the test case path run DDD and in the run command option type in -x ../test/halt -d. Here "-x " stands for execution, "../test/halt " is the path name for the halt executable file or process or executable program. main.cc should be displayed on the DDD screen after running the above command. Set the break points at the Initialize function in main.cc. The Initialize Function Step into the initialize function in order to get an idea of which data structures are initialized. Besides the initialization of scheduler, interrupt and stats, when the currrentThread gets initialized you will notice that space is also initialized to NULL in the Thread constructor. This corresponds to the address space associated with the currentThread which we will allocate later (This would be the address space of the process associated with this thread). The most important initialization is that of the machine*. Step into the machine constructor (machine/machine.cc) All its registers are initialized to 0. Main memory which is a character pointer is initialized to 0. Memory Size is : MemorySize (NumPhysPages * PageSize) (defined in machine/machine.h) PageSize SectorSize (defined in machine/machine.h) SectorSize 128 (defined in machine/disk.h) NumPhysPages 32 (defined in machine/machine.h) So total memory size is 128 * 32. Number of physical pages are 32 and the PageSize is 128 bytes. Also observe that the tlb and pageTable are NULL. . PageTable is a data structure of type TranslationEntry. ("TranslationEntry" is defined in translate.h) We would talk about the PageTable later Step out of the Initialize function and back to main.cc Set the break points at the StartProcess function in main.cc. StartProcess is inside the USER_PROGRAM macro guard. This function is defined inside the file progtest.cc in the userprog directory and main purpose of this function is to run user processes Step into the StartProcess function. The StartProcess function You should look at the StartProcess function very carefully and understand it completely. This function takes as input a filename (that contains a binary program, such as "halt", to run). First it opens the file. Then it creates a new address space. The parameter to the constructor for AddrSpace is a file pointer; the constructor, among other things, will read the contents of that file (i.e., the code and data) into the address space. Next, it sets the currentThread's address space pointer (look back in thread.h and thread.cc; when USERPROG is defined, as it is for this assignment, the thread class has an additional field -- namely a pointer to an address space. Whereas in assignment one the threads only executed in the kernel, now a thread may be running a user program. That's why you need an address space pointer (which is NULL if the thread only executes kernel code). Remember, a process is a thread + an address space. Here, the address space pointer for the current thread is set equal to the just created address space. Then it initializes address space's registers. NOTE: If you look at the code/test directory you will find the files named halt.c, halt.coff and its exe file "halt". We would see later how these are formed, but so far consider that halt is a user process and your operating system has to execute it. If you look at the file halt.c it calls the Halt system call. This Halt system call is already implemented for you in the file exception.cc in the userprog directory. To start the execution, we need to open the executable file halt. Step into the Open Function which is defined in the code/filesys/filesys.h. Also look at the files filesys.h and openfile.h carefully. The file is opened using filesystem->open (....) and it returns the openfile pointer . Just look at the FILESYS_STUB portion in filesys.h. The other classes within "#else" will be used in filesystem project (Assignment # 4), So You Don't Need To Worry About It. PLEASE AVOID CONFUSION (EXTREMELY IMPORTANT READ CAREFULLY) DO NOT CONFUSE YOURSELF BY LOOKING AT THE FILES FILESYS.CC OR OPENFILE.CC ALSO YOU WILL SEE TWO TYPES OF DECLARATIONS OF OPENFILE AND FILESYS CLASSES IN FILESYS.H AND OPENFILE.H. BUT YOU ARE ONLY REQUIRED TO LOOK AT THE ONES UNDER THE MACROGUARD FILESYS_STUB AND THESE HAVE MEMBER FUNCTIONS DEFINED IN FILESYS.H AND OPENFILE.H. For the system calls like Create, Open, Read, Write and Close, these member functions of openfile and filesys classes directly call unix system calls in order to actually perform the above tasks. You would be required to use the global fileSystem object to call these member functions (create and open). Read and Write would be explained later. Following the execution path we see that an address space is allocated for this executable using the addrespace constructor which is defined in addrspace.cc in the userprog directory. The argument of this addresspace constructor is openfile pointer. Step into the address space constructor. Note that a file noff.h is defined in the bin directory. It is weird that nachos has its own executable format which is "noff", different from the one for Unix which is "coff". Please read the nachos Road Map page 15 Article "User-Level Processes" for details. In order to make an executable for NACHOS the coff.h file has to be converted to the noff format which is already done while compiling the files in the test directory. See the Makefile in the test directory. The line 'executable->ReadAt((char *)&noffH, sizeof(noffH), 0); ' ( ReadAt function is in openfile.h in the filesys directory) does the following : It reads from the file at 0th position (the third argument of the ReadAt function). The second argument says how much it has to read so it reads (sizeof(noffH)) which is the size of the structure NoffHeader. And then it is stored as the buffer (char*) addressed by noffH. So essentially it reads the file header from the executable file. Do you know what information is contained in the file header? Well, it shows the addresses of the code, initialized data and uninitialized data, and this whole information is contained in the initial 40 bytes of the executable file. You can calculate that if you look at the structures in the file noff.h in bin directory. if ((noffH.noffMagic != NOFFMAGIC) && (WordToHost(noffH.noffMagic) == NOFFMAGIC)) SwapHeader(&noffH); ASSERT(noffH.noffMagic == NOFFMAGIC); Since the noff format accepts big endian, it checks whether the executable is little endian or big endian. If it is little endian then it converts it to big endian by the swapping header. But this would be rarely used. It depends on the machine. size = noffH.code.size + noffH.initData.size + noffH.uninitData.size // The size of the file is calculated as code+initialized data+uninitialized data. numPages = divRoundUp(size, PageSize) + divRoundUp(UserStackSize,PageSize) // we need to increase the size to leave room for the stack. Since we need to define the number of pages and these must be integers, we round them up. The DivRoundUp function is defined in utitlity.h. PageSize is defined in machine.h which is the sectorsize and again sector size is defined in disk.h which is 128 bytes. UserStackSize is defined in addrspace.h and is 1024 bytes. This is the execution stack. Location of the first stack is at the top. stackpage = divRoundUp(size, PageSize)+1; size = numPages * PageSize; // The size is recalculated due to round up. The user page table is initialized as : pageTable = new TranslationEntry[numPages]; for (i = 0; i < numPages; i++) { pageTable[i].virtualPage = i; // for now, virtual page # = phys page # pageTable[i].physicalPage = i; pageTable[i].valid = TRUE; pageTable[i].use = FALSE; pageTable[i].dirty = FALSE; pageTable[i].readOnly = FALSE; } // if the code segment was entirely on a separate page, we could set its pages to be read-only. Have a look at translate.cc and machine.cc in the machine directory and see how this Translation Entry Data structure is designed. We see that each virtual page # is being translated to the same physical physical page number in the physical memory. But this would not be the case when you would be implementing the Exec system call (for multiprogramming). First the machine main memory is set to zero: bzero(machine->mainMemory, size); //Check this by "man bzero" Now code and initialized data is being copied from the file to the mainMemory. mainMemory is actually a char* buffer that gets initialized when global machine object is initialized in the initialize function in system.cc. if (noffH.code.size > 0) { DEBUG('a', "Initializing code segment, at 0x%x, size %d\n", noffH.code.virtualAddr, noffH.code.size); executable->ReadAt(&(machine->mainMemory[noffH.code.virtua lAddr]); noffH.code.size, noffH.code.inFileAddr); } if (noffH.initData.size > 0) { DEBUG('a', "Initializing data segment, at 0x%x, size %d\n", noffH.initData.virtualAddr, noffH.initData.size); executable->ReadAt(&(machine->mainMemory[noffH.initData.vir tualAddr]), noffH.initData.size, noffH.initData.inFileAddr); } The above code does the following o noffH.code.inFileAddr = The starting address of code segment in an executable file.(This is relative to file) and starting point will be 40 right after the header. o o noffH.code.size = Size of the code segment &(machine->mainMemory[noffH.code.virtualAddr] = The address in the main memory where this data has to be copied. Since for now the physical page number is same as the virtual page number so mainmemory is indexed by noffH.code.virtualAddr noffH.code.virtualAddr = virtual address of code segment.(relative to virtual address space which will start from 0). The same is the case for noffH.initData but since noffH.initData.size is 0 for our case, it means no initialized data is there in our executable. The address space is now created. After the user process address space has been constructed, the control comes to the next instruction in the StartProcess function in which the process address space gets associated with the currentThread. Then the machine Registers are initialized. These registers are general purpose registers, Program Counter Register, Next Program Counter Register and Stack Register. PCRegister is loaded with the address 0 (starting virtual address). Since each instruction is supposed to have 4 bytes, the NextPC Register is loaded with 4. Stack Register is loaded with the Top address of the stack, keeping an allowance for the end. In Restore state the machine page Table is loaded with the user page table. Remember that machine pageTable was set to NULL when the machine constructor was initialized and pageTableSize was given the size of the userpage Table size. Then the Machine->Run() function is called . In this function first the mode is shifted from kernel mode to user mode and then the function "one instruction" is called. After the execution of each instruction one Tick() increments the timer tick to 1 unless it encounters any of the system calls. The infinite loop is enforced through "for(;;)". One Instruction function is defined in machine directory in mipssim.cc file. It has a switch statement and lot of cases. Each case corresponds to the MIPS instruction. Since NACHOS simulates the MIPS architecture, each MIPS instruction (Assembly) is decoded. You would now see how an instruction gets executed. As a matter of fact, if you closely look at the OneInstruction function it calls machine->ReadMem in order to get the instruction from the physical memory. Inside this function it passes the virtual address of the instruction that is stored in the register PCreg. For the first instruction it would be 0 and would increment in terms of 4 for subsequent instructions. The other argument of the ReadMem function is the size in bytes to be read, since an instruction consists of 4 bytes, these 4 bytes are passed. The third argument is the contents of the instruction itself which is pointed to by an integer and stored in the variable raw. The machine->ReadMem function which is defined in the machine directory in the file "translate.cc", translates this virtual address to the physical address by passing it to the Translate Function. The translate function takes the virtual address, translates it to the physical address by doing calculations on the pageTable or the TLB. In your case it would be the pageTable. After decoding it enters the switch statement and executes that instruction. In general Translate returns an exception, in case of no error it returns NoException. Exception Handling and System Calls Before starting the implementations of the System Calls, you need to understand some concepts related to Exceptions and System calls. Exception and Exception Handling An exception occurs when something unusual happens in an Operating System. There are various type of exceptions. These are defined in machine/machine.h. enum ExceptionType { NoException, // Everything ok! SyscallException, // A program executed a system call. PageFaultException, // No valid translation found ReadOnlyException, // Write attempted to a page marked "read-only" BusErrorException, address // Translation resulted in an invalid physical AddressErrorException, // Unaligned reference or one that was beyond the end of the address space OverflowException, // Integer overflow in add or sub. IllegalInstrException, // Unimplemented or reserved instr. NumExceptionTypes }; Mainly we would be dealing with the SyscallException. Syscall Exception occurs when the user program/process wants the Operating system to do something. There are 9 types of these system calls that you have to implement. Their prototypes are defined in userprog/syscall.h and implementation is to be done in userprog/exception.cc o Halt -- Simply halts the system o Exit -- Exits the process o Exec --Executes the process Join -- calls a join on the parent process by the child Create -- Creates a file Open -- Opens a file Read -- Read from the file or console (keyboard) Write -- Write to the file or console (Monitor) Close -- Closes the opened file Fork -- Fork a thread within a process addresspace Yield -- calls to yield the current thread. o o o o o o o o Proceeding on with the path, put a break point at OP_SYSCALL While stepping into the one Instruction Function (machine/mipssim.cc), you would see that when Halt() system call(Called by the user process in halt.c) is encountered, the switch statement of One Instruction switches to case OP_SYSCALL. In this it calls the RaiseException Function defined in machine/machine.cc. It passes SyscallException as the first parameter and 0 as the second parameter. This 0 will be saved as the Bad virtual Address in "BadVaddrReg" of machine. The mode changes from user to kernel, since the operating system has to catch this exception and do what the user process asks, and that is to halt the machine. It thus calls exceptionHandler. This is known as a trap to the operating system since the mode has been changed from user to kernel. ExceptionHandler is defined in exception.cc. It takes the code of the system call from register [2]. (All the codes are defined in syscall.h). It then executes the code corresponding to that system call. For example, in the case of halt, it simply halts the machine. NOTE: IMPORTANT After the execution of every system call the program counter and next program counter should be incremented. This can be done once at the end in the exception handler function in exception.cc. When the control is exited from this exception handler function, in the raise exception function the mode is again set to user mode and after that it returns inside OP_SYSCALL(in One Instruction Function). The next statement there is "return", thus it returns without incrementing the program counter for the next instruction. Instead, for all other instructions it exits the one Instruction function after incrementing the program counter inside that function. The code for incrementing the program counter is given in the nachos road map. You can use that. SYSTEM CALLS IMPLEMENTATION Before we go to the implementation lets see a few things. What are Stubs for? Stubs are defined in the test directory under file "start.s". These stubs are assembly codes which are used to assist user programs to understand the system calls and make them to Nachos Kernel. So each system call has a stub associated with it. If we look carefully we see that these stubs copy the code number of the system call into register r2. Moreover if the system call has some arguments then those arguments go to register r4, r5, r6 and r7 respectively. That is, we have argument1 = r4, argument2 = r5, argument3 = r6 and argument4 = r7. Also, the return value of the system call should be stored in register r2. In future if you need to add a new system call you will have to add a stub right here just similar to the ones already implemented for this project. You don't need to add anything here for your project). What is syscall.h ? syscall.h resides in the userprog directory and it contains the macros defining the code numbers of system calls and the prototypes of the system calls. The user programs which you will write for your test cases will follow the format of the system calls defined in syscall.h. The implementations of all system calls would go in exception.cc. Here is a detailed explanation for each System call. Following is an explanation of each system call and a suggested way of implementing each of them. Remember that this is not the only way of implementing these calls. void Create (char* filename) Create system call creates a file with the name "filename" given as the parameter. Here is how you can implement this call. o o Add your code in exception.cc in userprog directory. Add the Create System Call code in the following else if condition else if ((which == SyscallException) && (type == SC_Create)) { } o o You CAN copy and paste Translate() and ReadMem() functions in exception.cc and make any changes directory but you CAN'T make changes in these functions in machine directory. Define the Link List globally. int Open (char* filename) Open system call opens a file with the name "filename" given as the parameter. Here is how you can implement this call. o Add your code in exception.cc in userprog directory. o Add the Create System Call code in the following else if condition else if ((which == SyscallException) && (type == SC_Open)) { } o o DO NOT allocate 0 or 1 as the open file ID as these two values are used for Console Input (keyboard) and Console Output (monitor) respectively. The Unique fie ID written on Register R2 is returned to the user program as the return value and is later used for Read() and Write() System calls int Close (int fileId) Close system call closes a file with the id "fileid" given as the parameter. Here is how you can implement this call. o o Add your code in exception.cc in userprog directory. Add the Close System Call code in the following else if condition else if ((which == SyscallException) && (type == SC_Close)) { } o when you close a file. Remove its corresponding openfile pointer from the Linked List. int Read (char* buffer, int numbytes, int fileid) Read system call reads "numbytes" from a file with the id "fileid" (if fileid is not 0) or from console (if fileid is 0)given as the parameter. Here is how you can implement this call o Add your code in exception.cc in userprog directory. o Add the Read System Call code in the following else if condition else if ((which == SyscallException) && (type == SC_Read)) { } int Write (char* buffer, int numbytes, int fileid) Write system call writes "numbytes" into a file with the id "fileid" (if fileid is not 1) or at the console (if fileid is 1)given as the parameter. Here is how you can implement this call = o Add your code in exception.cc in userprog directory. o Add the Write System Call code in the following else if condition else if ((which == SyscallException) && (type == SC_Write)) { }