Download Toward a SSOS

Toward a Self-Stabilizing Operating System Introduction In order to make a system self-stabilizing it is needed and sufficient that every part of it will be self-stabilizing [Dolev]. If we’re talking about a real computing system we need the hardware to have this property and various software components up to the algorithms who determine or fulfill the system behavior. One such essential part of most computer systems is an Operating-System. It’s main targets are managing the hardware resources and presenting the higher level software an abstract (virtual) machine. [Tanenbaum]. The article describes the steps of the research toward a Self-Stabilizing Operating System – SSOS. In order to supply an SSOS we could take the top-down direction and start develop various standard OS algorithms like process scheduling or memory management. We could even apply those changes to some open-source OS. This approach while very practical is very difficult to prove, since the system built is already composed of various quite complicated components. Instead we took the bottom-up approach. Thus we looked for the most elementary part of an OS and try to stabilize it. Usually right after a computer is booted, the core of an OS is loaded to the computer’s Read Access Memory (RAM) this is done by procedures residing in non-volotile Read-Only-Memory(ROM) usually referred as Basic-Input-Output-Services(BIOS). Thus after the OS is loaded to memory, as a minimum we would like to guarantee that at least the code that was loaded is correct. Assuming that the OS itself is self-stabilizing, it might happen that the memory holding the OS code will be corrupted (e.g. by soft errors), leading to a situation that the OS does not converge to a valid state. In order to deal with this problem we would like our OS to have an essential procedure, which will reside in ROM and periodically will load the OS and start it again. The Intel’s Pentium Processor was chosen for implementing the SSOS. In the coming section we will describe the solution in details. We will pick a platform for implementing the solution and prove that the solution is correct. ……. The OS loader We will lay out a few general assumptions about the system. We have an underlying stabilized hardware which contain a CPU that keeps fetching, decoding & executing instructions from memory. Additionally we have a mechanism that from time to time is guaranteed to make the CPU stop the regular cycle and “jump” to a procedure residing in ROM and start fetching instructions from there. This procedure will reload the OS again ROM (e.g. CD_ROM) to memory in a way that is totally not dependent on previous machine state. Toward the procedure end it will set the CPU to start execution of OS code from start. This loading procedures assures that from once in a predefined way which is not dependent on any memory resident software the OS starts again thus enters a valid state. Hardware Platform - Model In this section we will describe the hardware platform that was chosen. We will concentrate on the needed information for describing the solution. The processor We chose the Intel’s Pentium Processor for implementing the SSOS. This is a ubiquitous processor which we already had in hand. [Intel 2003]. Additionally it has published & available manuals we used to write the code and verify it’s correctness. This processor is built from millions of transistors and is also a commercial one so we can’t gurentee that we know it’s full state, but we will describe it’s state according to the known registers it has. Interrupts {16.1.4} When interrupt occurs the processor does the following Pushes the values of CS & IP registers onto the stack Pushes the FLAG register onto the stack Clears the IF flag in the FLAG register Transfer program control (i.e loads CS & IP) with the location specified in the interrupt table. The IRET instruction called usually at the end of interrupt procedure, reverse these steps to resume the interrupted program. NMI Non-maskable Interrupt is generated by external hardware that assert the NMI pin. (Starting from Pentium 4 it can be generated by instruction also). When this happens the processor immediately moves to the handler pointed to by vector 2. This interrupt can not be masked by the IF flag. While an NMI is handled additional NMIs are disabled. NMIs can be disabled with external circuitry. {3/9.9.1} Timer As said, we need a mechanism for starting the loading procedure once in a while. The natural candidate is the interrupt mechanism built into the processor. Especially the processor has a built in timer that ticks every 55 milliseconds. Most interrupts including the timer interrupt can be discarded by the CPU as a result of executed instructions that handled by the CPU, and change its state. The CPU has an external leg for Non-Maskable-Interrupt (NMI) which can not be discarded. We can connect this input leg to a hardware called watchdog which can be configured to turn on this leg once in a specified time. [Liskov] assumes this mechanism also for a fault-tolerant system. In the event of an interrupt, the CPU jumps to a memory address written in another area of memory called Interrupt Vector Table. According to our assumptions this table, especially the entry for the NMI interrupt, must also reside in ROM. This assumption while not implemented in standard computers, is not so unreal since occasianly there are suggestions to implement full OSs on BIOS. [see…] Processor states - Basic Program execution Realmode {Intel 3/16.1} 3 or 4 operation modes: Protected mode, Real-address mode, Virtual-8086 mode and System Management mode. The IA-32 architecture processor family has an operation mode called 'Real-address' mode, in this mode the processor execute programs written for the Intel 8086 processor. However programs running in this mode can use explicitly processor resources added in later generations. There is a 2nd mode called Virtual-8086 mode that is part of the regular mode (protected mode) which enables concurrent running of realmode programs. We chose to work in the real mode because of its relevant simplicity and the availability of documentation. When the processor is booted up it is put in the real-mode probably for backward compatibility reasons. Mode switching: The PE flag in control register CR0 bit 0, controls the processor mode. MOV CRO instructions do it. The process is: {3/9.9.2} Disable interrupts (CLI) Something with paging Transfer control to readable segment Load segment registers with values appropriate for real mode Exectue LIDT to point to an interrupt vector table Clear the PE flag in CR0: {MOV EBX, CR0;} AND EBX, 0xfffe; MOV CR0, EBX Enable interrupts (STI + external hardware) The CPU state is assumed to be totally determined by the values of it’s different registers. So next we list all of the published Pentium’s registers and there function, this is based on Chapter 3 – Basic Execution Environment of ‘IA-32 Intel Architecture Software Developer’s Manual, volume 1: Basic Architecture’ And {Intel 3/16.1} Address Space The process supports addressing of 1-MByte of physical address space. This space is divided into segemts of 64Kbytes. The base of a segment is specified by a 16 bit selector (stored in one of the segment registeres) which is extended with additional 0 to form 20bit offset from address 0 in the address space. To address specific address, an operand of 16 bit is added to the segment offset to form the actual address. Registers – 8 General purpose registers, 6 segment registers, EFLAG register (Process State Word) and EIP register (Program Counter). These registers among other tasks are used for program flow control and addressing memory which will be used by our code. (Other uses are integer arithmetic and stack operations which we will not use). Out of these registers we will consider only…16 bit… Control registers – 5 control registers, determine the operation mode of the processor. {part 3, chap 2, } We will mention here briefly the other registered, which are not assumed to influence our code. x87 FPU registers – floating point operations. MMX and XMM registers – SIMD operations. Memory management registers – 4 registers for specifying the location of data in protected mode, which we don’t use. Also part of the basic execution environment are the I\O ports, which will not be used either. 2.1.7 There are also performance-monitoring counters and internal cahched and buffers. (3/2.1.7) Stack – Sindle 16bit wide in memory which we don't use yet. Interrupt table – Single, called 'the interrupt vector table' or just 'interrupt table' which is not relocatable in the real-mode. It is an array of 4-bytes entries, which resides in address 0, each 4 bytes becomes a pointer (or vector) to a procedure to carry the interrupt. (Actually the base of the table can be changed through the IDTR register, using the LIDT instruction.) Instructions In real mode the processor has a set of instructions that can be carried as specified by Intel's Manual. {3/16.1.3} Loader procedure The loading procedure will be presented in this processor’s assembly language. OS_SEGMENT OS_ROM_SEGMENT 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. equ 0x1000 equ 0x3000 enter realmode mov ax, OS_ROM_SEGMENT mov ds, ax mov ax, OS_SEGMENT mov es, ax mov si, 0x00 mov di, 0x00 mov cx, 0x100 cld rep movsb 10. mov 11. mov 12. jmp ax, OS_SEGMENT es, ax [es:0] Code explanation: Line 1-2: loads the segment address of the ROM OS code into register DS, this will be the source of the copy operation during the loading procedure. Line 3-4: loads the segment address of the ROM OS code into register ES. This will be the destination of the copy. Line 5-6: Initial the index registers SI and DI to zero, they will point to the actual addresses within the segments during the copy process. Line 7: Set register CX to 100, this determine how many bytes will be copied. Line 8: Set the direction bit in the FLAG register, so the index registers will be incremented after each copy. Line 9: Perform the actual copy. In each step a byte is copied from DS:SI to ES:SI then SI and DI are incremented and CX is decremented. The copy stops when CX reaches 0. Line 10-11: ES register is loaded with the segment address of the OS code. Line 12: A far jump is made, this loads CS register with the segment address and IP register with 0, and the CPU start fetch the first OS instruction from memory. Proof: OS Stabilizer Future Work Reference More Compiled with nasm Soft errors? Warm boot Disabling interrupts in the code. REP Formal verification methods and tools, or a program that model it and check all possibilities. Mention that not all the commands were explored 3/18.22.2 After NMI happen it is masked until the next NMI. We assume the processor can not be stuck! (2/3.2 Instruction reference, even after HLT command, an NMI wakes it up) Program is 1 segment (64K) Watchdog: Liskov paper assumes it too: http://www.pmg.lcs.mit.edu/~castro/osdi2000/node2.html IA32 explanation Part of the proof is that the NMI procedure is ending. EPROM instead of ROM

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Toward a SSOS