Download Toward a SSOS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Toward a Self-Stabilizing Operating System
Introduction
In order to make a system self-stabilizing it is needed and sufficient that
every part of it will be self-stabilizing [Dolev].
If we’re talking about a real computing system we need the hardware to
have this property and various software components up to the algorithms
who determine or fulfill the system behavior.
One such essential part of most computer systems is an Operating-System.
It’s main targets are managing the hardware resources and presenting the
higher level software an abstract (virtual) machine. [Tanenbaum].
The article describes the steps of the research toward a Self-Stabilizing
Operating System – SSOS.
In order to supply an SSOS we could take the top-down direction and start
develop various standard OS algorithms like process scheduling or memory
management. We could even apply those changes to some open-source OS.
This approach while very practical is very difficult to prove, since the
system built is already composed of various quite complicated components.
Instead we took the bottom-up approach. Thus we looked for the most
elementary part of an OS and try to stabilize it.
Usually right after a computer is booted, the core of an OS is loaded to the
computer’s Read Access Memory (RAM) this is done by procedures
residing in non-volotile Read-Only-Memory(ROM) usually referred as
Basic-Input-Output-Services(BIOS).
Thus after the OS is loaded to memory, as a minimum we would like to
guarantee that at least the code that was loaded is correct.
Assuming that the OS itself is self-stabilizing, it might happen that the
memory holding the OS code will be corrupted (e.g. by soft errors), leading
to a situation that the OS does not converge to a valid state.
In order to deal with this problem we would like our OS to have an
essential procedure, which will reside in ROM and periodically will load
the OS and start it again.
The Intel’s Pentium Processor was chosen for implementing the SSOS.
In the coming section we will describe the solution in details. We will pick
a platform for implementing the solution and prove that the solution is
correct.
…….
The OS loader
We will lay out a few general assumptions about the system.
We have an underlying stabilized hardware which contain a CPU that
keeps fetching, decoding & executing instructions from memory.
Additionally we have a mechanism that from time to time is guaranteed to
make the CPU stop the regular cycle and “jump” to a procedure residing in
ROM and start fetching instructions from there.
This procedure will reload the OS again ROM (e.g. CD_ROM) to memory
in a way that is totally not dependent on previous machine state.
Toward the procedure end it will set the CPU to start execution of OS code
from start.
This loading procedures assures that from once in a predefined way which
is not dependent on any memory resident software the OS starts again thus
enters a valid state.
Hardware Platform - Model
In this section we will describe the hardware platform that was chosen. We
will concentrate on the needed information for describing the solution.
The processor
We chose the Intel’s Pentium Processor for implementing the SSOS.
This is a ubiquitous processor which we already had in hand. [Intel 2003].
Additionally it has published & available manuals we used to write the
code and verify it’s correctness.
This processor is built from millions of transistors and is also a commercial
one so we can’t gurentee that we know it’s full state, but we will describe
it’s state according to the known registers it has.
Interrupts {16.1.4}
When interrupt occurs the processor does the following
Pushes the values of CS & IP registers onto the stack
Pushes the FLAG register onto the stack
Clears the IF flag in the FLAG register
Transfer program control (i.e loads CS & IP) with the location specified in
the interrupt table.
The IRET instruction called usually at the end of interrupt procedure,
reverse these steps to resume the interrupted program.
NMI
Non-maskable Interrupt is generated by external hardware that assert the
NMI pin. (Starting from Pentium 4 it can be generated by instruction also).
When this happens the processor immediately moves to the handler pointed
to by vector 2.
This interrupt can not be masked by the IF flag.
While an NMI is handled additional NMIs are disabled.
NMIs can be disabled with external circuitry. {3/9.9.1}
Timer
As said, we need a mechanism for starting the loading procedure once in a
while.
The natural candidate is the interrupt mechanism built into the processor.
Especially the processor has a built in timer that ticks every 55
milliseconds.
Most interrupts including the timer interrupt can be discarded by the CPU
as a result of executed instructions that handled by the CPU, and change its
state.
The CPU has an external leg for Non-Maskable-Interrupt (NMI) which can
not be discarded.
We can connect this input leg to a hardware called watchdog which can be
configured to turn on this leg once in a specified time.
[Liskov] assumes this mechanism also for a fault-tolerant system.
In the event of an interrupt, the CPU jumps to a memory address written in
another area of memory called Interrupt Vector Table.
According to our assumptions this table, especially the entry for the NMI
interrupt, must also reside in ROM.
This assumption while not implemented in standard computers, is not so
unreal since occasianly there are suggestions to implement full OSs on
BIOS. [see…]
Processor states - Basic Program execution
Realmode {Intel 3/16.1}
3 or 4 operation modes: Protected mode, Real-address mode, Virtual-8086
mode and System Management mode.
The IA-32 architecture processor family has an operation mode called
'Real-address' mode, in this mode the processor execute programs written
for the Intel 8086 processor.
However programs running in this mode can use explicitly processor
resources added in later generations.
There is a 2nd mode called Virtual-8086 mode that is part of the regular
mode (protected mode) which enables concurrent running of realmode
programs.
We chose to work in the real mode because of its relevant simplicity and
the availability of documentation.
When the processor is booted up it is put in the real-mode probably for
backward compatibility reasons.
Mode switching:
The PE flag in control register CR0 bit 0, controls the processor mode.
MOV CRO instructions do it.
The process is: {3/9.9.2}
Disable interrupts (CLI)
Something with paging
Transfer control to readable segment
Load segment registers with values appropriate for real mode
Exectue LIDT to point to an interrupt vector table
Clear the PE flag in CR0: {MOV EBX, CR0;} AND EBX, 0xfffe; MOV
CR0, EBX
Enable interrupts (STI + external hardware)
The CPU state is assumed to be totally determined by the values of it’s
different registers.
So next we list all of the published Pentium’s registers and there function,
this is based on Chapter 3 – Basic Execution Environment of ‘IA-32 Intel
Architecture Software Developer’s Manual, volume 1: Basic Architecture’
And {Intel 3/16.1}
Address Space
The process supports addressing of 1-MByte of physical address space.
This space is divided into segemts of 64Kbytes.
The base of a segment is specified by a 16 bit selector (stored in one of the
segment registeres) which is extended with additional 0 to form 20bit offset
from address 0 in the address space.
To address specific address, an operand of 16 bit is added to the segment
offset to form the actual address.
Registers –
8 General purpose registers, 6 segment registers, EFLAG register (Process
State Word) and EIP register (Program Counter).
These registers among other tasks are used for program flow control and
addressing memory which will be used by our code.
(Other uses are integer arithmetic and stack operations which we will not
use).
Out of these registers we will consider only…16 bit…
Control registers – 5 control registers, determine the operation mode of the
processor.
{part 3, chap 2, }
We will mention here briefly the other registered, which are not assumed to
influence our code.
x87 FPU registers – floating point operations.
MMX and XMM registers – SIMD operations.
Memory management registers – 4 registers for specifying the location of
data in protected mode, which we don’t use.
Also part of the basic execution environment are the I\O ports, which will
not be used either.
2.1.7
There are also performance-monitoring counters and internal cahched and
buffers. (3/2.1.7)
Stack – Sindle 16bit wide in memory which we don't use yet.
Interrupt table – Single, called 'the interrupt vector table' or just 'interrupt
table' which is not relocatable in the real-mode. It is an array of 4-bytes
entries, which resides in address 0, each 4 bytes becomes a pointer (or
vector) to a procedure to carry the interrupt.
(Actually the base of the table can be changed through the IDTR register,
using the LIDT instruction.)
Instructions
In real mode the processor has a set of instructions that can be carried as
specified by Intel's Manual. {3/16.1.3}
Loader procedure
The loading procedure will be presented in this processor’s assembly
language.
OS_SEGMENT
OS_ROM_SEGMENT
0.
1.
2.
3.
4.
5.
6.
7.
8.
9.
equ 0x1000
equ 0x3000
enter realmode
mov
ax, OS_ROM_SEGMENT
mov
ds, ax
mov
ax, OS_SEGMENT
mov
es, ax
mov
si, 0x00
mov
di, 0x00
mov
cx, 0x100
cld
rep
movsb
10. mov
11. mov
12. jmp
ax, OS_SEGMENT
es, ax
[es:0]
Code explanation:
Line 1-2: loads the segment address of the ROM OS code into register DS,
this will be the source of the copy operation during the loading procedure.
Line 3-4: loads the segment address of the ROM OS code into register ES.
This will be the destination of the copy.
Line 5-6: Initial the index registers SI and DI to zero, they will point to the
actual addresses within the segments during the copy process.
Line 7: Set register CX to 100, this determine how many bytes will be
copied.
Line 8: Set the direction bit in the FLAG register, so the index registers will
be incremented after each copy.
Line 9: Perform the actual copy. In each step a byte is copied from DS:SI
to ES:SI then SI and DI are incremented and CX is decremented. The copy
stops when CX reaches 0.
Line 10-11: ES register is loaded with the segment address of the OS code.
Line 12: A far jump is made, this loads CS register with the segment
address and IP register with 0, and the CPU start fetch the first OS
instruction from memory.
Proof:
OS Stabilizer
Future Work
Reference
More
Compiled with nasm
Soft errors?
Warm boot
Disabling interrupts in the code.
REP
Formal verification methods and tools, or a program that model it and
check all possibilities.
Mention that not all the commands were explored
3/18.22.2 After NMI happen it is masked until the next NMI.
We assume the processor can not be stuck!
(2/3.2 Instruction reference, even after HLT command, an NMI wakes it
up)
Program is 1 segment (64K)
Watchdog: Liskov paper assumes it too:
http://www.pmg.lcs.mit.edu/~castro/osdi2000/node2.html
IA32 explanation
Part of the proof is that the NMI procedure is ending.
EPROM instead of ROM