Download RTOS Acceleration by Reducing Overhead due to Context

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Spring (operating system) wikipedia , lookup

Unix security wikipedia , lookup

VS/9 wikipedia , lookup

Burroughs MCP wikipedia , lookup

Process management (computing) wikipedia , lookup

CP/M wikipedia , lookup

Thread (computing) wikipedia , lookup

Transcript
Divi Pruthvi. et, al., International Journal of Technology and Engineering Science [IJTES]TM
Vol 1 (4), pp 236 – 240, July 2013
RTOS Acceleration by Reducing Overhead due to
Context-Switching
Divi Pruthvi1, Narayanaju Samunuri2, Deevi Swathi3
1
Divi Pruthvi, [email protected]
Narayanaraju Samunuri, [email protected]
3
Deevi Swathi, [email protected]
2
Abstract— Embedded Real Time applications use multi
threading, a key concept of any conventional OS. The
advantage of multi-threading include greater throughput,
more efficient CPU use, better system reliability, improved
performance on multiprocessor computer.
The real-time systems are all about meeting the
deadlines i.e, delivering the output when required, without
any delay. This factor becomes more prominent in case
Hard Real-time systems where failure in meeting deadlines
leads to disaster. But, the concept of multi-threading or
multitasking itself introduces prominent delay due to
context-switching in Real time operating system, and it has
to be reduced so as to meet the deadines,and make sure
that system delivers required ouput without any delay.
In this paper, the main focus will be on reducing the time
taken for context-switchng and improve the overall
performance of the system. This is achieved by giving a
hardware approach for context-switching, eliminating the
need to use external memory for saving the context, so that
the system use multi-threading by moving the context to the
processor hardware itself.
This approach considerably reduces the number of clock
cycles required for task execution there by improving the
overall execution time.
Index Terms — Context-switching, Hard RTOS, MIPS
Processor, Multi-threading.
I.
INTRODUCTION
One of the key characteristic of an operating system
(OS) is its ability to handle to multiple tasks at a time on a
time sharing basis commonly referred to as Multi-tasking
[13]. It is also responsible for managing the hardware
resources of a computer and hosting applications that
execute on the computer. A real-time operating system is a
specialized type of operating system where execution of
tasks has to be done precisely without exceeding the
deadlines and is intended to use for Real-time systems.
Real-time systems (RTS) are those whose correctness of
result not only depends on its logical behaviour but also on
the time at which they are produced [1]. The OS meant for
RTS is referred as RTOS where the time at which results
are produced is of major concern. Basically, real-time
systems are classified in to two types Hard and Soft realtime systems.Soft real-time systems are those whre failure
in meting deadline doesn’t cause serious harm where as
hard real-time and soft real-time systems. In a hard realtime or immediate real-time system, the completion of an
operating after its deadline is considered useless, and this
may cause a critical failure of the complete system and can
lead to an accident (e.g. Engine Control Unit of a car,
Computer Numeric Control Machines).
An application comprises of many tasks and kernel
divides the application into logical pieces commonly called
as threads and co-ordinates their execution.Each thread has
its own context at every instant that includes processor
registers, program status word, programcounter [8].
Scheduler which is part of kernel will schedule the threads
depending upon type of scheduling mechanism choosen,by
default threads are scheduled in a round-robin fashion, with
prescribed time slices allotted for every thread.The CPU
will be transferred from one thread to another as follows,
1.
Suspend the execution of curent thread and save
the context related to current thread in to the stack and load
the program counter with address of thread to which
control has to be transferred.
2.
Before handling the CPU control to the new
thread the context related to the new thread I popped from
stack.
A context-switch also occurs whenever interrupt is
generated and the interrupt can be internal/ external [10].
Function calls also leads to context-switching process to
take place. Storing and restoring context to and from
memory is a time consuming process and may take 50 to 80
processors clock cycles depending upon context size and
RTOS design. If several events happen continuously,
overall performance of the system might be degraded as
most of the time will be consumed for saving and restoring
conetxt of different threads. To improve the responsiveness
of the system, the overhead imposed due to the contextswitching need to be reduced. In general, there are two
factors that affect the context switching cost. Direct cost
due to moving the processor’s registers to and from
ISSN: 2320 – 8007
236
Divi Pruthvi. et, al., International Journal of Technology and Engineering Science [IJTES]TM
Vol 1 (4), pp 236 – 240, July 2013
external memory or cache and indirect cost because of
perturbation of cache, CPU, pipeline, etc. This presents
difficulty in estimating the total cost of context switching
cost[2], several algorithms have been developed and
implemented to reduce the direct cost of context
switching[3][4][5]. The context registers has to be saved in
external memory one at a time.we have considered MIPS
processor and during context-switching 12 registers (9
temporary
registers,
stackpointer,
globalpointer,
programcounter) has to be saved and it will consume
2X2X12=48 clock cycles to switch the context. By using
the approach sugested in this paper, the switch time can be
considerably reduced tto 4 clock cycles independent of
number of context registers. The suggested approach in this
paper is to modify the architecture of MIPS processor [11]
by adding additional register files to the existing register
bank.These register files are implemented in the processor
hardware itself as part of the processor’s register bank
module. Along with this two additional instaructions are
introduced in to the MIPS assembler to access the register
files. To test and achieve the required performance a cooperative operating system invovlving thread switching is
being defined and run on FPGA along with the MIPS
architecture.
II. IMPLEMENTATION
The implementation of this concept involves both
hardware and software approach.
A .Hardware approach
In this, the existing MIPS processor architecture is
selected and suggested modification is implemented on
it.The register bank module is being modified by adding 4
register files as shown in fig:1 and each register file
consisting of 12 registers to the existing register bank to
save & restore the context and make process of conetxtswitching as part of processor itself avoiding the use of
external memory for storing and restoring purpose thereby
reasonably reducing the processing time.This modification
is done by means of VHDL code. The number of register
files can be extended, depending on the available FPGA
resources, to accommodate more threads. The Plasma
MIPS processor, used for this thesis, implements the
“reg_bank” module in the FPGA’s block RAM [9].
Figure l: Modified MIPS Processor Architecture
To access these register files two new instructions (scxt,
rcxt) are introduced in to the existing instruction set, so as
to make the process much quicker.The index will be stored
in CPU temporary register and this register will be used as
operand for these instructions.
B. Software Approach
The software implementation invovlves two major
operations.One is defining four threads invovlving simple
swithching between threads. These threads are initialized
by the application. The application needs to call “InitOS”
to initialize each thread’s “Task” structure as shown in
figure 2. FastCtxtSwitch member of task identifies contextswitching property if its value is grater than 0 the register
files are used to save and restore context or else external
memory is used.This feature gives appliction to deicde
whether thread needs fast context-switching or not.
The next process in software approach is assembler
modification. The GNU tool chain for the MIPS processor
is used to compile the co-operative OS (thread switching)
and the test applications.
#include “plasma’s”
#define CONTXT_SIZE 15
typedef void (*TaskFunc)(void)
typedef struct Task
{
void (*TaskPtr)();
// Pointer to Thread Starting Function
int *State; // context
ISSN: 2320 – 8007
237
Divi Pruthvi. et, al., International Journal of Technology and Engineering Science [IJTES]TM
Vol 1 (4), pp 236 – 240, July 2013
unsigned char Executed;
// 1 – thread has started, 0 otherwise
unsigned char TaskID; // Task ID
unsigned char FastCtxtSwitch;
// 1 – Require fast context switch,
// 0 otherwise
}Task
Task Structure
To automate the build process, the newly implemented
context- switch instructions (scxt,rcxt) are added to the
GNU MIPS assembler. These instructions are added to
GNU “binutils” version 2.19 [9]. The “binutils- 2.19/gas”
(GNU assembler) folder contains the source code for the
MIPS assembler. The file “mips-opc.c” in “binutils2.19/opcode” contains all the instructions supported by the
MIPS processor. The new instructions have been added in
the file “mips-opc.c” [12].The MIPS instruction format is
assembler is as shown:
const struct mips_opcode
{
name, args, match, mask, pinfo, pinfo2, membership
}
MIPS Instruction Structure Format in GNU Assembler
A Fastctxtswitch variable is defined for every thread and
reset to 0 in the beginning and for the thread that require
fast context switching the value of the variable will be
greater than 0 then internal register files are used to save
the context of the thread else external memory is used.The
context saving is done by using scxt and rcxt instructions
when
fastcontext
switch
needed
and
2-bit
signal(cnxt_switch[0:1]) is being defined to select the
required instruction,01 for scxt and 11 for rcxt ,completing
tha task of saving and restoring of context with single
instruction,thereby considerably reducing the overhead.
application.Before dumping the architecture in to the
FPGA the .bin file corresponding to the architecture is
being generated and is dumped in to the board using the
iMPACT software which is part of XILINX ISE .GCC tool
is used for compiling the co-operative os and at the same
time genrating the hex file correspoding to the thread
structure and compiling the GNU assembler with respect to
thechange opcode file i,e, addition of two new instructions
(scxt,rcxt). The operating system and application
executable were loaded into the FPGA's block RAM and
executed from there. To verify the correct operation of the
context-switch instructions, software using the “scxt” and
“rcxt” instructions was developed in the MIPS assembly
language and executed on the modified processor in a
simulation environment.An applicatin is run on the
architecture and the time-stampings for corresponding
threads are observed in hyperterminal by means of UART.
IV. SIMULATION RESULTS
As mentioned earlier, test applications have to be
developed using four threads run on the FPGA dumped
with modified MIPS architecure and MIPS assembler. The
first application tests the successful operation of the
proposed approach by switching four threads using internal
register files. This test is used to ensure that data between
threads is not corrupted and thread’s context switching is
correct.The stampings for each thread appears as shown in
figure 4.
The second application is designed to measure the
performance improvement, in clock cycles. It creates four
threads that execute in never-ending loops. In which for
two threads Fastctxtswtch variable is made 1 indicating fast
context switching i.e using register files and remaining two
uses external memory for saving the context. The
stampings appears as shown in figure 5.
As an additional work we can combine the concept of
examining the state of the registers at each instruction when
tracing the execution of a process or task which gives us
the live range of each task.The registers that are alive are
the only ones that need to be saved rather than saving all
the registers[6].This process can be applied to those threads
which doesn’t require fast contxt switching and for those
which require fast contxt-switching register files can be
used to save the context thereby overall execution time will
get reduced drastically giving high performance [7].
III. EXPERIMENTATION
SPARTAN 3E 1600 FPGA [9] development was used to
run the architecture with co-operative Os and test the
ISSN: 2320 – 8007
238
Divi Pruthvi. et, al., International Journal of Technology and Engineering Science [IJTES]TM
Vol 1 (4), pp 236 – 240, July 2013
time thereby meeting the dealines.This improved the ability
of hard RTOSs to meet their basic requirements.
This paper can be extended by introducing the concept of
tracing the execution of process or task to know the number
of live registers for a particular task,thereby reducing the
number of registers to be saved and restored during
context-switching.
REFERENCES
[1]
Krithi Ramamritham, John A. Stankovic,
Scheduling Algorithms and Operating Systems Support for
Real-Time Systems, in Proceedings of the IEEE, VOL. 82.
NO. 1, JANUARY 1994
[2]
Michele Co and K. Skadron, The Effects of
Context Switching on Branch Predictor Performance, 2001
IEEE International Symposium on Performance Analysis
of Systems and Software, pages 77–84, Nov 2001.
Figure 4: Serial Debug Log for Test Application – 1
[3]
L4
Performance,
http://ertos.nicta.com.au/research/l4/performance.pml
[4]
L.W. McVoy and C. Staelin lmbench: Portable
Tools for Performance Analysis. In USENIX Annual
Technical Conference, pages 279–294, 1996.
[5]
J. C. Mogul and A. Borg, The Effect of Context
Switches on Cache Performance. In Proceedings of the
fourth International Conference on Architectural Support
for Programming Languages and Operating Systems, pages
75–84, New York, NY, USA, 1991, ACM Press.
[6]
Jeffrey S.Synder, David B.whalley, Theodore
P.Baker.Fast Context Switches: Compiler and Architectural
Support for Pre-emptive Scheduling.Department of
Computer Science,Florida State University.
[7] Pekka Jaaskelainen, Pertti Kellomaki,Jarmo Takkala,Heitti
Kulkala,Mikael Lepisto, Reducing Context-switch Overhead
with Compiler-Assisted Threading, Department of Computer
Systems, Tampere University of Technology
Figure 5: Serial Debug Log for Test Application – 2
V. CONCLUSION
This paper introduces the concept of architectural
modification of MIPS processor for reducing the overhead
due to context switching in HARD real –time OS where
meting deadlines is of major concern and thereby
accelerating the RTOS performance. The proposed
approach allowed the RTOS to achieve the context
switching without reduced overhead in overall execution
[8]
Francis M. David, Jeffrey C. Carlyle, Roy H.
Campbell, Context Switch Overheads for Linux on ARM
Platforms, Department of Computer Science, University of
Illinois at Urbana-Champaign.
[9]
Xilinx Corp, “Spartan 3E Starter Kit board user
Guide” March 9, 2006
[10] Dan Tsafrir, The Context-Switch Overhead
Inflicted by Hardware Interrupts, IBM T.J. Watson
Research Center.
[11] Plasma
MIPS
Processor
http://www.opencores.org/project,plasma.
Design,
ISSN: 2320 – 8007
239
Divi Pruthvi. et, al., International Journal of Technology and Engineering Science [IJTES]TM
Vol 1 (4), pp 236 – 240, July 2013
[12] GNU compiler and
http://ftp.gnu.org/gnu/binutils/
assembler
for
MIPS,
[13] Abraham Silberschatz, Peter Baer Galvin,
Operating System Concepts, Fifth Edition: WILEY,
Singapore, 1997.
AUTHORS BIOGRAPHY
Divi Pruthvi, pursuing M. Tech in VLSI & Embedded
Systems at Sreenidhi Institute of Science & Technology.
Recieved a silver medal in B. Tech, with ECE as
specialization. Areas of interest include Embedded
Systems, Soft-core processors, RTOS.
Narayanaraju Samunuri, M.tech in Embedded
Systems, with an experience of 11 years in Core Embedded
System domain, published International paper on Flexray
Communication.
Deevi Swathi, M.tech in Embedded Systems, working as
Assistant professor, published a paper on “Remote Data
Acquisition on Embedded ARM9 Platform” in
“International Journal of Electronics, Computing &
Engineering Education.Journal Issue july-dec 2011.
ISSN: 2320 – 8007
240