Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Today’s Operating Systems and their Future with Reconfigurable Architecture Jesper Melin Daniel Boström IDT, Mälardalen University, Västerås, Sweden Kopparbergsvägen 27B 72213 Västerås SWEDEN +46 730620814 IDT, Mälardalen University, Västerås, Sweden Kopparbergsvägen 27B 72213 Västerås SWEDEN +46 730640260 [email protected] [email protected] ABSTRACT This paper is meant as an introduction to operating systems, their main functions and what types exist, as well as possible benefits gained by placing typical software functionality in hardware. By introducing hardware support in the form of reconfigurable architecture, such as programmable logic devices, not only new functionality and better performance can be achieved but also improved determinism and predictability. These properties make it easier for developers of real-time systems when trying to both calculate and predict timing constrains. Programmable logic devices will be presented, why are they reconfigurable? Operating systems in hardware are still in the development phase and one version of how an operating system in hardware could look like will be presented, including layers and message passing. Keywords Operating system, programmable logic device, FPGA, parallelism, hardware, software. 1. INTRODUCTION Demands on computer performance have increased rapidly over the years. Video processing and gaming are among the fields leading the way when pushing computers towards the breakingpoint. Computer scientists have put a lot of focus on improving computer hardware; introducing multiple core central processing units (CPUs) and improved cache for example. In order to take the next step towards better performance it is not enough to improve hardware; software improvements are needed to fully take advantage of new technology. Since introducing reconfigurable architecture (RA), proven software-based algorithms known to be resource demanding can be moved down to hardware with enhanced performance. This migration should not only be limited to algorithms but also parts of an operating system (OS). Section 2 presents what an OS is, followed by section 3 that will introduce different types of OSes. Section 4 will give a short summary of differences between a high-level programming language (HLPL) and a hardware description language (HDL). Types of programmable logic devices (PLDs) including a example is placed in section 5, followed by section 6 which mentions with few words about increased computer performance. Section 7 presents benefits that are achieved by introducing hardware support for OSes. An example of how a hardware operating system (HOS) could look like is presented in section 8. Finally the paper is ended with acknowledgements in section 9, summary with conclusions in section 10 and at last future work in section 11. 2. WHAT IS AN OPERATING SYSTEM? A lot of people have come in contact with an OS; some without even knowing it. But even if you are aware of the fact that you are using an OS, it can be difficult to specify exactly what it is. An OS is said to take care of two somewhat unrelated tasks, managing resources and extending the machine [1]. These tasks are very important so let us look at them in more detail. 2.1 Managing Resources Nowadays computers consist of processors, memory, hard drives, printers etc. and we need a way to manage them. In a computer running multiple applications at the same time, we could easily end up with a mess if every application was granted access to a resource at any given time. This is where the OS gives a helping hand by only allowing an application to access a resource if it is not already in use. The OS usually solves problems like this by implementing queues, in the case of printers, or locking resources, in the case of optical drives. A perfect example of where at most one application can have access to a resource is when using a printer. Imagine having a paper printed with two completely different documents merged together into one. Or imagine burning two image files onto the same digital versatile disc (DVD). How resources in a computer are shared among users or applications can be divided into two groups; in time and in space. When a resource is shared in time, it means that there are several applications or users who want to use a resource at the same time. The only way to solve this problem is to let them take turns using the resource. The order in which this happens is determined by the OS; this is called scheduling. Shortest job first, priority based and round robin are just a few of the scheduling algorithms available. There are situations where applications or users can share the same resource at the same time. For example as soon as an application starts executing, it is given a part of the memory. This means you can have multiple applications in memory at the same time. As most applications today only use a fraction of the memory, it would be a waste of resources to keep only one in memory at a time. This kind of sharing is called sharing in space. 2.2 Extending the Machine The architecture of a computer can be very scary to look at. There are instruction sets, memory, registers, buses etc. For a programmer this can be too much to comprehend, but knowledge required if you are told to do some low level programming. This is where the OS comes in. The OS hides all the low level bits and pieces from the programmer (see Figure 1) and introduces a much simpler view. Access to low level functionality is obtained through system calls provided by the OS. User Application Operating System Hardware Figure 1. General system layout 3. TYPES OF OPERATING SYSTEMS It is easy to think of OSes in personal computers as the one and only type; especially if you are not in the field of computer science. But the truth is, they are almost everywhere. Some are big, some are small but they all serve a specific purpose. The following sections with OS types are from [1]. 3.1 Mainframe Operating Systems It is not uncommon for a mainframe computer to take up an entire room by itself. The storage capacity of a mainframe computer surpasses personal computers by a wide margin. A mainframe OS focuses on processing as many tasks as possible at once. The above mentioned properties make a mainframe computer suitable for web server duties. Typically three types of services are offered by a mainframe OS: batching, transaction processing and time-sharing. A system which does not require user interaction while processing jobs is referred to as a batch system. As soon as one job completes, another is started. Processing checks at a bank is an example of transaction processing. Thousands of small jobs are carried out each second. Time-sharing systems are exactly what they sound like; multiple users remotely execute jobs on the same computer at the same time. 3.2 Server Operating Systems Server OSes run on servers based on either a personal computer, workstation or mainframe. Users share software and hardware by connecting over a network. Services users are usually interested in are print services, file services and web services. 3.3 Multiprocessor Operating Systems To increase performance of a computer, it is getting more and more common nowadays to use more than one processor. This hardware setup needs a special OS. Usually a modified version of the server OS is used with a particular set of features handling the extra complexity involved in communication. 3.4 Personal Computer Operating Systems If you have ever come in contact with an OS it is most likely this type. This OS only has to take one user into consideration. They are normally installed on computers located at home, at work, at school etc. You typically use it for writing documents, chatting, checking your e-mail and browsing the Internet. A well-known personal computer OS is Microsoft Windows XP. 3.5 Real-Time Operating Systems In a real-time operating system (RTOS) time is a key factor. Computing data correctly is not enough; it also has to be delivered in time. Sometimes it is also important not to deliver data too early. An airbag system is an example of a system which needs to respond within a certain time span. If an airbag is inflated too early, it will be flat by the time the driver's head hits it. In the case of a too slow system the airbag will explode in the face of the driver instead. Input is typically received from a sensor, computed, and later sent to an actuator. There are two types of real-time systems. The difference between them is primarily whether or not they accept deadlines to occasionally be missed or not. The airbag example described earlier is a good example of a hard real-time system. A single miss can be the difference between life and death. A system where it is okay to miss deadlines every once in a while is called a soft realtime system. Audio and video processing is an example of where deadlines can be missed without serious consequence; the service might suffer from a worsening in quality but your system will not fail. 3.6 Embedded Operating Systems A personal digital assistant (PDA) uses this type of OS. Embedded OSes are generally small and the devices using them only provide a limited set of functionality; after all, this is not a general purpose OS. They are designed not to be resourcedemanding with respect to memory usage and power consumption. It is not uncommon to find an embedded OS in kitchen appliances like microwave ovens. 3.7 Smart Card Operating Systems The smallest operating system on the market is used for smart cards. A smart card is a small device with a built-in chip; some of them can handle only a single function while others are a little more advanced. 4. PROGRAMMING LANGUAGES When developing software today, a HLPL is usually used to implement functionality. C, C++ and Java are just a small collection of HLPLs. Hardware used to be static and hard to reconfigure, but with the introduction of programmable hardware a new area of programming languages has derived, HDLs, such as VHDL and Verilog. Instructions written in a HLPL are translated into binary code before it can be understood by a central processing unit (CPU). HDL is not translated in the same way as a HLPL, instead of machine code, it is translated into gates (AND, NOT…), latches (D, JK…) etc. and connections between the components. [2] A clear difference between HLPLs and HDLs is that HLPL instructions are executed serially, one after another, in a CPU. HDL instructions can instead be executed in parallel thanks to the hardware (see Figure 2) [2]. These PLDs can be configured by downloading new software and hardware before run-time and use the configuration when the device is started. Example HLPL(C/C++): X = I + 1; Z = K + 2; A computing paradigm that tries to combine the flexibility from software with the performance of hardware is called Reconfigurable Computing (RC). [3] This example will be executed serially as binary code in a CPU. 5.1 FPGA HDL(VHDL): As mentioned before in the previous section, FPGA stands for Field Programmable Gate Array and can be programmed using an HDL. FPGAs are made of many configurable logic blocks (CLBs) and these blocks are then made of gates, latches, multipliers etc. FPGAs with over one million gates are not unusual today [4]. A matrix of wires and connections that are reconfigurable link these CLBs together with each other and I/O ports (see Figure 3). [2]. Y = J – 5; X <= I + 1; Y <= J – 5; Z <= K + 2; Since these instructions are translated into gates, instead of binary code, it is possible to execute them in parallel. Figure 2. Instructions processed in software (left) and hardware (right) Another good example why functionality implemented in hardware is in general faster than software is an output that is set to 1 only if two inputs are 1, a simple AND operation (Table 1). Table 1. Behaviour of an AND-gate Input 1 Input 2 Output 0 0 0 0 1 0 1 0 0 1 1 1 In hardware this function would only uses a simple AND-gate to generate the correct behaviour, but with software it must be implemented as a function, stored in some memory and translated into binary code to be executed in a CPU. There is a long chain of procedures before the output is set to 1 in software. Hardware on the other hand, has one link in the chain, the AND-gate, this results in much better performance compared to the software version (assumed that a short chain is faster than a long one). It is not always true that hardware is faster than software, but as said, in general hardware are faster than software. [2] 5. PROGRAMMABLE LOGIC DEVICES Today many types of programmable logic devices (PLDs) are available which can be used in different fields. In [2] is a list with names of devices that hardware and software can be downloaded to: • • • • • Programmable Logic Device (PLD) Simple Programmable Logic Device (SPLD) Complex Programmable Logic Device (CPLD) Field Programmable Gate Array (FPGA) Field Programmable InterConnect (FPIC) Figure 3. General FPGA architecture (courtesy of embedded.com) A deeper understanding how FPGAs work are according to us not required for this paper, since it is going to focus more about OSes in hardware. If you find FPGAs to be fascinating, feel free to search the Internet for further information. 6. SEARCH FOR BETTER PERFORMANCE To enhance performance in computers, new technologies and techniques have been implemented and presented over the years. Without cache memory and pipeline support in the CPU, computers would run much slower [2]. A floating point unit (FPU) is one example of hardware that aids the CPU to make faster mathematical calculations. Instead of iterating calculations in the CPU with software, the CPU can send a message to the FPU to speed up the calculation. The FPU is specially constructed to handle calculations, nothing else. By introducing caches, pipelines, FPUs and other components, computer performance is increased to new levels but also pushes the price tag. Higher performance needs more silicon area to fit the components which leads to a higher price for the product [2]. 7. BENEFITS WITH HARDWARE SUPPORT FOR OPERATING SYSTEMS In section 4 an example why functionality in hardware is faster than software was presented. This was only a tiny and easy function but lead to a great improvement in performance. Would it be possible to enlarge this simple example to a more complex one and even move parts from the OS down to hardware? The answer for the question is yes, so the following sections will present benefits that are the result of moving parts from the OS down to hardware. 7.1 Speed One major benefit with hardware is it is much faster than software. Better performance means that operations take less time to execute. Create task, suspend task etc. are a few examples of operations that can be improved. Algorithms that can be executed in parallel are the ones that achieve most increase in performance when placed in hardware. Depending on the operation that is moved to hardware, a speed-up of five to 1000 times is not impossible [2]. 7.2 Real parallelism In a CPU, instructions are executed one after another in serial as presented before. In a system with several processes it looks like the CPU is actually able to execute several processes at the same time, but this is unfortunately not true. At any time, only one process is able to execute in the CPU. Thanks to the scheduler, who decides what process is granted access to the CPU, it looks like real parallelism; this is usually called pseudoparallelism, Real parallelism is achieved thanks to the introduction of PLD technology, such as FPGA. Algorithms that used to be implemented in software can be translated into hardware using a HDL and therefore utilize parallelism. [5][1] This characteristic in hardware helps the system to reach higher performance. Example 2: Hardware In hardware (gates, latches…) it is possible to construct state machines where instructions are stored and all information is sent from point to point. These state machines work in parallel with each other, since they are in hardware. So even if the system is big, it is made of small state machines that are executed in parallel leading it to be predictable. This arrangement is suitable both for small and large systems. 7.5 New functionality This advantage is the result of the other benefits, Hardware offers a new way of implementing algorithms thanks to its speed. Some algorithms that used to take too long to execute in software can utilize the hardware to lower the execution time. Example: Mathematical calculation Imagine a hard real-time system with very short deadlines. Input is taken from the environment in form of a temperature meter, the input is used in a long and complicated mathematical formula before the result is used to actuate an engine. In software (see Figure 4) this calculation takes a long time to perform, leading to missed deadlines in the system. If the formula is translated into hardware instead, some part of it may be calculated in parallel (see Figure 5); the time for the calculation is lowered so no deadlines are missed in the system. 7.3 Determinism In philosophy, determinism is a philosophical standpoint that everything is determined by earlier grounds or other already given conditions [6]. Laplace once said [7]: “We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.” - Pierre Simon Laplace, A Philosophical Essay on Probabilities Computer science is far from physiology but the meaning of determinism is translatable into the digital world of ones and zeros. Hardware is by nature deterministic in regard to time and as a result makes it easier to predict. 7.4 Predictability In an RTOS it is important to make correct predictions of worst case execution time (WCET) for a program, to make sure that all tasks meet their deadline when scheduled. These predictions are made easier if the system has deterministic characteristics (which hardware has), to make this a little clearer, here are two examples: Example 1: CPU A CPU is running a small program; there are not so many other programs or interrupts. In small scale, a CPU is therefore predicable. When the CPU is scheduled with many programs, interrupts etc., it gets harder to predict. Figure 4. Computation in software Figure 5. Computation in hardware This is as said only an example, but hopefully gives you an idea why new functionality is achieved thanks to earlier benefits presented. 8. A HARDWARE OPERATING SYSTEM EXAMPLE An HOS still in the research phase, presented by [8], will be covered in this section. RC, as talked about earlier (see Section 5), is a paradigm gaining more and more acceptance in the field of computer science. An HOS, or run-time partial reconfiguration environment, is a vital part of RC. Instead of executing tasks sequentially like a software OS, an HOS executes hardware blocks (HBs) in parallel. HBs will be described in more detail later. 8.1 The layers An HOS consists of the following four distinct layers (see Figure 6): application, architectural, bridging and physical. Application Layer porting process will suffer; for the sake of portability, the ADI should be kept as simple as possible. But a simple ADI has to rely more on the CS to realize its functionality. As a result, performance takes a hit as traffic between the two sub-layers increases. The CS is also an abstraction but on a lower level; everything needed to manage and execute HBs on hardware architecture to be more specific. The main task of the CS is to break down hardware applications into messages which are later passed down to lower layers in the layer structure. Architectural Layer Bridging Layer 8.1.3 The Bridging Layer Physical Layer The purpose of the bridging layer is to tie the abstraction of the RA together with the hardware in the system. Figure 6. Layered structure of an HOS 8.1.1 The Application Layer The application layer shields the user from the internal structure of the system. This is achieved by providing a graphical user interface (GUI) which interacts with the user when developing and executing applications. In order for independent developers to program applications for hardware operating systems (HOSes), a standard way of designing applications needs to be defined. For this purpose, a set of standard rules was developed called application modeling language (AML). AML consists of two types of components: HBs and connections. HBs are used to describe hardware functionality; together with connections they form hardware applications. A connection connects HBs and shows where inputs and outputs go between them (see Figure 7). For a deeper discussion about AML and a more comprehensive example of a hardware application, we refer to [8] and [9] respectively. V Timeslot 0 HB1 = MUX C1 HB2 = FF C2 Figure 7. Example of connections connecting HBs 8.1.2 The Architectural Layer The architectural layer is in charge of providing a development environment. The environment uses an abstract model of the RA and supplies the application layer with an abstract view of a hardware application which can then be modeled. Two sub-layers form what we call the architectural layer: the application design interface (ADI) and the core system (CS). The two sub-layers are linked together using an object framework. When designing the architectural layer, and the application layer, three requirements were kept in mind. Performance, tasks have to be executed at the highest speed possible. Modifiability, changes cannot be an expensive process with respect to time and money. And finally portability, you should be able to port an HOS to multiple platforms. In short, everything needed when developing hardware applications for an HOS is provided as an abstraction by the ADI. The ADI is the crucial part in the system when trying to port an HOS to another platform. If problems arise here, the whole A protocol was designed to handle communication between the architectural layer and the bridging layer. This communication line, bridge, uses the hardware request and response message structure (HRRMS) to allow the architectural layer to pass execution timeslots forward to the bridging layer. Timeslots carry HBs which will later be executed by the RA. A HRRMS looks different depending on whether it is a request or a response message. Figure 8 illustrates a request message for timeslot 0 shown in Figure 7. HB1 HB2 MT MI BT BR O s/n BT BR I s/n O s/n 0 0 MUX 0 C1 - FF 0 C1 - C2 V Figure 8. HRRMS request message for Figure 7 Figure 8 needs some explaining: MT = 0, request message. MI = 0, first message sent in this data stream transmission. BT (HB1) = MUX, HB1 is of type multiplexer. BR (HB1) = 0, first multiplexer in this message. O (HB1) = C1, HB1's output goes out on connector C1. s/n (HB1) = -, HB1's output is not saved. BT (HB2) = FF, HB2 is of type flip-flop. BR (HB2) = 0, first flip-flop in this message. I (HB2) = C1, HB2 receives input from connector C1. s/n (HB2) = -, input is received directly from connector C1. O (HB2) = C2, HB2's output goes out on connector C2. s/n (HB2) = V, HB2's output will be saved in the global variable V. As can be seen in Figure 8, a request message is dynamic in length depending on how many HBs and connectors a timeslot consists of. When the bridging layer receives a request message from the architectural layer, it decodes and executes all the HBs. At the end of an execution timeslot, a response message is created and sent back to the architectural layer. Figure 9 shows the response message created by the bridging layer. A timeslot only handles one message of each type. MT MI O RE 1 1 C2 gV Figure 9. HRRMS response message for Figure 7 This is how to read Figure 9: MT = 1, response message MI = 1, second message transmitted as the request message was the first one. O = C2, connector containing data gathered for saving. RE = gV, a reference to the global variable in which data from connector C was saved in. 8.1.4 The Physical Layer The most advanced layer and the one that manages the hardware in the system is called the physical layer. It provides an application programming interface (API) which the bridging layer can use in order to access various hardware platforms. The physical layer can be seen as three main components: • • • HOSes_HwBlk HOSes_FPGAResource HOSes_PartialRTR An HOSes_HwBlk holds information about a HB, for example its size, type and name. An HOSes_FPGAResource manages the resources of the RA while information about its run-time environment is provided by the HOSes_HwBlk. 8.2 Implementations Since this HOS is still in the research phase, you cannot find it on the market. However, some implementation attempts deserve to be mentioned. Starting from the top, the application layer has been realized to a large extent. The AML has been design as well as a preproduction model of a GUI. Linux and C++ were used when realizing the architectural layer. With the JBits API [10] as a base, the Java programming language was used to implement the physical layer. 9. ACKNOWLEDGMENTS We want to thank Lennart Lindh for his inspiration for the subject and explanation of benefits. 10. SUMMARY AND CONCLUSIONS The future of OSes and higher performance might be the integration of software and hardware. There are a number of benefits that hardware can offer to developers thanks to the introductions of PLDs and HDLs. Similarities between HLPLs and HDLs makes it even easier to translate software into hardware. Developers of real-time systems can use the advantage of determinism that is natural in hardware to make better predictions for WCET and the speed of hardware to perform complex mathematical calculations in less time, where it used to be impossible with the timing constraints. Parts of OSes or algorithms can be places in hardware, depending on how much functionality is parallelizable, performance increase in different scales. It is not enough to increase the performance in hardware, to be capable of utilize new technology it is important to not leave software behind. Thanks to OSes, developers do not need to concern how the hardware is actually working, but in the future with HOSes, developers might be able to write code for both software and support it by writing code for hardware. This is an interesting field and much has happen the last decades, we believe the mixture of software with hardware support is the future. 11. FUTURE WORK Continuing research how an HOS architecture could be implemented, what parts of an OS could be moved from software into hardware. Today HOSes are still in its research phase, but future work could be to help it leave the research level and get out in the field Hopefully we will come to a point in the near future when software developers can use HOSes to develop and combine software with hardware in new ways. A question that would be interesting to research is if it is harder for developers to write algorithms that can be parallelised or written in serial. Is the brain naturally better to understand serial commands or parallel commands, depending how the human brain thinks? This might we outside the context of this paper and field, but it is an interesting thought. 12. REFERENCES [1] Tanenbaum Andrew. Modern Operating System (second edition). Prentice Hall. 2001 [2] Lindh, Lennart & Klevin, Tommy. Programmerbara kretsar : Utveckling av inbyggda system. Studentliteratur. 2005 [3] Wikipedia. Reconfigurable computing. <http://en.wikipedia.org/w/index.php?title=Reconfigurable_c omputing&oldid=318577595> (accessed October 14, 2009). [4] Wikipedia. Field-programmable gate array. <http://en.wikipedia.org/w/index.php?title=Fieldprogrammable_gate_array&oldid=319854435> (accessed October 12, 2009). [5] Nordström, Susanna. Configurable Hardware Support for Single Processor Real-Time Systems. Mälardalen University Press Licentiate Theses. 2008 [6] Nationalencyklopedin. Determinism. <http://www.ne.se/lang/determinism> (accessed October 13, 2009). [7] Wikipedia. Pierre-Simon Laplace. < http://en.wikipedia.org/w/index.php?title=PierreSimon_Laplace&oldid=316059480> (accessed October 15, 2009). [8] Grozal, Voicu & Villalobos, Rami. What next? A hardware operating system?.Instrumentation and Measurement Technology Conference 2004. 2004 [9] Villalobos, Ricardo & Abielmona, Rami & Grozal, Voicu. A Bridging Layer for Run-Time Reconfigurable Hardware Operating Systems. IEEE International Instrumentation and Measurement Technology Conference 2008. 2008 [10] Xilinx. JBits SDK. <http://www.xilinx.com/products/jbits/index.htm> (accessed October 13, 2009)