Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Amoeba Distributed Operating System James Schultz CPSC 550 Spring 2007 The History of Amoeba The current computer environment started arriving in the early 1980s. It was at this time that technology allowed computers to become smaller and cheaper, allowing for each individual to have their own personal computer. These personal computers began replacing the bulky computers that employees would have to share its use time with one another. These personal computers then started being networked together, allowing communication and sharing of resources. In fact this time period is where the basis of the current networking was born, and out of this period came a need for multiple computers to be connected together but act as one. This concept of multiple computers acting as one is known as parallelism and one of the things Amoeba was designed to handle. Amoeba was first developed in the early 1980s at the Vrije Universiteit in Amsterdam, Netherland, under the guidance of Andrew Tanenbaum and with the cooperation of Centrum voor Wiskunde en Informatica. The purpose of Amoeba was to create a simplistic way to handle the needs of distributed systems and parallels while maintaining the transparency for its users. The transparency that is being maintained is that the user has no clue how the underlying system works, where the files are stored or on which machine the process is running. Traditional distribution systems have workstations that users can log into, and while on these workstations they can tell that the process they run, only run on the local machine and other calls must occur for the process to run on a different machine. In this, users can tell that there are different machines. With Amoeba the user doesn’t know where the process is running, where files are stored or even if certain files are stored in the same physical memory. The last official release of Amoeba came in 1996 with version 5.3. It came 13 years after the prototype was released in 1983. This doesn’t mean that others have not done their own work; just that Tanenbaum and Vrije Universiteit have not done work. One example of this is Fireball Amoeba by Fireball Software Distribution. [4] Goals of Amoeba Amoeba was designed with four main goals: Distribution – Connecting together many machines Parallelism – Allowing individual jobs to use multiple CPUs easily Transparency – Having the collection of computer act like a single system Performance – Achieving all of the above in an efficient manner Distribution: Amoeba works by connecting a number of different computers together into a distribution network. The network allows for a user to have store data and run process on any of these computers. Amoeba use Fast Local Internet Protocol for over an Ethernet cable to communicate to other computers on the same LAN (local area network). If there are multiple LANs, then the out going router can act as FLIP router converting FLIP to standard TCP/IP allowing for LANs to exist all over the world. Parallelism: Parallelism is the concept of using multiple computers to work on the same process, allowing for an increase in speed. Essentially, it takes the collection of processors and makes them into one super processor. This can be extremely useful in a number of situations where the process has to make multiple loops, the two biggest examples of this being “The Traveling Salesman” and the factoring. Transparency: The key concept with transparency is how the user views the system. In traditional distributed networks, the user logs into a machine. He knows which machine he is logged into and knows what other machines exist in the network, remote logging into them if needed. Processes are run at the local machined unless they are specified to run somewhere else. In Amoeba and other systems with user transparency the user just logs onto a workstation and into the system as a whole. The user never knows where any files are located, which CPUs are running the processes or for that matter, how many CPUs are running the processes. Effectively the main difference is that the system looks like a single computer whereas the traditional distributed network where every computer is unique and individual but connected to each other. Performance: Performance is a key goal for any system, be it a computer operating system or even a check out system at the local grocery store. For Amoeba to achieve the performance required, Tanenbaum and company developed a micro-kernel that is installed on every computer. These kernels handle the basic communication, allowing for a uniformed and efficient communication on all computers in the network. [1] Definitions of Amoeba Microkernel: The microkernel is the basic, most fundamental part of Amoeba. This is where all the communication, memory management, I/O operations, object primitives and basic processes are preformed. All other services are built upon this kernel and it is this kernel that allows for uniform communication from different machines. [1] Client: Processes that are started by the user, such as running an application. Server: Processes that are not started by the user, examples would be the Bullet file server and the directory server. FLIP: Fast Local Internet Protocol (FLIP) is the communication protocol used. It was developed by Tanenbaum for Amoeba. FLIP is a connectionless protocol, which sends packets of data know as datagrams, that is functionally equivalent to IP. It was designed to optimize the use of high-performance remote procedure calls (RPCs). [5] RPC: Remote Procedure Call (RPC) is the communication mechanism that is used for client s need to communicate with servers. In Amoeba these calls are accessed by stubs which are generated by AIL. Stubs: The procedures generated by AIL that call the RPCs. They are used to prevent the user from directly accessing RPCs. They also marshal the parameters of the RPCs. AIL: Amoeba Interface Language (AIL) is the language used to create stubs. Objects: Objects are an abstract data type. They can either be software or hardware. Each object has a list of operations that can be preformed and a 128 –bit value known as a capability. Capability: Capability is a 128-bit value that is associated with every object. This value is created when the object is created and sent to the user. When ever the user wishes to perform an operation on an object, it sends the capability value to the server as a way of identifying the object and verifying that the user has permission to access the object. To prevent tampering, the capabilities are encrypted. I/O: Input/Output (IO) is needed to read or write data. The user sends a RPC to a disk I/O in the kernel, after authorization checks are the done the kernel then sends the information requested back to the caller. Bullet File Server: The Bullet file server is the basic file server for Amoeba. It was designed for high performance and to store files in contiguously on disk. Typically all files are sent with a single RPC, but some larger files may require multiple RPCs. It was designed so that there was a dedicated machine to be the Bullet server. Also, the largest that any file can be is no larger then the amount of physical memory that exists. [1] Directory Server: The Bullet server only handles on aspect of handling files, storing them. The directory server handles the naming of them and linking them to one another. The reason that a separate server is needed for this task is because the way the parallelism of Amoeba handles files. It allows files to be stored anywhere in the distributed network. Therefore a server is needed to keep track of where the directory files are located and where the files that preside in the directories are located. The directory server works hand in hand with the Bullet server to handle file storage. [1] Features of Amoeba Memory Management: Possibly the largest drawback of Amoeba is how the memory is handled. Amoeba does not support virtual memory and when an object is being processed all of the data of the object must enter into the memory. Therefore no file can be larger then the amount of physical memory available. My personal opinion is that this drawback is the leading cause of the systems decline. Micro-Kernel: Because of the simplistic nature of the micro-kernel, other servers can be built upon the kernel. This allows for many user created servers to be developed, including new file structures if desired. As a drawback though, it allows for inexperienced users to mess the server with the huge amount of freedom. Group Communication: Amoeba has a built in procedure for doing a one-to-many communication that is needed by many applications. It is a process the guarantees delivery to every member in the order that they messages where sent. This feature eliminates a lot of problems that occur when dealing with distributed and parallel networks. Compilers: There are a number of compliers included in the Amoeba system for application creation. There are also a number of third-party compliers that are supported by the Amoeba system, one of which is the GNU C compiler. Utilities: There are a number of utilities that are packaged with Amoeba that are based upon UNIX. There are also a number of newer utilities included such as amake which is used to compile in parallel. UNIX Emulation: Amoeba has a UNIX emulation know as Ajax included with it. It is compatible with POSIX P1003.1. TCP/IP: Amoeba uses FLIP instead of TCP/IP but it does have a server that can be used for TCP/IP connections. This allows for user to access the internet and is the main handling server for WAN connections. X Windows: The X Window System is the standard system used for the user interface. A workstation can run a special version of X is they are running an X server. Connection to UNIX: Amoeba has a special driver for connecting with SunOS 4.1.1 (or higher) UNIX kernel. This UNIX driver allows for increased speed for talking with SunOS computers, though it is still possible for Amoeba to communicate with non-UNIX systems by using TCP/IP. Transfer between UNIX and the Bullet Server area handled by utilities provided by Amoeba. Structure of Amoeba There are four main parts to Amoeba, the workstations, a pool of processors, specialized servers and the WAN routers. The workstations are the user interface for the system. The pool of processors is the back bone of Amoeba. The specialized servers include the Bullet file system and directory server, lastly the WAN routers which are used to connect distant Amoeba LANs together. Workstation: There are three types of workstations, the X window system, engineering workstations and specialized workstations. The X window system is the typical user’s interface, it is from here that users access the system and run processes on it. Engineering workstations are used to access the system for management purposes. This is interface that the system administrator will use to maintain the Amoeba system. Processor Pool: The second structure is the pool of processors; their functions are to be allocated to processes, run those processes and then return the results and are placed back with in the pool. Every time a process is running, it is given a number of processors, the processors given are those without current jobs or if all of them have jobs, those who load are the least, allowing for an even distribution of work load. This is where the increase in speed comes from in parallelism, giving more CPUs the same job. Now the increase isn’t linear with respect to the number of CPUs given to quickness that a process is completed. The reason for this is because of the overhead need to initiate all the CPUs, the communication time between all the CPUs and the fact that most processes cannot be even divided. Specialized Servers: The third group is the specialized servers. The things that fall into this category are the Bullet file server, the directory server and any other servers that might be introduced to the system. Details of the Bullet server and the directory server are in the definitions section. WAN Router: The last one is the wide-area network (WAN) routers or gateways. Their job is to link distant Amoeba local-area networks (LAN) together, allowing for the Amoeba system as a whole to be located any and everywhere in the world if so desired. [2] Figure 1 A diagram showing a design for an Amoeba system. To the far left is the processor pool waiting for requests. To the top a collection of workstations, on the bottom are the specialized servers and to the right is the WAN gateway. [2] How to use Amoeba First thing that is needed to use Amoeba is to down load the operating system from Vrije Universiteit. Once setup the user can log in using a workstation and begin computing. To create programs there are a number of text editors available, elvis which is based on Berkley vi editor, jove based on Emacs, and ed based on UNIX ed editor. Program languages that are supported by Amoeba are C, Pascal, Fortran 77, Basic and Modula 2. A del command is used to remove the entry of any object. A –f tag is needed to destroy the object completely. Applications of Amoeba The main use of Amoeba is to speed up processing for large computations. Such as the traveling sales men problem, factoring, and make files. All of these problems can potentially take a long time. Factoring traditionally has a horrible time to figure out for any given problem. This is why factoring is used in current encryption techniques, because it is takes so long to brute force. With Amoeba, a person could split up the process giving each processor only part of the problem. Example being, trying to find the factors in 1000, with 4 processors the problem could be broken down to see if value 1-250 factor into 1000, 251-500, 501-750 and 751-1000. This allows each processor the ability to look up only part of the problem, shorting the time. The traveling salesman problem has been around forever. The way it is set up is that there are a number of cities that a salesman must visit. He can visit each city only once and must return to the starting city last. The problem is finding the shortest route to be taken. The trick to solving this is splitting the processor pool; allocate a few processors to each city, and then the processors pools can subdivide the problem. So one processor pool starts by going to city A, another goes to city B and so on. [3] Significant Points The system is free It has not had an official update in over 10 years Can use older/slower CPUs to create a powerful system Micro-Kernel allows for other file systems to be created Has four main goals o Distribution o Parallelism o Transparency o Performance Has many UNIX like commands and programs Can only hold programs as large as its physical memory Summery Amoeba was a distributed parallel system developed at Vrije Universiteit in Amsterdam, Netherlands. It worked by using a micro-kernel and RPCs to help speed up the performance of the system as a whole, allowing for slower, less powerful CPUs to be used. Though it has not been updated in over 10 years, it still has fundamental ideas that are useful in understanding and developing today’s parallel systems. References [1] Tanenbaum, A.S, Sharp, G.J. “The Amoeba Distributed Operating System” Online: 2006 http://www.cs.vu.nl/pub/amoeba/Intro.pdf [2] Ramsay, M., Keigel, T., Memmer, H. “Amoeba Distributed Operating System” Online http://csserver.evansville.edu/~mr56/CS470/Final_Draft.pdf [3] Baggett, Ken “The Amoeba Distributed Operating System” Online: 2006 http://www.pcs.cnu.edu/~mzhang/CPSC450_550/Case Study/2006/KenBaggett_Amoeba.doc [4] Distributed Operating Systems Amoeba – Fireball Software Distributions Online: 2006 http://fsd-amoeba.sourceforge.net/ [5] Kaashoek, M. Frans, Renesse, Robbert van, Staveren, Hans van, Tanenbuam, Andrew S. “FLIP: an Internet Protocol for Support Distributed Systems” Online: 2006 http://www.mcs.drexel.edu/~bmitchel/course/mcs721/papers/flip.pdf