Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
M301: Software Systems & their Development Block 3: Concurrency 2 Unit 2: IPC in Non-shared Memory (II) In the last two units of Block 2 and the first unit of Block 3, you studied IPC in both a shared-memory and non-shared-memory setting. In particular you studied the conventional client–server technology, which is based on passing messages between host machines using sockets and data streams. In this unit you will study a different mechanism, known as remote procedure calls (RPC). The RPC mechanism is based on synchronous procedure calls. In this unit we shall first explore the RPC mechanism and then study Java’s version of RPC known as remote method invocation, or RMI for short. You will see that RMI is of a higher level abstraction than client–server technology: it is easier to use and hence less error prone. Section 1: RPC and Java's RMI mechanism In Unit 3.1 we studied the message passing mechanism for IPC in non-shared memory. In this unit we explore the other mechanism available for distributed IPC, which is a procedural form of communication. This mechanism is known as remote procedure call (RPC) and its implementation in Java as remote method invocation (RMI). As you learned in Unit 3.1, the object model of IPC exploits the object-oriented paradigm in which objects invoke the methods of other objects, including those residing on remote machines. Java provides the facilities to implement the object model through the RMI mechanism. In this section you will first learn about the general principles of RPC. You will then learn more about the object model by being introduced in a practical way to how the RMI facilities (contained mainly in the java.rmi package) can be used to develop a simple system based on the object model. Objectives: On completing this section you should be able to: describe how issues related to concurrency can be separated from issues related to distribution; distinguish between message passing and RPCs; describe the RRA protocol for an RPC implemented on an unreliable transport service; describe the problems that arise when a client, server or network crashes and when the network becomes congested, and explain how the RPC protocol deals with the problems; explain the difference between a transparent and non-transparent approach to integrating RPCs into a programming language; explain why an RPC is not a sensible mechanism for all types of communication; explain the issues related to the naming of objects in a distributed IPC system. explain Java’s RMI mechanism for implementing an RPC system of distributed IPC; describe the important classes in Java’s RMI packages; compare the RMI model of distributed IPC with the client–server model. Procedures and procedure calls Recall that an application can consist of several procedures (or methods, in OO terminology) and that there is a single thread of control as one procedure calls another. In effect, a procedure call is an instruction that enables data to be exchanged between the calling procedure and the invoked procedure. The execution of the calling procedure is stopped while the invoked procedure executes, and the calling procedure resumes execution when the invoked procedure returns its results. When a procedure (method) is invoked (called), one or more arguments may have to be passed to the procedure. In essence, there are two ways to do this: objects are passed by reference, while the values of primitive data types are passed by value. What this means is that: for objects, the reference to the object held in the actual argument is passed to the formal argument, so that the formal argument references the same object as the actual argument (reference semantics); for values, a copy of the value in the actual argument is passed to the formal argument (copy semantics). As you will learn in this section, the mechanism with regard to objects is sometimes different for remote procedure calls (i.e. calls to a procedure on a remote node). In such cases, the object referenced by the actual argument of the call may itself be copied across the network to the node where the procedure will be executed. In other words, the argument in this case is being passed by value and not by reference. Summary of Chapter 16 of Bacon and Harris 16.6 Distributed programming paradigms 16.6.1 Synchronous and asynchronous communication In synchronous communication, the sender (the process that makes a remote invocation) waits (it is blocked) until it receives a response from the receiver. In asynchronous communication, the sender continues to perform some operations after the message has been sent and before it may have to wait for a result to come back. In general, synchronous communication is considered easier to use than asynchronous communication because it is at a higher level of abstraction (similar to method invocation in a programming language). We can have the best of both synchronous and asynchronous communications if we have a FORK primitive available. By providing a FORK operation in which a new process, a child process, is created dynamically; the child process can wait for the result of the remote invocation while the parent process continues to execute. In due course, the child and its parent may synchronize, using the JOIN primitive, so that the result of the remote invocation can be passed from the child process to the parent process (see Figure 16.12). For the above model to work, it is necessary to have a kernel that supports multi-threaded processes. If the kernel does not support multi-threaded processes, and one of the threads (i.e. a child process) belonging to a parent process blocks for remote communication, the operating system will consider the parent process (including all its threads) to be blocked as well. In other words, the kernel must support multi-threaded processes or else a parent process that creates a child process that blocks will also become blocked. A way of viewing this model (when processes can be created dynamically) is that concurrency and distribution issues have been separated. Concurrency is achieved by having two processes: the parent and the child it creates using the FORK primitive. Distributed programming is achieved through the child process that invokes the remote operation. Such a child process is called a distributed sequential process. The child process is said to be ‘sequential’ because the communication between it and the remote process is synchronous (i.e. the child waits for a reply). 16.6.2 Procedural versus message passing style A procedure is a programming construct consisting of a body, containing a set of sequential instructions, and a heading, consisting of a name and a set of arguments (by which the caller can exchange data with the procedure when the procedure is invoked). The sequence of actions in invoking a procedure is: the caller passes data to the procedure via the arguments; the procedure is executed; the results are passed back to the caller, which then continues its execution. In an OO language like Java, a procedure is implemented as a method. A remote procedure call (RPC) is a call to a procedure located on a machine that is different (remote) from the machine from which the call was made. Message passing scheme vs. RPC A message passing scheme may be asynchronous. An RPC is always synchronous, in the sense that the calling procedure, having invoked a procedure on a remote machine, waits for the result before continuing with its processing. In short, compared with the use of sockets and unstructured data streams, or even asynchronous message passing, a call to a method or procedure which happens to be remote (i.e. RPC) is easier and less problematic. 16.7 Remote procedure call (RPC) An RPC system consists of a communications protocol, which typically sits on top of a transport-level service (e.g. (TCP-UDP)/IP) and language–level routines concerned with assembling the data to be passed by the protocol. It is also necessary to provide a mechanism for binding remote procedure names to network addresses. 16.7.1 An RFC system Figure 16.13 outlines system components that are involved when a remote procedure call is made. A request-reply-acknowledge (RRA) protocol is assumed (RRA is where one process (the caller process) sends a message (request) to another process (the receiver process), the called system (server) sends a reply (which also acts as an acknowledgement that the original message was received), and the calling system (client), on receipt of the reply, sends an acknowledgement to confirm receipt of the reply). An alternative is request-acknowledge-reply-acknowledge (RARA) (RARA is similar to RRA but an additional explicit acknowledgement of the receipt of the original message is sent by the called system). In RRA, it is assumed that the reply is likely to come back sufficiently quickly to function as an acknowledgment of the request. How a typical RPC system works. First the operation of the RPC protocol when client, server and network are performing well will be described. In this case a procedure call is made which is detected (by some means, which will be discussed in Section 16.8) to be a remote procedure call. Arguments are specified as usual. Control passes to point A in the diagram. At point A: the arguments are packed into a data structure suitable for transfer across the network as a message or packet; an RPC identifier is generated for this call; a timer is set. The data is then passed to lower protocol levels for transportation across the network. Typically, this is done by making a system call into the kernel, as described in Section 16.4. At the called system the lower levels deliver it up to the RPC service level. At point B: the arguments are unpacked from the network buffer data structure in a form suitable for making a local procedure call; the RPC identifier is noted. The call is then made to the required remote procedure, which is executed at C. The return from the procedure is to the calling environment in the RPC system, point D. At point D: the return arguments are packed into a network buffer; another timer is set. On arrival at the calling system's RPC service level, point E: the return arguments are unpacked; the timer set at point A is disabled; an acknowledgement is sent for this RPC ID (the timer at D can then be disabled). 16.7.2 The RPC protocol with network or server congestion The timer is used by the RPC service of the client to detect the possibility of network congestion or of a network or receiver failure. 1. If a reply is not received within a set time, the client assumes that a problem has arisen: either that the original request failed to get through or that the reply was lost. In such circumstances, the client may re-send the request. 2. If the original request did get through to the server, which did respond, the RPC identifier can be used by the RPC service of the server to detect that it has already responded. 3. In principle, in the face of failure or congestion, the client and server RPC services can repeat the sending of messages, replies and acknowledgements to ensure that the client does receive a reply to its request. This is known as exactly once RPC semantics. 4. Alternatively, if no attempt is made to re-send a request, the RPC protocol is said to have at most once RPC semantics. It is important to recognize that even if an RPC service uses at most once RPC semantics (that is, does not retry), an application using the RPC service may itself ask for a request to be resent. 16.7.3 The RPC protocol with node crash and restart In a local procedure call, caller and called procedures crash together. In a remote procedure call the following possibilities must be considered (the node containing the call is referred to as the client, while the server is the node containing the called procedure). Client failure (see Figure 16.14) An orphan is a remote procedure call whose client (i.e. the node making the call) crashes after the request has been sent. That is, there has been client failure with the result that the client is not able to deal with any reply generated by the orphan. The major problem is that there may have been state changes in the server (i.e. the node containing the called procedure) or other nodes as the result of acting on the original request. On restart, the client is likely to retry the RPC with the state of the server different from its state when the original request was made. When the client restarts, it will be the application that decides whether or not to retry the RPC. The RPC service sees only a request for a remote procedure call to be made and does not know that this is a repeat request. Therefore, it will view this request as a new one and generate a new RPC identifier. Server failure (see Figure 16.15) The server may fail before the call is received or at some point during the call; see Figure 16.15 (in all cases the client timeout at A will expire): After the RPC service receives the call but before the call to the remote procedure is made, point B in the figure; During the remote procedure invocation, C; After the remote procedure invocation but before the result is sent, D. In all cases the client might repeat the call when the server restarts. In cases C and D this could cause problems since the server could have made permanent state changes before crashing resulting, in inconsistent state. 16.7.5 RPC and the ISO reference model Figure 16.16 shows an RPC protocol in relation to the ISO layers. 16.8 RPC-language integration The advantage of an RPC system is that, when integrated into a conventional programming language that uses procedure calls, the system developer can use the familiar procedural calling mechanisms to program in a distributed setting. In other words, the system developer does not need to know the details of how the (remote) call being made in the local application is sent to a process on another machine, carried out on that machine, and the result returned to the local application. All this is automatically carried out by the underlying implementation. The first issue concerns whether an RPC mechanism should exhibit distribution transparency; that is, should it be indistinguishable from a local call as far as the application program is concerned. If it is to be indistinguishable, the compiler needs to detect that the call is remote and produce the code that will find the target of the call, and then use the underlying mechanisms that carry out the communication with the remote machine. These underlying mechanisms usually involve the use of a stub on both the local (client) and remote (server) machines. The role of stubs will be described in more detail in Bacon, Section 16.9. Basically, the two stubs — one on the local machine (from which the call is made) and the other on the remote machine (where the procedure is executed) — are responsible for effecting the communication between the calling and called processes. This involves the marshalling and unmarshalling of arguments and return values. Marshalling arguments is the activity of transforming the arguments of a procedure call into a form suitable for transmission via a network. The arguments of a procedure call are structured within the main memory and can consist of data values of various types and references to them. Since network communication is usually the transmission of a stream (sequence) of bytes, the procedure’s arguments have to be ‘flattened’ or marshalled into a stream of bytes. On arrival, the stream of bytes must be reformed (i.e. unmarshalled) into their original structure, and, to do this, information about the original structure must be transmitted in addition to the data. The alternative to distribution transparency recognizes that remote calls are likely to take longer than local calls and also involve additional types of failure (e.g. network failure). Some languages, therefore, provide special syntax for a remote call so that the application programmer can take such eventualities into account. 16.9 Java's RMI: RPC in the general object model The object model of IPC exploits the object-oriented paradigm in which objects invoke the methods of other objects, including those residing on remote machines. Java provides the facilities to implement the object model through a mechanism called remote method invocation (RMI). 16.9.1 RMI and the general object model In the OO paradigm, a client object invokes an object located on a (remote) server host. In fact, the execution of the method takes place on the remote host where the remote object resides. The difference between a client-server model as described in the previous unit (i.e. using sockets) and the OO paradigm might seem slight but it is significant. It is a non-trivial task to program the setting up of communication between clients and a server involving the creation of socket connections and the use of input and output streams. However, it is much easier if the application programmer could simply invoke a remote object. In this way, the developer remains within the OO paradigm rather than concerning about sending messages using streams). Java's RMI: a Background The remote object One of the fundamental ideas about Java objects is that, when defining a class, it is often a good idea to do so in two parts: an interface and an implementation. A Java interface contains the signatures of the public methods that can be invoked on an object of the class. The implementation contains the bodies of the public methods together with any fields and private methods that may be required. This separation into interface and implementation has many advantages. 1. In this section the main advantage is that a client (a user of a remote object) needs only to know about the interface of the remote object’s class whereas 2. the server (the implementer of the remote object) needs to know about both the interface and the implementation of the class. 3. The second idea is that an object is created only when a Java program (strictly, a Java thread) is executed. An object exists in main memory until either it is no longer referenced (in which case the garbage collector reclaims the storage it occupied) or the thread that created it ceases execution. Thus, there must be a thread in existence if an object is to exist. 4. Thus, when there is a remote object, the client must be aware of the object’s interface (to ensure that the remote method invocations made by the client can be checked by the compiler). Of course, the implementations of these methods must be known to the server because it is the server host that executes the methods. This means that we shall develop classes for remote objects in the form of an interface plus an implementation. Both are required by the server (you can’t have an implementation without the associated interface) but only a copy of the interface is required by the client. The RMI Registry The RMI registry is a Java program named rmiregistry which, when it is executing, maintains a list of references to objects that have been registered with it. An object is registered by a server using the rebind method from the class Naming (part of the java.rmi package). The first argument of the rebind method is a String that specifies: 1. where the registry is located — by quoting the IP address of its host (localhost can be used when the registry is on the same host as the server) and the port on which the registry listens for communications (by default this is 1099, and can be omitted); 2. a programmer defined name by which clients can gain access to the object. The second argument is a reference to the object itself. There is an alternative method to rebind known as bind. The difference between them depends upon whether or not an object with the same string name already exists within the registry when these methods are invoked. If an object already exists, rebind will replace its reference with a reference to the new object whereas bind will throw an exception. A client gains access to a remote object by executing the method lookup, also from the Naming class. This method has a single String argument as in the rebindmethod that specifies where the registry is located (the registry host's IP address and port number) and the String name by which the registry knows the object. The registry program rmiregistry is part of the Java system. There is another useful method in the Naming class, namely list. This method returns a list of all the names of objects currently supported by the registry. It is important to recognize that the RMI registry is a non-persistent scheme. The registry forgets all its contents when the rmiregistry program stops. The stub object An important point to note from Bacon, Figures 16.20 and 16.21 is that the client process does not communicate directly with the remote object. Instead, there is a stub object on the client's host that acts as a proxy for the remote object. Therefore, whenever the client invokes a method on the remote object, the message is sent to the stub, which is responsible for marshalling the arguments and converting the message into a stream object capable of being transmitted to the remote host using Java’s socket and stream client–server mechanism. Once the message reaches the remote host, it is transformed into a normal method invocation applied to the remote object. Any response (return value) from the remote object follows the reverse procedure. The stub on the client converts the return message into a value that is returned to the object that made the original remote call. The stub object is created from the class that implements the remote object using a special program named rmic as we shall illustrate in the next subsection. Overview of RMI (Taken from COMP3320 Distributed Systems Java Remote Method Invocation (RMI)) Java RMI allowed programmer to execute remote function class using the same semantics as local functions calls. Remote Machine (Server) Local Machine (Client) SampleServer remoteObject; int s; … 1,2 s = remoteObject.sum(1,2); public int sum(int a,int b){ return a + b;} 3 System.out.println(s); Here, general RMI architecture: Remote Machine Local Machine The server must first bind its name to the registry The client lookup the server name in the registry to establish remote references. The Stub serializing the parameters to skeleton (sometimes called stub), the skeleton invoking the remote method and serializing the result back to the stub. Stub and Skeleton call RMI Client Stub skeleton RMI Server return A client invokes a remote method, the call is first forwarded to stub. The stub is responsible for sending the remote call over to the server-side skeleton The stub opening a socket to the remote server, marshaling the object parameters and forwarding the data stream to the skeleton. A skeleton contains a method that receives the remote calls, unmarshals the parameters, and invokes the actual remote object implementation. Step of developing an RMI System: 1 Define the remote interface 2 Develop the remote object by implementing the remote interface. 3 Develop the client program. 4 Compile the Java source files. 5 Generate the client stubs and server skeletons. 6 Start the RMI registry. 7 Start the remote server objects. 8 Run the client Step 1. Defining the Remote Interface To create an RMI application, the first step is defining of a remote interface between the client and server objects. /* SampleServer.java */import java.rmi.*; public interface SampleServer Remote { public int sum(int a,int b) throws RemoteException;} extends Step 2. Develop the remote object by implement the remote interface • The server is a simple unicast remote server. • Create server by extending java.rmi.server.UnicastRemoteObject. • The server uses the RMISecurityManager to protect its resources while engaging in remote communication. M301: Software Systems & their Development /* SampleServerImpl.java */import java.rmi.*;import java.rmi.server.*;import java.rmi.registry.*; public class SampleServerImpl extends UnicastRemoteObjectimplem ents SampleServer { SampleServerImpl() throws RemoteException{ super(); } The server must bind its name to the registry, the client will look up the server name. Use java.rmi.Naming class to bind the server name to registry. In this example the name call “SAMPLE-SERVER”. In the main method of your server object, the RMI security manager is created and installed. /* SampleServerImpl.java */ public static void main(String args[]) { try{ System.setSecurityManager(n ew RMISecurityManager()); //set the securitymanager //create a local instance of the object SampleServerImpl Server = new SampleServerImpl(); //put the local instance in the registry Naming.rebind("SAMPLE-SERVER" , Server); System.out.println("Server waiting....."); } catch (java.net.MalformedURLException me) { System.out.println("Malformed URL: me.toString()); } catch (RemoteException re) { System.out.println("Remote exception: re.toString()); } } " + " + • Implement the remote mehtods /* SampleServerImpl.java */ public int sum(int a,int b) throws RemoteException { return a + b;}} Step 3. Develop the client program In order for the client object to invoke methods on the server, it must first look up the name of server in the registry. You use the java.rmi.Naming class to lookup the server name. The server name is specified as URL in the from ( rmi://host:port/name ) Default RMI port is 1099. The name specified in the URL must exactly match the name that the server has bound to the registry. In this example, the name is “SAMPLESERVER” The remote method invocation is programmed using the remote interface name (remoteObject) as prefix and the remote method name (sum) as suffix. import java.rmi.*;import java.rmi.server.*; public class SampleClient { public static void main(String[] args){ // set the security manager for the client System.setSecurityManager(new RMISecurityManager()); //get the remote object from the registry try { System.out.println("Security Manager loaded"); String url = "//localhost/SAMPLE-SERVER"; SampleServer remoteObject = (SampleServer)Naming.lookup(url);System.out.println("G ot remote object"); System.out.println(" 1 + 2 = " + remoteObject.sum(1,2) );}catch (RemoteException exc) { System.out.println("Error in lookup: " + exc.toString()); } catch (java.net.MalformedURLException exc) { System.out.println("Malformed URL: " + exc.toString()); } catch (java.rmi.NotBoundException exc) { System.out.println("NotBound: " + exc.toString()); } }} Step 4 & 5. Compile the Java source files & Generate the client stubs and server skeletons Assume the program compile and executing at elpis on ~/rmi Once the interface is completed, you need to generate stubs and skeleton code. The RMI system provides an RMI compiler (rmic) that takes your generated interface class and procedures stub code on its self. elpis:~/rmi> set CLASSPATH=”~/rmi” elpis:~/rmi> javac SampleServer.java elpis:~/rmi> javac SampleServerImpl.java elpis:~/rmi> rmic SampleServerImpl elpis:~/rmi> javac SampleClient.java 6. Start the RMI registry The RMI applications need install to Registry. And the Registry must start manual by call rmiregisty. The rmiregistry us uses port 1099 by default. You can also bind rmiregistry to a different port by indicating the new port number as : rmiregistry <new port> elpis:~/rmi> rmiregistry remark: On Windows, you have to type in from the command line: > start rmiregistry Step 7 & 8. Start the remote server objects & Run the client Once the Registry is started, the server can be started and will be able to store itself in the Registry. Because of the grained security model in Java 2.0, you must setup a security policy for RMI by set java.security.policy to the file policy.all elpis:~/rmi> java –Djava.security.policy=policy.all SampleServerImpl elpis:~/rmi> java –Djava.security.policy=policy.all SampleClient 16.9.4 Comparison of RMI with stream and socket programming RMI is at a higher level of abstraction than socket-level programming. It enables the details of sockets and data streams to be hidden. RMI clients can invoke a server method directly but socket-level programming allows only values to be passed that must be decoded and turned into a method call by the server. This decoding is performed automatically by the RMI stubs (marshalling). RMI programs are much easier to maintain than socket-level programs. An RMI server can be modified or moved to another host without the need to change the client application (apart from resetting the URL for locating the server). RMI is implemented using socket-level programming. Socket-level programming is low level and prone to error, and RMI should be used in preference (similar to semaphores and monitors). RMI supports the idea of callbacks where the server invokes methods on the client. This facility enables interactive distributed applications to be developed. This can be done using sockets and streams but is difficult to program. However, both synchronous and asynchronous communication can be programmed using streams and sockets, whereas RMI is synchronous only. 16.10 Critique of synchronous invocation We argued in Section 16.6 that a blocking primitive is more manageable than a non-blocking one for implementing remote operation invocation. Two such primitives, RPC and RMI, were discussed in detail. We saw how to implement an RPC and an RMI, both of which are synchronous, blocking calls. A system design issue is whether this synchronous call (RPC or RMI) paradigm is sufficient to meet the communications needs of applications. Some systems have a real-time requirement for the transfer of massive amounts of data (e.g. multimedia systems with real-time voice and video). It is unlikely that RPC will be sufficient for this purpose. RPC communication requires all the data to be received before the operation commences and is likely to have some maximum data size well below that required by the application. Also, for some systems in which the data to be transferred is simply a (large) stream of bytes, there is no requirement for sophisticated argument marshalling and, indeed, to have such a mechanism would slow up the transmission. Block 3: Concurrency 2 Unit 2: IPC in Nom-shared Memory (II) Section 2: Using Java's RMI mechanism Objectives: On completing this section you should be able to: explain Java’s RMI mechanism for implementing an RPC system of distributed IPC; describe the important classes in Java’s RMI packages; describe the main components of a small application which uses RMI; compare the RMI model of distributed IPC with the client–server model; use the IDE to run RMI applications; explain the role of the security manager. IMPORTANT Read Section 9 of the IDE handbook Carry out the practical activities 2.1 and 2.2 and read the remaining parts of this section in Block 3.