Download Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Airborne Networking wikipedia , lookup

Dynamic Host Configuration Protocol wikipedia , lookup

Distributed operating system wikipedia , lookup

Lag wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Remote Desktop Services wikipedia , lookup

Transcript
Distributed Systems
CS 15-440
Communication Paradigms (Cont’d) & Naming
Lecture 7, Sep 15, 2014
Mohammad Hammoud
 Last Session:
Today…
 A Complete Routing Example using the Bellman-Ford Algorithm
 Communication in Distributed Systems
 Inter-Process Communication
 Today’s Session:
 Part I: Communication Paradigms
 Remote Invocation and Indirect Communication
 Part II: Naming
 Naming Conventions and Name Resolution Algorithms
 Announcements:





P1 design report grades are out
PS1 grades will be out on Wednesday
PS2 is now posted. It is due on Saturday, Sep 27 by 11:59PM
Quiz 1 will be on Thursday, Sep 25 (during the recitation time)
Midterm exam will be on Wednesday, Oct 15 (during the class time)
Today…
Part I: Remote Invocation and
Indirect Communication
Communication Paradigms
Socket communication
Remote invocation
Indirect communication
Remote Invocation
Remote invocation enables an entity to call a procedure
that typically executes on an another computer without
the programmer explicitly coding the details of
communication
The underlying middleware will take care of raw-communication
Programmer can transparently communicate with remote entity
We will study two types of remote invocations:
a. Remote Procedure Calls (RPC)
b. Remote Method Invocation (RMI)
Remote Procedure Calls (RPC)
RPC enables a sender to communicate with a receiver using
a simple procedure call
No communication or message-passing is visible to the programmer
Basic RPC Approach
Machine A – Client
Client
Program
Machine B – Server
Communication
Module
Communication
Module
Request
…
add(a,b)
;
…
int add(int
x, int y) {
return
x+y;
}
Response
Client process
Client
Stub
Server
Procedure
Server process
Server Stub
(Skeleton)
Challenges in RPC
Parameter passing via Marshaling
Procedure parameters and results have to be
transferred over the network as bits
Data representation
Data representation has to be uniform
Architecture of the sender and receiver machines may differ
Challenges in RPC
Parameter passing via Marshaling
Procedure parameters and results have to be
transferred over the network as bits
Data representation
Data representation has to be uniform
Architecture of the sender and receiver machines may differ
Parameter Passing via Marshaling
Packing parameters into a message that will be
transmitted over the network is called parameter
marshalling
The parameters to the procedure and the result
have to be marshaled before transmitting them
over the network
Two types of parameters can passed
1. Value parameters
2. Reference parameters
1. Passing Value Parameters
Value parameters have complete information
about the variable, and can be directly encoded
into the message
e.g., integer, float, character
Values are passed through call-by-value
The changes made by the callee procedure are not
reflected in the caller procedure
2. Passing Reference Parameters
Passing reference parameters like value parameters in RPC
leads to incorrect results due to two reasons:
a. Invalidity of reference parameters at the server
Reference parameters are valid only within client’s address space
Solution: Pass the reference parameter by copying the data that is
referenced
b. Changes to reference parameters are not reflected back
at the client
Solution: “Copy/Restore” the data
Copy the data that is referenced by the parameter
Copy-back the value at server to the client
Challenges in RPC
Parameter passing via Marshaling
Procedure parameters and results have to be
transferred over the network as bits
Data representation
Data representation has to be uniform
Architecture of the sender and receiver machines may differ
Data Representation
Computers in DS often have different architectures and
operating systems
The size of the data-type differ
e.g., A long data-type is 4-bytes in 32-bit Unix, while it is 8-bytes
in 64-bit Unix systems
The format in which the data is stored differ
e.g., Intel stores data in little-endian format, while SPARC stores
in big-endian format
The client and server have to agree on how simple data is
represented in the message
e.g., format and size of data-types such as integer, char and float
Remote Procedure Call Types
Remote procedure calls can be:
Synchronous
Asynchronous (or Deferred Synchronous)
Synchronous vs. Asynchronous
RPCs
An RPC with strict request-reply blocks the client until
the server returns
Blocking wastes resources at the client
Asynchronous RPCs are used if the client does not need
the result from server
The server immediately sends an ACK back to client
The client continues the execution after an ACK from the server
Synchronous RPCs
Asynchronous RPCs
Deferred Synchronous RPCs
Asynchronous RPC is also useful when a client wants the
results, but does not want to be blocked until the call finishes
Client uses deferred synchronous RPCs
Single request-response RPC is split into two RPCs
First, client triggers an asynchronous RPC on server
Second, on completion, server calls-back client to deliver the results
Remote Method Invocation (RMI)
In RMI, a calling object can invoke a method on
a potentially remote object
RMI is similar to RPC, but in a world of
distributed objects
The programmer can use the full expressive power of
object-oriented programming
RMI not only allows to pass value parameters, but
also pass object references
Remote Objects and Supporting
Modules
In RMI, objects whose methods can be invoked remotely
are known as “remote objects”
Remote objects implement remote interface
During any method call, the system has to resolve whether
the method is being called on a local or a remote object
Local calls should be called on a local object
Remote calls should be called via remote method invocation
Remote Reference Module is responsible for translating
between local and remote object references
RMI Control Flow
Machine A – Client
Machine B – Server
Communication
Module
Obj A
Proxy
for B
Remote
Reference
Module
Communication
Module
Request
Response
Skeleton and
Dispatcher
for B’s class
Remote
Reference
Module
Remote
Obj B
Communication Paradigms
Socket communication
Remote invocation
Indirect communication
Indirect Communication
Recall: Indirect communication uses middleware to
Provide one-to-many communication
Mechanisms eliminate space and time coupling
Space coupling: Sender and receiver should know each other’s identities
Time coupling: Sender and receiver should be explicitly listening to each
other during communication
Approach used: Indirection
Sender  A Middle-Man  Receiver
Middleware for Indirect Communication
Indirect communication can be achieved through:
1. Message-Queuing Systems
2. Group Communication Systems
Middleware for Indirect Communication
Indirect communication can be achieved through:
1. Message-Queuing Systems
2. Group Communication Systems
Message-Queuing (MQ) Systems
Message Queuing (MQ) systems provide space and time
decoupling between sender and receiver
They provide intermediate-term storage capacity for messages
(in the form of Queues), without requiring sender or receiver to
be active during communication
1. Send message
to the receiver
Sender
Receiver
Traditional Request Model
1. Put message
into the queue
Sender
2. Get message
from the queue
MQ
Receiver
Message-Queuing Model
Space and Time Decoupling
MQ enables space and time decoupling between sender
and receivers
Sender and receiver can be passive during communication
However, MQ has other types of coupling
Sender and receiver have to know the identity of the queue
The middleware (queue) should be always active
Space and Time Decoupling (cont’d)
Four combinations of loosely-coupled communications
are possible in MQ:
Sender
MQ
Recv
1. Sender active; Receiver active
Sender
MQ
Recv
3. Sender passive; Receiver active
Sender
MQ
Recv
2. Sender active; Receiver passive
Sender
MQ
Recv
4. Sender passive; Receiver passive
Interfaces Provided by the MQ System
Message Queues enable asynchronous communication
by providing the following primitives to the applications:
Primitive
Meaning
PUT
Append a message to a specified queue
GET
Block until the specified queue is nonempty, and
remove the first message
POLL
Check a specified queue for messages, and remove
the first. Never block
NOTIFY
Install a handler (call-back function) to be called when
a message is put into the specified queue
Architecture of an MQ System
The architecture of an MQ system has to
address the following challenges:
a. Placement of the Queue
Is the queue placed near to the sender or receiver?
b. Identity of the Queue
How can sender and receiver identify the queue location?
c. Intermediate Queue Managers
Can MQ be scaled to a large-scale distributed system?
a. Placement of the Queue
Each application has a specific pattern of inserting and
receiving the messages
MQ system is optimized by placing the queue at a
location that improves performance
Typically, a queue is placed in one of the two locations
Source queues: Queue is placed near the source
Destination queues: Queue is placed near the destination
Examples:
“Email Messages” is optimized by the use of destination queues
“RSS Feeds” requires source queuing
b. Identity of the Queue
In MQ systems, queues are generally addressed
by names
However, the sender and the receiver should be
aware of the network location of the queue
A naming service for queues is necessary
Database of queue names to network locations is
maintained
Database can be distributed (similar to DNS)
c. Intermediate Queue Managers
Queues are managed by Queue Managers
Queue Managers directly interact with sending and receiving processes
However, Queue Managers are not scalable in dynamic
large-scale Distributed Systems (DSs)
Computers participating in a DS may change (thus changing the
topology of the DS)
There is no general naming service available to dynamically map queue
names to network locations
Solution: To build an overlay network (e.g., Relays)
c. Intermediate Queue Managers
(Cont’d)
Relay queue managers (or relays) assist in building
dynamic scalable MQ systems
Relays act as “routers” for routing the messages from sender to
the queue manager
Machine A
Application 1
Relay 1
Machine B
Application
Relay 1
Application 2
Relay 1
Machine C
Application
Middleware for Indirect Communication
Indirect communication can be achieved through:
1. Message-Queuing Systems
2. Group Communication Systems
Group Communication Systems
Group Communication systems enable one-to-many
communication
A prime example is multicast communication support for
applications
Multicast can be supported using two approaches
1. Network-level multicasting
2. Application-level multicasting
1. Network-Level Multicast
Each multicast group is assigned a unique
IP address
Applications “join” the multicast group
Multicast tree is built by connecting routers
and computers in the group
Recv 2
Sender
Recv 1
Network-level multicast is not scalable
Each DS may have a number of multicast
groups
Each router on the network has to store
information for multicast IP address for each
group for each DS
Recv 3
2. Application-Level Multicast (ALM)
ALM organizes the computers involved in a DS into an
overlay network
The computers in the overlay network cooperate to deliver
messages to other computers in the network
Network routers do not directly participate in the group
communication
The overhead of maintaining information at all the Internet
routers is eliminated
Connections between computers in an overlay network may
cross several physical links. Hence, ALM may not be optimal
Building Multicast Tree in ALM
Initiator (root) generates a multicast
identifier mid
If node P wants to join, it sends a join
request to the root
When request arrives at Q (any node):
If Q has not seen a join request before, it
becomes a forwarder
P becomes child of Q. Join request continues
to be forwarded
= Overlay link
= Member
N0
N1
N2
N4
N5
N6
N7
= Active link for Multicasting
= Forwarder
N3
= Non-member
Today…
Part II: Naming (Introduction)
Naming
Names are used to uniquely identify entities in Distributed
Systems
Entities may be processes, remote objects, newsgroups, …
Names are mapped to entities’ locations using name resolution
An example of name resolution
Name
http://www.cdk5.net:8888/WebExamples/earth.html
DNS Lookup
Resource ID (IP Address, Port, File Path)
55.55.55.55
MAC address
02:60:8c:02:b0:5a
8888
WebExamples/earth.html
Host
Names, Addresses and Identifiers
An entity can be identified by three types of references
1. Name
A name is a set of bits or characters that references an entity
Names can be human-friendly (or not)
2. Address
Every entity resides on an access point, and access point has an address
Addresses may be location-dependent (or not)
e.g., IP Address + Port
3. Identifier
Identifiers are names that uniquely identify entities
A true identifier is a name with the following properties:
a. An identifier refers to at-most one entity
b. Each entity is referred to by at-most one identifier
c. An identifier always refers to the same entity (i.e. it is never reused)
Naming Systems
A naming system is simply a middleware
that assists in name resolution
Naming systems are classified into three
classes based on the type of names used:
a. Flat naming
b. Structured naming
c. Attribute-based naming
Classes of Naming
Flat naming
Structured naming
Attribute-based naming
Flat Naming
In Flat Naming, identifiers are simply random bits of strings
(known as unstructured or flat names)
A flat name does not contain any information on how to
locate an entity
We will study four types of name resolution mechanisms
for flat names:
1.
2.
3.
4.
Broadcasting
Forwarding pointers
Home-based approaches
Distributed Hash Tables (DHTs)
1. Broadcasting
Approach: Broadcast the name/address to the complete
network. The entity associated with the name responds
with its current identifier
Example: Address Resolution Protocol (ARP)
Resolve an IP address to a MAC address
In this application,
Who has the address
192.168.0.1?
IP address is the address of the entity
MAC address is the identifier of the
access point
Challenges:
Not scalable in large networks
I am 192.168.0.1. My identifier
is 02:AB:4A:3C:59:85
This technique leads to flooding the network with broadcast messages
Requires all entities to listen to all requests
2. Forwarding Pointers
Forwarding Pointers enable locating mobile entities
Mobile entities move from one access point to another
When an entity moves from location A to location B, it
leaves behind (at A) a reference to its new location at B
Name resolution mechanism
Follow the chain of pointers to reach the entity
Update the entity’s reference when the present location is found
Challenges:
Reference to at-least one pointer is necessary
Long chains lead to longer resolution delays
Long chains are prone to failure due to
broken links
Forwarding Pointers – An Example
Stub-Scion Pair (SSP) chains implement remote invocation for
mobile entities using forwarding pointers
Server stub is referred to as Scion in the original paper
Each forwarding pointer is implemented as a pair:
(client stub, server stub)
The server stub contains a local reference to the actual object or a local
reference to another client stub
When object moves from A to B,
It leaves a client stub at A
It installs a server stub at B
Process P2
Process P1
Process P3
Process P4
n
= Process n;
= Remote Object;
= Caller Object;
= Server stub;
= Client stub
3. Home-Based Approaches
Each entity is assigned a home node
Home node is typically static (has fixed access point and address)
Home node keeps track of current address of the entity
Entity-home interaction:
Entity’s home address is registered at a naming service
Entity updates the home about its current address (foreign address) whenever it
moves
Name resolution
Client contacts the home to obtain the foreign address
Client then contacts the entity at the foreign location
3. Home-Based Approaches – An example
Example: Mobile-IP
1. Update home node about the
foreign address
Mobile entity
3a. Home node forwards the
message to the foreign address
of the mobile entity
Home node
3b. Home node replies the client
with the current IP address of
the mobile entity
4. Client directly sends all
subsequent packets directly to the
foreign address of the mobile entity
2. Client sends the packet to the
mobile entity at its home node
3. Home-Based Approaches – Challenges
Home address is permanent for an entity’s lifetime
If the entity permanently moves, then a simple home-based
approach incurs higher communication overhead
Connection set-up overheads due to communication
between the client and the home can be excessive
Consider the scenario where the clients are nearer to the mobile
entity than the home entity
Next Class
Flat naming (Cont’d)
Structured naming
Attribute-based naming