Download W_Chapter_6

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Remote Desktop Services wikipedia , lookup

Distributed operating system wikipedia , lookup

Transcript
IS473 Distributed Systems
CHAPTER 6
Operating System Support
OUTLINE
Applications
Middleware
Operating system
Platform
Computer and network hardware
Dr. Almetwally Mostafa
2
OUTLINE
 Distributed Operating System.
 Operating System Layer.
 Processes and Threads.
 Communication and Invocation.
 Operating System Architecture.
Dr. Almetwally Mostafa
3
Network and Distributed OS
 Network operating system:




Have networking capability to access remote resources.
Retain autonomy (‫ )استقالل‬in managing own resources.
Remote resource access not always transparent.
Separate system image on each node.
 Distributed operating system:



Single system image across multiple nodes.
Resource access completely transparent.
Not practically in use:
• Compatibility with existing applications.
• Emulations offer very bad performance.
• Users prefer a degree of autonomy for their machines.
 Middleware and network operating systems combination
provides an acceptable balance between the requirement of
autonomy and network-transparent resource access.
Dr. Almetwally Mostafa
4
Operating System Layer
Applications, services
Middleware
OS: kernel,
libraries &
servers
OS1
Processes, threads,
communication, ...
OS2
Processes, threads,
communication, ...
Computer &
network hardware
Computer &
network hardware
Node 1
Platform
Node 2
System layers
Dr. Almetwally Mostafa
5
Operating System Layer
 Users satisfaction is achieved if middleware-OS combination
has good performance.


OS running at a node provides its own abstractions of local hardware
resources for processing, storage and communication.
Middleware utilizes a combination of local resources to implement its
mechanisms for remote invocations between objects or processes.
 OSs provide
effectively:




support
for
middleware
layer
to
work
Encapsulation: provide transparent service interface to resources of
the computer.
Protection: protect resources from illegitimate access.
Concurrent processing: users/clients may share resources and access
concurrently.
Provide the resources needed for (distributed) services and
applications to complete their task: Communication and scheduling.
Dr. Almetwally Mostafa
6
Operating System Layer
Process manager
Communication
manager
Thread manager
Memory manager
Supervisor
Core OS functionality
Dr. Almetwally Mostafa
7
Operating System Layer
 The core OS components include the following:

Process manager:
• Handles the creation of and operation upon process.

Thread manager:
• Handles thread creation, synchronization and scheduling.

Communication manager:
• Handles communication between threads attached to different processes
on the same computer.
• Some OSs support communication between threads in remote processes.

Memory manager:
• Manages physical and virtual memory.

Supervisor:
• Dispatches interrupts, system call traps and other exceptions.
Dr. Almetwally Mostafa
8
Processes and Threads
 Process (program in execution):


Unit of resource management for operating system.
Execution environment:
• Address space.
• Thread synchronization and communication resources (e.g. semaphores).
• Computing resources (file systems, windows, etc.)

Expensive to create and manage.
 Threads (lightweight process):


Schedulable activities attached to processes.
Arise from the need for concurrent activities to share resources within
one process.
• Enable to overlap computation with input and output.
• Allow concurrent processing of client requests in servers – each request
handled by one thread.

Easier to create and destroy.
Dr. Almetwally Mostafa
9
Processes and Threads
Address Spaces
 A unit of management a process’s virtual memory.
 Large and consists of one or more regions separated by
inaccessible area of virtual memory to allow growth.
 A region is an area of contiguous virtual memory accessible
by the threads of the owning process.
 Each region is specified by:



Lowest virtual address and size.
Read/write/execute permissions for the process’s threads.
Whether can be grown upwards or downwards.
 Gaps are left between regions to allow for growth and
regions can be overlapped when extended in size.
 Data files can be mapped into the address space as an array
of bytes in memory.
Dr. Almetwally Mostafa
10
Processes and Threads
Address Spaces
 There are
regions:



at
least
three
main
2N
Text: a fixed unmodifiable region
containing program code.
Heap: extensible region initialized by
values stored in the program binary file.
Stack: downward extensible region used
by subroutines.
 The need to support a separate stack
for each thread is the main reason of
an indefinite number of regions.
 Shared regions are regions of virtual
memory mapped to identical physical
memory for different processes to
enable inter-process communication.
Dr. Almetwally Mostafa
Auxiliary
regions
Stac k
Heap
Tex t
0
11
Processes and Threads
New Process Creation
 An indivisible operation provided by the operating system.

The UNIX fork system call creates a process with an execution
environment copied from the caller.
 But, the creation of a new process in a distributed system can
be separated into two independent aspects:

The choice of a target host.

The creation of an execution environment.
 Choice of process host:

Determine the node at which the new process will reside according to
transfer and location policies for sharing the processing load:
• The transfer policy determines whether to situate a new process locally or
remotely.
• The location policy determines which node should host a new process.
Dr. Almetwally Mostafa
12
Processes and Threads
New Process Creation
 Choice of process host (cont.):

Location polices may be static or adaptive:
• Static location policies operate without regard to the current state of the
system based on a mathematical analysis aimed at optimizing the all
system and may be deterministic or probabilistic.
• Adaptive location polices apply heuristics to make the decision based on
unpredictable run-time factors on each node.

Load sharing system may be centralized, hierarchical or decentralized:
• One manger component take the decision in the centralized system.
• There are several mangers organized in a tree structure in hierarchical
system and each manger makes the decisions as far down the tree.
• Nodes in the decentralized system exchange information with one
another directly to make allocation decisions using:
• Sender-initiated algorithm: the node requires creating a new process is
responsible for initiating the transfer decisions.
• Receiver-initiated algorithm: the node with relatively low load advertises its
existence other nodes to transfer work to it.
Dr. Almetwally Mostafa
13
Processes and Threads
New Process Creation
 Creation of a new execution environment:

There are two approaches to defining and initiating the address space
of a new created process:
• The address space is of statically defined format and initialized with zeros.
• The address space is defined with respect to an existing execution
environment.
• In case of UNIX fork, the newly created child process share the parent’s text
region and has its own heap and stack regions.

Copy-on-write approach:
• A general approach of inheriting all regions of the parent process by the
child process.
• An inherited region is logically copied from the parent’s region by sharing
its frame between the two address spaces.
• A page in a region is physically copied when one or other process
attempts to modify it.
Dr. Almetwally Mostafa
14
Processes and Threads
New Process Creation
Process A’s address space
RA
Process B’s address space
RB copied
from RA
RB
Kernel
A's page
table
Shared
frame
B's page
table
a) Before write
b) After write
Dr. Almetwally Mostafa
15
Processes and Threads
Threads Performance
 Consider the server has a pool of one or more threads.
 Each thread removes a client request from a queue of
received requests and process it.
 Example: (how multi-threading maximize the server throughput)



Request processing: 2 ms
I/O delay (no caching): 8 ms
Single thread:
• 10 ms per requests, 100 requests per second.

Two threads (no caching):
• 8 ms per request, 125 requests per second

two threads and caching:
• 75% hit rate
• mean I/O time per request: 0.75 * 0 + 0.25 * 8ms = 2 ms
• 500 requests per second
• increased processing time per request as a result of caching : 2.5 ms
• 400 requests per second
Dr. Almetwally Mostafa
16
Processes and Threads
Threads Performance
Thread 2 makes
requests to server
Thread 1
generates
results
Input-output
Receipt &
queuing
T1
Requests
N threads
Client
Server
Client and server with threads
Dr. Almetwally Mostafa
17
Processes and Threads
Multi-threaded Server Architectures
 There are various ways to mapping requests to threads
within a server.
 The threading architectures of various implementations are:

Worker pool architecture:
• Pool of server threads serves requests in queue.
• Possible to maintain priorities per queue.

Thread-per-request architecture:
• Thread lives only for the duration of request handling.
• Maximizes throughput (no queueing).
• Expensive overhead for thread creation and destruction.

Thread-per-connection/per-object architecture:
• Compromise solution.
• No overhead for creation and deletion of threads.
• Requests may still block, hence throughput is not maximal.
Dr. Almetwally Mostafa
18
Processes and Threads
Multi-threaded Server Architectures
workers
I/O
remote
objects
a. Thread-per-request
per-connection
threads
per-object
threads
remote
objects
b. Thread-per-connection
I/O
remote
objects
c. Thread-per-object
Alternative server threading architectures
Dr. Almetwally Mostafa
19
Processes and Threads
Threads vs. Multiple Processes
 Why the multi-threaded process model is preferred than
multiple single-threaded processes?

Creating a new thread within an existing process is cheaper than
creating a process (~10-20 times)
• New process under Unix: 11ms, new thread under Topaz: 1 ms

Switching to a different thread within the same process is cheaper
than switching between threads in different processes (~5-50 times).
• Process switch in Unix: 1.8ms, thread switch in Topaz: 0.4 ms.

Threads within a process can share data and other resources more
conveniently and efficiently (without copying or messages).
• No need for message passing.
• Communication via shared memory.

Threads within a process are not protected from each other.
• One thread can access other thread's data, unless a type-safe
programming language is being used
Dr. Almetwally Mostafa
20
Processes and Threads
Threads Programming
 Some languages provided direct support for threads
concurrent programming (e.g. C, Ada95, Modula-3 and Java).
 Java provides Thread class that includes the following
methods for creating, destroying and synchronizing threads:
 Thread(ThreadGroup
group,
Runnable
target,
String
name)
Creates a new thread, in the SUSPENDED state, belong to group and be
identified as name; the thread will execute the run() method of target.
 setPriority(int newPriority), getPriority() - Set and return the thread’s
priority.
 run() - A thread executes the run() method of its target object, if it has one,
and otherwise its own run() method.
 start() - Change the state of the thread from SUSPENDED to RUNNABLE.
 sleep(int millisecs) - Cause the thread to enter the SUSPENDED state for
the specified time.
 destroy() - Destroy the thread.
Dr. Almetwally Mostafa
21
Processes and Threads
Java Thread Lifetimes
 A new thread is created in the SUSPENDED state on the same
Java Virtual Machine (JVM) as its creator.
 A thread executes run() method after it is made RUNABLE
with the start() method.
 Threads can be assigned a priority and Java implementations
will run a particular thread in preference to any thread with
lower priority.
 A thread ends its life when it returns from the run() method
or when its destroy() method is called.
 Programs can manage threads in groups:




Thread group is assigned at the time of its creation.
thread groups useful to shield various applications running in parallel
on one JVM.
A thread in one group may not interrupt thread in another group.
Thread group facilitates control of the relative priorities of threads.
Dr. Almetwally Mostafa
22
Processes and Threads
Java Thread Synchronization
 Each thread’s local variables in methods are private to it.
 An object can have synchronized and non-synchronized
methods.

Example: Synchronized addTo() and removeFrom() methods to
serialize requests in worker pool example.
 Any object can only be accessed through one invocation of
any of its synchronized methods.
 Threads can be blocked and woken up via condition variables:



Thread awaiting a certain condition calls an object’s wait() method.
Other thread calls notify() or notifyAll() to awake one or all blocked
threads.
Example:
• When worker thread discovers no requests to be processed calls wait() on
instance of Queue.
• When I/O thread adds request to queue calls notify() method of queue to
wake up worker.
Dr. Almetwally Mostafa
23
Processes and Threads
Java Thread Scheduling
 A special yield() method is used to enable scheduling of
threads to make progress.
 There are two types of scheduling threads:

Preemptive scheduling threads
• A thread may be suspended at any point to make away for another thread.

Non-preemptive scheduling threads
• A thread runs until makes a call to the threading system to de-schedule it
and schedule another thread to run.
• A code section without a threading system call is automatically a critical
section.
• Run exclusively and therefore it can not take advantage of a multiprocessor.
• Must take care long-running code sections that do not contain threading
system calls.
• Unsuitable for real-time applications processed in absolute times.
Dr. Almetwally Mostafa
24
Processes and Threads
Threads Implementation
 Many operating system kernels provide support for multithreaded processes (e.g. Windows NT, Solaris, and Mach).

Provide thread creation, management system calls and scheduling
individual threads.
 Other operating systems have only a single-threaded process
abstraction.


Multi-threaded processes are implemented by linking a user library of
procedures to application programs.
Suffer from the following problems:
• Threads within a process can not take advantage of a multiprocessor.
• A thread that takes a page fault blocks the entire process and all its threads.
• Threads within different process can not scheduled according to a single
schema of relative prioritization.

But have significant advantages:
• Operations of thread creation are significantly less costly.
• Allow customizing the thread scheduling module and support more userlevel threads to suit particular application requirements.
Dr. Almetwally Mostafa
25
Processes and Threads
Threads Implementation
 The advantages of user-level and
implementations can be combined:


kernel-level
threads
Mach OS enable user-level code to provide scheduling hints to the
kernel’s thread scheduler.
Solaris 2 adopts hierarchical scheduling that supports both kernel-level
and user-level threads.
• A user-level scheduler assigns each user-level thread to a kernel-level thread.
• Take the advantage of a multiprocessor.
• Disadvantage: still lakes flexibility
• If a kernel-level thread is blocked, then all user-level threads assigned to it are
also prevented from running.

Several research projects have developed hierarchical scheduling further
to provide greater efficiency and flexibility:
• FastThreads implementation of a hierarchic event-based scheduling system:
• Consider the main system components is a kernel running on a computer with
one or more processors and a set of application programs running on it.
• Each application process contains a user-level scheduler to manage its threads.
• The kernel is responsible for allocating virtual processors to processes.
Dr. Almetwally Mostafa
26
Processes and Threads
Threads Implementation
Process
A
Process
B
P added
SA preempted
Process
SA unblocked
SA blocked
Kernel
Kernel
Virtual processors
A. Assignment of virtual processors
to processes
P idle
P needed
B. Events between user-level scheduler
& kernel
Key: P = processor; SA = scheduler activation
Scheduler activations
Dr. Almetwally Mostafa
27
Communication and Invocation
Invocation Performance
 The performance of RPC and RMI mechanisms is a critical
factor for effective distributed systems.

Clients and servers may make many millions of invocation-related
operations in their lifetimes.
 Software overheads often predominate
overheads in invocation times.

over
network
Invocation times have not decreased in proportion with increases in
network bandwidth.
 Each invocation mechanism executes a code out of the calling
procedure or object scope and involves the arguments
communication to the code and the data values return to the caller.
 The important performance-related distinctions between
invocation mechanisms:



Whether they are synchronous or asynchronous.
Whether they involve a domain transition.
Whether they involve communication across a network.
Dr. Almetwally Mostafa
28
Communication and Invocation
Invocation Performance
(a) System call
Control transfer via
trap instruction
Thread
User
Control transfer via
privileged instructions
Kernel
Protection domain
boundary
(b) RPC/RMI (within one computer)
Thread 1
User 1
Thread 2
Kernel
User 2
(c) RPC/RMI (between computers)
Network
Thread 1
User 1
Kernel 1
Dr. Almetwally Mostafa
Thread 2
Kernel 2
User 2
29
Communication and Invocation
Invocation Performance
 A null invocation is an RPC (or RMI) without parameters that
execute a null procedure and returns no values.


Important to measure a fixed overhead, the latency.
Execution time for a null procedure call:
• Local procedure call
< 1 microseconds
• Remote procedure call ~ 10 milliseconds
10,000 times slower!
 Much of the delay taken by the actions of the operating
system kernel and middleware.

Network time involving about 100 bytes (null invocation size)
transferred at 100 megabits/sec. accounts for only .01 millisecond.
 Factors affecting remote invocation performance:





Marshalling/unmarshalling + operation dispatch at the server
Data copying:- application -> kernel space -> communication buffers
Thread scheduling and context switching
Protocol processing:- for each protocol layer
Network access delays:- connection setup, network latency
Dr. Almetwally Mostafa
30
Communication and Invocation
Invocation Performance
 Shared regions may be used for rapid communication between
a user process and the kernel or between user processes.

Data is communicated by writing to and reading from the shared region
without coping them to and from the kernel address spaces.
 The delay of a client experiences during request-replay
interactions over TCP is not necessarily worse than UDP and it
is sometimes better for large messages.

The operating system default buffering can be used to collect several
small massages and then send them together rather than sending them
in separate packets.
 Develop a more efficient invocation mechanism for the case of
two processes on the same computer, lightweight RPC (LRPC):



Based on optimization concerning data coping and thread scheduling.
Use shared regions for client-server communication with a different
private region (A stack) between the server and each of its local clients.
Each client and the server are able to pass arguments and return
values directly via an A stack.
Dr. Almetwally Mostafa
31
Communication and Invocation
Invocation Performance
Client
Server
A stack
A
4. Execute procedure
and copy results
1. Copy args
User
stub
stub
Kernel
2. Trap to Kernel
3. Upcall
5. Return (trap)
A lightweight remote procedure call
Dr. Almetwally Mostafa
32