Download Lecture 14 - The University of Texas at Dallas

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Natural computing wikipedia , lookup

Diff wikipedia , lookup

Virtual synchrony wikipedia , lookup

Social peer-to-peer processes wikipedia , lookup

Transcript
Synchronizing Processes
▸ Clocks
▹External clock synchronization (Cristian)
▹Internal clock synchronization (Gusella & Zatti)
▹Network Time Protocol (Mills)
▸ Decisions
▹Agreement protocols (Fischer)
▸ Data
▹Distributed file systems (Satyanarayanan)
▸ Memory
▹Distributed shared memory (Nitzberg & Lo)
▸ Schedules
▹Distributed scheduling (Isard et al.)
Synchronizing Processes
1
Agreement Problems
▸Require all non-faulty (or correct) processes to come
to an agreement
▸Three types of problems:
▹Consensus:
▸Each process Pi proposes a value vi and all non-faulty processes
agree on a consensus value c
▹Interactive Consistency:
▸Each process Pi proposes a value vi and all non-faulty processes
agree on a consensus vector c = <v1, v2, …, vN>
▹Byzantine (Generals or Reliable Broadcast):
▸One process Pg proposes a value vg and all non-faulty processes
agree on a consensus value c = vg
Synchronizing Processes > Agreement Protocols > Agreement Problems
2
Relations Among the Problems
▸ Since the interactive consistency problem can be solved
with a Byzantine protocol Bz
▸ And the consensus problem can be solved with an
interactive consistency protocol
▸ The consensus problem can be solved with a Byzantine
protocol Bz
▹N copies of the Bz protocol are run in parallel, where each
processor Pi acts as the commander (Pg) for exactly one copy of
the protocol
▹The non-faulty processors use the majority vote of the
consensus vector as the consensus value
▸ Hence, a Byzantine protocol can solve all three problems
Synchronizing Processes > Agreement Protocols > Agreement Problems
3
The Byzantine Generals Problem
[Lamport, Shostak, & Pease, 1982.]
Basic idea is very similar to the consensus problem:
▸Each of N generals has a value v(i), (e.g. “attack” or
“retreat”).
▸We want an algorithm to allow all generals to
exchange their values such that the following hold:
▹All non-faulty generals must agree on the values of
v(1),…,v(N).
▹If the i th general is non-faulty, then the value agreed for v(i)
must be the i th general’s value.
Consensus & Byz. Agreement 4
Byzantine Generals Problem
▸The problem described earlier can be solved by
restricting attention to one commanding general
and considering all others to be lieutenants.
▸A commanding general must send an order to
his N–1 lieutenants, such that:
IC1:
IC2:
All loyal lieutenants obey the same order.
If the commander is loyal, then loyal
lieutenants obey the order he sends.
Consensus & Byz. Agreement 5
Oral Message Algorithm
▸Assumptions:
1. Every message that is sent is delivered correctly
2. The receiver of a message knows who sent it
3. The absence of a message can be detected
▹Assumptions #1 and #2 prevent a traitor from interfering
with the communication between two other generals
▹Assumption #3 foils a traitor who tries to prevent a
decision by simply not sending messages
▸Denoted OM(m), where m is the maximum number
of traitors the system can handle
Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Unauthenticated Messages
6
Impossibility Theorem
▸If processes can only send unauthenticated
messages, more than two thirds of the processes
must be non-faulty to derive a solution
▸In other words, no solution exists for a system
with fewer than 3m + 1 nodes, where m is the
number of faulty processes
Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Unauthenticated Messages
7
Algorithm with Oral Messages
Algorithm OM(m) (defined recursively) tolerates m traitors.
Algorithm OM(0):
▹ Commander sends value to each lieutenant.
▹ Each lieutenant uses the value received from the commander (or
“retreat” if no message is received).
Algorithm OM(m), m > 0:
▹ Commander sends value to each lieutenant.
▹ Each lieutenant uses OM(m–1) to send the value received (take this
value to be “retreat” if not received) to the other N–2 lieutenants.
▹ Each lieutenant uses the majority of the values received from the
commander and the other lieutenants in the previous two steps.
Consensus & Byz. Agreement 8
Intuition
▸If the commander is loyal, then he sends the same
command to all lieutenants. In this case, the
lieutenants all agree on the correct command by
majority, as in the example.
▸If the commander is a traitor, then he may send
different commands to different lieutenants.
However, this leaves one fewer traitors among the
lieutenants, making it easier to reach agreement
among them. (When the commander is a traitor,
they can agree on any command.)
Consensus & Byz. Agreement
9
Signed Message Algorithm
▸Assumptions:
1.
2.
3.
4.
Every message that is sent is delivered correctly
The receiver of a message knows who sent it
The absence of a message can be detected
Signatures:
▸A loyal general’s signature cannot be forged, and any alteration of the
contents of his signed messages can be detected
▸Anyone can verify the authenticity of a general’s signature
▸Denoted SM(m), the algorithm can cope with m
traitors for any number of generals
▹I.e., it is now possible to tolerate any number of traitors
Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages
10
SM(m) Algorithm
▸ Initially Vi = { }
1. The commander P0 signs and sends his value to every
lieutenant
2. If lieutenant i receives a message of the form v:0 from the
commander, then
▹ it adds v to Vi and
▹ sends the message v:0:i to every other lieutenant
3. If lieutenant i receives a message of the form v:0:j1:…:jk and v
is not in Vi then
▹ it adds v to Vi and
▹ if k < m, it sends the message v:0:j1:…:jk:i to every lieutenant other
than j1, …, jk
4. When lieutenant i will receive no more messages, it obeys
choice(Vi)
Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages
11
Choice Function
▸The choice function is applied to a set of orders to
obtain a single one
▸Requirements:
▹If the set V consists of a single element v, then choice(V) =
v
▹If the set V is empty, then choice(V) = a predetermined
value
▸Possibilities:
▹choice(V) selects the majority of set V or a predetermined
value if there is not a majority
▹choice(V) selects the median of set V, if the elements of V
can be ordered
Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages
12
Choice Function
▸ The choice function is applied to a set of orders to obtain a
single one
▸ Requirements:
▹If the set V consists of a single element v, then choice(V) = v
▹If the set V is empty, then choice(V) = a predetermined value
▸ Basic Idea:
▹If Commander is loyal, then all messages will be of the form
V:0:w*. (No forging.) So, all lieutenants end up with Vi = {V}.
▹If Commander is a traitor, then loyal lieutenants can detect it.
Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages
13
SM(m) Example
▸ After step 3, V1 = V2 = {Attack, Retreat}
▸ Intuitively, both lieutenants can tell the commander is a tritor
▸ With no majority, choice would default to Retreat
Commander
Attack:0
Retreat:0
Attack:0:1
Lieutenant 1
Lieutenant 2
Retreat:0:2
Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages
14
Synchronizing Processes
▸ Clocks
▹External clock synchronization (Cristian)
▹Internal clock synchronization (Gusella & Zatti)
▹Network Time Protocol (Mills)
▸ Decisions
▹Agreement protocols (Fischer)
▸ Data
▹Distributed file systems (Satyanarayanan)
▸ Memory
▹Distributed shared memory (Nitzberg & Lo)
▸ Schedules
▹Distributed scheduling (Isard et al.)
Synchronizing Processes
15
Synchronizing Processes:
Distributed File Systems
CS/CE/TE 6378
Advanced Operating Systems
Distributed File Systems
▸Distributed File System (DFS):
▹a system that provides access to the same storage for
a distributed network of processes
Common
Storage
Synchronizing Processes > Distributed File Systems
17
Benefits of DFSs
▸ Data sharing is simplified
▹ Files appear to be local
▹ Users are not required to specify remote servers to access
▸ User mobility is supported
▹ Any workstation in the system can access the storage
▸ System administration is easier
▹ Operations staff can focus on a small number of servers instead of a
large number of workstations
▸ Better security is possible
▹ Servers can be physically secured
▹ No user programs are executed on the servers
▸ Site autonomy is improved
▹ Workstations can be turned off without disrupting the storage
Synchronizing Processes > Distributed File Systems
18
Design Principles for DFSs
▸ Utilize workstations when possible
▹ Opt to perform operations on workstations rather than servers to improve
scalability
▸ Cache whenever possible
▹ This reduces contention on centralized resources and transparently makes
data available whenever used
▸ Exploit file usage characteristics
▹ Knowledge of how files are accessed can be used to make better choices
▹ Ex: Temporary files are rarely shared, hence can be kept locally
▸ Minimize system-wide knowledge and change
▹ Scalability is enhanced if global information is rarely monitored or updated
▸ Trust the fewest possible entities
▹ Security is improved by trusting a smaller number of processes
▸ Batch if possible
▹ Transferring files in large chunks improve overall throughput
Synchronizing Processes > Distributed File Systems
19
Quiz Question
▸Which of the following was not a design principle
from the Andrew and Coda file systems?
▹Cache whenever possible.
▹Decentralize operations when possible.
▹Minimize system-wide knowledge.
▹Trust the most possible entities.
Synchronizing Processes > Distributed File Systems
20
Mechanisms for Building DFSs
▸ Mount points
▹ Enables filename spaces to be “glued” together to provide a single,
seamless, hierarchical namespace
▸ Client caching
▹ Contributes the most to better performance in DFSs
▸ Hints
▹ Pieces of information that can substantially improve performance if correct
but no negative consequence if erroneous
▹ Ex: Caching mappings of pathname prefixes
▸ Bulk data transfer
▹ Reduces network communication overhead by transferring in bulk
▸ Encryption
▹ Used for remote authentication, either with private or public keys
▸ Replication
▹ Storing the same data on multiple servers increases availability
Synchronizing Processes > Distributed File Systems
21
Quiz Question
▸Which of the following is not an important
mechanism for developing distributed file
systems?
▹Caching data at clients, either entire files or portions
of files.
▹Encrypting data transmissions, either with private or
public keys.
▹Read-only data replication for files that change often.
▹Transferring data in bulk to reduce communication
overheads.
Synchronizing Processes > Distributed File Systems
22
DFS Case Studies
▸Two case studies:
▹Andrew (AFS)
▹Coda
▸Both were:
▹Developed at Carnegie Mellon University (CMU)
▹A Unix-based DFS
▹Focused on scalability, security, and availability
Synchronizing Processes > Distributed File Systems
23
Andrew File System (AFS)
▸Vice is a collection of trusted file servers
▸Venus is a service that runs on each workstation
to mediate shared file access
Venus
Venus
Vice
Vice
Venus
Venus
Venus
Venus
Synchronizing Processes > Distributed File Systems > Andrew
24
AFS-1
▸ Used from 1984 through 1985
▸ Each server contained a local file system mirroring the
structure of the shared file system
▸ If a file was not on the server, a search would end in a stub
directory that identified the server containing the file
▸ Clients cached pathname prefix information to direct file
requests to the appropriate servers
▸ Venus used a pessimistic approach to maintaining cache
coherence
▹All cached files copies were considered suspect
▹Venus would contact Vice to verify the cache was the latest
version before accessing the file
Synchronizing Processes > Distributed File Systems > Andrew
25
AFS-2
▸ Used from 1985 through 1989
▸ Venus now used an optimistic approach to maintaining
cache coherence
▹All cached files were considered valid
▹Callbacks were used
▸When files are cached on a workstation, the server promises to notify
the workstation if the file is to be modified by another machine
▸ A remote procedure call (RPC) mechanism was used to
optimize bulk file transfers
▸ Mount points and volumes were used instead of stub
directories to easily move files around among the servers
▹Each user was normally assigned a volume and a disk quota
▹Read-only replication of volumes increased availability
Synchronizing Processes > Distributed File Systems > Andrew
26
Quiz Question
▸What is a callback?
▹A notification from a client that a local cache has
been modified.
▹A notification from a server that a file or directory is
to be modified.
▹Both of the above.
▹None of the above.
Synchronizing Processes > Distributed File Systems > Andrew
27
AFS-3
▸Used from 1989 through early 1990s
▸Supports multiple administrative cells, each with
its own servers, workstations, system admins,
and users
▹Each cell is completely autonomous
▸Venus now cached files in large chunks instead of
their entirety
Synchronizing Processes > Distributed File Systems > Andrew
28
Security in Andrew
▸ Protection domains
▹ Each is composed of users and groups
▹ Each group is associated with a unique owner (user)
▹ A protection server is used to immediately reflect changes in domains
▸ Authentication
▹ Upon login, a user’s password is used to obtain tokens from an
authentication server
▹ Venus uses these tokens to establish connections to the RPC
▸ File system protection
▹ Access lists are used to determine access to directories instead of
files, including negative rights
▸ Resource usage
▹ Andrew’s protection and authentication mechanisms protect against
denials of service and resources
Synchronizing Processes > Distributed File Systems > Andrew
29
Coda
▸A descendant of AFS-2
▸Substantially more resilient to server and network
failures
▸By relying entirely on local resources (caches)
when the servers are inaccessible
▸Allows a user to continue working regardless of
failures elsewhere in the system
Synchronizing Processes > Distributed File Systems > Coda
30
Coda Overview
▸ Clients cache entire files on their local disks
▸ Cache coherence is maintained by the use of callbacks
▸ Clients dynamically find files on servers and cache location
information
▸ Token-based authentication and end-to-end encryption
are used for security
▸ Provides failure resiliency through two mechanisms:
▹Server replication: storing copies of files on multiple servers
▹Disconnected operation: mode of optimistic execution in
which the client relies solely on cached data
Synchronizing Processes > Distributed File Systems > Coda
31
Server Replication
▸Replicated Volume:
▹consists of several physical volumes or replicas that
are managed as one logical volume by the system
▸Volume Storage Group (VSG):
▹a set of servers maintaining a replicated volume
▸Accessible VSG (AVSG):
▹the set of servers currently accessible
▹Venus performs periodic probes to detect AVSGs
▹One member is designated as the preferred server
Synchronizing Processes > Distributed File Systems > Coda > Server Replication
32
Quiz Question
▸What is a VSG?
▹Venus Service Group
▹Vice Server Group
▹Volume Storage Group
▹None of the above
Synchronizing Processes > Distributed File Systems > Coda > Server Replication
33
Server Replication
▸Venus employs a Read-One, Write-All strategy
▸For a read request,
▹If a local cache exists,
▸Venus will read the cache instead of contacting the VSG
▹If a local cache does not exist,
▸Venus will contact the preferred server for its copy
▸Venus will also contact the other AVSG for their version
numbers
▸If the preferred version is stale, a new, up-to-date preferred
server is selected from the AVSG and the fetch is repeated
Synchronizing Processes > Distributed File Systems > Coda > Server Replication
34
Server Replication
▸Venus employs a Read-One, Write-All strategy
▸For a write,
▹When a file is closed, it is transferred to all members of
the AVSG
▹If the server’s copy does not conflict with the client’s
copy, an update operation handles transferring file
contents, making directory entries, and changing access
lists
▹A data structure called the update set, which summarizes
the client’s knowledge of which servers did not have
conflicts, is distributed to the servers
Synchronizing Processes > Distributed File Systems > Coda > Server Replication
35
Disconnected Operation
▸Begins at a client when no member of a VSG is
accessible
▸Clients are allowed to rely solely on local caches
▸If a cache does not exist, the system call that
triggered the file access is aborted
▸Disconnected operation ends when Venus
established a connection with the VSG
▸Venus executes a series of update processes to
reintegrate the client with the VSG
Synchronizing Processes > Distributed File Systems > Coda > Disconnected Operation
36
Disconnected Operation
▸Reintegration updates can fail for two reasons:
▹There may be no authentication tokens that Venus can use
to communicate securely with AVSG members due to token
expirations
▹Conflicts may be detected
▸If reintegration fails, a temporary repository is created
on the servers to store the data in question until a
user can resolve the problem later
▸These temporary repositories are called covolumes
▸Mitigate is the operation that transfers a file or
directory from a workstation to a covolume
Synchronizing Processes > Distributed File Systems > Coda > Disconnected Operation
37
Conflict Resolution
▸When a conflict is detected, Coda first attempts to
resolve it automatically
▹Ex: partitioned creation of uniquely named files in the
same directory can be handled automatically by
selectively replaying the missing file creates
▸If automated resolution is not possible, Code
marks all accessible replicas inconsistent and
moves them to their covolumes
▸Coda provides a repair tool to assist users in
manually resolving conflicts
Synchronizing Processes > Distributed File Systems > Coda > Conflict Resolution
38
Quiz Question
▸Which of the following is not true about conflict
resolution in the Coda DFS?
▹Coda attempts to resolve conflicts by recreating any
missing files in a directory.
▹Coda inspects workstations for the most up-to-date
cache of the conflicted file.
▹For file-level conflicts, Coda marks all replicas as
inconsistent and moves them to a covolume.
▹Users manually resolve file-level conflicts using a
provided repair tool.
Synchronizing Processes > Distributed File Systems > Coda > Conflict Resolution
39