Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Synchronizing Processes ▸ Clocks ▹External clock synchronization (Cristian) ▹Internal clock synchronization (Gusella & Zatti) ▹Network Time Protocol (Mills) ▸ Decisions ▹Agreement protocols (Fischer) ▸ Data ▹Distributed file systems (Satyanarayanan) ▸ Memory ▹Distributed shared memory (Nitzberg & Lo) ▸ Schedules ▹Distributed scheduling (Isard et al.) Synchronizing Processes 1 Agreement Problems ▸Require all non-faulty (or correct) processes to come to an agreement ▸Three types of problems: ▹Consensus: ▸Each process Pi proposes a value vi and all non-faulty processes agree on a consensus value c ▹Interactive Consistency: ▸Each process Pi proposes a value vi and all non-faulty processes agree on a consensus vector c = <v1, v2, …, vN> ▹Byzantine (Generals or Reliable Broadcast): ▸One process Pg proposes a value vg and all non-faulty processes agree on a consensus value c = vg Synchronizing Processes > Agreement Protocols > Agreement Problems 2 Relations Among the Problems ▸ Since the interactive consistency problem can be solved with a Byzantine protocol Bz ▸ And the consensus problem can be solved with an interactive consistency protocol ▸ The consensus problem can be solved with a Byzantine protocol Bz ▹N copies of the Bz protocol are run in parallel, where each processor Pi acts as the commander (Pg) for exactly one copy of the protocol ▹The non-faulty processors use the majority vote of the consensus vector as the consensus value ▸ Hence, a Byzantine protocol can solve all three problems Synchronizing Processes > Agreement Protocols > Agreement Problems 3 The Byzantine Generals Problem [Lamport, Shostak, & Pease, 1982.] Basic idea is very similar to the consensus problem: ▸Each of N generals has a value v(i), (e.g. “attack” or “retreat”). ▸We want an algorithm to allow all generals to exchange their values such that the following hold: ▹All non-faulty generals must agree on the values of v(1),…,v(N). ▹If the i th general is non-faulty, then the value agreed for v(i) must be the i th general’s value. Consensus & Byz. Agreement 4 Byzantine Generals Problem ▸The problem described earlier can be solved by restricting attention to one commanding general and considering all others to be lieutenants. ▸A commanding general must send an order to his N–1 lieutenants, such that: IC1: IC2: All loyal lieutenants obey the same order. If the commander is loyal, then loyal lieutenants obey the order he sends. Consensus & Byz. Agreement 5 Oral Message Algorithm ▸Assumptions: 1. Every message that is sent is delivered correctly 2. The receiver of a message knows who sent it 3. The absence of a message can be detected ▹Assumptions #1 and #2 prevent a traitor from interfering with the communication between two other generals ▹Assumption #3 foils a traitor who tries to prevent a decision by simply not sending messages ▸Denoted OM(m), where m is the maximum number of traitors the system can handle Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Unauthenticated Messages 6 Impossibility Theorem ▸If processes can only send unauthenticated messages, more than two thirds of the processes must be non-faulty to derive a solution ▸In other words, no solution exists for a system with fewer than 3m + 1 nodes, where m is the number of faulty processes Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Unauthenticated Messages 7 Algorithm with Oral Messages Algorithm OM(m) (defined recursively) tolerates m traitors. Algorithm OM(0): ▹ Commander sends value to each lieutenant. ▹ Each lieutenant uses the value received from the commander (or “retreat” if no message is received). Algorithm OM(m), m > 0: ▹ Commander sends value to each lieutenant. ▹ Each lieutenant uses OM(m–1) to send the value received (take this value to be “retreat” if not received) to the other N–2 lieutenants. ▹ Each lieutenant uses the majority of the values received from the commander and the other lieutenants in the previous two steps. Consensus & Byz. Agreement 8 Intuition ▸If the commander is loyal, then he sends the same command to all lieutenants. In this case, the lieutenants all agree on the correct command by majority, as in the example. ▸If the commander is a traitor, then he may send different commands to different lieutenants. However, this leaves one fewer traitors among the lieutenants, making it easier to reach agreement among them. (When the commander is a traitor, they can agree on any command.) Consensus & Byz. Agreement 9 Signed Message Algorithm ▸Assumptions: 1. 2. 3. 4. Every message that is sent is delivered correctly The receiver of a message knows who sent it The absence of a message can be detected Signatures: ▸A loyal general’s signature cannot be forged, and any alteration of the contents of his signed messages can be detected ▸Anyone can verify the authenticity of a general’s signature ▸Denoted SM(m), the algorithm can cope with m traitors for any number of generals ▹I.e., it is now possible to tolerate any number of traitors Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages 10 SM(m) Algorithm ▸ Initially Vi = { } 1. The commander P0 signs and sends his value to every lieutenant 2. If lieutenant i receives a message of the form v:0 from the commander, then ▹ it adds v to Vi and ▹ sends the message v:0:i to every other lieutenant 3. If lieutenant i receives a message of the form v:0:j1:…:jk and v is not in Vi then ▹ it adds v to Vi and ▹ if k < m, it sends the message v:0:j1:…:jk:i to every lieutenant other than j1, …, jk 4. When lieutenant i will receive no more messages, it obeys choice(Vi) Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages 11 Choice Function ▸The choice function is applied to a set of orders to obtain a single one ▸Requirements: ▹If the set V consists of a single element v, then choice(V) = v ▹If the set V is empty, then choice(V) = a predetermined value ▸Possibilities: ▹choice(V) selects the majority of set V or a predetermined value if there is not a majority ▹choice(V) selects the median of set V, if the elements of V can be ordered Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages 12 Choice Function ▸ The choice function is applied to a set of orders to obtain a single one ▸ Requirements: ▹If the set V consists of a single element v, then choice(V) = v ▹If the set V is empty, then choice(V) = a predetermined value ▸ Basic Idea: ▹If Commander is loyal, then all messages will be of the form V:0:w*. (No forging.) So, all lieutenants end up with Vi = {V}. ▹If Commander is a traitor, then loyal lieutenants can detect it. Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages 13 SM(m) Example ▸ After step 3, V1 = V2 = {Attack, Retreat} ▸ Intuitively, both lieutenants can tell the commander is a tritor ▸ With no majority, choice would default to Retreat Commander Attack:0 Retreat:0 Attack:0:1 Lieutenant 1 Lieutenant 2 Retreat:0:2 Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages 14 Synchronizing Processes ▸ Clocks ▹External clock synchronization (Cristian) ▹Internal clock synchronization (Gusella & Zatti) ▹Network Time Protocol (Mills) ▸ Decisions ▹Agreement protocols (Fischer) ▸ Data ▹Distributed file systems (Satyanarayanan) ▸ Memory ▹Distributed shared memory (Nitzberg & Lo) ▸ Schedules ▹Distributed scheduling (Isard et al.) Synchronizing Processes 15 Synchronizing Processes: Distributed File Systems CS/CE/TE 6378 Advanced Operating Systems Distributed File Systems ▸Distributed File System (DFS): ▹a system that provides access to the same storage for a distributed network of processes Common Storage Synchronizing Processes > Distributed File Systems 17 Benefits of DFSs ▸ Data sharing is simplified ▹ Files appear to be local ▹ Users are not required to specify remote servers to access ▸ User mobility is supported ▹ Any workstation in the system can access the storage ▸ System administration is easier ▹ Operations staff can focus on a small number of servers instead of a large number of workstations ▸ Better security is possible ▹ Servers can be physically secured ▹ No user programs are executed on the servers ▸ Site autonomy is improved ▹ Workstations can be turned off without disrupting the storage Synchronizing Processes > Distributed File Systems 18 Design Principles for DFSs ▸ Utilize workstations when possible ▹ Opt to perform operations on workstations rather than servers to improve scalability ▸ Cache whenever possible ▹ This reduces contention on centralized resources and transparently makes data available whenever used ▸ Exploit file usage characteristics ▹ Knowledge of how files are accessed can be used to make better choices ▹ Ex: Temporary files are rarely shared, hence can be kept locally ▸ Minimize system-wide knowledge and change ▹ Scalability is enhanced if global information is rarely monitored or updated ▸ Trust the fewest possible entities ▹ Security is improved by trusting a smaller number of processes ▸ Batch if possible ▹ Transferring files in large chunks improve overall throughput Synchronizing Processes > Distributed File Systems 19 Quiz Question ▸Which of the following was not a design principle from the Andrew and Coda file systems? ▹Cache whenever possible. ▹Decentralize operations when possible. ▹Minimize system-wide knowledge. ▹Trust the most possible entities. Synchronizing Processes > Distributed File Systems 20 Mechanisms for Building DFSs ▸ Mount points ▹ Enables filename spaces to be “glued” together to provide a single, seamless, hierarchical namespace ▸ Client caching ▹ Contributes the most to better performance in DFSs ▸ Hints ▹ Pieces of information that can substantially improve performance if correct but no negative consequence if erroneous ▹ Ex: Caching mappings of pathname prefixes ▸ Bulk data transfer ▹ Reduces network communication overhead by transferring in bulk ▸ Encryption ▹ Used for remote authentication, either with private or public keys ▸ Replication ▹ Storing the same data on multiple servers increases availability Synchronizing Processes > Distributed File Systems 21 Quiz Question ▸Which of the following is not an important mechanism for developing distributed file systems? ▹Caching data at clients, either entire files or portions of files. ▹Encrypting data transmissions, either with private or public keys. ▹Read-only data replication for files that change often. ▹Transferring data in bulk to reduce communication overheads. Synchronizing Processes > Distributed File Systems 22 DFS Case Studies ▸Two case studies: ▹Andrew (AFS) ▹Coda ▸Both were: ▹Developed at Carnegie Mellon University (CMU) ▹A Unix-based DFS ▹Focused on scalability, security, and availability Synchronizing Processes > Distributed File Systems 23 Andrew File System (AFS) ▸Vice is a collection of trusted file servers ▸Venus is a service that runs on each workstation to mediate shared file access Venus Venus Vice Vice Venus Venus Venus Venus Synchronizing Processes > Distributed File Systems > Andrew 24 AFS-1 ▸ Used from 1984 through 1985 ▸ Each server contained a local file system mirroring the structure of the shared file system ▸ If a file was not on the server, a search would end in a stub directory that identified the server containing the file ▸ Clients cached pathname prefix information to direct file requests to the appropriate servers ▸ Venus used a pessimistic approach to maintaining cache coherence ▹All cached files copies were considered suspect ▹Venus would contact Vice to verify the cache was the latest version before accessing the file Synchronizing Processes > Distributed File Systems > Andrew 25 AFS-2 ▸ Used from 1985 through 1989 ▸ Venus now used an optimistic approach to maintaining cache coherence ▹All cached files were considered valid ▹Callbacks were used ▸When files are cached on a workstation, the server promises to notify the workstation if the file is to be modified by another machine ▸ A remote procedure call (RPC) mechanism was used to optimize bulk file transfers ▸ Mount points and volumes were used instead of stub directories to easily move files around among the servers ▹Each user was normally assigned a volume and a disk quota ▹Read-only replication of volumes increased availability Synchronizing Processes > Distributed File Systems > Andrew 26 Quiz Question ▸What is a callback? ▹A notification from a client that a local cache has been modified. ▹A notification from a server that a file or directory is to be modified. ▹Both of the above. ▹None of the above. Synchronizing Processes > Distributed File Systems > Andrew 27 AFS-3 ▸Used from 1989 through early 1990s ▸Supports multiple administrative cells, each with its own servers, workstations, system admins, and users ▹Each cell is completely autonomous ▸Venus now cached files in large chunks instead of their entirety Synchronizing Processes > Distributed File Systems > Andrew 28 Security in Andrew ▸ Protection domains ▹ Each is composed of users and groups ▹ Each group is associated with a unique owner (user) ▹ A protection server is used to immediately reflect changes in domains ▸ Authentication ▹ Upon login, a user’s password is used to obtain tokens from an authentication server ▹ Venus uses these tokens to establish connections to the RPC ▸ File system protection ▹ Access lists are used to determine access to directories instead of files, including negative rights ▸ Resource usage ▹ Andrew’s protection and authentication mechanisms protect against denials of service and resources Synchronizing Processes > Distributed File Systems > Andrew 29 Coda ▸A descendant of AFS-2 ▸Substantially more resilient to server and network failures ▸By relying entirely on local resources (caches) when the servers are inaccessible ▸Allows a user to continue working regardless of failures elsewhere in the system Synchronizing Processes > Distributed File Systems > Coda 30 Coda Overview ▸ Clients cache entire files on their local disks ▸ Cache coherence is maintained by the use of callbacks ▸ Clients dynamically find files on servers and cache location information ▸ Token-based authentication and end-to-end encryption are used for security ▸ Provides failure resiliency through two mechanisms: ▹Server replication: storing copies of files on multiple servers ▹Disconnected operation: mode of optimistic execution in which the client relies solely on cached data Synchronizing Processes > Distributed File Systems > Coda 31 Server Replication ▸Replicated Volume: ▹consists of several physical volumes or replicas that are managed as one logical volume by the system ▸Volume Storage Group (VSG): ▹a set of servers maintaining a replicated volume ▸Accessible VSG (AVSG): ▹the set of servers currently accessible ▹Venus performs periodic probes to detect AVSGs ▹One member is designated as the preferred server Synchronizing Processes > Distributed File Systems > Coda > Server Replication 32 Quiz Question ▸What is a VSG? ▹Venus Service Group ▹Vice Server Group ▹Volume Storage Group ▹None of the above Synchronizing Processes > Distributed File Systems > Coda > Server Replication 33 Server Replication ▸Venus employs a Read-One, Write-All strategy ▸For a read request, ▹If a local cache exists, ▸Venus will read the cache instead of contacting the VSG ▹If a local cache does not exist, ▸Venus will contact the preferred server for its copy ▸Venus will also contact the other AVSG for their version numbers ▸If the preferred version is stale, a new, up-to-date preferred server is selected from the AVSG and the fetch is repeated Synchronizing Processes > Distributed File Systems > Coda > Server Replication 34 Server Replication ▸Venus employs a Read-One, Write-All strategy ▸For a write, ▹When a file is closed, it is transferred to all members of the AVSG ▹If the server’s copy does not conflict with the client’s copy, an update operation handles transferring file contents, making directory entries, and changing access lists ▹A data structure called the update set, which summarizes the client’s knowledge of which servers did not have conflicts, is distributed to the servers Synchronizing Processes > Distributed File Systems > Coda > Server Replication 35 Disconnected Operation ▸Begins at a client when no member of a VSG is accessible ▸Clients are allowed to rely solely on local caches ▸If a cache does not exist, the system call that triggered the file access is aborted ▸Disconnected operation ends when Venus established a connection with the VSG ▸Venus executes a series of update processes to reintegrate the client with the VSG Synchronizing Processes > Distributed File Systems > Coda > Disconnected Operation 36 Disconnected Operation ▸Reintegration updates can fail for two reasons: ▹There may be no authentication tokens that Venus can use to communicate securely with AVSG members due to token expirations ▹Conflicts may be detected ▸If reintegration fails, a temporary repository is created on the servers to store the data in question until a user can resolve the problem later ▸These temporary repositories are called covolumes ▸Mitigate is the operation that transfers a file or directory from a workstation to a covolume Synchronizing Processes > Distributed File Systems > Coda > Disconnected Operation 37 Conflict Resolution ▸When a conflict is detected, Coda first attempts to resolve it automatically ▹Ex: partitioned creation of uniquely named files in the same directory can be handled automatically by selectively replaying the missing file creates ▸If automated resolution is not possible, Code marks all accessible replicas inconsistent and moves them to their covolumes ▸Coda provides a repair tool to assist users in manually resolving conflicts Synchronizing Processes > Distributed File Systems > Coda > Conflict Resolution 38 Quiz Question ▸Which of the following is not true about conflict resolution in the Coda DFS? ▹Coda attempts to resolve conflicts by recreating any missing files in a directory. ▹Coda inspects workstations for the most up-to-date cache of the conflicted file. ▹For file-level conflicts, Coda marks all replicas as inconsistent and moves them to a covolume. ▹Users manually resolve file-level conflicts using a provided repair tool. Synchronizing Processes > Distributed File Systems > Coda > Conflict Resolution 39