Download Distributed System Concepts and Architectures

Distributed System Concepts and Architectures By Master Prince 1 Outline • • • • • • • • Advantages and disadvantages of distributed OS Goals Transparency Services Architecture Models Communication Network Protocols Major Design Issues Distributed Computing Environment (DCE) 2 Distributed OS • An integration of system services, presenting a transparent view of a multiple computer system with distributed resources and control • A collection of independent computers that appear to the users of the system as a single computer • Examples – Personal workstations + a pool of processors + single file system – Robots on the assembly line + Robots in the parts department – A large bank with hundreds of branch offices all over the world 3 Advantages of Distributed Systems Over Centralized Systems • Economics – microprocessors offer a better price/performance than mainframes • Speed – a distributed system may have more total computing power than a mainframe • Inherent distribution – some applications involve spatially separated machines • Reliability – if one machine crashes, the system as whole can still survive • Incremental growth – computing power can be added in small increments 4 Advantages of Distributed Systems Over Isolated Computers • Data sharing – allow many users access to a common data base • Device sharing – allow many users to share expensive peripherals like color printers • Communication – make human-to-human communication easier, for example, by E-mail • Flexibility – spread the workload over the available machines in the most cost effective way 5 Disadvantages of Distributed Systems • Software – complex software • Networking – the network can saturate or cause other problems • Security – easy access also applies to secret data 6 Goals (I) • Provide a high-performance and robust computing environment with least awareness of the management and control of distributed system resources • Efficiency - difficult due to communication delays – Propagation delay – nothing can be done – Protocol overhead • Effective communication primitives, good protocols – Load distribution – bottleneck or congestions in Network/SW • Balance and overlap computation and communication • Distributed processing and load sharing 7 Goals (II) • Flexibility – User view: friendly system and freedom in using the system • Friendliness: user interface, consistency, reliability  use OO • Freedom: – No unreasonable restrictions in using systems – Easy to build additional tools or services – System view • Ability to evolve and migrate • Modularity, scalability, portability and interoperability – Difficult to achieve… • Heterogeneous HW/SW components 8 Goals (III) • Consistency - Lack of global information, replication and partitioning of data, component failures, complexity of interaction among components – User needs: uniformity in using the system and predictable system behavior – System needs: proper concurrency control mechanisms and failure handling and recovery procedure • Robustness - problem with failures in communication links, processing nodes and client/server processes – System must reinitialize itself to a state where integrity preserved and only small loss in performance – Handle exceptions and errors, changes to topology, long message delays, inability to locate server – Security: reliability, protection, and access control 9 Transparency • Transparency – Hide all irrelevant system-dependent details from users – Create an illusion of the model users are supposed to see – Trade-off between simplicity and effectiveness • Objective – Provide a logical view of a physical system and at the same time reduce the effect and awareness of the physical system to a minimum 10 Type of Transparency (I) • Access: access local and remote system objects in same way – Phone (local) VS. letter (remote) • Location (name): No awareness of object location - use logical names – Area code for other cities • Migration: object can be moved to different locations without changing names – Local numbers are changed if one moves to other cities – Need universal name (symbolic or numerical) • Concurrency: sharing of objects without interference 11 Type of Transparency (II) • Relocation: a resource may be moved to another location when in use • Replication: consistency of multiple instances of files and data • Parallelism: permit parallel activities without users knowing how, where, and when these activities are carried out by the system • Failure: fault tolerance, graceful performance degradation, minimum damages to the user • Performance: consistent and predictable performance level even if changes in structure or load distribution • Size: modularity and scalability Incremental growth in HW without user awareness • Persistence: (software) resource may be in memory or on disk • Revision: SW revisions not visible (vertical growth) 12 Categorization of Transparency Based on System Goals • Efficiency • Consistency – Concurrency – Parallelism – Performance – – – – • Flexibility – – – – – – Access Replication Performance Persistence • Robustness Access Location Relocation Migration Size Revision – – – – 13 Failure Replication Size Revision Distributed System Issues and Transparencies Major Issues Transparencies Communication Synchronization Distributed algorithms Interaction and control transparency Process scheduling Deadlock handling Load balancing Performance transparency Resource scheduling File sharing Concurrency control Resource transparency Failure handling Configuration Redundancy Failure transparency 14 Services (I) • Primitive services - most fundamental, in kernel – Must implemented in the kernel of each node in the system – Communication – message passing (send/receive primitives) • Synchronous or asynchronous – Inter-node, inter-process Synchronization – synchronous communication • Synchronous semantics of communication or synchronization server – Processor multiplexing -- Process server (for transparency reason) • Creation, deletion, tracking for memory and processing time 15 Service (II) • Services by System Servers – fundamental, not need in kernel – Provide fundamental services for managing processes, files, and process communication – Can be implemented anywhere in the system, and still perform functions basic to the operation of a distributed system – Mapping logical names to physical addresses • Name server: locate processes, users, machines • Directory server: locate files, communication ports – Translate addresses and locations into communication paths: network server – Broadcast messages: broadcast or multicast servers – Clocks for synchronization - impossible to agree on global clock information • Time server: physical clocks and logical clocks (for event ordering) – File servers, print servers, migration server, authentication server 16 Service (III) • Value-added Services - not essential in implementation of system but useful, higher-level or special purpose services (such as user applications) – Increase computational performance, enhance fault tolerance, cooperative activities – Example is Web server – Groups of interacting processes • Group server: membership (add/remove), admission policies, privileges – Distributed conferencing server and concurrent editing server 17 System Architecture Models • System Architectures – Workstation-server model • Client workstations – Local processing capability and interface to the network • Server workstations – Dedicated for special services – Processor pool model - collect all processing power in one place, users use terminals only • Terminal: remote booting, remote file mounting, virtual terminal handling, packet assembling and disassembling (PAD) • File and processor allocation done by system – Integrated hybrid model 18 Workstation-Server Model File Server Printer Server 19 Processor-Pool Model 20 Communication Network Architecture Models • HW interconnection + inter-node inter-process communication protocols • Hardware interconnection – Point-to-point links – direct connections between pairs of nodes – Multipoint links – allow connection of nodes into clusters • Common bus – time shared – IEEE 802 LAN Standard – Ethernet, Token Bus/Ring, FDDI… • Switch – space/time multiplexing at higher HW cost/complexity – Private switches for multiprocessor systems – cross-bar… – Public switches – ISDN, SMDS, ATM • LAN, MAN, WAN • Ratio of propagation delay to transmission delay – LAN: small. Close components, more suitable for distributed processing – MAN/WAN: large. More communication oriented 21 WAN, MAN, LAN Point-to-Point Point-to-Point 22 Communication Network Protocols • Communication Protocol: set of rules that regulate the exchange of messages to provide a reliable and orderly flow of information among communicating processes • Connection-oriented communication service – Phone – Need explicit set up of a connection channel before communication – Messages are delivered reliably and in sequence – Virtual circuit (logical) or circuit switching (physical) • Connectionless communication service – postal service – No initial connection establishment is necessary – Messages are delivered on a best-effort basis in timing and route and may arrive in arbitrary order – Datagram (logical) or packet switching (physical) 23 OSI Protocol Suite • Seven-layer protocol suite • OSI focuses on interconnecting computers • A process communicates with a remote process by passing data through the seven layers, then the physical network, and finally through the remote layers in reverse order – Segmenting/reassembling – Transparency between layers – encapsulation • Add header for protocol data unit (PDU) from upper layer • The remote corresponding layer strip off the header • A gateway or intermediate node only stores and forwards messages at the three lower network dependent layers 24 OSI Protocol Suite (Cont.) Application Peer-to-Peer Protocols Application Presentation Presentation Session Session Transport Transport Intermediate Node Network Network Network Network Data Link Data Link Data Link Data Link Physical Physical Physical Physical Communication Link Communication Link 25 OSI Protocol Suite (Cont.) -Physical Layer • Specify the electrical and mechanical characteristics of the physical communication link – standardize – Coding method, modulation technique, wire/connector specification – Sharing of common bus needs interface standards for the medium access control in the data link layer • Reliable mapping of signals to bits – need bit synchronization • Bit synchronization – Detection of the beginning of a bit and a sequence of bits – Bit synchronous: large blocks of bits transmitted at a regular rate • Offer higher data transfer speed and better link utilization – Character asynchronous: small fixed-size bit sequences transmitted asynchronously • Low-speed character-oriented terminals 26 OSI Protocol Suite (Cont.) -Data Link Control (DLC) Layer • Ensure reliable data transfer of groups of bits (frames) • Configuration setup – Establishment and termination of a connection – Full- or half-duplex, synchronous or asynchronous connection? • Error controls – Transmission errors and loss or replication of data frames – Detected by checksum or time-out mechanisms – Recovered by retransmissions or forward error corrections • Sequencing – Maintain an orderly delivery of frames by sequence numbers – Sequence number can assist error control and flow control of data frames • Flow control of data frames – Permit the transmission of a frame only if it falls into an allowed windows of buffers for the send and the receiver • Multipoint configuration: DLC sublayer – MAC sublayer – Physical layer – Resolve the access contention of the multiple access channel 27 OSI Protocol Suite (Cont.) – Network Layer • Address issues of sending packets across the network through several link segments • Routing function – Which link should be selected for forwarding a packet, based on its destination address – Static or dynamic routing; centralized or distributed – Routing decision can be made at the time when a connection is requested and is being established (connection-oriented); or packetby-packet basis (connectionless, multiple path routing) • Error, sequencing, and flow control function – Reassemble packets and discard duplicate ones – Congestion control for favorable routing nodes 28 OSI Protocol Suite (Cont.) – Transport Layer • The most important layer from the OS view – The only interface between the communication sub-network layers and network-independent layers • Provide a reliable end-to-end communication between peers processes – All network-dependent faults or problems are to be shielded from the communicating processes – Message packets (breaking/reassembling) – Multiple sessions can be multiplexed on one transport connection – One session may occupy multiple transport connection – Five classes (TP0 to TP4) of transport services to support sessions • Depend on application and network quality • TP4: multiplexing, error detection, and retransmission 29 OSI Protocol Suite (Cont.) – Session, Presentation, Application Layers • Session layer: add additional dialog and synchronization services to transport layer – Dialog: establishment of sessions – Synchronization: allow processes to insert checkpoints for efficient recovery from system crashes • Presentation layer: data encryption, compression, and code conversion for messages that use different coding schemes • Application layer: standard is completely left to the designer of the application 30 TCP/IP Protocol Suite • Address inter-process and inter-node communication – How is communication between a pair of processes maintained? • Transport Layer  TCP (TP4 in OSI) • Connection-oriented (TCP) or Connectionless (UDP) – How are messages routed through the network nodes? • Network Layer  IP (a little more than the OSI network Layer) • Virtual circuit or datagram • TCPI/IP focuses on interconnecting networks • (TCP, UDP) * (Virtual Circuit, Datagram IP) – Shift burden of maintaining reliable communication from network to OS • Port and Socket (more in Chapter 4) – Port: inter-process communication endpoints – Socket: interface to port 31 TCP/IP Protocol Suite (Cont.) Application processes Peer to Peer Protocols Application processes message Transport layer packet Internet layer datagram Data link and physical Layer Transport layer Gateway Internet layer Internet layer Data link and physical Layer Data link and physical Layer Frame in bits 32 Major Design Issues • A distributed system consists of concurrent processes accessing distributed resources (which may be shared or replicated) through message passing in a network environment that may be unreliable and contain un-trusted components – – – – – How to model and identify objects How to coordinate the interaction among objects How to achieve objects communication How to manage shared or replicated objects How to protect objects and system security • How to support transparency 33 Major Design Issues – Object Models and Naming Schemes • Objects: processes, data files, memory, devices, processors, networks • Assume all objects can be represented uniformly – An object is represented abstractly by the allowable operations – The physical details of the object are transparent to other objects • To identify a server: – By name - map name to logical address – Physical or logical address - done by network service, port for logical – By service - needed by CAS 34 Major Design Issues – Distributed Coordination • Coordinate interacting concurrent processes to achieve synchronization • Requirements – Barrier synchronization – a set of processes (or events) must reach a common synchronization point before they can continue – Condition coordination – a set of processes (or events) must wait for an asynchronously condition set by other processes to maintain some ordering of execution – Mutual exclusion - concurrent processes must have mutual exclusion when accessing a critical shared resource • Need knowledge of state information about other processes – Through messages  inaccurate or incomplete (unreliable network) – Centralized coordinator (leader election) or distributed resolution • Deadlock handling – detect and recover • Assimilate partial global state information and use it for decision making – Exchange local knowledge among cooperating sites 35 Major Design Issues (Cont.) • IPC - Use high-level methods for transparency in communication – Message passing – low level and physical – Client/Server Model - system interactions through message exchanges: request/reply – RPC - request/reply like procedure call, built on top of client/server model • RPC assumes point to point, but need groups (multicast, broadcast) • Distributed Resources - data processing capacity – Multiprocessor scheduling - static load distribution vs. dynamic load sharing • Process migration, real-time scheduling – Distributed file system and distributed shared memory • Sharing and replication of data 36 Major Design Issues (Cont.) • Fault tolerance and security – Failure - unintentional intrusion - redundancy alleviates it – Security violation - intentional intrusion - need secure communication processes, integrity of messages – Need to authenticate clients/severs, messages 37 Distributed Computing Environment (DCE) • Proposed by Open Software Foundation (OSF) – Develop and standardize an open Unix environment that is free from the influence of AT&T and Sun • DEC: an integrated package of software and tools for developing distributed applications on an existing OS • Hierarchically layered architecture 38 DCE Architecture 39

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Distributed System Concepts and Architectures