Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Tuesday, November 25, 2008 The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn, and relearn. --Unknown 1 Single Processor OS P2 P1 Process management Memory management IPC -- Naming, Concur Storage OS Kernel Processor Memory 2 Distributed OS P2 P1 Process management Memory management IPC -- Naming, Concur Storage OS Kernel Processor Memory Processor Memory 3 Issues • Same but different: Distributed P2 P1 Network Delay Lossy 4 Why Distribute? • To access resources • Improve availability • Enhance fault-tolerance • Consider Yahoo Mail vs. Airline Ticketing System 5 Serial computing • To be run on a single computer having a single Central Processing Unit (CPU); • A problem is broken into a discrete series of instructions. • Instructions are executed one after another. • Only one instruction may execute at any moment in time. 6 Parallel computing • Simultaneous use of multiple compute resources to solve a computational problem on multiple CPUs • A problem is broken into discrete parts that can be solved concurrently • Instructions from each part execute simultaneously on different CPUs 7 Grand Challenge Problems • Traditionally, parallel and distributed computing has been considered to be "the high end of computing“ – Motivated by numerical simulations of complex systems and "Grand Challenge Problems": • • • • • • • Global change Fluid turbulence Vehicle dynamics Ocean circulation Viscous fluid dynamics Superconductor modeling …. 8 9 BlueGene/L2 Supercomputer IBM remains dominant vendor of supercomputers 10 Agenda • Basic Concepts • Revision – Distributed systems – Characteristics – Issues/challenges • System Models – Introduction – Architectural Models 11 Revision Topics • Introduction and History • Examples of distributed systems – The internet – Interanets – Moble and ubiquitous computing • Resource sharing and the web – The world wide web • • • • • • • Challenges Heterogeneity Openness Security Scalability Failure Handling Concurrency Transparency HTML URL HTTP Dynamic Pages Web Services Discussion of the web Q1 12 Objectives of the lecture • To provide models for helping to perceive different distributed system designs. • To understand the characteristics of the most common architectural models of distributed systems. • To understand some of the factors that produce variety to these models. 13 Basic Elements Resources in a distributed systems are shared between users. They are normally encapsulated within in one of the computers and can be accessed from other computers by communication. Each resource is managed by a program, the resource manager, it offers communication interface enabling the resource to be accessed by the users. Resource managers in general can be modeled as processes. In object oriented system, resources are encapsulated in objects. 14 Distributed Systems Architecture: • Multi-tier architecture – Alternatives (a) thru (e) 15 Monday, 01, 2008 "If you were plowing a field, what would you rather use, two strong oxen or 1024 chickens?" (Commenting on parallel architectures) - Seymour Cray, Founder of Cray Research Distributed system models • Architectural models (definition): The way in which the components of systems interact with one another and the way in which they are mapped onto an underlying network of computers • Architectural models: – Placements of parts – Relationship between parts • Examples – Client /server – Peer-to-peer model – Proxy server • Fundamental models: – Formal description of system properties common in all architectural models 17 – reliability, security, performance Architectural models • Architecture: – Structure of separately specified components • Over all goal: – Structure should meet the present and future requirements on • Reliability • Manageability • Adaptability • Cost-effectiveness 18 Architectural models • Simplification by distinction between client, server and peer processes – Assessment of workload (responsibilities) – Failure impact analysis • Architectural models can be used to determine placement of processes • More dynamic systems can be built as variations on the client-server model – Moving code –applets, reduce delays, reduce traffic – Adding/removing nodes or components – Discovery and advertisement of services –mobile devices 19 Software Architecture & Service Layers Applic ations, services • Software architecture – Structuring software as layers or modules – Services offered and requested by processes on same or different computers (More recently) Middle w are Operatin g system Pla tform • Service layers (ch. 3 to 6) – A distributed service can be provided by one or more interacting server processes – Network Time Protocol Computer and netw ork hardw are Software and Hardware service layers in DS 20 Service Layers (contd..) • Platform • Operating system, hardware – Lowest level of layer provide services to Layer above them, So that they can be implemented independently in all computers – Facilitate communication and coordination among process – Supplies system programming interface 21 Service Layers (contd..) • Middleware – Masks heterogeneity • Represented by processes or objects in a set of computers that interact with each – To Implement communication and resource sharing • RMI • Group communication (Isis) – Supplies application programming interface 22 Service Layers: Middleware • Middleware – Examples: • Sun RPC – Simple, remote procedure calling package • OMG CORBA • Microsoft DCOM • Java RMI – Provides • Building blocks for software components • Infrastructural services for application programs 23 Service Layers: Middleware • Middleware services – – – – – Naming Security Transactions Persistent storage Event notification The end-to-end argument e.g. reliable file transfer (Saltzer, 1984) Correct behaviour in distributed programs depends upon checks at various levels • Limitations – Many distributed applications rely entirely on the services provided by the available middleware to support their needs for communication & data sharing • TCP reliable for packets (basic error correction and detection) • Mail transfer service increase fault tolerance (maintain record of progress) • Some functions can only be correctly implemented with the help of application level information 24 System Architectures: Client-Server Model Client inv oc ation result Server inv oc ation Server result Client Key: Proc es s: Computer: • Server may in turn be client of other servers – Web server may be client of local file server – DNS Servers – Serch engine is both a server and client 25 Client Server Communication Three – tiered Architecture System Architectures: Multiple Servers Model Service Server Client Server Client Server • Web servers manage their resources independently – A user can access any resource • Sun NIS (Netwrok Infomation Service) – Each NIS has its own replica of the password file 28 System Architectures: Multiple Servers Model • Services may be provided by multiple servers • Partitioned or replicated service-related objects • Replication provides – Increased performance – Increased availability – Increased fault-tolerance • But requires replica coordination / consistency preservation 29 System Architectures: Proxy Server Model Web s erv er Client Prox y s erv er Client Web s erv er • Cache: Store of reccently used data objects – Cache • collocated at each client (web bworser) • proxy server (shared cache) – Web browser • recently visited pages • HTTP to check the consistency 30 System Architectures: Proxy Server Model • Cache: a close store of recently used data – Considerably increases performance in many applications – But requires cache coherence protocols • Proxy server: a shared cache of resources – Most commonly used for web access 31 System Architectures: Peer-to-Peer Model Peer processes: processes that play similar roles ◦ No absolute distinction between client/server ◦ May still assume client/server roles from time to time ◦ Peer processes maintain the synchronization & consistency of resources and actions Example ◦ Napster Elimination of server processes reduces inter-process communication delay for local object access Increased fault-tolerance and scalability Coordination and to maintan replicas is difficult 32 System Architectures: Peer-to-Peer Model • P2P architecture is to exploit the resources in a large number of participating computers for the fulfillment of a given task Applic ation Applic ation Coordination c ode Coordination c ode Applic ation Coordination c ode 33 Agenda • Basic Concepts • Revision • System Models – Introduction – Architectural Models • Software Layers • System Architectures – Variations – Design requirements for distributed architectures 34 Variations on the Client-Server Model • Variations inside the client-server model from following factors: – the use of mobile code and mobile agents – lightweight clients, based on users’ need for low-cost computers and easy management – the requirment to add and remove mobile devices in a convinient manner 35 Variations on the Client-Server Model • Example of a mobile code a) client request results in the downloading of applet code Client Applet code Web server b) client interacts with the applet Client Applet Web server 36 Variations on the Client-Server Model • Example of a mobile code (cont.) – Server push model • Server initiates dialogue • “Pushes” information to client • Client needs application that listens for server pushes • Example – stockbroker 37 Variations on the Client-Server Model • Example of a mobile agent – a) The code and data are moved to the server Code + data Users Web server Results 38 Variations on the Client-Server Model • Mobile agents – A running program that travels between computers in a network • Carries out tasks on someone’s behalf – E.g., collect information, install programs • Has internal knowledge, beliefs and goals – Advantage: local access everywhere • Reduction in communication costs • Disconnected operation – Potential security threat • Limited applicability/ web crawlers 39 Variations on the Client-Server Model • Network computer – All files related to management of application is stored and managed remotely and application run locally. – Local disk may be used as cache – It downloads OS and application software from a remote file server • Minimum of local software is downloaded – As data and files are stored remotely so user can migrate from one NC to another NC – Applications are run locally but the files are managed by a remote file server • e.g., NFS 40 Variations of Client-Server Model (cont.) Thin client – is a PC with less of everything – Does not even run its own applications – Programs are run by a powerful computer server – Citrix WinFrame, teleporting and PCAnyWhere: thin client process for Windows – Advantage • Low cost – Disadvantage • Not good for highly interactive application like CAD • Delay in communication can effect application operation • e.g., X11 Server 41 Thin Clients Network computer or PC Thin Client Compute server network Application Process • VNC •WinFrame 42 Variations of Client-Server Model (cont.) • Mobile devices – capable of wireless networking • e.g. personal digital assistants (PDAs) • Spontaneous networking – the form of distribution that integrates mobile devices and other devices into a given network – Key features • easy connection to a local network – A device brought into a new network environments is transparently reconfigured to obtain connectivity there • easy integration with local services – automatic discovery of available services with no special configuration 43 – discovery services Mobile Devices and Spontaneous networking Music service gateway Alarm service Internet Hotel wireless network Discovery service Camera TV/PC Laptop PDA Guests devices 44 Variations of Client-Server Model (cont.) • Spontaneous networking (cont.) – Issues • supporting convenient connection and integration • limited connectivity – how the system can support the user so that they can continue to work while disconnected • security and privacy – track of users’ location – Discovery services • to accept and store details of services that are available on the network and to respond to queries from clients about them • Interfaces of discovery services – registration service : accept registration requests from servers, stores properties in database of currently available services 45 – lookup service : match requested services with available servers Agenda • Basic Concepts • Revision • System Models – Introduction – Architectural Models • Software Layers • System Architectures – Variations – Design requirements for distributed architectures 46 Design requirements for distributed architectures Performance issues Quality of Service Use of caching and replication Dependability issues 47 Performance issues Responsiveness Users of interactive application require a fast and consistent response to interaction Speed of response: load and performance of the server, delays in all software components Software layers must be optimized Throughput The rate at which computational work is done Data transfer between processes Throughput of software layer and network Balancing computer loads The ability to run applet at client removes load from the web server In some case load balancing may involve moving partiallycompleted work as the loads on hosts changes 48 Quality of Service (QoS) Quality of service The ability to meet the deadlines of users need Reliability Failure model Security Security model Performance Adaptability Example Time-critical data: movie service 49 Use of caching and replication The performance issues are major obstacles to the successful deployment of DS, Much progress has been made in the design of systems that overcome data replication and caching 50 Dependability issues • The dependability of computer systems – correctness, – security – fault tolerance • Correctness • Fault tolerance – reliability is achieved through redundancy • Replication of data and process • Redundancy in communication protocols • Security – the architectural impact of the requirement for security concerns the need to locate sensitive data and other resources only in computers 51 that can be effectively secured against attack Next Time: Fundamental Models of Distributed Systems • Conclude Chapter 2 – Interaction • Communication and coordination – Failure • Classification of faults and tolerance methods – Security • Classification of attacks and protection mechanisms • Lab 2: Socket Programming 52 Teaching Assistant • Zaeem Ahmad Abbasi – (BIT7) • [email protected] 53 Reading Material Q.1 From the book, answer the following questions 2.11 2.13 2.14 2.17 Paper Reading: Read first 5 pages only ! Discussion date: 16/12/2008 A. D. Birrell and B. J. Nelson. Implementing remote procedure calls. ACM Transactions on Computer Systems 2(1):39-59, February 1984 54 Monday, 02, 2008 "If an infinite number of computer programmers programmed for an infinite number of years, they would eventually come up with a working operating system. Bill Gates, being impatient, gave them two days and took the first one that was finished. — Robert X. Cringely, InfoWorld Revision • Architectural Model – Goal • • • • Reliability Manageability Adaptability Cost-effectiveness – Service Layers • Platform • Middleware – System Architecture • Client/Server • Proxy • Peer to Peer – Variations on Client/Server • Mobile code and mobile agent – Design requirements for distributed systems Objectives • To provide fundamental models that reflect common properties for distributed system designs. • To understand the characteristics of the most common fundamental models of distributed systems. System models – what and why? • System model: – Abstract, consistent description of a relevant aspect of a distributed system. • A system model could address: – What are the main entities in the system? – How do they interact? – What are the characteristics that affect their individual and collective behavior? • Aid – Design, analysis, discussion, etc. • The purpose of a system model: – Make explicit all assumptions. – To make generalizations concerning what is possible or impossible. (General purpose Algo, mathematical proof) Fundamental models Will our design work in this system ? Yes, if our assumptions hold in the system Interaction model: ◦ Performance of processes and communication channels, absence of a global clock, timing problems, …coordination, delays Failure model: ◦ Failures of processes and communication channels, ◦ We define and Classify the faults and their effects ◦ So, we can design systems that tolerate faults of each type Security model: ◦ Modular nature of DS and their openness ◦ It defines and classifies the forms that such attacks may take ◦ So we can design a system that are able to resist them Interaction model - basics • Interaction – Multiple server processes may cooperate to provide service. ( DNS) – A set of peer processes may cooperate to achieve common goal (voice conferencing system) • Communication & Coordination • Distributed Algorithm – definition of the steps to be taken by each of the processes of which DS is made of, including the transmission of messages. – Rate at which each process proceed and the timing of transmission of messages cannot in general be predicted. – Each process has its own state. • Significant factors affecting interacting processes: – Communication performance. (latency, bandwidth, jitter) – Lack of global notion of time. Interaction model – Significant factors • Performance of communication channels: – Latency. • Delay between sending of a message by one process and its receipt by another. • Transmission time – Time taken to for the first of the string of bits transmitted through a network to reach its destination. • Delay network access time – Increase significantly with increase in network load. • Operating system communication services time – In sending and receiving messages. Varies with load on OS – Bandwidth. • total amount of information that can be transmitted in given time – Jitter. • Variation in the time taken to deliver a series of messages. e.g. multimedia data Interaction model – Significant factors (cont.) • Computer clocks and timing events. – Local processes use time service – Different time values for processes at different systems – Drift rate • The relative amount of time that a clock differs from a perfect reference clock – Computers may use radio receivers to get time from GPS • Cost, accuracy, not inside, not for every computer Interaction model – synchronous vs. asynchronous Synchronous distributed systems: 1. The time to execute each step of a process has known lower and upper bounds. 2. Each message transmitted over a channel is received within a known bounded time. 3. Each process has a local clock whose drift rate from real time has a known bound. ◦ Difficult to arrive at realistic values and to provide guarantees ◦ Synchronous distributed systems can be built Known resource requirements Guaranteed resources Interaction model – synchronous vs. asynchronous • Asynchronous distributed systems (web) • no bounds on: 1. Process execution speed. 2. Message transmission delays. (email can take days) 3. Arbitrary clock drift rates. – Design of web browsers – Actual distributed systems are very often asynchronous • Sharing processors • Sharing network – Any solution valid for synchronous Interaction model – event ordering s end X receiv e 1 m1 2 Y receiv e 4 s end 3 m2 receiv e Phy sical time receiv e s end Z receiv e receiv e m3 A t1 t2 m1 m2 receiv e receiv e receiv e t3 Failure Models • Failure – System doesn’t give desired behavior • Component-level failure • System-level failure (incorrect result) • Fault – Cause of failure (component-level) • Transient: Not repeatable • Intermittent: Repeats, but (apparently) independent of system operations • Permanent: Exists until component repaired • Failure Model – How the system behaves when its not working properly Failure models - Types • Omission failures. • Arbitrary failures. • Timing failures. Failure model - omission failures (1) • A process or communication channel fails to perform actions that it is supposed to do. • Process omission failures: – Crash (and clean crash) • Use timeouts. – Process crash is called Fail-stop • If other processes can detect certainly that Process has been crashed. (fail to respond) • Can be produced in synchronous systems only where message delivery is guaranteed. Failure model – omission failure (2) • Communication omission failures: – Communication primitives are send and receive. • Send-omission failures. • Receive-omission failures. • Channel-omission failures. – Also known as dropping message • Generally caused by process p – Lack of buffer space at receiving end or intervening gateway – Network transmission error, detected by a checksum process q send m receive Communication channel Outgoing message buffer Incoming message buffer Failure model – Arbitrary failure • Arbitrary or Byzantine failures – Described as the worst possible failure semantics, in which any type of error may occur • Process/channel exhibits arbitrary behavior • Process may return a wrong value in response to an invocation – Arbitrary Process failure • Process may omit a step/s or Perform uninterested step/s – Arbitrary Communication Failure • Messages contents can be corrupted, a duplicate message can be sent or message can be lost on its way • Rare and can be detected by checksum or message numbering Failure model – overview of omission failures Class of failure Affects Description Crash Process Process halts and remains halted. Other processes may Fail-stop Process Process halts and remains halted. Other processes may not be able to detect this state. detect this state. Channel A message inserted in an outgoing message buffer never arrives at the other end’s incoming message buffer. Send-omission Process A process completes a but the message is not put in its outgoing message buffer. Receive-omission Process A message is put in a process’s incoming message buffer, but that process does not receive it. Arbitrary Process/channel exhibits arbitrary behaviour: it may Process (Byzantine) or channel send/transmit arbitrary messages at arbitrary times, commit omissions; a process may stop or take an incorrect step. Omission Failure model - timing failures • • Applicable in synchronous distributed systems. Time limits – Process execution time – Message delivery time – Clock drift rate Class Affects Description Performance Process Process exceeds the bounds on the interval between two steps. Performance Channel Clock Process A message’s transmission takes longer than the stated bound. Process’s local clock exceeds the bounds on its rate of drift from real time. Real Time Operating System Provides timing guarantee Multimedia Failure model - remedies • Masking failures: – A knowledge of the failure characteristic of a component can enable us to develop a reliable service which use such components which can fail. – Converting failure, checksum, retransmit message, replication, restoring information (convert arbitrary failure to omission failure) – A service masks a failure, either by hiding it altogether or by converting it into a more acceptable type of failure. • Reliability of one-to-one communication: – Correct message delivery in presence of failure – Validity: Any message in the outgoing message buffer is eventually delivered to the incoming message buffer. – Integrity: The message received is identical to one sent, and no messages are delivered twice. – Threats: Malicious Users and Protocols Security model - basics • The security of a distributed system: – securing the processes and the channels – protecting the objects against unauthorized access. • Protecting objects. A c cess right s Object inv oc ation Client result Principal (user) Netw ork Server Principal (s erv er) •Access rights: • Who is allowed to perform operation •Principal: • Authority associated with each invocation and each result – The behalf on which it is issued Summary • Models in general. • Architectural models: • Fundamental models: – Interaction. – Failure. – Security. Networking and Internetworking (Self Study) (Discussion date: 26the August) Objectives • To provide fundamental models that reflect common properties for distributed system designs. • To understand the characteristics of the most common fundamental models of distributed systems. • To understand the code for writing simple sockets --Next Time Reading Material Paper Reading: Home Assignment 1: Discussion date: 16th December Discussion date: 16th December Development of the Domain Name System, Paul V. Mockapetris,Kevin J. Dunlap. Computer Communication Review Vol. 18, No. 4, August 1988, pp. 123–133. 83 Extra 84 Notification Transmission Mechanisms (Permanent Connections) Client 1 Service Update Client 2 Client 3 Server Notification Transmission Mechanisms (Connect and Notify) Client 1 Service Update Client 2 Client 3 Server Client 1 Client 2 Client 3 Notification Transmission Mechanisms (Multicast) Client 1 Service Update Client 2 Client 3 Server Client 1 Client 2 Client 3