Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Distributed Systems (Credit Hours: 3) This course covers advanced topics in distributed systems with a special emphasis on distributed computing, and the services provided by distributed operating systems. Important topics include parallel processing, remote procedure call, concurrency, transactions, shared memory, message passing and scalability. Reference Books: 1. Distributed Systems: Concept and Design by Coulouris, Dollimore, and Kingberg 2. Distributed Operating Systems by Andrew S. Tanenbaum 1 Course Evaluation Attendance & Class Participation Assignments Critical Reviews Mid Term Final Term Total Marks: 05 10 10 25 50 100 2 Overview • • • • • • Multiprocessing (Parallel processing). Tightly coupled processors. Distributed system (DS). Loosely coupled processors (Distributed). Key features of DS. Pros and Cons of DS. 3 Parallel Processing From the beginning, computer scientists had challenged computers with larger and larger problems. Eventually, computer processors were combined on the same board together in parallel to work on the same task together by sharing the same memory. This is called parallel processing. 4 Parallel Processing • • • • Types of Parallel Processing. MISD SIMD MIMD 5 Cont…… Processors are multiple MISD – Multiple Instruction stream, Single Data stream SIMD – Single Instruction stream, Multiple Data stream MIMD – Multiple Instruction stream, Multiple Data stream 6 MISD One piece of data is broken up and sent to many processor. For searching a specific record. CPU CPU Data Search CPU CPU Example: An unsorted dataset is broken up into sections of records and assigned to several different processors, each processor searches the sections of data base for a specific key. 7 MISD An other example may be: Multiple cryptography algorithms attempting to decrypt a single coded message 8 SIMD SIMD (Single Instruction, Multiple Data) Is a technique applied to achieve parallel execution from a set of processor with data level parallel processing. 9 SIMD Multiple processors execute the same instruction of separate data. Data CPU Data CPU Data Data Multiply CPU CPU Ex: A SIMD machine with 100 processors could multiply 100 numbers, each by the number three(3), at the same time. 10 • • SIMD Single instruction: all processing units execute the same instruction at any given clock cycle Multiple data: each processing unit can operate on a different data element 11 MIMD Multiple processors execute different instruction of separate data. Data Multiply CPU Data Search CPU Data Add CPU Data Subtract CPU This is the most complex form of parallel processing. It is used on complex simulations like modeling the growth of cities. 12 MIMD • • • Currently the most common type of parallel computer Multiple instruction: every processor may be executing a different instruction stream Multiple data: every processor may be working with a different data stream 13 Tightly Coupled Processors (H/W concepts) – e.g., Multiprocessors, in which two or more CPUs share a main memory. – More difficult to build than multicomputers. – Easier to program (Desktop programming). 14 Multiprocessing system • Each processor is assigned a specific duty but, processors work in close association possibly sharing one memory module. • These CPUs have local cashes and have access to a central shared memory. The IBM p690 Regatta is an example of a multiprocessing system. (Mainframe). 15 Multiprocessors Consist of some number of CPUs, all connected to a common bus along with a memory module Bus-based multiprocessors require cashing. With caching, memory incoherence becomes an issue Write-through cache (Updating): Any update goes through to the actual memory (not only the cache) Snooping (snoopy) cache (Reading): Every cache monitors the bus, picks up any write-through to memory and applies them to itself, if necessary It is possible to put about 32 or possibly 64 CPUs on a single bus 16 Fig: A bus-based multiprocessor CPU CPU CPU Cache Cache Cache Memory Bus 17 Why Multiprocessors? 1. Microprocessors as the fastest CPUs Collecting several is much easier than redesigning one. 2. IL multithreading is limited due to data dependency on one processor. 3. Improvement in parallel softwares (scientific apps, databases, OS) needs multiprocessors. 18 Introduction to distributed systems Definitions A distributed system is a collection of independent computers that appear to the users of the system as a single computer Tanenbaum A distributed system is one in which hardware components located at networked computers communicate and coordinate their actions only by passing messages. Coulouris, Dollimore, Kindberg 19 Distributed Computing • Distributed computing is the process of aggregating the power of several computing entities to collaboratively run a computational task in a transparent and coherent way, so that it appears as a single, centralized system. • A distributed computer system is a loosely coupled collection of autonomous computers connected by a network using system software to produce a single integrated computing environment. 20 Features of DS • Distributed computing consists of a network of autonomous nodes. • Loosely coupled. • Node do not share the primary or secondary storage. • A well designed distributed system does not crash if a node goes down. 21 Cont.. • If you are to perform a computing task which is parallel in nature, scaling your system is a lot cheaper by adding extra node, compared to getting a faster single machine. • Of course, if your processing task is highly non-parallel (every result depends on the previous), using a distributed computing system may not be very beneficial. 22 Cont… • Network connections are the key feature. • Establishing Remote access is by message passing b/w nodes. • Messages are from CPU to CPU. • Protocols are designed for reliability, flow control, failure detection etc. • Mode of communication between nodes is by sending and receiving the network messages. 23 Distributed OS vs Networking OS • The machines supporting a distributed operating system are running under a single operating system that spans the network. • With network operating system each machine runs an entire operating system. 24 Cont… • Thus the print task might be running on one machine, the file system on an other. Thus each machines cooperates as for the current software part of the DOS. • While in NW OS the entire node is itself distributed across the network. 25 Advantages of DS over Centralized system. • Better price/performance than mainframes. • More computing power (Parallel and distributed). • Requests for some applications. • Improved reliability because system can survive crash of one processor. • Incremental growth can be achieved by adding one processor at a time. • Shared ownership facilitated. 26 Disadvantages of DS. • Network performance parameters. • Latency: Delay that occurs after a send operation is executed before data starts to arrive at the destination computer. • Data transfer rate: Speed at which data can be transferred between two computers once transmission has begun. • Total network bandwidth: total volume of traffic that can be transferred across the 27 network in a given time. Disadvantages of DS. • Dependency on reliability of the underlying network. • Higher security risk due to more possible access points for intruders and possible communication with insecure systems. • Software complexity. 28 Loosely Coupled Processors in H/W concepts – e.g., Multi-computers, in which each of the processors has its own memory. – Easy to build and are commercially available (PCs). – More complex to program (Desktop+ Socket programming). 29 DS consists of workstations on a LAN Workstation Workstation Workstation Local memory Local memory Local memory CPU CPU CPU Network 30 Software Concepts • Network Operating Systems (NOS) – Loosely-coupled software on loosely-coupled hardware – e.g., a network of workstations connected by a LAN – Each user has a workstation for his exclusive use – Offers local services to remote clients • Distributed Operating Systems (DOS) – Tightly coupled-software over loosely-coupled hardware – Creating an illusion to a user that the entire network of computers is a single timesharing system, rather than a collection of distinct machines (single-system image) 31 Software Concepts (Cont’d) • DOS – Users should not have to be aware of the existence of multiple CPUs in the system – No current system fulfills this requirement entirely • Multiprocessor Operating Systems – Tightly-coupled software on tightly-coupled hardware – e.g., UNIX timesharing system with multiple CPUs – Key characteristic is the existence of a single run queue (same memory) – Basic design is mostly same as traditional OS; however, issues of process synchronization, task scheduling, memory management, and security become more complex as memory is shared by many processors 32 Summary Comparison of three different ways of organizing n CPUs Item NOS DOS Multiprocessor OS Does it look like a virtual uni-processor? No Yes Yes Do all have to run the same OS at a time? No Yes Yes How many copies of the OS are there? N 1-replicated 1 How is communication achieved? Shared Files Messages Shared Memory Are agreed upon network protocols required? Yes Yes No Is there a single run queue? No No Yes 33