Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Multiprocessor Systems a Teodor Rus [email protected] The University of Iowa, Department of Computer Science a These slides have been developed by Teodor Rus. They are copyrighted materials and may not be used in other course settings outside of the University of Iowa in their current form or modified form without the express written permission of the copyright holder. During this course, students are prohibited from selling notes to or being paid for taking notes by any person or commercial firm without the express written permission of the copyright holder. Introduction to System Software – p.1/34 Multiprocessor systems • A multiprocessor system has more than one control processor: Example: multi-core processors; • Each control processor in a multiprocessor system performs program execution either synchronously or asynchronously with other processors; • The control processors that compose a multiprocessor system may interact with each other by sharing resources (memory, I/O devices, information, time), by sending messages, or both. Introduction to System Software – p.2/34 Fact The major problem raised by multiprocessor system development results from the cost of communication while the processors share resources or send/receive messages. Introduction to System Software – p.3/34 Multiprocessor systems • Tightly coupled; • Loosely coupled; • Distributed multiprocessors. Introduction to System Software – p.4/34 Tightly coupled All processors share the same memory, Figure 1 P rocessor1 6 ? - P rocessori - P rocessorn 6 ? 6 ? Shared memory Figure 1: Tightly coupled multiprocessor Introduction to System Software – p.5/34 Loosely coupled In addition to a common shared memory each processor has its own private memory, Figure 2. P rocessor1 6 ? M emory1 - P rocessori - P rocessorn 6 ? M emoryi ? 6 ? M emoryn ? Shared memory Figure 2: Loosely coupled multiprocessors Introduction to System Software – p.6/34 Distributed • • A distributed system is defined by a computer network; Each node of the network is occupied by a completely equipped computing platform; it could be based on: 1. Sequential machine; 2. Parallel machines; 3. Both, sequential and parallel machines. Introduction to System Software – p.7/34 Software support Software support for shared memory multiprocessor systems consists of: 1. Mechanisms for defining shared resources. Example: IPC shared resources in Unix 5. 2. Mechanisms for process creation, process execution, and controlling process interaction. fork(), exec(), wait(), exit(), signal(), and pipe() in Unix. Example: 3. High-level language support for partitioning programs into parallel streams of control and controlling their interaction. Example: P-threads (POSIX threads). Introduction to System Software – p.8/34 Example language support • Shared data declarations, lock data type, and libraries of functions; Example: IPC (InterProces Communication, Unix 5) mechanisms, tasking, threading (POSIX, Portable Operating System Interface), etc. • Using coordination languages as extensions of the conventional sequential language; Examples: Linda, Message Passing Interface (MPI), Parallel Virtual Machine (PVM). • Parallelizing compilers. Introduction to System Software – p.9/34 Distributed systems Are also called message based multiprocessor systems or multi-computers. Terminology: 1. Components of a multi-computer system are called sites or nodes; 2. Connection between two sites is called a communication line; 3. Communication lines between nodes are established by sockets. Introduction to System Software – p.10/34 Software support The software support for a multi-computer system consists of: 1. IP addresses which are mechanisms for recognizing other sites of the system. Note: the IP (Internet Protocol) address of a site is a unique number used to identify sites on the network and to communicate with each other. 2. Communication protocols which are templates that define the information formats received from other sites. Note: a communication protocol consist of a set of standard rules for data representation, signaling, authentication, and error. Introduction to System Software – p.11/34 More software support • A monitor which transforms the information structures received from sender sites into the information structure of the receiver sites; Example: serialization and deserialization mechanisms used by Common Object Request Broker Architecture (CORBA) a and Remote Procedure Call (RPC). • Mechanisms for process execution and process communication on remote sites. Example: Remote Procedure Call in C, Remote Method Invocation in Java, Sockets, CORBA, etc. a See Object Management Group, OMG Introduction to System Software – p.12/34 Why multiprocessors? Main reasons for designing multiprocessor systems are: • Sharing resources; • Increasing computation power; • Increasing reliability. Introduction to System Software – p.13/34 Sharing resources • A user at one site of the system may be able to use an available resource at another site. • Information is one of the most common resources shared in a multiprocessor system. • Processors, memories, and devices can also be shared, and computing power can be distributed among the sites which define the multiprocessing system. Introduction to System Software – p.14/34 Increasing computing power • Computations must be decomposed into sequential components that can be executed concurrently; • The study of this decomposition is the task of parallel algorithms; • Programming such algorithms is the task of parallel programming; • Some programming languages are now provided with special constructs to express parallel computations; • Operating system schedules sequential components of a program on available processors performing in parallel. Note: today’s hot-topic in multiprocessing system is parallel programming of multi-core computers. Introduction to System Software – p.15/34 Increasing reliability A system is reliable (error tolerant) if it is able to: 1. Detect the failure of system components; 2. Isolate the component which fails; 3. Redistribute the computing tasks in the system accordingly, i.e., reconfigure the system and reschedule its activity; 4. Inform the user about this new configuration by sending appropriate messages to the user’s terminal. Introduction to System Software – p.16/34 System recovery Properties of a multiprocessor system: 1. Failures of processors in a multiprocessor system are detectable; 2. Computing activities can be carried out by other processors; 3. Errors can be isolated and their propagation can be stopped. Hence, a multiprocessor system can indeed be reliable, i.e., error tolerant. Introduction to System Software – p.17/34 System architectures • Master-slave; • Computer network; • Data type. Introduction to System Software – p.18/34 Master-slave approach • The multiprocessor system is provided with a master component and a collection of working components (slaves); • Each working component can be assigned a specific activity while the master component controls the system activity as a whole; • Time-sharing system is an example of a master slave multiprocessor system. Question: Who is the Master in a TSS? Introduction to System Software – p.19/34 Computer-network approach 1. A computer network consists of a collection of independent computer systems; 2. The configuration of computer components may extend from processors provided only with own memory while sharing other components, to processors completely equipped with memory, I/O devices, and information carriers. Note: the communication protocol between components classify the networks as synchronous, asynchronous, and hybrid. Introduction to System Software – p.20/34 Classification • A synchronous network is characterized by the lock-step manner in which processors that belong to the network operate. • An asynchronous network is characterized by the asynchronous manner in which processors that belong to the network operate. Each processor may execute independently a stream of instructions. Introduction to System Software – p.21/34 Classification, continuation • A hybrid network can be dynamically controlled (by software or by hardware) to behave as a synchronous or an asynchronous network. • Example of hybrid network is the connection machine CM5. Introduction to System Software – p.22/34 Topology of a network • The communication lines in a network are established according to the given “communication protocols”. • Since the cost of communication in a computer network depends upon its communication lines, the topology of the network is a characteristic of the system. • We summarize such characterizations in the table in Figure 3: Introduction to System Software – p.23/34 Topology characterization Topology Cost Speed Reliability Exponential Exponential Very reliable connected Polynomial Polynomial Less reliable Hierarchy Logarithmic Not very fast Not reliable Star Linear cost Fast Reliable Ring Linear cost Slow speed Not reliable Cube Linear cost Very fast Reliable Multibus Linear cost Fast Reliable Fully connected Partially Figure 3: Network topologies Introduction to System Software – p.24/34 Data-type approach • Each processor is defined by the collection of operations it can execute on the types of memory locations characterizing it. • Processor becomes a hardware recognized data type of the network while its computations represent abstract data types. • Scheduling requires to match the abstract data type to the real processor data type. • Each computing process in these systems is controlled by the same operating system. Introduction to System Software – p.25/34 Fact The most advanced example of hardware designed to support a data type multiprocessor system seems to be the Intel iAPX-432 Introduction to System Software – p.26/34 Operating System Overview Operating systems presented in this chapter can be characterized in terms of: 1. Data processed by the system; 2. Resources shared under the control of the system; 3. Concurrent activities controlled by the system; 4. Interaction controlled by the operating system; 5. Debugging mechanism under the operating system; 6. Constraints regarding system response time. Introduction to System Software – p.27/34 Operating systems discussed 1. Batch operating systems (BOS); 2. Multiprogramming batch (Multiprogramming BOS); 3. Interactive operating systems; 4. Time-sharing operating system (TSS); 5. Real-time systems; 6. Multiprocessing operating system. Introduction to System Software – p.28/34 Data Data processed by the operating systems are: 1. Job data structure, JDS; 2. Process data structure, PDS; 3. Both, that is, JDS and PDS. Introduction to System Software – p.29/34 Sharing Resources shared under the operating system control are: 1. Information stored in various memory media; 2. I/O devices available in the system; 3. The memory of the computer system; 4. The time, measured in processor clocks. Introduction to System Software – p.30/34 Concurrency The concurrent activities controlled by the operating system are: 1. Processor and devices while performing computations specified by the same program; 2. Processor and devices while performing computations specified by different programs; 3. Processors and devices while processing in a multiprocessor environment. Introduction to System Software – p.31/34 Interaction The interaction controlled by the operating system is: 1. No interaction is allowed between users and their computations; 2. Users interact with their computations during their life in the system. Introduction to System Software – p.32/34 Debugging Program debugging under the OS is performed by: • Statically, from snapshots dumps; • Dynamically, interacting with the running program. Introduction to System Software – p.33/34 Time constraints The time constraints controlled by the operating systems are: 1. No time constraints; 2. Processing should be complete in a given time interval; 3. Processing must be complete in a given time interval. Introduction to System Software – p.34/34