Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS4023 – Operating Systems (week 5) Communication models in processes Threads Dr. Atif Azad [email protected] 1 Acknowledgement • Significant material in this set of lectures has been borrowed from: – http://www.cs.berkeley.edu/~kubitron/courses/cs162 /. Copyright © 2010 UCB – http://cs162.eecs.berkeley.edu. Copyright © 2014 David Culler. Copyright ©2014 UCB. – http://www.os-book.com . © 2014 Silberschatz et al. – Dr Patrick Healy at CSIS Department, University of Limerick. 2 A simple address translation: Base & Bound code 0000… code 0000… Static Data Static Data heap heap stack Base Address stack 1000… code Program address Static Data heap Bound < 1100… • Can the pgm touch OS? • Can it touch other pgms? 9/3/14 1000… stack 1100… FFFF… Concurrency • “Thread” of execution – Independent Fetch/Decode/Execute loop – Operating in some Address space – Single-Threaded processes for this lecture. (also called “heavy weight” process) • Uniprogramming: one thread at a time – – – – MS/DOS, early Macintosh, Batch processing Easier for operating system builder Get rid of concurrency by defining it away Does this make sense for personal computers? • Multiprogramming: more than one thread at a time – Multics, UNIX/Linux, OS/2, Windows NT/2000/XP, Mac OS X • ManyCore Multiprogramming, right? The Basic Problem of Concurrency • The basic problem of concurrency involves resources: – Hardware: single CPU, single DRAM, single I/O devices – Multiprogramming API: users think they have exclusive access to shared resources • OS Has to coordinate all activity – Multiple users, I/O interrupts, … – How can it keep all these things straight? • Basic Idea: – abstract the notion of an executing program (process representation) – Then, worry about multiplexing these abstract beings • Dijkstra did this for the “THE system” – Few thousand lines vs 1 million lines in OS 360 (1K bugs) Multiprogramming – Many Processes Proc 1 Proc 2 … code Proc n 0000… Static Data heap OS stack sysmode 1 code Base 1000 … 0000… Static Data Bound 1100… FFFF… heap uPC xxxx… PC stack 0000 1234 code regs 00FF… 1000… 1100… 3000… Static Data … heap stack 3080… Multiprogramming – Interrupt (kernel code) Proc 1 Proc 2 … code Proc n 0000… Static Data heap OS stack sysmode 0 code Base 1000 … 0000… Static Data Bound 1100 … FFFF… heap uPC PC 0000 1234 IntrpVector[i] code regs • 00FF… How to save registers and set up system stack? stack 1000… 1100… 3000… Static Data … heap stack 3080… Multiprogramming – Process 2 Proc 1 Proc 2 … code Proc n 0000… Static Data heap OS stack sysmode 1 code 1000 … 1100 … 0000 1234 Base 3000 … 0000… Static Data Bound 0080 … FFFF… heap uPC PC regs 00FF… • xxxx xxxx stack 000 0248 code regs 00D0… How to save registers and set up system stack? 1000… 1100… 3000… Static Data … heap stack 3080… 8 Concurrency: illusion of many processors CPU1 CPU2 CPU3 Shared Memory CPU1 CPU2 CPU3 CPU1 CPU2 Time • Concurrency provides the illusion of multiple processors? – Multiplex in time! • Each process is like a virtual “CPU”; needs a structure to hold: – Program Counter (PC), Stack Pointer (SP) – Registers (Integer, Floating point, others…?) • Q: What structure holds this information: PCB. (coming next) • How switch from one CPU to the next? – Save PC, SP, and registers in current state block – Load PC, SP, and registers from new state block • What triggers switch? – Timer, voluntary yield, I/O, other things • Q: what states can a process go to after a switch? Diagram of Process State • As a process executes, it changes state – new: The process is being created – ready: The process is waiting to run – running: Instructions are being executed – waiting: Process waiting for some event to occur – terminated: The process has finished execution Multiplexing with Process Control Blocks (PCB) • The current state of process held in a process control block (PCB): – This is a “snapshot” of the execution and protection environment – Only one PCB active at a time • PCB also notes multiplexing process info : – CPU time to different processes (Scheduling): • Only one process “running” at a time • Give more time to important processes – Resources held by Process): • Memory Mapping: based and bound registers • I/O info: opened files, network connections etc. Process Control Block Process Representation in Linux Represented by the C structure task_struct pid t_pid; /* process identifier */ long state; /* state of the process */ unsigned int time_slice /* scheduling information */ struct task_struct *parent; /* this process’s parent */ struct list_head children; /* this process’s children */ struct files_struct *files; /* list of open files */ struct mm_struct *mm; /* address space of this process */ CPU Switch From Process to Process • This is also called a “context switch” • Code executed in kernel above is overhead – Overhead sets minimum practical switching time Context Switch in Detail • A context switch is the switching of the CPU from one process or thread to another. • In detail, kernel does the following: – Suspend one process and store its state as a PCB – retrieve the PCB for the next process from memory and restore it in the CPU’s registers – Start executing the new process as pointed by its program counter (PC) – The more complex the OS and the PCB the longer the context switch • Time dependent on hardware support – Some hardware provides multiple sets of registers per CPU multiple contexts loaded at once 14 Operations on Processes • Operating systems must provide mechanisms for: – process creation, – process termination, Process Creation • Parent process create children processes, which, in turn create other processes, forming a tree of processes • Generally, process identified and managed via a process identifier (pid) • Resource sharing options – Parent and children share all resources – Children share subset of parent’s resources – Parent and child share no resources • Execution options – Parent and children execute concurrently – Parent waits until children terminate A Tree of Processes in Linux init pid = 1 login pid = 8415 khelper pid = 6 bash pid = 8416 ps pid = 9298 emacs pid = 9204 sshd pid = 3028 kthreadd pid = 2 pdflush pid = 200 sshd pid = 3610 tcsch pid = 4005 • Run pstree command. Can’t see init? Instead see systemd? • http://www.pcworld.com/article/2841873/meet-systemd-the-controversialproject-taking-over-a-linux-distro-near-you.html Process Creation (Cont.) • Address space – Child duplicate of parent – Child has a program loaded into it • UNIX examples – fork() system call creates new process – exec() system call used after a fork() to replace the process’ memory space with a new program C Program Forking Separate Process ?? Parent Process Child Process Process Termination • Process executes last statement and then asks the operating system to delete it using the exit() system call. – Returns status data from child to parent (via wait()) – Process’ resources are deallocated by operating system • Parent may terminate the execution of children processes using the abort() system call. Some reasons for doing so: – Child has exceeded allocated resources – Task assigned to child is no longer required – The parent is exiting and the operating systems does not allow a child to continue if its parent terminates Process Termination • Some operating systems do not allow child to exist if its parent has terminated. If a process terminates, then all its children must also be terminated. – cascading termination. All children, grandchildren, etc. are terminated. – The termination is initiated by the operating system. • The parent process may wait for termination of a child process by using the wait()system call. The call returns status information and the pid of the terminated process • pid = wait(&status); • If no parent waiting (did not invoke wait()) process is a zombie • If parent terminated without invoking wait , process is an orphan Multiprocess Architecture – Chrome Browser • Many web browsers ran as single process (some still do) – If one web site causes trouble, entire browser can hang or crash • Google Chrome Browser is multiprocess with 3 different types of processes: – Browser process manages user interface, disk and network I/O – Renderer process renders web pages, deals with HTML, Javascript. A new renderer created for each website opened • Runs in sandbox restricting disk and network I/O, minimizing effect of security exploits – Plug-in process for each type of plug-in Resource Duplication in Child Process • Child gets a copy of parent’s resources • Can involve a lot of overhead – Copying several GB of data – Only to throw that memory space away and load a new address space with exec()? • Solution: – Copy on write 23 Copy On Write Code Segments Data Copies //Parent code segment (a) string i = “Good”; //Child1 Code Segment (b) { printf(“in child 1”); } //Child2 Code Segment (c) { printf(“in child 2”); i = i+ “Bye”; } a Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305 RWCString is class implemented using a Copy-on-Write technique “Good” Copy On Write Code Segments Data Copies //Parent code segment (a) string i = “Good”; //Child1 Code Segment (b) { printf(“in child 1”); } //Child2 Code Segment (c) { printf(“in child 2”); i = i+ “Bye”; } a b Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305 RWCString is class implemented using a Copy-on-Write technique “Good” Copy On Write Code Segments Data Copies //Parent code segment (a) string i = “Good”; //Child1 Code Segment (b) { printf(“in child 1”); } //Child2 Code Segment (c) { printf(“in child 2”); i = i+ “Bye”; } a b c Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305 RWCString is class implemented using a Copy-on-Write technique “Good” Copy On Write Code Segments Data Copies //Parent code segment (a) string i = “Good”; //Child1 Code Segment (b) { printf(“in child 1”); } //Child2 Code Segment (c) { printf(“in child 2”); i = i+ “Bye”; } a b “Good” c “Good Bye” Read section 9.3 of the book SGG: The page containing the data to be written to is copied and then amended. Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305 RWCString is class implemented using a Copy-on-Write technique Short Term Scheduler Process Scheduling Long Term Scheduler? Degree of multiprogramming • PCBs move from queue to queue as they change state – Decisions about which order to remove from queues are Scheduling decisions – Many algorithms possible (few weeks from now) Interprocess Communication • Processes within a system may be independent or cooperating • Cooperating process can affect or be affected by other processes, including sharing data • Reasons for cooperating processes: – – – – Information sharing (Copy and Paste) Computation speedup (parallel work) Modularity Convenience • Cooperating processes need interprocess communication (IPC) – Link: Interprocess communication on Windows • Two models of IPC – Shared memory – Message passing Communications Models (a) Message passing. (b) shared memory. Which is potentially faster? Shared Memory: Producer-Consumer Metaphor • An area of memory shared among the processes that wish to communicate • Paradigm for cooperating processes: producer process produces information that is consumed by a consumer process – Issue: synchronisation • The consumer should not read before the producer has produced; • producer should not over-produce. Synchronisation In == out empty in+1 == out full Consumer out Shared Buffer in Producer • Producer/Consumer may wait: – Either for the space/content to be available – Or CPU time (assume a single core: only one process at a time) 32 • Bounded-Buffer – Shared-Memory Solution Shared data #define BUFFER_SIZE 10 typedef struct { . . . } item; item buffer[BUFFER_SIZE]; int in = 0; int out = 0; item next_produced; while (true) { /* produce an item in next produced */ while (((in + 1) % BUFFER_SIZE) == out) ; /* do nothing */ buffer[in] = next_produced; in = (in + 1) % BUFFER_SIZE; } • Solution is correct, but can only use BUFFER_SIZE-1 elements Bounded Buffer – Consumer item next_consumed; while (true) { while (in == out) ; /* do nothing */ next_consumed = buffer[out]; out = (out + 1) % BUFFER_SIZE; /* consume the item in next consumed */ } Synchronisation covered in detail in Chapter 5 of SGG IPC POSIX Producer IPC POSIX Consumer Interprocess Communication – Message Passing • Mechanism for processes to communicate and to synchronize their actions • Message system – processes communicate with each other without resorting to shared variables • IPC facility provides two operations: – send(message) – receive(message) • The message size is either fixed or variable • If processes P and Q wish to communicate, they need to: – Establish a communication link between them – Exchange messages via send/receive Message Passing (Cont.) • Implementation of communication link – Physical: • Shared memory • Hardware bus • Network – Logical: • Direct or indirect • Synchronous or asynchronous Indirect Communication • Messages are directed and received from mailboxes (also referred to as ports) – Each mailbox has a unique id – Processes can communicate only if they share a mailbox • Properties of communication link – Link established only if processes share a common mailbox – A link may be associated with many processes – Each pair of processes may share several communication links – Link may be unidirectional or bi-directional Synchronization Message passing may be either blocking or non-blocking Blocking is considered synchronous Blocking send -- the sender is blocked until the message is received Blocking receive -- the receiver is blocked until a message is available Non-blocking is considered asynchronous Non-blocking send -- the sender sends the message and continue Non-blocking receive -- the receiver receives: A valid message, or Null message • Different combinations possible – If both send and receive are blocking, we have a rendezvous Communications in Client-Server Systems • • • • Pipes Sockets Remote Procedure Calls Remote Method Invocation (Java) Pipes • Acts as a conduit allowing two processes to communicate • Issues: – Is communication unidirectional or bidirectional? – In the case of two-way communication, is it half or fullduplex? – Must there exist a relationship (i.e., parent-child) between the communicating processes? – Can the pipes be used over a network? • Ordinary pipes – cannot be accessed from outside the process that created it. Typically, a parent process creates a pipe and uses it to communicate with a child process that it created. • Named pipes – can be accessed without a parent-child relationship. Ordinary Pipes • Ordinary Pipes allow communication in standard producerconsumer style • Producer writes to one end (the write-end of the pipe) • Consumer reads from the other end (the read-end of the pipe) • Ordinary pipes are therefore unidirectional • Require parent-child relationship between communicating processes • Windows calls these anonymous pipes • See Unix and Windows code samples in textbook Sockets • A socket is defined as an endpoint for communication • • • Concatenation of IP address and port IP: four numbers between 0 – 255: e.g. 161.25.19.8 port a number included at start of message packet to differentiate network services on a host. • The socket 161.25.19.8:1625 refers to port 1625 on host 161.25.19.8 • Communication consists between a pair of sockets • All ports below 1024 are well known, used for standard services. – Assigned by Internet Assigned Numbers Authority • Special IP address 127.0.0.1 (loopback) to refer to system on which process is running Socket Communication Sockets in Java • Three types of sockets – Connection-oriented (TCP) – Connectionless (UDP) – MulticastSocket class– data can be sent to multiple recipients • Consider this “Date” server: • See Fig. 3.22 in the book for the client code. Lab Sample code • FILE* outfile = fopen(“copy.txt”, “w”); • while ( (c=getc(infile)) !=EOF){ – putc(c,outfile); • } • No check on the system call’s return value • Can crash your program – Can bring your entire system down – Can cost millions to your business – An awkward meeting with the boss • Only going to deduct 1 mark for such mistakes. 47 Midterm Exam • Exam on Monday, 19 October at 1600 hrs at CSG001. • Special Needs: email Farshad Toosi for separate seating. • Everything covered including extra reading material • More instructions later on the class website. 48