Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS519: Lecture 7 Uniprocessor and Multiprocessor Scheduling What and Why? What is processor scheduling? Why? At first to share an expensive resource – multiprogramming Now to perform concurrent tasks because processor is so powerful Future looks like past + now Rent-a-computer approach – large data/processing centers use multiprogramming to maximize resource utilization Systems still powerful enough for each user to run multiple concurrent tasks CS 519 – Operating Systems Theory 2 DCS, Rutgers University Assumptions Pool of jobs contending for the CPU CPU is a scarce resource Jobs are independent and compete for resources (this assumption is not always used) Scheduler mediates between jobs to optimize some performance criteria CS 519 – Operating Systems Theory 3 DCS, Rutgers University Types of Scheduling We’re mostly concerned with short-term scheduling CS 519 – Operating Systems Theory 4 DCS, Rutgers University What Do We Optimize? System-oriented metrics: Processor utilization: percentage of time the processor is busy Throughput: number of processes completed per unit of time User-oriented metrics: Turnaround time: interval of time between submission and termination (including any waiting time). Appropriate for batch jobs Response time: for interactive jobs, time from the submission of a request until the response begins to be received Deadlines: when process completion deadlines are specified, the percentage of deadlines met must be promoted CS 519 – Operating Systems Theory 5 DCS, Rutgers University Design Space Two dimensions Selection function Which of the ready jobs should be run next? Preemption Preemptive: currently running job may be interrupted and moved to Ready state Non-preemptive: once a process is in Running state, it continues to execute until it terminates or it blocks for I/O or system service CS 519 – Operating Systems Theory 6 DCS, Rutgers University Job Behavior CPU I/O-bound jobs Jobs that perform lots of I/O Tend to have short CPU bursts CPU-bound jobs Jobs that perform very little I/O Tend to have very long CPU bursts Distribution tends to be hyper-exponential Very large number of very short CPU bursts A small number of very long CPU bursts Disk CS 519 – Operating Systems Theory 7 DCS, Rutgers University Scheduling Algorithms FIFO: non-preemptive Round-Robin: preemptive Shortest Job Next: non-preemptive Shortest Remaining Time: preemptive at arrival Highest Response Ratio (turnaround/service time) Next: non-preemptive Priority with feedback: preemptive at time quantum CS 519 – Operating Systems Theory 8 DCS, Rutgers University Example Job Set Arrival Time Service Time 1 0 3 2 2 6 3 4 4 4 6 5 5 8 2 Process CS 519 – Operating Systems Theory 9 DCS, Rutgers University Behavior of Scheduling Policies CS 519 – Operating Systems Theory 10 DCS, Rutgers University Behavior of Scheduling Policies CS 519 – Operating Systems Theory 11 DCS, Rutgers University Priority with Feedback Scheduling After each preemption, process moves to lower-priority queue CS 519 – Operating Systems Theory 12 DCS, Rutgers University Scheduling Algorithms FIFO is simple but leads to poor average response times. Short processes are delayed by long processes that arrive before them RR eliminate this problem, but favors CPU-bound jobs, which have longer CPU bursts than I/O-bound jobs SJN, SRT, and HRRN alleviate the problem with FIFO, but require information on the length of each process. This information is not always available (although it can sometimes be approximated based on past history or user input) Feedback is a way of alleviating the problem with FIFO without information on process length CS 519 – Operating Systems Theory 13 DCS, Rutgers University It’s a Changing World Assumption about bi-modal workload no longer holds Interactive continuous media applications are sometimes processor-bound but require good response times New computing model requires more flexibility How to match priorities of cooperative jobs, such as client/server jobs? How to balance execution between multiple threads of a single process? CS 519 – Operating Systems Theory 14 DCS, Rutgers University Lottery Scheduling Randomized resource allocation mechanism Resource rights are represented by lottery tickets Have rounds of lottery In each round, the winning ticket (and therefore the winner) is chosen at random The chances of you winning directly depends on the number of tickets that you have P[wining] = t/T, t = your number of tickets, T = total number of tickets CS 519 – Operating Systems Theory 15 DCS, Rutgers University Lottery Scheduling After n rounds, your expected number of wins is E[win] = nP[wining] The expected number of lotteries that a client must wait before its first win E[wait] = 1/P[wining] Lottery scheduling implements proportional-share resource management Ticket currencies allow isolation between users, processes, and threads OK, so how do we actually schedule the processor using lottery scheduling? CS 519 – Operating Systems Theory 16 DCS, Rutgers University Implementation CS 519 – Operating Systems Theory 17 DCS, Rutgers University Performance Allocated and observed execution ratios between two tasks running the Dhrystone benchmark. With exception of 10:1 allocation ratio, all observed ratios are close to allocations CS 519 – Operating Systems Theory 18 DCS, Rutgers University Short-term Allocation Ratio CS 519 – Operating Systems Theory 19 DCS, Rutgers University Isolation Five tasks running the Dhrystone benchmark. Let amount.currency denote a ticket allocation of amount denominated in currency. Tasks A1 and A2 have allocations 100.A and 200.A, respectively. Tasks B1 and B2 have allocations 100.B and 200.B, respectively. Halfway thru experiment B3 is started with allocation 300.B. This inflates the number of tickets in B from 300 to 600. There’s no effect on tasks in currency A or on the aggregate iteration ratio of A tasks to B tasks. Tasks B1 and B2 slow to half their original rates, corresponding to the factor of 2 inflation caused by B3. CS 519 – Operating Systems Theory 20 DCS, Rutgers University Thread Scheduling for Cache Locality Traditionally, each resource (CPU, memory, I/O) has been managed separately Resources are not independent, however Policy for one resource can affect how another resource is used. For instance, the order in which threads are scheduled can affect performance of memory subsystem Neat paper that uses a very simple scheduling idea to enhance memory performance CS 519 – Operating Systems Theory 21 DCS, Rutgers University Main Idea When working with a large array, want to tile (block) for efficient use of the cache What is tiling? Restructuring loops for data re-use. Tiling by hand is a pain and is error-prone Compilers can automatically tile but not always. For instance, when program contains dynamically allocated or indirectly accessed data So, use threads and hints to improve cache utilization CS 519 – Operating Systems Theory 22 DCS, Rutgers University Example Thread ti is denoted by ti(ai1,…,aik), where aij is the address of the jth piece of data reference by thread ti. Simplify by using just 2 or 3 addresses or hints. Hints might be elements of rows or columns of a matrix, for ex. CS 519 – Operating Systems Theory 23 DCS, Rutgers University Algorithm Hash algorithm should assign threads to bins so that threads with similar hints fall in the same bin Threads in each bin are scheduled together 2-D plane of bins with 2 hints Key insight: the sum of the two dimensions of a bin is less than the cache size C Easy to extend to k hints CS 519 – Operating Systems Theory 24 DCS, Rutgers University Performance Fork = create and schedule null thread Run = execute and terminate null thread CS 519 – Operating Systems Theory 25 DCS, Rutgers University More Complex Examples Partial differential equation solver CS 519 – Operating Systems Theory 26 DCS, Rutgers University Multiprocessor Scheduling Load sharing: single ready queue; processor dequeues thread at the front of the queue when idle; preempted threads are placed at the end of queue Gang scheduling: all threads belonging to an application run at the same time Hardware partitions: chunk of the machine is dedicated to each application Advantages and disadvantages? CS 519 – Operating Systems Theory 27 DCS, Rutgers University Multiprocessor Scheduling Load sharing: poor locality; poor synchronization behavior; simple; good processor utilization. Affinity or per processor queues can improve locality. Gang scheduling: central control; fragmentation -unnecessary processor idle times (e.g., two applications with P/2+1 threads); good synchronization behavior; if careful, good locality Hardware partitions: poor utilization for I/Ointensive applications; fragmentation – unnecessary processor idle times when partitions left are small; good locality and synchronization behavior CS 519 – Operating Systems Theory 28 DCS, Rutgers University