Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mobile operating system wikipedia , lookup
Burroughs MCP wikipedia , lookup
Copland (operating system) wikipedia , lookup
Plan 9 from Bell Labs wikipedia , lookup
Unix security wikipedia , lookup
Spring (operating system) wikipedia , lookup
Security-focused operating system wikipedia , lookup
Process management (computing) wikipedia , lookup
Advanced Operating Systems Lecture 6: Scheduling University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani Univ. of Tehran Distributed Operating Systems 1 How efficiently use resources Sharing CPU and other resources of the systm. References Surplus Fair Scheduling: A Proportional-Share CPU Scheduling Algorithm for Symmetric Multiprocessors Scheduler Activations: Effective Kernel Support for UserLevel Management of Parallelism", Condor- A Hunter of Idle Workstation Virtual-Time Round-Robin: An O(1) Proportional Share Scheduler A SMART Scheduler for Multimedia Applications Linux CPU scheduling, Univ. of Tehran Distributed Operating Systems 2 Outline Scheduling Scheduling policies. Scheduling on Multiprocessor Thread scheduling Univ. of Tehran Distributed Operating Systems 3 What is Scheduling? Policies and mechanisms to allocates resources to entities. It is a general problem in any field, why? An O/S often has many pending tasks. The order may matter. Policy, correctness, or efficiency. Providing sufficient control is not easy. Threads, async callbacks, device input. Mechanisms must allow policy to be expressed. A good scheduling policy ensures that the most important entity gets the resources it needs Univ. of Tehran Distributed Operating Systems 4 Why Scheduling? This topic was popular in the days of time sharing, when there was a shortage of resources. Is it irrelevant now? in era of PCs when resources are plenty. The topic is back to handle massive Internet servers with paying customers Where some customers are more important than others New area such as multicores need scheduling Univ. of Tehran Distributed Operating Systems 5 Resources to Schedule? Resources : CPU time, physical memory, disk and network I/O, and I/O bus bandwidth. Entities to give resources : users, processes, threads, web requests. Univ. of Tehran Distributed Operating Systems 6 Key problems ? Gap between desired policy and available mechanism. Implementation is problem. Many conflicting goals (low latency, high throughput, and fairness), must make a trade-off between the goals. Interaction between different schedulers. One have to take a systems view. Just optimizing the CPU scheduler may do little to for the overall desired policy. Univ. of Tehran Distributed Operating Systems 7 Scheduling Policy Examples Allocate cycles in proportion to money. Maintain high throughput under high load. Never delay high pri thread by > 1ms. Maintain good interactive response. Can we enforce policy with the thread scheduler? Univ. of Tehran Distributed Operating Systems 8 General plan Understand where scheduling is occurring. Expose scheduling decisions, allow control. Account for resource consumption, to allow intelligent control. Where to schedule? Application is the best position to know scheduling requirements Which threads run best simultaneously Which are on Critical path But Kernel must make sure all play fairly Univ. of Tehran Distributed Operating Systems 9 Example Each process equal CPU time. Interrupt every 10 msec and select another in a round-robin fashion. Works if processes are compute-bound. What if a process waits for I/O? How long quantum? is 10 msec the right answer? Shorter quantum => better interactive , but lower throughput. What if the environment computes for 1 msec and sends an IPC to the file server environment? Shouldn't the file server get more CPU time because it operates on behalf of all other functions? Improvements: track "recent" CPU use and run one with least recent CPU use. (Still, if you sleep long enough you lose.) Other solution: directed yield; specify on the yield to which environment you are donating the remainder of the quantuam Tehran Distributed Operating Systems 10 (e.g.,Univ.toof the file server so that it can compute on the Scheduling is a System Problem Thread/process scheduler can’t enforce policies by itself. Needs cooperation from: All resource schedulers. Software structure. Conflicting goals may limit effectiveness. Univ. of Tehran Distributed Operating Systems 11 Goals Low latency High throughput People typing at editors want fast response - Network services can be latency-bound, not CPU-bound Minimize context switches to avoid wasting CPU, TLB misses, cache misses, even page faults. Fairness Univ. of Tehran Distributed Operating Systems 12 Scheduling Approaches FIFO + Fair - High latency Round robin + fair + low latency poor throughput STCF/SRTCF (shortest time/remaining time to completion first) + low latency + high throughput - unfair: Starvation Univ. of Tehran Distributed Operating Systems 13 Shortest Job First (SJF) Two types: Non-preemptive Preemptive Requirement: the elapse time needs to be known in advance Optimal if all jobs are available simultaneously (provable) Is SJF optimal if all the jobs are not available simultaneously? Univ. of Tehran Distributed Operating Systems 14 Preemptive SJF Also called Shortest Remaining Time First Schedule the job with the shortest remaining time required to complete Requirement: the elapse time needs to be known in advance Univ. of Tehran Distributed Operating Systems 15 Interactive Scheduling Usually preemptive Performance Criteria Time is sliced into quantum (time intervals) Decision made at the beginning of each quantum Min Response time best proportionality Representative algorithms: Priority-based Round-robin Multi Queue & Multi-level Feedback Shortest process time Guaranteed Scheduling Lottery Scheduling Fair Sharing Scheduling Univ. of Tehran Distributed Operating Systems 16 Priority Scheduling Each job is assigned a priority with FCFS within each priority level. Select highest priority job over lower ones. Rational: higher priority jobs are more mission-critical Example: DVD movie player vs. send email Problems: May not give the best AWT (Ave. Waiting Time) Indefinite blocking or starvation a process Univ. of Tehran Distributed Operating Systems 17 Set Priority Two approaches Static (for system with well known and regular application behaviors) Dynamic (otherwise) Priority may be based on: Cost to user. Importance of user. Aging Percentage of CPU time used in last X hours. Univ. of Tehran Distributed Operating Systems 18 Pitfall: Priority Inversion • • • • Low-priority thread X holds a lock. High-priority thread Y waits for the lock. Medium-priority thread Z pre-empts X. Y is indefinitely delayed despite high priority. When a higher priority process needs to read or modify kernel data that are currently being accessed by a lower priority process the higher priority process must wait! But the lower priority cannot proceed quickly due to scheduling. Solution: priority inheritance When a lower-priority process accesses a resource, it inherits high-priority until it is done with the resource in question. And then its priority reverses to its natural value. Univ. of Tehran Distributed Operating Systems 19 Pitfall: Long Code Paths Large-granularity locks are convenient. Non-pre-emptable threads are an extreme case. May delay high-priority processing. Every resource with multiple waiting threads has a scheduler, Locks, disk driver, memory allocator. The schedulers may not cooperate or even be explicit Univ. of Tehran Distributed Operating Systems 20 Pitfall: Efficiency Efficient disk use requires unfairness. Efficient paging policy creates delays. Shortest-seek-first vs FIFO. Read-ahead vs data needed now. O/S may swap out my idle Emacs to free memory. What happens when I type a key? Thread scheduler doesn’t control these. Univ. of Tehran Distributed Operating Systems 21 Pitfall: Server Processes User-level servers schedule requests. X11, DNS, NFS. They usually don’t know about kernel’s scheduling policy. Network packet scheduling also interferes. Univ. of Tehran Distributed Operating Systems 22 Pitfall: Hardware Schedulers Memory system scheduled among CPUs. I/O bus scheduled among devices. Interrupt controller chooses next interrupt. Hardware doesn’t know about O/S policy. O/S often doesn’t understand hardware. Univ. of Tehran Distributed Operating Systems 23 Time Quantum Time slice too large FIFO behavior Poor response time Time slice too small Too many context switches (overheads) Inefficient CPU utilization Heuristic: 70-80% of jobs block within time-slice Typical time-slice 10 to 100 ms Time spent in system depends on size of job. Univ. of Tehran Distributed Operating Systems 24 Multi-Queue Scheduling Hybrid between priority and round-robin Processes assigned to one queue permanently Scheduling between queues Example of different priorities Fixed Priorities % CPU spent on queue System processes Interactive programs Background Processes Student Processes Address the starvation and infinite blocking problems Univ. of Tehran Distributed Operating Systems 25 Scheduling Approaches Multilevel feedback queues A job starts with the highest priority queue If time slice expires, lower the priority by one level If time slice does not expire, raise the priority by one level Age long-running jobs Univ. of Tehran Distributed Operating Systems 26 Multi-Processor Scheduling: Load Sharing Decides Which process to run? How long does it run Where to run it? (CPU (horsepower)) I want to ride it Process 1 Univ. of Tehran … Process 2 Process n Distributed Operating Systems 27 Multi-Processor Scheduling Choices Self-Scheduled Master-Slave Each CPU dispatches a job from the ready queue One CPU schedules the other CPUs Asymmetric One CPU runs the kernel and the others runs the user applications. One CPU handles network and the other handles applications Univ. of Tehran Distributed Operating Systems 28 Gang Scheduling for MultiProcessors A collection of processes belonging to one job are running at the same time If one process is preempted, all the processes of the gang are preempted. Helps to eliminate the time a process spends waiting for other processes in its parallel computation. Univ. of Tehran Distributed Operating Systems 29 Lottery Scheduling Claim Priority-based schemes are ad hoc Lottery scheduling Randomized scheme Based on a currency abstraction, A process is scheduled to run if it has a ticket. Idea: Processes own lottery tickets CPU randomly draws a ticket and execute the corresponding process Univ. of Tehran Distributed Operating Systems 30 Properties of Lottery Scheduling Guarantees fairness through probability in a long run. Guarantees no starvation, as long as each process owns one ticket To approximate SRTCF Short jobs get more tickets Long jobs get fewer Univ. of Tehran Distributed Operating Systems 31 Partially Consumed Tickets What if a process is chosen, but it does not consume the entire time slice? The process receives compensation tickets Idea Get chosen more frequently But with shorter time slice Different implementation. Sort tickets and start with the large one Generate a random number and see who has the ticket. Univ. of Tehran Distributed Operating Systems 32 Ticket Currencies Load Insulation A process can dynamically change its ticketing policies without affecting other processes Need to convert currencies before transferring tickets Univ. of Tehran base:3000 1000 2000 Alice:200 Bob:100 200 100 process1:500 process2:100 200 300 100 thread1 thread2 thread3 Distributed Operating Systems 33 Condor Identifies idle workstations and schedules background jobs on them Guarantees job will eventually complete Analysis of workstation usage patterns Only 30% Remote capacity allocation algorithms Up-Down algorithm Allow fair access to remote capacity Remote execution facilities Remote Unix (RU) Univ. of Tehran Distributed Operating Systems 34 Condor Issues Leverage: performance measure Ratio of the capacity consumed by a job remotely to the capacity consumed on the home station to support remote execution Checkpointing: save the state of a job so that its execution can be resumed Transparent placement of background jobs Automatically restart if a background job fails Users expect to receive fair access Small overhead Univ. of Tehran Distributed Operating Systems 35 Condor - scheduling Hybrid of centralized static and distributed approach Each workstation keeps own state information and schedule Central coordinator assigns capacity to workstations Workstations use capacity to schedule Univ. of Tehran Distributed Operating Systems 36 Real time Systems Issues are scheduling and interrupts Must complete task by a particular deadline Examples: Accepting input from real time sensors Process control applications Responding to environmental events How does one support real time systems If short deadline, often use a dedicated system Give real time tasks absolute priority Do not support virtual memory Use early binding Univ. of Tehran Distributed Operating Systems 37 Real time Scheduling To initiate, must specify System accepts or rejects Deadline Estimate/upper-bound on resources If accepted, agrees that it can meet the deadline Places job in calendar, blocking out the resources it will need and planning when the resources will be allocated Some systems support priorities But this can violate the RT assumption for already accepted jobs Univ. of Tehran Distributed Operating Systems 38 User-level Thread Scheduling Possible Scheduling 50-msec process quantum run 5 msec/CPU burst Univ. of Tehran Distributed Operating Systems 39 Kernel-level Thread Scheduling Possible scheduling 50-msec process quantum threads run 5 msec/CPU burst Univ. of Tehran Distributed Operating Systems 40 Thread Scheduling Examples Solaris 2 priority-based process scheduling with four scheduling classes: real-time, system, time sharing, interactive. A set of priorities within each class. The scheduler converts the class-specific priorities into global priorities and selects to run the thread with the highest global priority. The thread runs until (1) it blocks, (2) it uses its time slice, or (3) it is preempted by a higher priority threads. JVM schedules threads using a preemptive, priority-based scheduling algorithm. schedules the ``runnable'' thread with the highest priority. If two threads have the same priority, JVM applies FIFO. schedules a thread to run if (1) other thread exits the ``runnable state'' due to block(), exit(), suspend() or stop() methods; (2) a thread with higher priority enters the ``runnable''state. Univ. of Tehran Distributed Operating Systems 41 Surplus Fair Scheduling Motivation Server Web Streaming Network End-stations E-commerce Diverse web and multimedia applications popular HTTP, Streaming, e-commerce, games, etc. Applications hosted on large servers (typically multiprocessors) Key Challenge: Design OS mechanisms for Resource Management Univ. of Tehran Distributed Operating Systems 42 Requirements for OS Resource Management Fair, Proportionate Allocation Application Isolation Eg: 20% for http, 30% for streaming, etc. Misbehaving/overloaded applications should not affect other applications Efficiency OS mechanisms should have low overheads Focus: Achieving these objectives for CPU scheduling on multiprocessor machines Univ. of Tehran Distributed Operating Systems 43 Proportional-Share Scheduling Wt=1 2/3 1/3 Applications CPU bandwidth Associate a weight with each application and allocate CPU bandwidth proportional to weight Existing Algorithms Wt=2 Ideal algorithm: Generalized Processor Sharing (GPS) E.g.: WFQ, SFQ, SMART, BVT, etc. Question: Are the existing algorithms adequate for multiprocessor systems? Univ. of Tehran Distributed Operating Systems 44 Starvation Problem SFQ : Start tag of a thread ( Service / weight ) Schedules the thread with minimum start tag ... Start tag is add with service time/weight S1=0 S1=1 S2=0 S2=100 CPU 1 ... CPU 2 Univ. of Tehran 0 100 S1=11 A (Wt=100) S2=1000 B starves S3=10 CPU 2 Time ... S1=10 1000 S3=110 ... B (Wt=1) C (Wt=1) 1100 C arrivesOperating Systems Distributed 45 Weight Readjustment Reason for starvation: Infeasible Weight Assignment (eg: 1:100 for 2 CPUs) Accounting is different from actual allocation Observation: A thread can’t consume more than 1 CPU bandwidth A thread can be assigned at most (1/p) of total p CPU bandwidth wi 1 Feasibility Constraint: p wj Univ. of Tehran j Systems Distributed Operating 46 Weight Readjustment (contd.) Goal: Convert given weights to feasible weights Decreasing Order of weights ... CPU 1 CPU 2 CPU 3 ... CPU p Efficient: Algorithm is O(p) Can be combined with existing algorithms Univ. of Tehran Distributed Operating Systems 47 Effect of Readjustment SFQ with Readjustment SFQ without Readjustment Number of iterations (105) 30 25 A (wt=10) A (wt=10) 20 B (wt=1) 15 B (wt=1) 10 C (wt=1) 5 0 C (wt=1) 0 10 20 30 Time (s) 40 50 0 10 20 30 40 50 Time (s) Weight Readjustment gets rid of starvation problem Univ. of Tehran Distributed Operating Systems 48 Short Jobs Problem Frequent arrivals and departures of Ideal SFQ short jobs Number of iterations (105) 20 J1, wt=20 J2-J21, wt=1x20 J_short, wt=5 15 J1, wt=20 J2-J21, wt=1x20 J_short, wt=5 10 5 0 0 5 10 15 20 25 30 35 40 0 5 10 20 25 30 35 40 Time (s) Time (s) 15 SFQ does unfair allocation! Univ. of Tehran Distributed Operating Systems 49 Surplus Fair Scheduling Surplus = ServiceActual - ServiceIdeal Service received by thread i Actual Ideal Surplus Time t Scheduler picks the threads with least surplus values Lagging threads get closer to their due Threads that are ahead are restrained Univ. of Tehran Distributed Operating Systems 50 Surplus Fair Scheduling (contd.) Start tag (Si) : Weighted Service of thread i Si = Servicei / wi Virtual time (v) : Minimum start tag of all runnable threads Surplus (α i ) : α i = Servicei - Servicelagging = wi Si - wi v Scheduler selects threads in increasing order of surplus Univ. of Tehran Distributed Operating Systems 51 Surplus Fair Sched with Short Jobs Number of iterations (105) Ideal Surplus Fair Sched 20 J1, wt=20 J2-J21, wt=1x20 J_short, wt=5 15 J1, wt=20 J2-J21, wt=1x20 J_short, wt=5 10 5 0 0 5 10 15 20 25 30 35 40 0 5 10 Time (s) 15 Surplus Fair Scheduling does proportionate allocation Univ. of Tehran Distributed Operating Systems 20 25 30 35 40 Time (s) 52 Proportionate Allocation Processor Shares received by two web servers 7 6 5 Processor Allocation 4 (Normalized) 3 2 1 0 1:1 1:2 1:4 1:7 Weight Assignment Univ. of Tehran Distributed Operating Systems 53 Application Isolation MPEG decoder with background compilations 50 Surplus Fair Time-sharing 40 Frame Rate 30 (frames/sec) 20 10 0 0 2 4 6 8 10 Number of background compilations Univ. of Tehran Distributed Operating Systems 54 Scheduling Overhead 10 Surplus Fair Time-sharing 8 Context switch 6 time (microsec) 4 2 0 0 10 20 30 40 Number of processes 50 Context-switch time(~10μ s) vs. Quantum size (~100ms) Distributed Operating Systems Univ. of Tehran 55 Summary Existing proportional-share algorithms inadequate for multiprocessors Readjustment Algorithm can reduce unfairness Surplus Fair Scheduling practical for multiprocessors Achieves proportional fairness, isolation Has low overhead Heuristics for incorporating processor affinity Source code available at: http://lass.cs.umass.edu/software/gms Univ. of Tehran Distributed Operating Systems 56 Scheduler Activations In a multiprocessor system, threads could be managed in: User Space only Kernel Space only Key feature: Cooperative Key feature: Preemptive User Space on top of Kernel Space Univ. of Tehran Some User-Level Threads ------------------------Some Kernel-Level Threads ------------------------Some CPUs Distributed Operating Systems 57 Scheduler activations User level scheduling of threads Kernel allocates threads to tasks Application maintains scheduling queue Makes upcall to scheduling code in application when thread is blocked for I/O or preempted Only user level involved if blocked for critical section User level will block on kernel calls Kernel returns control to application scheduler Univ. of Tehran Distributed Operating Systems 58 User-Level Thread Management Sample measurements were obtained by Firefly running Topaz (in microsecs). Procedure call: 7 microsecs. Kernel Trap: 19 microsecs. Operation FastThreads Topaz Threads Ultrix Processes Null Fork 34 948 11300 Signal-Wait 37 441 1840 Univ. of Tehran Distributed Operating Systems 59 User-level on top of kernel threads Three layers: Some User-Level Threads (how many?) Problems caused by: --------------Some Kernel-Level Threads (how many?) --------------Some CPUs (how many?) Univ. of Tehran Kernel threads are scheduled obliviously with respect to the user-level thread state Kernel threads block, resume, and are preempted without notification to user level Distributed Operating Systems 60 The way out: Scheduler Activation Processor allocation is done by the kernel Thread scheduling is done by each address space The kernel notifies the address space thread scheduler of every event affecting the address space The address space notifies kernel of the subset of user-level events that can affect processor allocation decisions Univ. of Tehran Distributed Operating Systems 61 Scheduler Activation (cont.) Goal Design a kernel interface and a user-level thread package that can combine the functionality of kernel threads with performance and flexibility of user-level threads. Secondary Goal: If thread operations do not involve kernel intervention the achieved performance should be similar to user-level threads. Univ. of Tehran Distributed Operating Systems 62 Scheduler Activation (cont.) The difficulty is IN achieving all the above: in a multi-programmed/multiprocessor system the required control and scheduling information is distributed between the kernel and the user-space! To be able to manage the application’s parallelism successfully: user-level support routines (software) must be aware of kernel events (processor reallocations, I/O requests and completions etc.) –this often is all HIDDEN stuff from the application. Univ. of Tehran Distributed Operating Systems 63 Scheduler Activation (cont.) 1. Provide each application with a VIRTUAL MULTIPROCESSOR. 2. To achieve the above, the kernel NOTIFIES. 3. Application knows how many processors are allocated. Application has total control over the processors and its own scheduling. The OS has control over the allocation of processors among address spaces and ability to change the number of processors assigned to an application during its execution. The address space thread scheduler for kernel events affecting it! Application has complete knowledge of its scheduling state. The user-space thread system NOTIFIES the kernel for kernel operations that may affect the allocation of processors (helping in good performance!). Kernel mechanism that achieves the above: SCHEDULER ACTIVATIONS. Univ. of Tehran Distributed Operating Systems 64 Scheduler Activation Data Structures Each scheduler activation maintains two execution stacks: One mapped into the kernel. Another mapped into the application address space. Each user-level thread is provided with its own stack at creation time. When a user-level thread calls into kernel, it uses its activation’s kernel stack. The user-level thread scheduler runs on the activation’s userlevel stack. The kernel maintains. Activation control block: records the state of the scheduler activation’s thread when it blocks in the kernel or it is preempted. Keeps track of which thread is running on every scheduler Univ. of Tehran Distributed Operating Systems 65 When a new Program is started… Kernel creates a scheduler activation Assigns to it a processor Does an upcall to the user-space application (at a fixed entry point). The user-level system Receives the upcall and uses the scheduler activation as the context to initialize itself and start running the first (main) thread! The first thread may ask for the creation of more threads and additional processors. For each processor the kernel will create a new activation and upcall the user-level to say that the new processor is there The user-level picks a thread and executes it on the new processor. Univ. of Tehran Distributed Operating Systems 66 Notify the user level of an event… The kernel created a new scheduler activation Assigns to it a new processor and upcalls the user-space. As soon as the upcall happens An event can be processed Run a user-level thread Trap and block into the kernel KEY DIFFERENCE between Scheduler Activation and Kernel Threads If an activation’s user-level thread is stopped by the kernel, the thread is never directly resumed by the kernel! INSTEAD a new scheduler activation is done whose prime objective is to notify the user-space that the thread has been suspended. Then… the user-level thread system removes the state of the thread from the “old” activation. Tells the kernel that the “old” activation can be reused. Decides which thread to run on the processor INVARIANT: number of activations = number of processors allocated to a job. Univ. of Tehran Distributed Operating Systems 67 Kernel vectors to user-Space as Activations Add-this-processor(processor #) /* Execute a runnable user-level thread */ Processor-has-been-preempted(preempted activation # and its machine state) /* Return to the ready list the user-level thread that was executing in the context of the preempted scheduler activation */ Scheduler-activation-has-blocked(blocked activation #) /* the blocked activation no longer uses its processor */ Scheduler-activation-has-been-unlocked(unblocked activation # and its machine state) /*Return to the ready list the user-level thread that was executing in context of the blocked scheduler activation */ Univ. of Tehran Distributed Operating Systems 68 I/O happens for Thread (1)….. T1 User Program (2) User-Level Runtime System (3) (1) (A) (4) (B) Add Processor Add Processor Operating System Kernel Processors Univ. of Tehran Distributed Operating Systems 69 A’s Thread has blocked on an I/O request T2 User Program (2) User-Level Runtime System (3) (1) B (A) Operating System Kernel (4) (B) (C) A’s thread has blocked Processors Univ. of Tehran Distributed Operating Systems 70 A’s Thread I/O completed T3 User Program (2) (3) User-Level Runtime System (1) (1) (A) (B) (C) (4) (D) Operating System Kernel Processors Univ. of Tehran Distributed Operating Systems 71 A’s Thread resumes on Scheduler Activation D T4 User Program (2) (3) (1) User-Level Runtime System (C) (1) (4) (D) Operating System Kernel Processors Univ. of Tehran Distributed Operating Systems 72 User-Level Events Notifying the Kernel Add-more-processors (additional # of processor needed) /* allocate more processors to this address space and start them running scheduler activations */ This-processor-is-idle() /* preempt this processor if another address space needs it */ The kernel’s processor allocator can favor address spaces that use fewer processors (and penalize those that use more). Univ. of Tehran Distributed Operating Systems 73 Thread Operation Latencies (msec.) Univ. of Tehran Distributed Operating Systems 74 Speedup Univ. of Tehran Distributed Operating Systems 75 Next Lecture Concurrency References Fast Mutual Exclusion for Uniprocessors On Optimistic Methods for Concurrency Control Univ. of Tehran Distributed Operating Systems 76