Download Application Performance in the QLinux Multimedia Operating System

Application Performance in the QLinux Multimedia Operating System Jun Wang Overview      H-SFQ CPU/Packet Scheduler Cello Disk Scheduler Lazy Receiver Processing (LRP) Related Works Open Discussions H-SFQ CPU/Packet Scheduler Requirements for New CPU Scheduler Example 1 Consider a MPEG decoder and some conventional processes running together. The CPU scheduler is the standard time sharing scheduler. The MPEG decoder has a soft real-time demand which is 25 frames per seconds. When the number of conventional processes increases, what is the problem? Complaint: The video refreshes too slow! Requirements for New CPU Scheduler Example 2 Consider some conventional processes and some hard real-time processes such as control system of aircraft running together in a real-time OS. The CPU scheduler is Earliest Deadline First (EDF). When the number of hard real-time processes increases, what is the problem? Complaint: I can’t input anything into my editor! Requirements for New CPU Scheduler Problem Analysis 1. A conventional CPU scheduler can not adapt multiple classes of applications simultaneously. 2. A conventional CPU scheduler can not isolate different classes of applications so that increasing the load of one class of applications does not influence other classes of applications. H-SFQ CPU Scheduler —— A Sample Hierarchy H-SFQ CPU Scheduler —— Features  Multiple level scheduling structure ♠class-independent scheduler for non-leaf node & classspecific scheduler for leaf node  Support for multiple service classes ♠solving the simultaneous occurrence problem  Predictable resource allocation ♠solving the isolation problem  Proper accounting of resource usage ♠accounting of CPU usage by virtual time SFQ CPU Scheduler —— Algorithm Description(1) Example: Consider two threads A and B with weights 1 and 2,respectively,which become runnable at time t=0. Thread A will run 80ms to finish, and Thread B will run 110ms to finish. But both of them will block at some time. Let the scheduling quantum for each thread be 10ms. SFQ CPU Scheduler —— Algorithm Description(2) SFQ CPU Scheduler —— Properties    SFQ achieves fair allocation of CPU regardless of variation in available processing bandwidth SFQ does not require the length of the quantum to be known a priori SFQ provides bounds on maximum delay incurred and minimum throughput achieved by the threads H-SFQ CPU Scheduler —— implementation Weight Start-tag Finish-tag H-SFQ CPU Scheduler —— Evaluation(1) H-SFQ CPU Scheduler —— Evaluation(2) Open Discussions — Section One   How to convert different user requirements into weights in H-SFQ? HLS: another hierarchical framework for soft Real-time schedulers ♣Multiple kinds of schedulers in non-leaf nodes ♣Support both relative allocations and absolute allocations Cello Disk Scheduler Requirements for New Disk Scheduler Conventional Disk schedulers experience the same simultaneous occurrence problems and isolation problem as the conventional CPU schedulers Cello Disk Scheduler —— Architectural Principles Cello Disk Scheduler —— Features    Multiple level scheduling structure Class-independent scheduler combines weight and disk service time into its decision of scheduling. Class-specific scheduler steal the slack time of soft real-time requests for interactive requests as much as possible Cello Disk Scheduler —— Implementation Ideas(1) Example: There are 3 classes disk requests with equal weights in the system, one is soft real-time class with 20ms deadline for every request and 6ms service time, one is interactive class with 3 ms service time, the last is throughputintensive class. The following figure shows the contents of pending queues and scheduled queue at the beginning of a certain time interval which is 45ms. soft real-time r3 r2 r1 interactive i3 i2 i1 throughput-intensive t1 Scheduled Queue Cello Disk Scheduler —— Implementation Ideas(2) soft real-time r3 r2 interactive i3 i2 i1 throughput-intensive t1 soft real-time r3 interactive i3 i2 i1 throughput-intensive t1 Switching from one class to another class when one of three conditions is met: Scheduled Queue ᆞthe pending queue for a class is r1 empty ᆞthe sum of requests outweigh the class’s proportion ᆞthe sum of requests outweigh the current time interval, Scheduled Queue r2 r1 Different class-specific Cello Disk Scheduler schedulers have different insert Slack time of a strategies. Ideas(3) —— Implementation request i: equal to ᆞReal-time: EDF+SCAN Max(0,li-ei). Li: the latest time by softthe real-time which disk must r3 this begin servicing interactive request i2 i1time Ei: thei3earliest throughput-intensive at which the disk t1 may begin servicing a request soft real-time r3 interactive throughput-intensive t1 ᆞInteractive: Steal slack time ᆞThroughput-intensive: insert towards the tail Scheduled Queue i3 r2 i2 i1 r1 Unused disk bandwidth for interactive class can be assigned to other classes according to their weight. Scheduled Queue t1 r3 i3 r2 i2 i1 r1 Cello Disk Scheduler —— Evaluation(1) Cello Disk Scheduler —— Evaluation(2) Open Discussions — Section Two Overload of recomputing slack time after insertion of each request. Possible reductions?  Lazy Receiver Processing (LRP) Conventional Linux Network Architecture for Receiving Packets Problem Analysis     Eager receiver processing ♠starving applications problem Lack of effective load shedding ♠livestock problem Lack of traffic separation ♠applications isolation problem Inappropriate resource accounting ♠Account CPU time unfair LRP Architecture Features of LRP     Lazy receiver processing Dropping packets as soon as possible Different NI channels for different sockets accurate resource accounting Comparision of Two Architectures(1) Demultiplexing Income Packets Conventional Linux LRP Demultiplexing Position Sockets Queue NI Channel The initial Position of Dropping Packets Sockets Queue NI Channel What is the advantage of early demultiplexing packets in LRP? Traffic separation and effective load shedding Comparision of Two Architectures(2) Packets Processing Conventional Linux LRP Time when the packet is passed from NI when the application process call receive() TCP/IP Protocol Processing Postion Hard/Soft interrupt handler driven as kernel level invoked in receive() as user level Who charge for Protocol Processing Time The process when a interrupt comes The process who will consume the packet What is the advantage of delayed packet processing in LRP? Support stability and fairness under overload and correct CPU time accounting Open Discussions —Section Three  Accounting CPU time ♣Virtual time How to make up the stolen time for a process? ♣Lazy receiver processing Related Works QLinux ♠ Resource Reservation for different classes of applications by multiple level schedulers ♠ Reallocating dynamically unused resources ControlWare & QLinux A Problem with QLinux : how to map QoS directly to the appropriate weight? QLinux can only tell that you can get some performance guarantee with a certain weight. But if you want to get some certain QoS, what is the appropriate weight assigned to you? A possible solution: using some feedback mechanism like ControlWare to adjust the weights until finding the appropriate weight for a certain QoS. QuO & QLinux QuO solves how to transmit the QoS information between a client and an object based on the QoS parameters of underlying OS and network. So QLinux can serve very well for QuO. References [1] P. Goyal and X. Guo and H.M. Vin, A Hierarchical CPU Scheduler for Multimedia Operating Systems, Proceedings of 2nd Symposium on Operating System Design and Implementation (OSDI'96), Seattle, WA, pages 107-122, October 1996. [2] P. Goyal and H. M. Vin and H. Cheng, Start-time Fair Queuing: A Scheduling Algorithm for Integrated Services Packet Switching Networks, Proceedings of ACM SIGCOMM'96, pages 157-168, August 1996. [3] P Shenoy and H M. Vin, Cello: A Disk Scheduling Framework for Next Generation Operating Systems, Proceedings of ACM SIGMETRICS Conference, Madison, WI, pages 44-55, June 1998. [4] P. Druschel and G. Banga, Lazy Receiver Processing (LRP): A Network Subsystem Architecture for Server Systems, Proceedings of the 2nd Symposium on Operating System Design and Implementation (OSDI'96), Seattle, WA, Pages 261-275, October 1996 [5] Vijay Sundaram, Abhishek Chandra, Pawan Goyal and Prashant Shenoy, Application Performance in the QLinux Multimedia Operating System , Proceedings of the Eighth ACM Conference on Multimedia, Los Angeles, CA, pages 127-136, November 2000. November 2000. [6] John Regehr and John A. Stankovic, HLS:A Framework for Composing Soft RealTime Schedulers, Proceedings of the 22nd IEEE Real-Time Systems Symposium (RTSS 2001), pages 3-14, London, UK, December 3-6 2001. [7] John Regehr and John A. Stankovic, Augmented CPU Reservations: Towards Predictable Execution on General-Purpose Operating Systems, Proceedings of the Seventh IEEE Real-Time Technology and Applications Symposium (RTAS 2001), pages 141-148, Taipei, Taiwan, May 30-June 1 2001 [8] John Regehr and John A. Stankovic, Operating System Support for Multimedia:

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Application Performance in the QLinux Multimedia Operating System