Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Control theory wikipedia , lookup
Computer simulation wikipedia , lookup
Neuroinformatics wikipedia , lookup
Perceptual control theory wikipedia , lookup
Inverse problem wikipedia , lookup
Pattern recognition wikipedia , lookup
Data analysis wikipedia , lookup
Operational transformation wikipedia , lookup
Corecursion wikipedia , lookup
Theoretical computer science wikipedia , lookup
Hendrik Wade Bode wikipedia , lookup
Control-Based Load Shedding in Data Stream Management Yicheng Tu†, Song Liu‡, Sunil Prabhakar†, Bin Yao‡ †Indiana Center of Database Systems, Department of Computer Sciences, 305 N. University Street, West Lafayette, IN 47907 ‡School of Mechanical Engineering, 140 S. Intramural Drive, West Lafayette, IN 47907 Introduction Our approach Data Stream Management Systems (DSMSs) process large number of data streams to answer user-specified queries. These systems are generally built following a query-passive data-active model, in which all data are pushed to the database server for processing and query results are sent to the users continuously. Data processing delay is critical in DSMSs since query results generated from old data are useless to users. In case of overloading, data tuples have to be discarded without processing in order to achieve desired processing delay. This is called load shedding. - View it as a feedback control problem - Develop a dynamic model for a specific DSMS - Design controller via rigorous control-theoretical methods - Work on a real DSMS – the open-source Borealis system Key Questions: • When? • How much? • Where? We focus on the first two questions. Figure 3. The feedback control loop for load shedding. Output (y): average tuple delay; Input (u): tuple injection rate to DSMS; target delay value (yr) and control error (e). Figure 5. Relative performance of CTRL to AURORA and BASELINE. A, B, C: various aspects of delay violations; D: percentage of data discarded. Results Da ta User Data - Obtained a first-order linear model for Borealis - Pole placement-based design ended up a PD controller: Query Results Data User DSMS D at a Data User Figure 1. Pushed-based DSMS system model. where c and H are system-specific constants and T is the control period. - Identified and solved several DSMS-specific problems - Control framework evaluated with real and synthetic data Figure 6. Robustness of CTRL and AURORA tested with input streams of different burstiness (smaller bias factor represents more bursty stream). Objective To design and implement a load shedding framework that • minimize the data loss; • maintains processing delays in rejection to disturbances: - bursty data arrivals; - internal dynamics of DSMS. • is robust, i.e., works for a wide range of input streams. Conclusions 1. First database work theoretical methods; that uses feedback-control- 2. Rigorous system modeling and controller design generate a PD controller that controls average tuple delays by adjusting the amount of load shedding; 3. Control framework implemented and evaluated in real DSMS. Experiments show that feedback-control-based method significantly improves control of delays with the same amount of data loss as compared to current solutions. 4.The above solution is also robust. Figure 2. Examples of disturbances in data processing in DSMS. Top: bursty arrival rates; Bottom: unit processing costs. Figure 4. Performance of our load shedding solution (CTRL), AURORA, an open-loop solution that represents state-of-theart in DSMS load shedding, and BASELINE, a naïve feedback-based solution. Acknowledgements This is joint work with my advisor, Prof. Sunil Prabhakar ([email protected]), Dr. Song Liu ([email protected]) and Prof. Bin Yao ([email protected]) of the School of Mechanical Engineering in Purdue University. The author would also like to thank Ms. Nesime Tatbul and Prof. Ston Zdonik, both from the Computer Science department of Brown University, for providing the Aurora/Borealis source code.