* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Research Projects in DSRG Lab
Survey
Document related concepts
Entity–attribute–value model wikipedia , lookup
Data Protection Act, 2012 wikipedia , lookup
Versant Object Database wikipedia , lookup
Data center wikipedia , lookup
Data analysis wikipedia , lookup
Forecasting wikipedia , lookup
Clusterpoint wikipedia , lookup
Information privacy law wikipedia , lookup
3D optical data storage wikipedia , lookup
Business intelligence wikipedia , lookup
Transcript
Elke A. Rundensteiner Database Systems Research Group Email: Office: Phone: WebPages: [email protected] Fuller 238 Ext. – 5815 http://www.cs.wpi.edu/~rundenst http://davis.wpi.edu/dsrg Project Topics in a Nutshell: Distributed Data Sources: EVE : Data Warehousing over Distributed Data TOTAL-ETL : Distributed Extract Transform Load [NSF’96,NSF02,NSF05?] XML/Web Data Systems: RAINBOW : XML to Relational Databases MASS : Native XQuery Processing System [Verizon,IBM,NSF05, NSF05?] Databases & Visualization: Scalable Visual High-Dim. Data Exploration Data and Visual Quality Support in XMDV [NSF’97,NSF01,NSF05] Stream Monitoring System: Scalable Query Engine for Data Streams Fire Prediction and Monitoring Appl. [NSF05a?, NSF05b?] CAPE : Engine for Querying and Monitoring Streaming Data Example of Stream Data Applications: • Market Analysis –Streams of Stock Exchange Data get rich • Critical Care –Streams of Vital Sign Measurements – save lives • Physical Plant Monitoring –Streams of Environmental Readings – protect env Databases Upside Down static data data data data Standing queries Query Query Query Query data one-time queries data data streams of data Stream Query Processing Register Continuous Queries Receive Answers High workload of queries Real-time and accurate responses required Streaming Data Distributed Stream Query Engine May have timevarying rates and high-volumes Memory- and CPU resource limitations Streaming Result Available resources for executing each operator may vary over time. Run-time Distribution and Adaptations required. Good news … for a research student We can lean on the oldie and goodie, Yet so many new and unsolved problems at our finger tips due to new light ! Interesting (yet doable) research challenges Even possibilities for start-up (if you are so inclined) Research Contributions Scalable Query Operators (Punctuations) Adapt and select among tasks such as memory purging, stream reading, memoryto-disk shuffling, punctuation propagation, index selection, etc. Synchronized Plan Spilling Operators selectively spill data to disk to off-set the system overload with adaptive re-load to improve performance Adaptive Operator Scheduling Selector scores alternate scheduling algorithm based on their effect on QoS requirements, and selects candidate. On-line Query Plan Migration On-line plan restructuring and then online migration to the new plan even for stateful operators. Distributed Plan Execution Adaptively distribute computations across multiple machines to optimize QoS requirements without information loss We got it all . . . and more If you like theory algorithms for np-complete optimization, graph theory If you like systems distributed allocation, scheduling, and parallelism of query execution If you like networking quality-of-query, load-shedding, grid-computing If you like AI learning of scheduling selection, run-time adaptation If you like software engineering huge query engine code base, we really need you So where is the database in this stuff? One answer : Who cares ? If it’s fun, it’s database stuff Second answer : Development of a new generation of “data query engine” A driving application: FIRE Sensors in Rooms Engineering Data for Fire Science Futuristic Monitoring Queries ? Track a smoke cloud (moving cluster) in terms of its speed and severity ? Find the scope and direction of fire spreads ? Match given sensors readings of fire with a fire stream simulation to determine similarity ? Is this a prank (outlier), or are we dealing with an actual fire ? What path should people be leaving this building ? Any sensor readings are faulty, and should be ignored? FireEngine : Fire Stream Processing If Questions, email me: [email protected] Better, drop by DSRG Labs : Fuller 319 & 318 My office : Fuller 238