Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei Guo University of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan University of Wisconsin Motivation — Scaling Google … 2 Motivation — Scaling A DBMS By Caching Problem: How to tell whether the cached data is “good enough” for an application? Application Server NO data quality requirements from the applications! App codeguarantees from the caching DBMS! specific NO data quality … Caching DBMS Asynchronous Updates Backend DBMS 3 Big Picture Apps: Application Specifies data quality requirements in Server queries [SIGMOD 2004] [SIGMOD 2004 Demo] Fine-grained data quality-aware database caching model Caching Cache admin: Specifies local data quality DBMS Cache: Keeps track of local data quality [VLDB 2005] Query processing: Enforces data quality constraintBackend [SIGMOD 2004] [VLDB 2005] DBMS System performance evaluation [ongoing work] 4 Contributions Goal: fine-grained data quality-aware cache management A comprehensive solution Problems Cache properties How does the cache track data quality? Dynamic modelspecify cache properties? How doescache the admin Efficient cache maintenance and “safety” How to maintain the cache efficiently? Efficiently enforce data quality checking for How to enforce data quality constraints queries? 5 Review: Data Quality Metrics (informal) Currency: The elapsed time since this copy becomes stale Consistency: A query result is (snapshot) consistent iff it is as if evaluated from a snapshot of the master database C&C: Currency & Consistency 6 Review: Proposed SQL Syntax BookCopy bid title author 1 databases Raghu 2 databases Ullman ReviewCopy rid bid text SELECT * Consistency FROM Currency Books B, Reviews R Group classby bound WHERE B.bid = R.bid AND B.title = “Databases“ CURRENCY CURRENCY BOUND 10 BOUND min ON 10(B, minR)ON BY(B) (B, B.bid R) , 30 min ON (R) bid title author bid rid text 1 databases Raghu 1 1 … … 1 databases Raghu 1 2 … … 2 databases Ullman 2 3 … 1 1 … 2 1 3 2 7 Roadmap Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions 8 Why Define Cache Properties? Queries with Relaxed C&C Requirements Query processing Cache Properties Cache maintenance Results = contract 9 Cache Properties (P+3C) Presence — per object Consistency — a set of objects Completeness — per predicate Currency — object staleness Describe local data status 10 Presence Example: SELECT * FROM Authors A WHERE authorId = 1 Question: Is an object present at the cache? 11 Consistency and Currency Example: SELECT * FROM Authors A WHERE authorId in (1, 2, 3) CURRENCY BOUND 10 ON (A) Question: Is a set of objects consistent and no more than 10 minutes old? 12 Completeness Example: SELECT * FROM Authors A WHERE city = ‘Madison’ Question: Are ALL authors from Madison in the cache? 13 Basic Concepts Tables Object View 1 Master Database H1 Snapshots View 2 View 3 Cache H2 Cache Property Examples Currency = now – stale point Consistent Complete Present View 1 Master Database H1 Stale point View 2 View 3 Cache H2 Roadmap Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions 16 Specifying Cache Properties Specified as integrity constraints Presence constraint Consistency constraint Completeness constraint Single view Presence correlation constraint Between Consistency correlation constraint two views 17 Presence Constraint AuthorCopy: authorId Backend DBMS name city 1 Alice Madison 2 Bob Madison 3 Cedric Seattle AuthorList_PCT: authorId 1 Caching DBMS 2 3 18 Presence Constraint CREATE VIEW AuthorCopy AS Partially SELECT * FROM Authors materialized view CREATEcontrolTABLE AuthorList_PCT [Zhou int) et al 2005] (authorId key ALTER VIEW AuthorCopy ADD PRESENCE ON authorId IN control(SELECT authorId FROM table authorId_PCT AuthorCopy: authorId name city 1 Alice Madison 2 Bob Madison 3 Cedric Seattle AuthorList_PCT: authorId 1 2 3 19 Consistency Constraint Cache Region CREATE TABLE CityList_CsCT (city string) Backend ALTER VIEW AuthorCopy ADD DBMS Consistency ON city IN (SELECT city FROM cityList_CsCT AuthorCopy: authorId name city 1 Alice Madison 2 Bob Madison 3 Cedric Seattle CityList_CsCT: AuthorList_PCT: AuthorList_PCT: authorId city authorId Madison 1 1 2 2 3 3 20 Completeness Constraint AuthorCopy: authorId CREATE TABLE CityList_CpCT (city string) Backend ALTER VIEW AuthorCopy ADD DBMS Completeness ON city IN (SELECT city FROM cityList_CsCT name city 1 Alice Madison 2 Bob Madison 3 Cedric Seattle CityList_CpCT: AuthorList_PCT: AuthorList_PCT: authorId city authorId Madison 1 1 New3York 3 21 Presence Correlation Constraint AuthorList_PCT: authorId 1 AuthorCopy: authorId 2 3 Backend DBMS ALTER VIEW BookCopy ADD PRESENCE ON authorId IN (SELECT authorId FROM AuthorCopy) authorId name 1 2 3 Alice Bob Cedric BookCopy: isbn 111 222 333 444 555 authorId 1 1 2 3 3 city Madison Madison Seattle authorId title aaa bbb ccc ddd eee 22 Presence Correlation Constraint AuthorList_PCT: authorId 1 2 3 AuthorList_PCT authorId AuthorCopy authorId BookCopy AuthorCopy: authorId authorId name 1 2 3 Alice Bob Cedric BookCopy: isbn 111 222 333 444 555 authorId 1 1 2 3 3 city Madison Madison Seattle authorId title aaa bbb ccc ddd eee 23 Consistency Correlation Constraint AuthorList_PCT: authorId 1 2 3 Backend DBMS ALTER VIEW BookCopy ADD CONSISTENCY ROOT AuthorCopy: authorId authorId name 1 2 3 Alice Bob Cedric BookCopy: isbn 111 222 333 444 555 authorId 1 1 2 3 3 city Madison Madison Seattle authorId title aaa bbb ccc ddd eee 24 Consistency Correlation Constraint AuthorList_PCT: authorId 1 2 3 AuthorList_PCT authorId AuthorCopy authorId BookCopy AuthorCopy: authorId authorId name 1 2 3 Alice Bob Cedric BookCopy: isbn 111 222 333 444 555 authorId 1 1 2 3 3 city Madison Madison Seattle authorId title aaa bbb ccc ddd eee 25 Cache Schema Example AuthorList_PCT authorId ReviewerList_PCT CityList_CsCT AuthorCopy reviewerId ReviewerCopy authorId BookCopy isbn ReviewC opy reviewId 26 Roadmap Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions 27 Extension to the Optimizer Compile-time consistency checking Run-time currency and inexpensive consistency checking Cost estimation 28 Run-time C&C Checking ChoosePlan Local plan using V C&C Guard Remote plan requesting E Currency guard: Check if local view V satisfies currency requirement Consistency guard: Check if local view V satisfies consistency requirement 29 Future Directions Comprehensive performance evaluation Cache configurations? Comparison with other replication solutions? Automate cache design/tuning Improve current prototype How to get a good cache schema? (i.e., cache region granularity, assignment) Read-write transactions? Adaptive data quality aware caching policies Control-table content? Refresh intervals? 30 Summary Goal: fine-grained data quality-aware cache management A comprehensive solution Four cache properties Dynamic cache model Efficient cache maintenance and “safety” Efficiently enforce C&C checking Questions? 31 So long, and thanks for all the fish! 32 33 Simple Consistency Guards Overhead Execution time (ms) 80 70 Consistency guard 60 Query 1.6% 1.72% 50 40 30 20 10 1.66% 1.59% 16.56% 14.00% Qa Qb 0 Local Qc Qa Qb Remote Qc 34 Single Table Consistency Guard Overhead Execution time (ms) 7 6 5 Consistency guard 6.06% 4.95% 2.33% 7.48% 8.79% A11a A11b S11 S12 Query (Qa is used) 4 3 2 62.85% 58.32% 23.77% 1 71.41% 16.98% 0 A11a A11b A12 Local S11 S12 A12 Remote 35