Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
“Good Enough” Database Caching Hongfei Guo University of Wisconsin-Madison Motivation — Scaling Google … 2 Motivation — Scaling A DBMS By Caching How to tell whether the cached data is “good enough” for an Application Server application? NO data quality requirements from the applications! specific NO data quality App codeguarantees from the caching DBMS! … Caching DBMS Asynchronous Updates Backend DBMS 3 The Thesis Serverquality requirements in Apps: Application Specifies data queries Cache: Enforces data quality constraint [SIGMOD 2004] [SIGMOD 2004 Demo] Caching Cache admin: Specify local data quality to be DBMS maintained by cache (Data quality-centric database caching model) [TR 2005] [VLDB 2005] Backend System performance evaluation DBMS 4 Data Quality Metrics (informal) Currency: The elapsed time since this copy becomes stale Consistency: A query result is (snapshot) consistent iff it is as if evaluated from a snapshot of the master database C&C: Currency & Consistency 5 Roadmap Background Specifying data quality constraints in SQL Data quality-centric caching model Enforcing data quality constraints System performance evaluation Other research Conclusions and future directions 6 Specifying Data Quality Constraints in SQL [Guo, Larson, Ramakrishnan and Goldstein, SIGMOD 2004] Currency requirements Consistency requirements Extend SQL to specify relaxed C&C requirements Formal semantics of C&C constraints 7 Currency Requirements Example 1: The caching database keeps BookCopy Customer A is about to purchase –he wants the data to be exactly current (High data quality is preferred) Customer B is browsing –it is ok if the data is no more than 3 days out of sync (Quick response time is preferred) 8 Currency Requirements Example 1: The caching database keeps BookCopy Customer A is about to purchase –he wants the data to be exactly current (High data quality is preferred) Customer B is browsing –it is ok if the data is no more than 3 days out of sync (Quick response time is preferred) 9 Currency Requirements Example 1: The caching database keeps BookCopy Customer A is about to purchase –he wants Different apps may have different the data to be exactly current currency requirements for the same query (High data quality is preferred) Customer B is browsing –it is ok if the data is no more than 3 days out of sync (Quick response time is preferred) 10 Consistency Requirements Example 2: SELECT * FROM Books B, Reviews R WHERE B.bid = R.bid AND B.title = “Databases” BookCopy bid title author 1 databasesapps Raghu may have different consistency Different The Books Each whole book be consistent be query consistent result & be 2 databases Ullman requirements for the same query consistent Reviews with its reviews be consistent bid ReviewCopy title author bid rid text rid bid text 1 databases Raghu 1 1 … 1 1 … 1 databases Raghu 1 2 … 2 1 … 3 2 … 2 databases Ullman 2 3 … 11 Proposed SQL Syntax BookCopy bid title author 1 databases Raghu 2 databases Ullman ReviewCopy rid bid text SELECT * Consistency FROM Currency Books B, Reviews R Group classby bound WHERE B.bid = R.bid AND B.title = “Databases“ CURRENCY CURRENCY BOUND 10 BOUND min ON 10(B, minR)ON BY(B) (B, B.bid R) , 30 min ON (R) bid title author bid rid text 1 databases Raghu 1 1 … … 1 databases Raghu 1 2 … … 2 databases Ullman 2 3 … 1 1 … 2 1 3 2 12 Specifying Data quality Constraints in SQL: Contributions Extend SQL to express C&C constraints Single-block queries Provides correctness standard Multi-block (i.e., nested) queries Timeline constraint using for Formal semanticsor of cached C&C constraints replicated data 13 Roadmap Background Specifying data quality constraints in SQL Data quality-centric caching model Enforcing data quality constraints System performance evaluation Other research Conclusions and future directions 14 Data Quality-Centric Caching Model [Guo, Larson and Ramakrishnan, submitted] Cache data quality properties Cache property specification Maintenance and “safety” 15 Why Define Cache Properties? Query processing Cache Properties (= contract) Cache maintenance 16 Cache Properties (P+3C) Presence — per object Consistency — a set of objects Completeness — per predicate Currency — object staleness 17 Basic Concepts Tables Object View 1 Master Database H1 Snapshots View 2 View 3 Cache H2 Cache Property Examples Currency = now – stale point Consistent Complete Present View 1 Master Database H1 Stale point View 2 View 3 Cache H2 Specifying Cache Properties Specified as integrity constraints Presence constraint Consistency constraint Completeness constraint Presence correlation constraint Consistency correlation constraint 20 Presence Constraint AuthorCopy: authorId Backend DBMS name city 1 Alice Madison 2 Bob Madison 3 Cedric Seattle AuthorList_PCT: authorId 1 Caching DBMS 2 3 21 Presence Constraint CREATE VIEW AuthorCopy AS Partially SELECT * FROM Authors materialized view CREATEcontrolTABLE AuthorList_PCT [Zhou int) et al 2005] (authorId key ALTER VIEW AuthorCopy ADD PRESENCE ON authorId IN control(SELECT authorId FROM table authorId_PCT AuthorCopy: authorId name city 1 Alice Madison 2 Bob Madison 3 Cedric Seattle AuthorList_PCT: authorId 1 2 3 22 Consistency Constraint Cache Region CREATE TABLE CityList_CsCT (city string) Backend ALTER VIEW AuthorCopy ADD DBMS Consistency ON city IN (SELECT city FROM cityList_CsCT AuthorCopy: authorId name city 1 Alice Madison 2 Bob Madison 3 Cedric Seattle CityList_CsCT: AuthorList_PCT: AuthorList_PCT: authorId city authorId Madison 1 1 2 2 3 3 23 Completeness Constraint AuthorCopy: authorId CREATE TABLE CityList_CpCT (city string) Backend ALTER VIEW AuthorCopy ADD DBMS Completeness ON city IN (SELECT city FROM cityList_CsCT name city 1 Alice Madison 2 Bob Madison 3 Cedric Seattle CityList_CpCT: AuthorList_PCT: AuthorList_PCT: authorId city authorId Madison 1 1 3 3 24 Presence Correlation Constraint AuthorList_PCT: authorId 1 AuthorCopy: authorId 2 3 Backend DBMS ALTER VIEW BookCopy ADD PRESENCE ON authorId IN (SELECT authorId FROM AuthorCopy) authorId name 1 2 3 Alice Bob Cedric BookCopy: isbn 111 222 333 444 555 authorId 1 1 2 3 3 city Madison Madison Seattle authorId title aaa bbb ccc ddd eee 25 Presence Correlation Constraint AuthorList_PCT: authorId 1 2 3 AuthorList_PCT authorId AuthorCopy authorId BookCopy AuthorCopy: authorId authorId name 1 2 3 Alice Bob Cedric BookCopy: isbn 111 222 333 444 555 authorId 1 1 2 3 3 city Madison Madison Seattle authorId title aaa bbb ccc ddd eee 26 Consistency Correlation Constraint AuthorList_PCT: authorId 1 2 3 Backend DBMS ALTER VIEW BookCopy ADD CONSISTENCY ROOT AuthorCopy: authorId authorId name 1 2 3 Alice Bob Cedric BookCopy: isbn 111 222 333 444 555 authorId 1 1 2 3 3 city Madison Madison Seattle authorId title aaa bbb ccc ddd eee 27 Consistency Correlation Constraint AuthorList_PCT: authorId 1 2 3 AuthorList_PCT authorId AuthorCopy authorId BookCopy AuthorCopy: authorId authorId name 1 2 3 Alice Bob Cedric BookCopy: isbn 111 222 333 444 555 authorId 1 1 2 3 3 city Madison Madison Seattle authorId title aaa bbb ccc ddd eee 28 Cache Schema Example AuthorList_PCT ReviewerList_PCT authorId reviewerId AuthorCopy ReviewerCopy authorId BookCopy isbn ReviewC opy reviewId 29 Pull-Maintenance Refresh a region by pulling query results When refreshing a region, also refresh the affected closure All overlapping regions All correlated regions 30 Pull-Maintenance AuthorList_PCT: authorId 1 3 4 authorId TitleList_CsCT: BookCopy: isbn 111 222 333 444 555 authorId 1 1 1 3 4 title aaa bbb ccc aaa eee title aaa 31 Pull-Maintenance AuthorCopy: AuthorList_PCT authorId AuthorCopy authorId BookCopy authorId name city 1 3 Alice Cedric Madison Seattle BookCopy: isbn 111 222 333 444 555 authorId 1 1 1 3 3 authorId title aaa bbb ccc aaa eee 32 Inefficient Pulling AuthorCopy: authorId isbn 111 1 1 3 3 222 333 111 555 city 1 Alice Madison Shared-row 3 Cedric Seattle problem BookCopy: AuthorBookCopy: authorId 1 name isbn isbn 111 222 333 555 price 10 20 30 50 title aaa bbb ccc eee 33 Issues Inefficient pulling: Calculation of the affected closure requires checking the rows Efficient pulling: The affected closure does NOT depend on the instance of a view Only requires forward pull among correlated views 34 Theoretical Results Definition: (Safe partially materialized views) A partially materialized view V is safe if the following two conditions hold for every instance of the cache that satisfies all integrity constraints: Property held for For any pair of regions in V, either they don’t overlap or one is contained in the other. every instance If V is gray, let X denote the set of regions in V defined by presence control-key values. X is a partitioning of V and no pair of regions in X is contained in any one region defined on V. Cache schema design rules: Rule 1: A cache graph is a DAG. Syntactically Rule 2: Only red nodes can have independent completeness or consistency control-tables. checkable conditions Rule 3: Every PMV with more than one parent must be a red circle. Rule 4: If a PMV has the shared(polynomial) row problem according to Lemma 5.2, then it cannot be gray. Rule 5: A PMV cannot have noncompatible control-tables. Theorem: Given a cache schema <W, E>, if it satisfies the design rules, then every PMV in W is safe. Conversely, if the schema violates one of these rules, there is an instance of the cache satisfying all specified integrity constraints in which some PMV is unsafe. 35 Data Quality-Centric Caching Model: Contributions Four cache properties Specifying properties Providescache an abstraction layer Cache property unit: cache region (contract) between query Safe views and efficient pulling processing and cache maintenance 36 Roadmap Background Specifying data quality constraints in SQL Data quality-centric caching model Enforcing data quality constraints System performance evaluation Other research Conclusions and future directions 37 Enforcing Data Quality Constraints Overview Simple case: View-level consistency [Guo, Larson, Ramakrishnan and Goldstein, SIGMOD 2004] [Guo, Larson, Ramakrishnan and Goldstein, SIGMOD 2004 Demo] Implemented in MS SQL Server code base General case: Row-level consistency [Guo, Larson and Ramakrishnan, submitted] 38 Queries with Relaxed Queries C&C Requirements Shadow Databases Query Optimizer Cache Region Metadata Local Materialized Views Execution Engine Heartbeat Tables Caching DBMS Backend DBMS Results Extension to MTCache Framework MTCache Framework [Larson et al. 2004] Simple Case Assumptions Fully materialized views Each view is consistent Push-based maintenance E.g., MS replication service 40 Queries with Relaxed C&C Requirements Shadow Databases Query Optimizer Cache Region Metadata Local Materialized Views Execution Engine Heartbeat Tables Caching DBMS Backend DBMS Results Extension to MTCache Framework C&C Tracking Mechanism Consistency tracking cache region (CR) The unit of update propagation Data mutually consistent all the time V1 V3 Properties, e.g., est. delay, est. interval V2 Currency tracking heartbeat table Backend Cid 1 2 Timestamp 12: 00 12: 20 30 00 10 12: 00 V4 V5 Cache CR1: CR2: 42 Queries with Relaxed C&C Requirements Shadow Databases Query Optimizer Currency Region Metadata Execution Heartbeat The best plan that: Engine Tables Satisfies consistency requirements Includes run-time currency checking Local Materialized Views Caching DBMS Backend DBMS Results Extension to MTCache Framework Extension to the Optimizer Compile-time consistency checking Run-time currency checking Cost estimation 44 Consistency Checking Enforced at optimization time Immediately prune a sub-plan if it violates consistency constraints Merge join Q1: σ( Books Reviews) CURRENCY 5 ON (Books, Reviews) Local scan Reviews Remote query on Books 45 Run-time Currency Checking When view V matches expression E E V ChoosePlan Local plan using V Currency Guard Remote plan requesting E Currency guard: Check if local view V satisfies currency requirement 46 Cost Estimation Cost for the SwitchUnion operator: C = p * Clocal + (1- p) * Cremote + Ccg p Clocal Cremote Ccg : : : : probability that the local branch will be used cost of execution of the local branch cost of execution of the remote branch cost of currency checking 47 Estimating p Compute p from three parameters: f : estimated refresh interval d : estimated minimal delay B : currency bound p= 0 (B-d)/f 1 if B-d ≤ 0, if 0 < B-d ≤ f, if B-d > f 48 Changing The Assumptions Fully materialized Partially materialized More general algorithms views views Run-time check for consistency constraints that can not be validated Consistent views Row-level consistency at compile-time Push-based maintenance Pull-based maintenance 49 Run-time C&C Checking When view V matches expression E E ChoosePlan Local plan using V Currency Guard Remote plan requesting E Currency guard: Check if local view V satisfies currency requirement 50 Run-time C&C Checking When view V matches expression E E ChoosePlan Local plan using V Currency C&C Guard Remote plan requesting E Currency guard: Check if local view V satisfies currency requirement Consistency guard: Check if local view V satisfies consistency requirement 51 Performance Evaluation Goals Currency guards overhead Consistency guards overhead Simple checks A spectrum of checks ranging from simple to complicated 52 Experimental Setting Back-end hosts a TPCD database tpcd1gh with scale factor 1.0 (~1GB) Cache server has a shadow of tpcd1gh Two local views: custCopy, orderCopy LAN connection between cache and backend server 53 Queries Used Qa: key select SELECT * FROM Customers C WHERE c_custkey=1 CURRENCY 10 ON (C) Qb: join query SELECT * FROM Customers C, Orders O WHERE c_custkey=o_custkey and c_custkey=1 CURRENCY 10 ON (C), 20 ON (O) Qc: nonkey select SELECT * FROM Customers C WHERE c_nationkey = 1 CURRENCY 10 on (C) 54 Currency Guards Overhead 250 0.41% Execution time (ms) Currency guard 200 Query 150 100 3.66% 50 15.26% 21.3% 3.59% 4.31% Qa Qb 0 Qa Qb Local Qc Remote Qc 55 Simple Consistency Guards Overhead Execution time (ms) 80 70 Consistency guard 60 Query 1.6% 1.72% 50 40 30 20 10 1.66% 1.59% 16.56% 14.00% Qa Qb 0 Local Qc Qa Qb Remote Qc 56 Single Table Consistency Guard Overhead Execution time (ms) 7 6 5 Consistency guard 6.06% 4.95% 2.33% 7.48% 8.79% A11a A11b S11 S12 Query (Qa is used) 4 3 2 62.85% 58.32% 23.77% 1 71.41% 16.98% 0 A11a A11b A12 Local S11 S12 A12 Remote 57 Enforcing Data Quality Constraints: contributions Algorithms for enforcing C&C constraints in query processing Provides DBMS guarantees for C&C Implemented a prototype in MS SQL requirements Server code base for a restricted case 58 Roadmap Background Specifying data quality constraints in SQL Data quality-centric caching model Enforcing data quality constraints System performance evaluation Other research Conclusions and future directions 59 System Performance Evaluation Push vs. pull maintenance Performance model Model parameters and settings Experiments and analysis 60 “Push” Maintenance Publication V1 V1 V3 V2 V4 V3 V2 V5 V4 V5 61 “Push” Maintenance V1 Distribution Agent V2 V3 V4 V2 log sniffing V1 V5 Distribution Database V3 Subscriptions V1 Distribution Agent V5 V4 Updates 62 Push vs. Pull “Push” model: Incremental Only view level regions Only limited types of views selection and projection views “Pull” model: Re-computing Maximal flexibility 63 Performance Model Overview Single-site DBMS ([ACL87]) Cache-master configuration Model refinement User model Transaction model Data quality requirements Cache region concept Cache-master interaction Cache maintenance Consistency-class based Transaction processing locking for the cache Network cost Sequential vs. random disk access 64 Logical queuing model [ACL87] (single-site) ... TERMINALS delay ready queue update queue update UPDATE cc queue RESTART CC blocked queue BLOCK ACCESS think object queue YES NO think? object 65 Physical queuing model [ACL87] ... (single-site) TERMINALS delay ready queue ... disk think disk ... cpu cpu 66 Queuing model for a cache-master configuration TERMINALS ... ... TERMINALS SUBMIT SUBMIT COMMIT COMMIT MASTER CACHE remote queries remote queue distribution agents refresh transactions ... updates 67 Model Parameters for Single-Site DBMS Parameter Meaning db_size Number of objects in database mpl Multiprogramming level max_size Size of largest transaction min_size Size of smallest transaction write_prob Pr (write X | read X) read_only_percentage Percentage of read-only transactions ext_think_time Mean transaction think time obj_io Disk time for accessing an object obj_io_seek Disk seeking time obj_io_transfer Disk transfer time for an object obj_cpu CPU time for accessing an object num_cpus Number of CPUs num_disks Number of disks 68 Model Parameters for a Cache-Master Configuration (1) Parameter Meaning num_terms_total Total number of terminals num_terms_cache Number of terminals at a cache num_caches Number of caches network_delay_query Network delay for sending a query network_delay_transfer Network delay for sending an object num_regions Number of cache regions at each cache max_num_classes Maximal number consistency classes per Xact min_num_classes Minimal number of consistency classes per Xact num_classes Number of consistency classes of the database refresh_interval Refresh interval currency_bound Currency bound 69 Model Parameters for a Cache-Master Configuration (2) Parameter Meaning log_sniffing_fixed_cpu Fixed part of CPU time for log sniffing a transaction log_sniffing_unit_cpu Unit CPU time for log sniffing a write action log_sniffing_fixed_disk Fixed part of Disk time for log sniffing a transaction log_sniffing_unit_disk Unit Disk time for log sniffing a write action distribution_fixed_cpu Fixed part of CPU time for distributing updates distribution_unit_cpu Unit CPU time for distributing a write action distribution_fixed_disk Fixed part of Disk time for distributing updates distribution_unit_disk Unit Disk time for distributing a write action seq_prob_copier Pr (copier reads are sequential) seq_prob_refresh Pr (copier writes are sequential) 70 Parameter Setting (1) Parameter Value db_size 10,000 pages num_terms_total 300 num_terms_cache 15 mpl 50 max_size 12-page readset (maximum) min_size 4-page readset (minimum) write_prob 0.25 read_only_percentage 90 ext_think_time 1 second obj_io 35 milliseconds obj_io_seek 30 milliseconds obj_io_transfer 5 milliseconds obj_cpu 15 milliseconds 71 Parameter Setting (2) Parameter Value num_cpus (master) 2 num_disks (master) 4 num_cpus (cache) 1 num_disks (cache) 2 num_caches 0, 1, 3, 5, 8, 10, 13, and 15 network_delay_query 20 milliseconds network_delay_transfer 5 milliseconds seq_prob_copier 1 seq_prob_refresh 0 for push, 1 for pull num_regions 1 72 Parameter Setting (3) Parameter Value log_sniffing_fixed_cpu 15 milliseconds log_sniffing_unit_cpu 5 milliseconds log_sniffing_fixed_disk 20 milliseconds log_sniffing_unit_disk 5 milliseconds distribution_fixed_cpu 200 milliseconds distribution_unit_cpu 5 milliseconds distribution_fixed_disk 200 milliseconds distribution_unit_disk 5 milliseconds 73 Performance Metrics Throughput Number of transactions completed per second Response time Local workload ratio Conflict ratios Utilization 74 Performance Metrics Throughput Response time Elapsed time between transaction submission and completion Local workload ratio Conflict ratios Utilization 75 Performance Metrics Throughput Response time Local workload ratio Ratio of number of reads completed at the caches to the total number of reads submitted to the caches Conflict ratios Utilization 76 Performance Metrics Throughput Response time Local workload ratio Conflict ratios Blocking ratio: average number of times that a transaction has to block per commit Restarting ratio: average number of times that a transaction has to restart per commit Utilization 77 Performance Metrics Throughput Response time Local workload ratio Conflict ratios Utilization Disk utilization CPU utilization 78 Experiments and Analysis Impact of writes Impact relaxing interval Onlyofone cacherefresh region, push Impact of relaxing data quality requirements Equal-sized cache regions Impact of push vs. pull push vs. pull 79 Impact of Writes Scenario 1: never refresh Scenario 2: continuous refresh 80 System Throughput (∞ currency bound, ∞ refresh interval) 81 System Throughput (∞ currency bound, 0 refresh interval) 82 Summary Improvement is marginal when readonly percentage is low (80, 70, 50) Cache maintenance overhead worsens the situation 83 Impact of Relaxing Refresh Interval Scenario 1: low cache maintenance overhead Scenario 2: high cache maintenance overhead 84 System Throughput (low overhead) 85 System Throughput (high overhead) 86 Summary The cache maintenance overhead increases when: the number of caches increases the maintenance overhead increases 87 Impact of Relaxing Data Quality Requirements Scenario Scenario Scenario Scenario 1: 0 refresh interval 2: 5s refresh interval 3: 50s refresh interval 4: ∞ refresh interval 88 System Throughput (0 refresh interval) 89 System Throughput (5s refresh interval) 90 System Throughput (50s refresh interval) 91 System Throughput (∞ refresh interval) 92 Local Workload Ratio (0 refresh interval) 93 Local Workload Ratio (5s refresh interval) 94 Local Workload Ratio (50s refresh interval) 95 Local Workload Ratio (∞ refresh interval) 96 Summary Tradeoff between refresh interval and currency bound Refresh interval refresh overhead Choose appropriate refresh interval Currency bound local workload ratio according to workload currency bounds Balance refresh interval with currency bound better system performance 97 Impact of Push vs. Pull Settings: Skewed setting (decaying currency bounds) Uniform setting (same currency bound) Number of cache regions: Push: 1, 20, 40 and 100 Pull: 100, 200, 500 and 1,000 98 System throughput (skewed, push) 99 System throughput (skewed, pull) 100 Local Workload Ratio (skewed, push) 101 Local Workload Ratio (skewed, pull) 102 System throughput (non-skewed, push) 103 System throughput (non-skewed, pull) 104 Local Workload Ratio (non-skewed, push) 105 Local Workload Ratio (non-skewed, pull) 106 Summary Impact of fine cache region granularity More opportunity for lazy maintenance Smaller regions cache region granularity Choose appropriate More copier/refresh according to workloadtransactions C&C requirements Finer granularity worse performance for non-skewed workload 107 Performance modeling: contributions Developed a detailed model for a complex system — data quality-aware Provides insights into performance cache-master configuration tradeoffs Systematic performance evaluation 108 Related Work Relaxing data quality Distributed databases Read-only transactions [Garcia-Monina et al. 1982] Demarcation protocol [Barbará et al 1992] TACC [Yu et al. 2000] Epsilon-serilizability [Pu et al. 1992] Caching Database caching DBCache [Altinel et al. 2003] Constraint-based database caching [Härder et al. 2004] Mid-Tier caching [TimesTen 2002] Shared-storage caching [Khalil et al 2002] Uniqueness of our approach (query-centric): Query: Specifies fine-grained C&C constraints Warehousing and web views WebViews Admin: Flexible local data quality control in [Labrinidis et al 2003] Others FAS [Röhm et of al. 2002] Semantic caching [Dar et al 1996] terms granularity and properties Obsolescent views [Gal 1999] Cache in Postgres [Stonebraker et al 1990] Distributed views [Segev et al 1990] Predicate-based caching [Keller et al 1996] Freshness-driven CachingwebDBMS: C&C guarantees for caching [Li etProvides al 2003] WATCHMAN [Scheuermann et al 1996] Replica management individual query Cache investment [Kossmann et al 2000] Quasi-copies [Alonso et al. 1998], [Gallersdörfer et al. 1995] Good-enough views [Seligman et al. 1997] TRAPP [Olson et al. 2000] DECAF [Kiernan et al 2000] Proxy caching [Luo et al 2001] 109 Other Research UW: Indexing large-scale, dynamic one-dimensional intervals [In preparation] Evaluating different locking protocols for database caching [ongoing] Quality of services evaluation of multicast streaming protocols [SIGMETRICS 2002] MS: SchemaGen project [Software released] A family of data structures Differed index Designed and implemented a relational schema generator for annotated XML schemas MSR-Redmond: RECYCLE project Added support for update statistics for query result caching in SQL Server 110 Future Directions Adaptive data quality aware caching policies Improve current prototype Read-write transactions? Time-line constraints? Apply “good enough” to other forms of replications Indexing data? Control-table content? Refresh intervals? Automate cache design/tuning How to get a good cache schema? (i.e., cache region granularity, assignment) 111 Summary Problem: Gap between applications and caching DBMS A comprehensive solution long, data and quality thanks for all the fish! So Specifying constraints Data quality-centric cache model Enforcing Data quality constraints Systematic performance evaluation Questions? 112 113