Download “Good Enough” Database Caching

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Caching with “Good Enough”
Currency, Consistency, and Completeness
Hongfei Guo
University of Wisconsin
Per-Åke Larson
Microsoft Research
Raghu Ramakrishnan University of Wisconsin
Motivation — Scaling Google
…
2
Motivation — Scaling A DBMS By Caching
Problem: How to tell whether the cached data is “good enough”
for an application? Application Server
 NO data quality requirements from the applications!
App
codeguarantees from the caching DBMS!
 specific
NO data quality
…
Caching
DBMS
Asynchronous
Updates
Backend
DBMS
3
Big Picture

Apps: Application
Specifies data
quality requirements in
Server
queries
[SIGMOD 2004] [SIGMOD 2004 Demo]

Fine-grained data quality-aware database
caching model
Caching
Cache admin:
Specifies local data quality
DBMS
 Cache: Keeps track of local data quality
[VLDB 2005]


Query processing: Enforces data quality
constraintBackend
[SIGMOD 2004]
[VLDB 2005]
DBMS

System performance evaluation
[ongoing work]
4
Contributions


Goal: fine-grained data quality-aware
cache management
A
comprehensive solution
Problems




Cache
properties
How does
the cache track data quality?
Dynamic
modelspecify cache properties?
How doescache
the admin
Efficient
cache maintenance
and “safety”
How to maintain
the cache efficiently?
Efficiently
enforce
data
quality
checking for
How to enforce
data
quality
constraints
queries?
5
Review: Data Quality Metrics (informal)



Currency: The elapsed time since this
copy becomes stale
Consistency: A query result is
(snapshot) consistent iff it is as if
evaluated from a snapshot of the
master database
C&C: Currency & Consistency
6
Review: Proposed SQL Syntax
BookCopy
bid
title
author
1
databases Raghu
2
databases Ullman
ReviewCopy
rid bid text
SELECT *
Consistency
FROM Currency
Books B, Reviews
R
Group
classby
bound
WHERE B.bid = R.bid AND
B.title = “Databases“
CURRENCY
CURRENCY
BOUND 10
BOUND
min ON
10(B,
minR)ON
BY(B)
(B,
B.bid
R)
,
30 min ON (R)
bid
title
author bid rid text
1
databases Raghu
1
1
…
…
1
databases Raghu
1
2
…
…
2
databases Ullman
2
3
…
1
1
…
2
1
3
2
7
Roadmap





Background
Cache data quality properties
Cache property specification
Enforcing data quality constraints
Future directions and conclusions
8
Why Define Cache Properties?
Queries with Relaxed
C&C Requirements
Query
processing
Cache
Properties
Cache
maintenance
Results
= contract
9
Cache Properties (P+3C)




Presence — per object
Consistency — a set of objects
Completeness — per predicate
Currency — object staleness
Describe local data status
10
Presence
Example:
SELECT *
FROM Authors A
WHERE authorId = 1
Question: Is an object present at the cache?
11
Consistency and Currency
Example:
SELECT *
FROM Authors A
WHERE authorId in (1, 2, 3)
CURRENCY BOUND 10 ON (A)
Question: Is a set of objects consistent and no
more than 10 minutes old?
12
Completeness
Example:
SELECT *
FROM Authors A
WHERE city = ‘Madison’
Question: Are ALL authors from Madison in the
cache?
13
Basic Concepts
Tables
Object
View 1
Master Database
H1
Snapshots
View 2
View 3
Cache
H2
Cache Property Examples
Currency = now – stale point
Consistent
Complete
Present
View 1
Master Database
H1
Stale point
View 2
View 3
Cache
H2
Roadmap





Background
Cache data quality properties
Cache property specification
Enforcing data quality constraints
Future directions and conclusions
16
Specifying Cache Properties

Specified as integrity constraints





Presence constraint
Consistency constraint
Completeness constraint
Single view
Presence correlation constraint
Between
Consistency correlation constraint two views
17
Presence Constraint
AuthorCopy:
authorId
Backend
DBMS
name
city
1
Alice
Madison
2
Bob
Madison
3
Cedric
Seattle
AuthorList_PCT:
authorId
1
Caching DBMS
2
3
18
Presence Constraint
CREATE VIEW
AuthorCopy AS
Partially
SELECT * FROM Authors
materialized
view
CREATEcontrolTABLE
AuthorList_PCT
[Zhou int)
et al 2005]
(authorId
key
ALTER VIEW AuthorCopy ADD
PRESENCE ON authorId IN
control(SELECT authorId
FROM table
authorId_PCT
AuthorCopy:
authorId
name
city
1
Alice
Madison
2
Bob
Madison
3
Cedric
Seattle
AuthorList_PCT:
authorId
1
2
3
19
Consistency Constraint
Cache Region
CREATE TABLE CityList_CsCT
(city string)
Backend
ALTER
VIEW AuthorCopy ADD
DBMS
Consistency
ON city IN
(SELECT city
FROM cityList_CsCT
AuthorCopy:
authorId
name
city
1
Alice
Madison
2
Bob
Madison
3
Cedric
Seattle
CityList_CsCT: AuthorList_PCT:
AuthorList_PCT:
authorId
city
authorId
Madison
1
1
2
2
3
3
20
Completeness Constraint
AuthorCopy:
authorId
CREATE TABLE CityList_CpCT
(city string)
Backend
ALTER
VIEW AuthorCopy ADD
DBMS
Completeness
ON city IN
(SELECT city
FROM cityList_CsCT
name
city
1
Alice
Madison
2
Bob
Madison
3
Cedric
Seattle
CityList_CpCT: AuthorList_PCT:
AuthorList_PCT:
authorId
city
authorId
Madison
1
1
New3York
3
21
Presence Correlation Constraint
AuthorList_PCT:
authorId
1
AuthorCopy:
authorId
2
3
Backend
DBMS
ALTER VIEW BookCopy ADD
PRESENCE ON authorId IN
(SELECT authorId
FROM AuthorCopy)
authorId
name
1
2
3
Alice
Bob
Cedric
BookCopy:
isbn
111
222
333
444
555
authorId
1
1
2
3
3
city
Madison
Madison
Seattle
authorId
title
aaa
bbb
ccc
ddd
eee
22
Presence Correlation Constraint
AuthorList_PCT:
authorId
1
2
3
AuthorList_PCT
authorId
AuthorCopy
authorId
BookCopy
AuthorCopy:
authorId
authorId
name
1
2
3
Alice
Bob
Cedric
BookCopy:
isbn
111
222
333
444
555
authorId
1
1
2
3
3
city
Madison
Madison
Seattle
authorId
title
aaa
bbb
ccc
ddd
eee
23
Consistency Correlation Constraint
AuthorList_PCT:
authorId
1
2
3
Backend
DBMS
ALTER VIEW BookCopy ADD
CONSISTENCY ROOT
AuthorCopy:
authorId
authorId
name
1
2
3
Alice
Bob
Cedric
BookCopy:
isbn
111
222
333
444
555
authorId
1
1
2
3
3
city
Madison
Madison
Seattle
authorId
title
aaa
bbb
ccc
ddd
eee
24
Consistency Correlation Constraint
AuthorList_PCT:
authorId
1
2
3
AuthorList_PCT
authorId
AuthorCopy
authorId
BookCopy
AuthorCopy:
authorId
authorId
name
1
2
3
Alice
Bob
Cedric
BookCopy:
isbn
111
222
333
444
555
authorId
1
1
2
3
3
city
Madison
Madison
Seattle
authorId
title
aaa
bbb
ccc
ddd
eee
25
Cache Schema Example
AuthorList_PCT
authorId
ReviewerList_PCT
CityList_CsCT
AuthorCopy
reviewerId
ReviewerCopy
authorId
BookCopy
isbn
ReviewC
opy
reviewId
26
Roadmap





Background
Cache data quality properties
Cache property specification
Enforcing data quality constraints
Future directions and conclusions
27
Extension to the Optimizer
Compile-time consistency checking
 Run-time currency and inexpensive
consistency checking
 Cost estimation

28
Run-time C&C Checking
ChoosePlan
Local plan
using V
C&C
Guard
Remote plan
requesting E
Currency guard:
Check if local view V satisfies currency requirement
Consistency guard:
Check if local view V satisfies consistency requirement
29
Future Directions
Comprehensive
performance evaluation


Cache configurations?
Comparison with other
replication solutions?
Automate cache
design/tuning

Improve current
prototype
How to get a good cache
schema? (i.e., cache
region granularity,
assignment)

Read-write
transactions?
Adaptive data quality
aware caching policies


Control-table content?
Refresh intervals?
30
Summary


Goal: fine-grained data quality-aware
cache management
A comprehensive solution




Four cache properties
Dynamic cache model
Efficient cache maintenance and “safety”
Efficiently enforce C&C checking
Questions?
31
So long, and thanks for all the fish!
32
33
Simple Consistency Guards Overhead
Execution time (ms)
80
70
Consistency guard
60
Query
1.6%
1.72%
50
40
30
20
10
1.66%
1.59%
16.56%
14.00%
Qa
Qb
0
Local
Qc
Qa
Qb
Remote
Qc
34
Single Table Consistency Guard
Overhead
Execution time (ms)
7
6
5
Consistency guard
6.06%
4.95% 2.33%
7.48%
8.79%
A11a
A11b
S11
S12
Query
(Qa is used)
4
3
2
62.85%
58.32%
23.77%
1
71.41%
16.98%
0
A11a
A11b
A12
Local
S11
S12
A12
Remote
35
Related documents