Download slides - Andrew.cmu.edu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
August 25, VLDB 2009
Believe It or Not –
Adding belief annotations
to databases
Wolfgang Gatterbauer, Magda Balazinska,
Nodira Khoussainova, and Dan Suciu
University of Washington
http://db.cs.washington.edu/beliefDB/
High-level overview
•
•
DBMS: manage consistent data
Applications need inconsistent DM
– Scientific databases
reason: disagreement !
– Internet community databases
•
Community DBMS: manage inconsistent views
•
This work: Belief databases
– manage data and curation
– grounded in modal and default logic
– implemented on top of relational model
2
Agenda
•
Motivating example
•
Logic foundations
•
Relational implementation
•
Discussion
3
Motivating application
•
NatureMapping project (http://depts.washington.edu/natmap/)
– volunter contribute animal observations
– one person curates the database
problem: does not scale!
Observations
id uid
2
species
Alice Crow
date
location
comment
06-14-08
Lake Placid
found feathers
Sightings (S)
sid uid
species
s2 Alice Crow
date
location
06-14-08 Lake Placid
Comments (C)
cid comment
sid
c1 found feathers
s2
4
1. Distinct database instances
S
sid uid
species
date
location
s2 Alice Crow
06-14-08 Lake Placid
sid uid
date
Alice
S
Bob
species
s2 Alice Raven
location
06-14-08 Lake Placid
D1: Belief worlds: individually consistent,
mutually possibly inconsistent
5
1. Distinct database instances
S
sid uid
species
date
location
s2 Alice Crow
06-14-08 Lake Placid
sid uid
date
Alice
S
Bob
species
s2 Alice Raven
location
06-14-08 Lake Placid
BeliefSQL
I: Alice believes that she saw a crow.
insert into BELIEF ‘Alice’ Sightings
values (‘s2’,‘Alice’,‘Crow’,’06-14-08’,‘Lake Placid’)
I: Bob believes that she actually saw a raven.
insert into BELIEF ‘Bob’ Sightings
values (‘s2’,‘Alice’,‘Raven’,’06-14-08’,‘Lake Placid’)
Q: Who believes something different
than Alice and what?
select U2.name, S1.species, S2.species
from Users as U,
BELIEF ‘Alice’ Sightings as S1,
BELIEF U.uid Sightings as S2,
where S1.sid = S2.sid
and
S1.species <> S2.species
A: {(‘Bob’, ‘Crow’, ‘Raven’)}
6
2. Open world assumption
S
sid uid
species
date
location
s2 Alice Crow
06-14-08 Lake Placid
sid uid
date
Alice
S
Bob
S
Carol
species
location
s2 Alice Raven
06-14-08 Lake Placid
sid uid
date
species
location
s2 Alice Crow
06-14-08 Lake Placid
s2 Alice Raven
06-14-08 Lake Placid
−
−
Adapted key constraints !
D2: Model incomplete knowledge with exlicit negative beliefs
7
2. Open world assumption
S
Alice
S
Bob
S
Carol
believe that Alice saw a crow
species I: Carol
datedoes not
location
nor a raven.
s2 Alice Crow
06-14-08 Lake Placid
insert into BELIEF ‘Carol’ not Sightings
values (‘s2’,‘Alice’,‘Crow’,’06-14-08’,‘Lake Placid’)
insert into BELIEF ‘Carol’ not Sightings
Placid’)
sid uid
species values
date(‘s2’,‘Alice’,‘Raven’,’06-14-08’,‘Lake
location
sid uid
s2 Alice Raven
06-14-08 Lake Placid
sid uid
date
species
location
s2 Alice Crow
06-14-08 Lake Placid
s2 Alice Raven
06-14-08 Lake Placid
−
−
8
2. Open world assumption
S
sid uid
species
s2 Alice Crow
Alice
S
species
s2 Alice Raven
Bob
S
Carol
sid uid
sid uid
species
Q: Who disagrees with a sighting from
‘06-14-08’ that Alice believes?
U.name, S1.species
date selectlocation
from Lake
Users
as U,
06-14-08
Placid
BELIEF ‘Alice’ Sightings as S1,
BELIEF U.uid not Sightings as S2
where S1.sid = S2.sid
S1.uid = S2.uid
date and location
and
S1.species = S2.species
06-14-08 Lake Placid
and
S1.date = ‘06-14-08’
and
S2.date = ‘06-14-08’
and
S1.location = S2.location
‘Crow’), (‘Carol’, ‘Crow’)}
date A: {(‘Bob’,
location
s2 Alice Crow
06-14-08 Lake Placid
s2 Alice Raven
06-14-08 Lake Placid
−
−
9
3. Higher-order beliefs
S
sid uid
species
date
location
s2 Alice Crow
06-14-08 Lake Placid
sid uid
date
Alice
S
species
s2 Alice Raven
C
Bob
Bob
Alice
C
location
06-14-08 Lake Placid
cid comment
sid
c1 plain black feathers
s2
cid comment
sid
c1 purple-black feathers
s2
D3: Beliefs about other user’s beliefs: allow discussion between users
10
3. Higher-order beliefs
to location
Bob, Alice believes that the
species I: According
date
of the Lake
sighted
animal were plain black.
s2 Alice Crow feathers
06-14-08
Placid
insert into BELIEF ‘Bob’ BELIEF ‘Alice’ Comments
values (‘c1’, ‘plain black feathers’, ‘s2’)
S
sid uid
S
sid uid
Alice
species
s2 Alice Raven
C
Bob
Bob
Alice
C
date
location
06-14-08 Lake Placid
cid comment
sid
c1 plain black feathers
s2
cid comment
sid
c1 purple-black feathers
s2
11
3. Higher-order beliefs
S
s2
Alice
S
sid
s2
C
Bob
Bob
cid
c1
Alice
C
does Alice believe according
species Q: Which
date comments
location
to Bob, which he does not believe himself?
Alice Crow
06-14-08 Lake Placid
select C1.cid, C1.comment
from BELIEF ‘Bob’ BELIEF ‘Alice’ Comments as C1,
BELIEF ‘Bob’ not Comments as C2
uid
species
date
location
where C1.cid = C2.cid
Alice Raven and 06-14-08
Lake Placid
C1.comment
= C2.comment
and
C1.sid = C2.sid
comment
sid
A: {(‘c1’,‘plain-black feathers’)}
plain black feathers
s2
sid uid
cid comment
sid
c1 purple-black feathers
s2
12
3. Higher-order beliefs
S
s2
Alice
S
sid
s2
C
Bob
Bob
cid
c1
Alice
C
does Alice believe according
species Q: Which
date comments
location
to somebody, which (s)he does not believe themself?
Alice Crow
06-14-08 Lake Placid
select U.name, C1.sid, C1.comment
from Users as U,
BELIEF U.uid BELIEF ‘Alice’ Comments as C1,
uid
species
date
location
BELIEF U.uid not Comments as C2
Alice Raven where
06-14-08
Placid
C1.cid =Lake
C2.cid
and
C1.comment = C2.comment
comment
sid = C2.sid
and
C1.sid
plain black feathers
s2
A: {(‘Bob’, ‘c1’, ‘plain-black feathers’)}
sid uid
cid comment
sid
c1 purple-black feathers
s2
13
4. Message board assumption
S
sid uid
species
date
location
s2 Alice Crow
06-14-08 Lake Placid
sid uid
date
Alice
S
species
s2 Alice Raven
C
Bob
S
Bob
Alice
06-14-08 Lake Placid
cid comment
sid
c1 plain black feathers
s2
sid uid
species
s2 Alice Crow
C
location
date
location
06-14-08 Lake Placid
cid comment
sid
c1 purple-black feathers
s2
D4: Default assumption: models a trusting attitude & avoids repeated inserts
14
4. Message board assumption
S
sid uid
species
s2 Alice Crow
Alice
S
sid
s2
C
Bob
cid
c1
S
sid
select S1.sid, S1.species
from BELIEF ‘Bob’ BELIEF ‘Alice’ Sightings as S1,
uid
species
dateBELIEFlocation
‘Bob’ not Sightings as S2
S1.sid =Lake
S2.sid
Alice Raven where
06-14-08
Placid
and
S1.uid = S2.uid
comment
sid
and
S1.species
= S2.species
and
S1.date
= S2.date
plain black feathers
s2
and
S1.location = S2.location
A: {(‘s2’, ‘Crow’)}
uid
species
date
location
s2 Alice Crow
Bob
Alice
C
Q: Which animal sightings does Alice believe
date
location
according to Bob, which he does not
06-14-08
Lake Placid
believe
himself?
06-14-08 Lake Placid
cid comment
sid
c1 purple-black feathers
s2
15
What we have seen so far
•
4 Design decisions for Belief Database model
–
–
–
–
•
Distinct belief worlds
Open world assumption (OWA)
Higher-order beliefs
Message board assumption
BeliefSQL
– SQL + ‘BELIEF’ (+ ‘not’)
16
Agenda
•
Motivating example
•
Logic foundations
•
Relational implementation
•
Discussion
17
Logic foundations: Belief statements
S
Bob
Alice
Carol
…
Alice
Alice
Bob
…
Bob Alice
species
s2 Alice Crow
Alice
…
sid uid
…
…
insert into BELIEF ‘Alice’ S
values (‘s2’, ‘Alice’, ‘Crow’,…)
i: ☐Alice S+(‘s2’,‘Alice’,‘Crow’,…)
modal operator
& belief path (w)
relational tuple (t)
sign (s)
Carol
ε
…
Carol
Alice
…
belief statement
 = ☐w
ts
“annotation of
ground tuple”
Belief database D = {1, …, n}
Bob
…
18
Logic foundations: Entailment
S
Bob
Alice
Carol
…
Alice
S
…
Bob Alice
Alice
Alice
Bob
…
Bob Alice
sid uid
species
…
s2 Alice Crow
…
sid uid
…
species
s2 Alice Crow
…
select *
from BELIEF ‘Bob’ BELIEF ‘Alice’ S
Carol
ε
…
…
One belief annotation:
D = {1}
1=☐Alice S+(…‘Crow’,…)
…
More than one entailed belief:
D ☐Bob Alice S+(…‘Crow’,…)
Carol
Alice
Bob
19
Logic foundations: Message board assumption
Message board assumption
If D ☐w ts
and ☐u w ts consistent with D
then D ☐u w ts
Default logic
 : ☐u 
☐u 
non-monotonic reasoning !
D
D\D
D
Explicit beliefs
(annotations)
Implicit beliefs
(assumptions)
Entailed beliefs
(extension)
belief database “contains” more than the explicit belief annotations !
20
“Semi-formal” problem statement
INPUT
OUTPUT
Belief statements
i1: 1
i2: 2
...
in: n
Adapted
key constraints
Message board
assumption
 : ☐u 
☐u 
D
 ?
D
☐w1…wd R+(x1,…xl) ?
q(x) :− ☐w Ri+(xi)
Belief Conjunctive Queries (BCQ)
q(x) :− ☐w1R1s1(x1), …, ☐wgRgsg(xg)
21
Agenda
•
Motivating example
•
Logic foundations
•
Relational implementation
•
Discussion
22
Canonical Kripke structure
Belief statements*
i1: s11+
i2: ☐Bob s11−
i3: ☐Bob s12−
i4: ☐Alice s21+
i5: ☐Alice c11+
i6: ☐Bob s22+
i7: ☐BobAlice c21+
i8: ☐Bob c22+
Message board
assumption
Carol
Carol Alice
#0
{s11+}
#1
{s11+,s21+,c11+}
Carol
Bob
Carol
Bob
#2
Alice
#3
{s11+,s21+,c11+,c21+}
Bob
{s11−,s12−,s22+,c22+}
 : ☐i 
☐i 
* Running example from the paper
23
Relational representation
Sightings_INTERNAL
tid
sid uid
species
Sightings_V
date
location
wid tid
E
sid s
e
wid1 uid
wid2
s1.1 s1 Carol Bald eagle 06-14-08 Lake Forest
#0 s1.1 s1 + y
#0
Alice #1
s1.2 s1 Carol Fish eagle 06-14-08 Lake Forest
#1 s1.1 s1 + n
#0
Bob
s2.1 s2 Alice Crow
06-14-08 Lake Placid
#1 s2.1 s2 + y
#0
Carol #0
s2.2 s2 Alice Raven
06-14-08 Lake Placid
#2 s1.1 s1 − y
#1
Bob
#2 s1.2 s1 − y
#1
Carol #0
#2 s2.2 s2 + y
#2
Alice #3
#3 s1.1 s1 + n
#2
Carol #0
#3 s2.1 s2 + n
#3
Bob
#3
Carol #0
Comments_INTERNAL
tid
cid
comment
sid
c1.1 c1
found feathers
s2
c2.1 c2
plain black feathers
s2
c2.2 c2
purple-black feathers s2
#2
#2
#2
Comments_V
wd tid
cid
s
e
#1 c1.1 c1
+ y
#2 c2.2 c2
+ y
#3 c1.1 c1
+ n
#3 c2.1 c2
+ y
D
S
wid
d
wid1 wid2
#0
0
#1
#0
#1
1
#2
#0
#2
1
#3
#1
#3
2
24
Example Translation of a Belief CQ (BCQ)
Q: Who disagrees with a sighting from ’06-14-08’ that Alice believes?
BeliefSQL
Translation into SQL
select U.name, S1.species
from Users as U,
BELIEF ‘Alice’ Sightings as S1,
BELIEF U.uid not Sightings as S2
where S1.sid = S2.sid
and
S1.uid = S2.uid
and
S1.species = S2.species
and
S1.date = ‘06-14-08’
and
S2.date = ‘06-14-08’
and
S1.location = S2.location
select
into
from
where
and
and
and
E1.uid as uid1, V.tid, V.sid, R.uid, R.species, R.date, R.location, V.s
T2
E as E1, Sightings_V as V, Sightings_STAR as R
E1.wid1=0
V.wid=E1.wid2
V.tid=R.tid
E1.uid='1';
select
into
from
where
and
and
E1.uid as uid1, V.tid, V.sid, R.uid, R.species, R.date, R.location, V.s
T1
E as E1, Sightings_V as V, Sightings_STAR as R
E1.wid1=0
V.wid=E1.wid2
V.tid=R.tid;
select
from
where
and
and
and
T1.uid1, T2.species
T1 as T1, T2 as T2
T1.sid=T2.sid
((T1.s=0 and T1.uid=T2.uid and T1.species=T2.species
and T1.date='6-14-08' and T1.location=T2.location) or
(T1.s=1 and (T1.uid<>T2.uid or T1.species<>T2.species
or T1.date<>'6-14-08' or T1.location<>T2.location)))
T2.s=1
T2.date='6-14-08';
drop
drop
table T2;
table T1;
q(x,y) :− ☐Alice S+(u,v,y,‘06-14-08’,z),
☐x S−(u,v,y,‘06-14-08’,z)
25
Agenda
•
Motivating example
•
Logic foundations
•
Relational implementation
•
Discussion
26
Experiments
Size
|R*|
Relative overhead ρ := n
ρ = O(mdmax)
In theory: e.g. 100 users, max. depth 2
ρ  10,000
m … #users
dmax … maximum
depth of
belief
annotation
Experiments: ρ  21 – 1,009
Size not limitation of semantics, but of relational implementation!
Time
Depends on type of query (3 types in paper)
Experiments on 10,000 annotations (ρ =22.4):
Q1: ~0.1 s
Q2: ~0.4 s
Q3: ~4.5 s
Considerable speed-up to come!
27
Inspirations and related work (excerpt)
•
Annotations
– Buneman et al. [ICDT 2001 / ICDT 2007]
– Bhagwat et al. [VLDBJ 2005], Geerts et al. [ICDE 2006]
– Srivastava & Velegrakis [SIGMOD 2007]
•
Modal logic
– Fagin et al. [1995]
– Calvanese et al. [IS 2008]
– Nguyen [LJ-IGPL 2008]
•
Uncertain / incomplete information
– Sarma et al. [ICDE 2006]
– Green & Tannen [IEEE Data Eng. 2006]
– Dalvi & Suciu [PODS 2007]
•
Inconsistency / key violations
– Arenas et al. [PODS 1999]
– Fuxman et al. [SIGMOD 2005]
•
Peer-to-peer computing / collaborative data sharing
– Bernstein et al. [WebDB 2002]
– Ives et al. [SIGMOD record 2008]
28
Conclusions
•
Proposed BELIEF databases
– Goal: manage, curate inconsistent data
•
Implementation
– Logical foundations
– Relational translation
•
Current work
– making it compact and fast
29
BACKUP
30
Relative overhead of relational representation
Relative overhed (|R|/n)
1E+4
1E+3
1E+2
1E+1
1E+1
1E+2
1E+3
Number of annotations (n)
1E+4
Distribution of belief
path depths (Pr[k=x])
31
Query types and execution times
32
Belief Conjunctive Queries (BCQ)
33
Revisiting the semantics / the user
(3) ?
-> Structured discourse
(2) BeliefSQL
Conflicts in belief worlds:
OWA, keys, ML, DA
(1) SQL
BELIEF ’Alice’ (…,’eagle’,…)
-> ’Alice’ASSERTS (…,’eagle’,…)
BELIEF ’Bob’ BELIEF ’Alice’
(…,’black feathers’,…)
-> ’Bob’SUGGESTS that the ASSUMPTION
(…,’black feathers’,…) has led ‘Alice’ to her
original observation
Standard relational model
34
Related documents