Download Presentation_Erick

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SQL wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Relational algebra wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Versant Object Database wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Cooperative Query Answering
Erick Martinez
Nov. 19, 2002
MOTIVATION:




Responses to queries posed by a user of a database
do not always contain the information required
DB and information systems are often hard to use
because they do not explicitly attempt to cooperate
with their users. They answer literally the queries
posed to them
A user might need more information than requested, or
might actually need different information
An answer with extra or alternative information may be
more useful and less misleading to a user
Cooperative Answer (CA)

A CA should be a correct, non-misleading, and
useful answer to a query.
Q0: “Which students are enrolled?
A0: “joana, jacob, shakil, …“
A0: “X. student(X)“
Grice's maxims

Maxim of Quality: a system should never give an answer
which might mislead the user

Maxim of Quantity: an answer should not be more
informative, or more detailed, than necessary

Maxim of Relation: an answer should be always relevant to
the user who asked the question

Maxim of Manner: an answer should not be ambiguous,
leaving the user with choices to make about its meaning
Database Stonewalling

Q1: "Who passed COSC6115 in the winter semester of 2001?
A1: “No one“

Q2: "Who failed COSC6115 in the winter semester of 2001?
A2: “No one“

Q3: "Who taught COSC6115 in the winter semester of 2001?
A3: “No one"
DB stonewall - will answer a yes/no question with a yes or no
regardless of whether the answer is misleading.
QUERY / ANSWER SYSTEMS



Natural language interfaces
Databases (relational)
Logic programming and deductive databases(*)
Deductive Databases (DDB)
A deductive database consists of
three parts:
 Facts – the set of all facts
constitute the extensional
database (EDB)
 Rules – the set of rules
constitute the intensional
database (IDB)
 Integrity constraints (IC) – the
set of logical formula that
must be true of the database
e.g. IC0:  enrolled_in(X, Y),
not student(X).
Distinction between data and
knowledge:
 Data represented in the EDB,
and knowledge in the IDB
and IC.
 Knowledge is the semantics
of the DB, that which must be
true of the DB’s state, and the
logical conclusions that must
follow from given data
TECHNIQUES





Evaluation of presuppositions in a query(*)
Detection and correction of misconceptions in a
query(*)
Relaxation and generalization of queries and
responses(*)
Consideration of specific information about a user's
state of mind
Formulation of intensional answers
Presuppositions:
T
E
C
H
N
I
Q
U
E
S

Usually, asking a query not
only presupposes the
existence of all components
of the query, but also
presupposes an answer to
the query itself.
i.e. "Which employees own
red cars?“
Q4:  emp(X), owns(X,Y),
car(Y), red(Y).



Two atoms in a query are joined
if they share a variable.
A query is connected if every
two atoms in the query are
connected.
2n - 2 sub-queries for a
conjunctive query with n atoms
(exp. cost)
 Algorithm: Report the smallest
sub-queries that fail, considering
only connected sub-queries
Presuppositions:
Lattice of sub-queries:
T
E
C
H
N
I
Q
U
E
S
Q4: "Which employees own red cars?“

<- emp(X), owns(X,Y),
car(Y), red(Y).
<- emp(X),
owns(X,Y),
car(Y).
<- emp(X),
owns(X,Y).
<- emp(X).
<- emp(X),
owns(X,Y),
red(Y).
<- owns(X,Y),
car(Y).
<- owns(X,Y).
<- owns(X,Y),
car(Y),
red(Y).
<- owns(X,Y)
red(Y).

<- car(Y),
red(Y).
<- car(Y).
<- red(Y).
If a sub-query has no
answers, the query
cannot have any answers
either (scalar implicature)
Finding presuppositions
(failed sub-queries) is
independent of domain
specific knowledge.
Misconceptions:
T
E
C
H
N
I
Q
U
E
S

Integrity constraints:
IC1:  professor(X), student(X).
IC2:  enrolled_in(X, Y), not student(X).

Query:
"Which professor is enrolled in COSC6115?“
Q5:  professor(X), enrolled_in(X, COSC6115).

Answer:
“No one is both a professor and a student. Anyone who is enrolled
in a class is a student. So no one is a professor and enrolled in
class.“
Relaxation:
T
E
C
H
N
I
Q
U
E
S

Taxonomy clause:
C6: travel(From, To) 
serves_area(A, From),
serves_area(B, To),
flight(A,B) *.

Reciprocal clause:
C6T: relax(flight(A,B) ) 
serves_area(A, From),
serves_area(B, To),
travel(From, To) .
Relaxation step: let  be a
substitution after unifying atom in
goal with key (*) in the taxonomy
clause
1.
Apply  across the taxonomy
clause.
2.
Replace the query atom with the
head atom of the taxonomy
clause.
3.
Add the non-key literals from the
body of the taxonomy clause to
the new query as constraints on
the variables.
C6: travel(From, To) 
serves_area(A, From),
serves_area(B, To), flight(A,B) *.
C6T: relax(flight(A,B) ) 
serves_area(A, From),
serves_area(B, To), travel(From, To) .
… Relaxation:
T
E
C
H
N
I
Q
U
E
S

Original query:
Q6 :
Q6r :

flight(‘Dulles, ‘Orly’).
relax (flight(‘Dulles, ‘Orly’)).
Relaxing via reciprocal clause C6T :
Q6r’ :




serves_area(‘Dulles, From),
serves_area(‘Orly’, To),
travel(From, To) .
Resolving with taxonomy clause C6 :
Q6r’’ :

serves_area(‘Dulles, From),
serves_area(‘Orly’, To),
serves_area(A, From),
serves_area(B, To),
flight(A, B) .
airport
washintong_dc
… Relaxation:
T
E
C
H
N
I
Q
U
E
S
Q6r’’ :



'Dulles'
'National'
baltimore
'BWI'
paris
'Orly'
'De Gaulle'
serves_area(‘Dulles’, From),
serves_area(‘Orly’, To),
serves_area(A, From),
serves_area(B, To),
flight(A, B).
When A = ‘Dulles’ and B = ‘Orly’, solving
flight(‘Dulles, ‘Orly’) again and will get the same answers
When A  ‘Dulles’ and B  ‘Orly’, will get new answers:
–
–
From = ‘Washington, D.C.’ and
serves_area(A, ‘Washington, D.C.’) will be satisfied by
A = ‘National’, A = ‘BWI’ …
C6T: relax(flight(A,B) ) 
Generalization:
T
E
C
H
N
I
Q
U
E
S




serves_area(A, From),
serves_area(B, To),
travel(From, To) .
Relaxation is strictly a syntactic notion, a rewrite mechanism.
Generalization is a semantic counterpart to relaxation.
Literal answers to the relaxed query should include answers to
the original query, plus some new neighbourhood answers with
respect to the original query.
After applying relaxation a new query is a generalization only if
all the non-key atoms are satisfied whenever the key atom is
satisfied. (conservative reciprocal clause)
When all reciprocal clauses are conservative, resolution over
a relaxed query will produce all the answers of the original
query.
USER GOALS AND MODELS
Types of knowledge about a user relevant to CA



Interests and preferences
Needs – user constraints (UC)
Goals and intent
MY KEY POINTS:






CA is mostly intended for DDB as a platform.
For RDB, a deductive database interface should be
implemented on top of any relational system.
The system should support natural language input to some
extend for some domains (the natural language translator
generates a logical query)
The system should produce natural language responses
CA techniques, in particular relaxation, can useful for
applications like Internet queries
It is not evident that first order logic can serve as an
adequate ontology for CA
The End
That’s that’s that’s all folks …
A CA SYSTEM (at U of Maryland)

Uniform system:
–
–

Portable
–
–

Defined and implemented through logic
Uniform representation and support for all cooperative
methods
General approach for RDB, DDB and logic programs
Domain-independent
Natural language interface
–
–
Accept natural language queries
Provide cohesive and coherent responses in natural language
Deductive Database Structure:

EDB:
prerequisite(‘MATH-300’, ‘MATH-350’).
prerequisite(‘MATH-350’, ‘MATH-400’).
teaches(smith, ‘MATH-400’).
…

IDB :
teaches(X, Y)  teaches(X, Z) ,
prerequisite(Y, Z).
…

IC :
 enrolled_in(X, Y), not student(X).
…