Download Deductive Databases with Universally Quantified Conditions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Intuitionistic logic wikipedia , lookup

Abductive reasoning wikipedia , lookup

Rewriting wikipedia , lookup

Laws of Form wikipedia , lookup

Law of thought wikipedia , lookup

Structure (mathematical logic) wikipedia , lookup

Cognitive semantics wikipedia , lookup

Inquiry wikipedia , lookup

Propositional calculus wikipedia , lookup

Boolean satisfiability problem wikipedia , lookup

New riddle of induction wikipedia , lookup

Sequent calculus wikipedia , lookup

Propositional formula wikipedia , lookup

First-order logic wikipedia , lookup

Natural deduction wikipedia , lookup

Transcript
Deductive Databases with Universally Quantified Conditions
Weiling Li and Rajshekhar Sunderraman
Department of Computer Science
Georgia State University
[email protected], [email protected]
Abstract. This paper presents an extension to deductive databases, called quantified deductive databases, that incorporates universally quantified expressions (in coded form) in the body of rules. Since universally quantified expressions
contain negations in their semantics, quantified deductive databases fall under the category of deductive databases with
negation. Furthermore, depending on other factors such as recursion and negation in body of rules, quantified deductive
databases may be classified as stratified or non-stratified deductive databases. Applications of quantified deductive
databases in expressing relational queries as well as in the bottom-up computation of the (weak) well-founded model of
general deductive databases are presented. It is conjectured that quantified deductive databases includes the class of
modularly stratified deductive databases.
Categories and Subject Descriptors: H.Information Systems [H.m. Miscellaneous]: Databases
General Terms: Data Models, Logic
Keywords: Deductive Databases, Negation, Stratification
1.
INTRODUCTION
Deductive databases were introduced over 30 years ago ([Gallaire and Minker 1978],[Gallaire, Minker,
and Nicolas 1981],[Gallaire, Minker, and Nicolas 1984]). A deductive database (or sometimes referred
to as DATALOG programs) consists of a set of facts that correspond to a relational database and a set
of logical rules that define predicates that correspond to relational views. However, unlike relational
databases where the views are restricted to be non-recursive, deductive databases can express recursive
views. Moverover,deductive rules with arbitrary negation ([Ullman 1994]) in their bodies are much
more expressive and can represent many views that are not possible with rules without negation or
with restricted negation such as stratified negation, where negation is not allowed within a recursive
view.
In deductive databases, a deductive rule is of the form:
P :- L1, ..., Ln.
and is interpreted as the if-then rule: if L1, ..., Ln are true then one can infer P is true. In particular,
if the predicates in the rule have variables as arguments, then for some values for these variables, if
L1, ..., Ln are true when the variables are replaced by their values, then one can infer P with variables
its variables replaced by values. For example, consider the following deductive database:
//Facts
p(a,b).
p(b,c).
//Rules
r(X,Y) :- p(X,Z), p(Z,Y).
SBBD - Simpósio Brasileiro de Banco de Dados
113
161-2
·
Weiling Li and Rajshekhar Sunderraman
Based on the given facts, the rule allows us to infer r(a,c) to be true because there exists a substitution
of values for variables (X by a, Z by b, and Y by c) that makes the conditions in the rule tally the
given facts.
Rules in deductive databases are fired when there exists a substitution for variables that make the
conditions in the rule true, i.e. we are restricted to existentially quantified conditions in the rule. This
causes difficulty in expressing rules which may have universally quantified conditions. For example,
consider the following facts and rules depicting an undirected graph:
node(a).
node(b).
node(c).
edge(a,a).
edge(a,b).
edge(a,c).
edge(X,Y) :- edge(Y,X).
Now, consider the predicate star(X) which is true if there is an edge from node X is to every node in
the graph. The deductive rules to express the star(X) predicate would be difficult to express since the
condition that must be satisfied to ensure star(X) is inferrable is a universally quantified condition.
It would be convenient if such conditions could be expressed in the rules. We propose an extension to
include universally quantified conditions. So, in this example, we could express the rule for star(X)
as follows:
star(X) :- edge(X,*:node(*)).
The *-term in the condituon of the rule would correspond to the universally quantified statement
“node X is connected to every node in the graph”. In the above example, this condition is satisfied
only by node a.
In this paper we introduce a class of deductive databases, called quantified deductive databases,
that allows universally quantified conditions in the body of rules. Our conjecture is that the class
of quantified deductive databases introduced in this paper properly contains the class of stratified
DATALOG (see Fig 1) and also includes the class of modularly stratified DATALOG ([Ross 1994]).
We present two applications of quantified deductive databases: one in relational querying and the
other in the bottom-up computation of the weak well-founded model of general deductive databases.
The rest of the paper is organized as follows: Section 2 introduces the syntax and semantics of
quantified deductive databases and Section 3 presents the two applications.
2.
QUANTIFIED DEDUCTIVE DATABASES
In this section, we introduce quantified deductive databases. The syntax is first introduced by extending ordinary deductive database terms by quantified terms. These quantified terms are then used
in atomic formulas to produce quantified atomic formulas, which subsequently are used in if-then deductive rules. The semantics of quantified literals is then discussed. Finally, the semantics quantified
deductive databases is informally discussed.
2.1
Syntax
We now introduce the syntax of quantified deductive databases through a series of definitions.
Definition 2.1. An alphabet is a finite set of symbols that include constants (such as numbers and
strings), variables, predicates, and special syntactic symbols listed below:
SBBD - Simpósio Brasileiro de Banco de Dados
114
Deductive Databases with Universally Quantified Conditions
Fig. 1.
·
161-3
Deductive Database Classes
(, ), ,, :-, <, <=, =, <>, >, >=, :, *, and #.
The two new symbols introduced in this paper are * and #. Their significance will become clear soon.
Definition 2.2. Given an alphabet, a term is either a constant symbol or a variable symbol from
the alphabet or one of the following:
(1) *
(2) *:p(t1,...,tn), where p is
which is * and the remaining
(3) #
(4) #:p(t1,...,tn), where p is
which is # and the remaining
a n-ary predicate symbol, t1, ..., tn are terms exactly one of
are either constant or variable symbols.
a n-ary predicate symbol, t1, ..., tn are terms exactly one of
are either constant or variable symbols.
We introduce four new kinds of terms to regular deductive databases. Their significance will become
clear soon. In the above definition, p(t1,...,tn) is referred to as the limiting predicate. Note that
we do not allow the terms within a limiting predicate to be a term with another limiting predicate (it
would be interesting to consider such cases and is left for future work).
Definition 2.3. Given an alphabet, let p be an n-ary predicate symbol and let t1, ..., tn be terms
as defined earlier, at least one of which is a variable or a constant symbol, and if one of them is a
*-term (or a #-term) then none of the remaining terms is a #-term (or a *-term). Then, p(t1,...,tn)
is an atomic formula, also referred to as a quantified atomic formula.
Quantified atomic formulas encode universally quantified statements about the predicate. As is clear
in the definition, atomic formulas may include only *-terms or only #-terms, but not a mix of these
terms. Here are some valid quantified atomic formulas:
p(*, X, *)
SBBD - Simpósio Brasileiro de Banco de Dados
115
161-4
·
Weiling Li and Rajshekhar Sunderraman
p(*:e(*,U), X, *:f(U,*))
p(X, Y, #)
q(#:g(#))
and some invalid atomic formulas:
p(*, X, #)
p(*:e(*,*), X, *:f(U,*))
The meaning of valid quantified atomic formulas will be discussed in the the next sub-section.
Definition 2.4. A literal is defined as either a quantified atomic formula P or a negated quantified
atomic formula not P.
Definition 2.5. Deductive rules are expressed as:
P :- Q1, ..., Qn.
where P is a quantified atomic formula and Q1, ..., Qn are quantified literals. These rules are also
referred to as quantified rules.
Definition 2.6. A quantified deductive database is a finite set of quantified rules.
2.2
Semantics of Quantified Literals
We now introduce the informal semantics of quantified atomic formulas. Without loss of generality,
we introduce the semantics by considering atomic formulas involving predicates of small arity.
p(X,*). This simple quantified literal is interpreted as the universally quantified formula:
(forall Y)(dom(Y) -> p(X,Y)).
where dom is a unary predicate that includes all constants mentioned in the database, i.e.
dom(c) if and only if c is a constant present in the database.
Informally, the meaning of the quantified literal is the set of X values that are related (via predicate p)
to all Y values from dom. In a similar way, the quantified literal p(*,X) is interpreted as the quantified
formula:
(forall Y)(dom(Y) -> p(Y,X)).
p(X,#). This simple quantified literal is interpreted as the universally quantified formula:
(forall Y)(p(X,Y) -> dom(Y)).
Informally, the meaning of the quantified literal is the set of X values that are related (via predicate
p) to only Y values from dom. In a similar way, the quantified literal p(#,X) is interpreted as the
quantified formula:
(forall Y)(p(Y,X) -> dom(Y)).
p(X,*:e(*)). This quantified literal is interpreted as the universally quantified formula:
SBBD - Simpósio Brasileiro de Banco de Dados
116
Deductive Databases with Universally Quantified Conditions
·
161-5
(forall Y)(e(Y) -> p(X,Y)).
Such quantified literals are much more useful than the simpler ones discussed earlier. Usually, universally quantified statements are made on a subset of the universal domain. For example, someone
may make an observation “ All men in this room are wearing a tie” rather than “All men are wearing
a tie”. In this case, the informal meaning of the quantified literal is the set of X values that are
related (via predicate p) to all Y values from unary predicate e. In a similar way, the quantified literal
p(*:e(*),X) is interpreted as the universally quantified formula:
(forall Y)(e(Y) -> p(Y,X)).
p(X,#:e(#)). This quantified literal is interpreted as the universally quantified formula:
(forall Y)(p(X,Y) -> e(Y)).
In this case, the informal meaning of the quantified literal is the set of X values that are related
(via predicate p) to only Y values from unary predicate e. In a similar way, the quantified literal
p(#:e(#),X) is interpreted as the universally quantified formula:
(forall Y)(p(Y,X) -> e(Y)).
p(X,*:e(*,Z)). This is an example of a quantified literal which has a limiting predicate with variables.
The quantified literal is interpreted as the universally quantified formula:
(forall Y)(e(Y,Z) -> p(X,Y)).
This quantified literal denotes the pairs of X and Z values such that X value is related (via predicate
p) to all Y values related (via predicate e) to the Z value.
p(X,#:e(#,Z)). This is an example of a quantified literal which has a limiting predicate with variables.
The quantified literal is interpreted as the universally quantified formula:
(forall Y)(p(X,Y) -> e(Y,Z)).
This quantified literal denotes the pairs of X and Z values such that X value is related (via predicate
p) to only Y values related (via predicate e) to the Z value.
p(*:e(*,U), X, *:f(U,*)). This is an example with two *-terms. The quantified literal is interpreted as the universally quantified formula:
(forall W,Y)( (exists U)e(W,U) and (exists U)f(U,Y) -> p(W,X,Y)).
The quantified literal denotes the set of X-values (present as the middle argument of predicate p) that
relate to all W and Y-values (coming from the first argument of predicate e and the second argument
of predicate f respectively).
A more formal definition of the meaning of quantified atomic formulas is being worked out. In
general the meaning of a quantified atomic formula is a relation whose arity is equal to the number of
variables present as arguments in the formula - note that the * and # terms do not count as variables.
The tuples in the relation would be the values of the variables that make the equivalent universally
quantified formula true.
SBBD - Simpósio Brasileiro de Banco de Dados
117
161-6
2.3
·
Weiling Li and Rajshekhar Sunderraman
Semantics of Quantified Deductive Databases
There are several approaches to express the semantics of quantified deductive databases. One method
is to provide an algorithm to translate quantified deductive databases to an ordinary deductive
database. Then, the traditional semantics of deductive databases could be used. We now present
the translation process using two examples. In each, we provide the individual steps to arrive at an
equivalent ordinary deductive database. A detailed translation algorithm is being worked out.
Example: Consider the following quantified deductive database rule involving *:
r(X,Y) :- p(X,*:e(*)), q(X,Y).
The rule expands to:
r(X,Y) :- (forall U)(e(U) -> p(X,U)), q(X,Y).
The formula (forall U)(e(U) -> p(X,U)) expresses a unary predicate, say pPred(X) that can be
defined by the following 3 ordinary deductive rules:
temp1(X,U) :- p(X,_), e(U).
temp2(X) :- temp1(X,U), not p(X,U).
pPred(X) :- p(X,_), not temp2(X).
These three rules basically code the double-negation equivalent of the universally quantified formula.
An explanation follows:
The universally quantified formula
(forall U)(e(U) -> p(X,U))
is equivalent to the formula (without universal quantifier):
not (exists U)(e(U) and not p(X,U))
To be able to write this equivalent formula in the form of deductive rules, we add the predicate p(X,_)
twice to limit the variable X in sub-formulas as follows:
p(X,_) and not (exists U)(p(X,_) and e(U) and not p(X,U))
The three deductive rules are now apparent. temp1(X,U) expresses the conjunctive sub-formula:
p(X,_) and e(U).
temp2(X) expresses the sub-formula:
(exists U)(p(X,_) and e(U) and not p(X,U)).
Finally, pPred(X) expresses the entire formula
p(X,_) and not (exists U)(p(X,_) and e(U) and not p(X,U))
So, the output ordinary deductive database would include the above 3 rules and the following rule:
SBBD - Simpósio Brasileiro de Banco de Dados
118
Deductive Databases with Universally Quantified Conditions
·
161-7
r(X,Y) :- pPred(X), q(X,Y).
Example: Consider the following quantified deductive database rule involving #:
r(X,Y) :- p(X,#:e(#)), q(X,Y).
The rule expands to:
r(X,Y) :- (forall U)(p(X,U) -> e(U)), q(X,Y).
The formula (forall U)(p(X,U) -> e(U)) expresses a unary predicate, say pPred(X) that can be
defined by the following 2 ordinary deductive rules:
temp1(X) :- p(X,U), not e(U).
pPred(X) :- p(X,U), not temp1(X).
The first rule collects X values that are related U values that are not present in e and the second rule
eliminates these values to produce the X values that satisfy the universally quantified formula. So, the
output ordinary deductive database would include the above 2 rules and the following rule:
r(X,Y) :- pPred(X), q(X,Y).
A second approach to defining the semantics is to extend the traditional model-theoretic or the
fixpoint semantics for deductive databases to quantified deductive databases. This is easily done
for quantified deductive databases with no explicit negation in the body of rules. Current work is
addressing this approach.
3.
APPLICATIONS
In this section, we present two applications of quantified rules. One is in the area of querying relational
databases. Quantified rules are quite useful and easy to use to express complicated relational queries
that involve verifying a universally quantified condition. The second application is in the area of
computing the intended model of general deductive databases. A database transformation approach
that uses quantified rules is presented.
3.1
Relational Querying
Quantified rules can be used to answer many relational queries without the use of explicit negation.
Consider the following relational database recording information about movies, actors, and directors:
movie(TITLE)
director(TITLE,DIRECTOR)
actor(TITLE,ACTOR)
The movie table records titles of movies, the director table records which directors direct a particular
movie, and the actor table records which actor acts in a particular movie. It is possible to have
multiple directors for a movie as well as a director of a movie also playing the role of an actor in the
same movie. We now illustrate several queries against the movie database that involve universally
quantified conditions.
Query 1: Get actors who act in all movies. The query can be answered using the following quantified
rule:
SBBD - Simpósio Brasileiro de Banco de Dados
119
161-8
·
Weiling Li and Rajshekhar Sunderraman
answer(A) :- actor(*:movie(*),A).
To clearly understand the rule, the condition in the above rule may be expanded using the definition
as follows:
answer(A) :- (forall T)(movies(T) -> actor(T,A)).
A bottom-up evaluation of the rule involves the "divide" relational operator shown below:
AN SW ER(A) = ACT OR(T, A) ÷ M OV IE(T )
Query 2: Get actors who do not act in all movies. The query can be answered using the following
quantified rule:
answer(A) :- actor(T,A), not actor(*:movie(*),A).
The conditions in the above rule may be expanded using the definition as follows:
answer(A) :- actor(T,A), not (forall T)(movies(T) -> actor(T,A)).
A bottom-up evaluation of the rule involves the "project", "minus" and "divide" relational operators
shown below:
AN SW ER(A) = project[A](ACT OR(T, A)) − (ACT OR(T, A) ÷ M OV IE(T ))
Query 3: Get directors such that every actor has acted in at least one of his or her movies. The
query can be answered using the following quantified rule:
aperson(A) :- actor(T,A).
r(A,D) :- actor(T,A), director(T,D).
answer(D) :- r(*:aperson(*),D).
The condition in the third rule above may be expanded using the definition as follows:
answer(D) :- (forall A)(aperson(A) -> r(A,D)).
A bottom-up evaluation of the rules involves the "project", "join" and "divide" relational operators
shown below:
AN SW ER(D) = project[A, D](ACT OR(T, A) ./ DIRECT OR(T, D)) ÷ project[A](ACT OR(T, A))
Query 4: Get pairs of actors who have acted in exactly the same set of movies. The query can be
answered using the following quantified rule:
answer(A1,A2) :actor(*:actor(*,A2),A1),
actor(*:actor(*,A1),A2),
A1<A2.
The conditions in the rule may be expanded using the definition as follows:
SBBD - Simpósio Brasileiro de Banco de Dados
120
Deductive Databases with Universally Quantified Conditions
·
161-9
answer(A1,A2) :(forall T)(actor(T,A2) -> actor(T,A1)),
(forall T)(actor(T,A1) -> actor(T,A2)),
A1<A2.
As can be seen in the expansion, the first condition verifies that A1 acts in all the movies in which A2
has acted in and the second condition verifies that A2 acts in all the movies in which A1 has acted in.
Another way to solve this query is by using the following quantified rule:
answer(A1,A2) :actor(*:actor(*,A2),A1),
actor(#:actor(#,A2),A1),
A1<A2.
Note the use of # and the exchange of the two variables in the second condition; the second condition
verifies that A1 acts only in movies in which A2 has acted in.
3.2
Bottom-up Computation of (Weak) Well-Founded Model
Deductive databases with arbitrary negation in the body of rules have important applications in
knowledge representation, data integration and database repairs, and many other emerging areas.
Semantics of such deductive databases have been studied and two popular approaches have emerged:
the well-founded model semantics ([van Gelder, Ross, and Schlipf 1988]) and the stable model semantics ([Gelfond and Lifschitz 1988]). A pre-cursor to the well-founded model was published by Fitting
([Fitting 1985]) in which the basic idea behind the well-founded model was introduced. This model is
sometimes referred to as the Fitting model or the weak well-founded model. In [Li, Khabya, Fang and
Sunderraman 2010], we proposed a program transformation algorithm that takes as input a deductive
database with arbitrary negation and transforms it into a Fitting-model equivalent deductive database
in which the arbitrary negations are not present. The transformation introduces two new predicates,
p_plus and p_minus, for each predicate p in the original deductive database. The quantified deductive
rules introduced in this paper can be used to simplify the transformation process. We introduce the
transformation process and the introduction of the quantified rules in the output using an example.
Example: Consider the following general dedutive database:
edge(1,2).
edge(2,3).
vc(X):- edge(X,Y), not vc(Y).
This simple deductive database is non-stratified and defines the concept of a vertex cover in undirected
graphs. The transformed program (with comments) is shown below:
//for each constant introduce a dom() fact.
dom(1).
dom(2).
dom(3).
// For the extensional predicate edge(), introduce the positive facts.
edge_plus(1,2).
edge_plus(2,3).
// For the extensional predicate edge(), introduce the following rule
edge_minus(X,Y) :- dom(X), dom(Y), not edge_plus(X).
// The rule for vc_plus; replace negated literal by vc_minus
SBBD - Simpósio Brasileiro de Banco de Dados
121
161-10
·
Weiling Li and Rajshekhar Sunderraman
vc_plus(X) :- edge_plus(X,Y), vc_minus(Y).
// Rules for vc_minus
// For each literal in the body introduce temp() predicate
temp(X,Y) :- edge_minus(X,Y).
temp(X,Y) :- dom(X), vc_plus(Y).
// Introduce rule for vc_minus
vc_minus(X) :- temp(X,*).
The transformed deductive database is a quantified deductive database with no explicit negation
involving the intensional predicates. A bottom up compuation of the fixed-point is guaranteed to
stop. The Fitting model of the original program can easily be extracted from the fixed-point of the
transformed program. The following is the computation process:
Iteration
edge_plus(X,Y)
edge_minus(X,Y)
vc_plus(X)
temp(X,Y)
vc_minus(X)
Iteration 1
{(1,2),(2,3)}
∅
unchanged
{2}
Iteration 3
unchanged
unchanged
unchanged
{(1,1),(1,3),
(2,1),(2,2),
(3,1),(3,2),(3,3)}
result from
the last iteration
∪ {(1,2)}
unchanged
{3}
Iteration 2
{(1,1),(1,3),
(2,1),(2,2),
(3,1),(3,2),(3,3)}
unchanged
result from
the last iteration
∪ {1}
unchanged
The iteration stops when no change happens. A bottom-up evaluation of the transformed program
results in the values for vc_plus = {2} and vc_minus = {1,3}, which coincides with the Fitting
model for the input program.
4.
CONCLUSIONS AND FUTURE WORK
We have presented quantified deductive databases and their applications. The class of quantified
deductive databases is easily seen to encompass stratified deductive databases. If explicit negations
are not allowed in quantified deductive databases, an intuitive semantics can be defined using a
bottom-up fixpoint technique. The conjecture is that in this case, a unique fixpoint semantics exists.
Further investigation is needed to exactly classify quantified deductive databases in the spectrum of
deductive databases with negation.
REFERENCES
M. Fitting. A Kripke-Kleene semantics for logic programs. J. of Logic Programming, 4:295–312, 1985.
H. Gallaire and J. Minker. Logic and Data Bases, Plenum Press, New York, 1978.
H. Gallaire, J. Minker, and J.M. Nicolas. Advances in Data Base Theory, in Plenum Press, New York, 1981.
H. Gallaire, J. Minker, and J.M. Nicolas. Logic and Databases : A Deductive Approach, in ACM Computing Surveys,
16(2):151–184, June 1984.
M. Gelfond and V. Lifschitz. The stable model semantics for logic programming. In Proceedings of the 5th International
Conference and Symposium on Logic Programming, Seattle, Washington, pages 1070–1080. IEEE, 1988.
W. Li, K. Khabya, M. Fang and R. Sunderraman. Handling Negation in General Deductive Databases: A Program
Transformation Method, International Conference on Management of Data (COMAD), pp. 39-49, 2010.
Kenneth A. Ross. Modular stratification and magic sets for Datalog programs with negation, in Journal of ACM,
41(6):1216–1266, 1994.
J.D. Ullman. Assigning an Appropriate Meaning to Database Logic with Negation, in Computers as Our Better
Partners, pp. 216-225, World Scientific Press, 1994.
Allen van Gelder, Kenneth A. Ross, and John S. Schlipf. Unfounded sets and well-founded semantics for general logic
programs. In Proceedings of the Seventh Annual ACM Symposium on Principles of database systems, pages 221–230.
ACM, 1988.
SBBD - Simpósio Brasileiro de Banco de Dados
122