* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Deductive Databases with Universally Quantified Conditions
Intuitionistic logic wikipedia , lookup
Abductive reasoning wikipedia , lookup
Laws of Form wikipedia , lookup
Law of thought wikipedia , lookup
Structure (mathematical logic) wikipedia , lookup
Cognitive semantics wikipedia , lookup
Propositional calculus wikipedia , lookup
Boolean satisfiability problem wikipedia , lookup
New riddle of induction wikipedia , lookup
Sequent calculus wikipedia , lookup
Propositional formula wikipedia , lookup
Deductive Databases with Universally Quantified Conditions Weiling Li and Rajshekhar Sunderraman Department of Computer Science Georgia State University [email protected], [email protected] Abstract. This paper presents an extension to deductive databases, called quantified deductive databases, that incorporates universally quantified expressions (in coded form) in the body of rules. Since universally quantified expressions contain negations in their semantics, quantified deductive databases fall under the category of deductive databases with negation. Furthermore, depending on other factors such as recursion and negation in body of rules, quantified deductive databases may be classified as stratified or non-stratified deductive databases. Applications of quantified deductive databases in expressing relational queries as well as in the bottom-up computation of the (weak) well-founded model of general deductive databases are presented. It is conjectured that quantified deductive databases includes the class of modularly stratified deductive databases. Categories and Subject Descriptors: H.Information Systems [H.m. Miscellaneous]: Databases General Terms: Data Models, Logic Keywords: Deductive Databases, Negation, Stratification 1. INTRODUCTION Deductive databases were introduced over 30 years ago ([Gallaire and Minker 1978],[Gallaire, Minker, and Nicolas 1981],[Gallaire, Minker, and Nicolas 1984]). A deductive database (or sometimes referred to as DATALOG programs) consists of a set of facts that correspond to a relational database and a set of logical rules that define predicates that correspond to relational views. However, unlike relational databases where the views are restricted to be non-recursive, deductive databases can express recursive views. Moverover,deductive rules with arbitrary negation ([Ullman 1994]) in their bodies are much more expressive and can represent many views that are not possible with rules without negation or with restricted negation such as stratified negation, where negation is not allowed within a recursive view. In deductive databases, a deductive rule is of the form: P :- L1, ..., Ln. and is interpreted as the if-then rule: if L1, ..., Ln are true then one can infer P is true. In particular, if the predicates in the rule have variables as arguments, then for some values for these variables, if L1, ..., Ln are true when the variables are replaced by their values, then one can infer P with variables its variables replaced by values. For example, consider the following deductive database: //Facts p(a,b). p(b,c). //Rules r(X,Y) :- p(X,Z), p(Z,Y). SBBD - Simpósio Brasileiro de Banco de Dados 113 161-2 · Weiling Li and Rajshekhar Sunderraman Based on the given facts, the rule allows us to infer r(a,c) to be true because there exists a substitution of values for variables (X by a, Z by b, and Y by c) that makes the conditions in the rule tally the given facts. Rules in deductive databases are fired when there exists a substitution for variables that make the conditions in the rule true, i.e. we are restricted to existentially quantified conditions in the rule. This causes difficulty in expressing rules which may have universally quantified conditions. For example, consider the following facts and rules depicting an undirected graph: node(a). node(b). node(c). edge(a,a). edge(a,b). edge(a,c). edge(X,Y) :- edge(Y,X). Now, consider the predicate star(X) which is true if there is an edge from node X is to every node in the graph. The deductive rules to express the star(X) predicate would be difficult to express since the condition that must be satisfied to ensure star(X) is inferrable is a universally quantified condition. It would be convenient if such conditions could be expressed in the rules. We propose an extension to include universally quantified conditions. So, in this example, we could express the rule for star(X) as follows: star(X) :- edge(X,*:node(*)). The *-term in the condituon of the rule would correspond to the universally quantified statement “node X is connected to every node in the graph”. In the above example, this condition is satisfied only by node a. In this paper we introduce a class of deductive databases, called quantified deductive databases, that allows universally quantified conditions in the body of rules. Our conjecture is that the class of quantified deductive databases introduced in this paper properly contains the class of stratified DATALOG (see Fig 1) and also includes the class of modularly stratified DATALOG ([Ross 1994]). We present two applications of quantified deductive databases: one in relational querying and the other in the bottom-up computation of the weak well-founded model of general deductive databases. The rest of the paper is organized as follows: Section 2 introduces the syntax and semantics of quantified deductive databases and Section 3 presents the two applications. 2. QUANTIFIED DEDUCTIVE DATABASES In this section, we introduce quantified deductive databases. The syntax is first introduced by extending ordinary deductive database terms by quantified terms. These quantified terms are then used in atomic formulas to produce quantified atomic formulas, which subsequently are used in if-then deductive rules. The semantics of quantified literals is then discussed. Finally, the semantics quantified deductive databases is informally discussed. 2.1 Syntax We now introduce the syntax of quantified deductive databases through a series of definitions. Definition 2.1. An alphabet is a finite set of symbols that include constants (such as numbers and strings), variables, predicates, and special syntactic symbols listed below: SBBD - Simpósio Brasileiro de Banco de Dados 114 Deductive Databases with Universally Quantified Conditions Fig. 1. · 161-3 Deductive Database Classes (, ), ,, :-, <, <=, =, <>, >, >=, :, *, and #. The two new symbols introduced in this paper are * and #. Their significance will become clear soon. Definition 2.2. Given an alphabet, a term is either a constant symbol or a variable symbol from the alphabet or one of the following: (1) * (2) *:p(t1,...,tn), where p is which is * and the remaining (3) # (4) #:p(t1,...,tn), where p is which is # and the remaining a n-ary predicate symbol, t1, ..., tn are terms exactly one of are either constant or variable symbols. a n-ary predicate symbol, t1, ..., tn are terms exactly one of are either constant or variable symbols. We introduce four new kinds of terms to regular deductive databases. Their significance will become clear soon. In the above definition, p(t1,...,tn) is referred to as the limiting predicate. Note that we do not allow the terms within a limiting predicate to be a term with another limiting predicate (it would be interesting to consider such cases and is left for future work). Definition 2.3. Given an alphabet, let p be an n-ary predicate symbol and let t1, ..., tn be terms as defined earlier, at least one of which is a variable or a constant symbol, and if one of them is a *-term (or a #-term) then none of the remaining terms is a #-term (or a *-term). Then, p(t1,...,tn) is an atomic formula, also referred to as a quantified atomic formula. Quantified atomic formulas encode universally quantified statements about the predicate. As is clear in the definition, atomic formulas may include only *-terms or only #-terms, but not a mix of these terms. Here are some valid quantified atomic formulas: p(*, X, *) SBBD - Simpósio Brasileiro de Banco de Dados 115 161-4 · Weiling Li and Rajshekhar Sunderraman p(*:e(*,U), X, *:f(U,*)) p(X, Y, #) q(#:g(#)) and some invalid atomic formulas: p(*, X, #) p(*:e(*,*), X, *:f(U,*)) The meaning of valid quantified atomic formulas will be discussed in the the next sub-section. Definition 2.4. A literal is defined as either a quantified atomic formula P or a negated quantified atomic formula not P. Definition 2.5. Deductive rules are expressed as: P :- Q1, ..., Qn. where P is a quantified atomic formula and Q1, ..., Qn are quantified literals. These rules are also referred to as quantified rules. Definition 2.6. A quantified deductive database is a finite set of quantified rules. 2.2 Semantics of Quantified Literals We now introduce the informal semantics of quantified atomic formulas. Without loss of generality, we introduce the semantics by considering atomic formulas involving predicates of small arity. p(X,*). This simple quantified literal is interpreted as the universally quantified formula: (forall Y)(dom(Y) -> p(X,Y)). where dom is a unary predicate that includes all constants mentioned in the database, i.e. dom(c) if and only if c is a constant present in the database. Informally, the meaning of the quantified literal is the set of X values that are related (via predicate p) to all Y values from dom. In a similar way, the quantified literal p(*,X) is interpreted as the quantified formula: (forall Y)(dom(Y) -> p(Y,X)). p(X,#). This simple quantified literal is interpreted as the universally quantified formula: (forall Y)(p(X,Y) -> dom(Y)). Informally, the meaning of the quantified literal is the set of X values that are related (via predicate p) to only Y values from dom. In a similar way, the quantified literal p(#,X) is interpreted as the quantified formula: (forall Y)(p(Y,X) -> dom(Y)). p(X,*:e(*)). This quantified literal is interpreted as the universally quantified formula: SBBD - Simpósio Brasileiro de Banco de Dados 116 Deductive Databases with Universally Quantified Conditions · 161-5 (forall Y)(e(Y) -> p(X,Y)). Such quantified literals are much more useful than the simpler ones discussed earlier. Usually, universally quantified statements are made on a subset of the universal domain. For example, someone may make an observation “ All men in this room are wearing a tie” rather than “All men are wearing a tie”. In this case, the informal meaning of the quantified literal is the set of X values that are related (via predicate p) to all Y values from unary predicate e. In a similar way, the quantified literal p(*:e(*),X) is interpreted as the universally quantified formula: (forall Y)(e(Y) -> p(Y,X)). p(X,#:e(#)). This quantified literal is interpreted as the universally quantified formula: (forall Y)(p(X,Y) -> e(Y)). In this case, the informal meaning of the quantified literal is the set of X values that are related (via predicate p) to only Y values from unary predicate e. In a similar way, the quantified literal p(#:e(#),X) is interpreted as the universally quantified formula: (forall Y)(p(Y,X) -> e(Y)). p(X,*:e(*,Z)). This is an example of a quantified literal which has a limiting predicate with variables. The quantified literal is interpreted as the universally quantified formula: (forall Y)(e(Y,Z) -> p(X,Y)). This quantified literal denotes the pairs of X and Z values such that X value is related (via predicate p) to all Y values related (via predicate e) to the Z value. p(X,#:e(#,Z)). This is an example of a quantified literal which has a limiting predicate with variables. The quantified literal is interpreted as the universally quantified formula: (forall Y)(p(X,Y) -> e(Y,Z)). This quantified literal denotes the pairs of X and Z values such that X value is related (via predicate p) to only Y values related (via predicate e) to the Z value. p(*:e(*,U), X, *:f(U,*)). This is an example with two *-terms. The quantified literal is interpreted as the universally quantified formula: (forall W,Y)( (exists U)e(W,U) and (exists U)f(U,Y) -> p(W,X,Y)). The quantified literal denotes the set of X-values (present as the middle argument of predicate p) that relate to all W and Y-values (coming from the first argument of predicate e and the second argument of predicate f respectively). A more formal definition of the meaning of quantified atomic formulas is being worked out. In general the meaning of a quantified atomic formula is a relation whose arity is equal to the number of variables present as arguments in the formula - note that the * and # terms do not count as variables. The tuples in the relation would be the values of the variables that make the equivalent universally quantified formula true. SBBD - Simpósio Brasileiro de Banco de Dados 117 161-6 2.3 · Weiling Li and Rajshekhar Sunderraman Semantics of Quantified Deductive Databases There are several approaches to express the semantics of quantified deductive databases. One method is to provide an algorithm to translate quantified deductive databases to an ordinary deductive database. Then, the traditional semantics of deductive databases could be used. We now present the translation process using two examples. In each, we provide the individual steps to arrive at an equivalent ordinary deductive database. A detailed translation algorithm is being worked out. Example: Consider the following quantified deductive database rule involving *: r(X,Y) :- p(X,*:e(*)), q(X,Y). The rule expands to: r(X,Y) :- (forall U)(e(U) -> p(X,U)), q(X,Y). The formula (forall U)(e(U) -> p(X,U)) expresses a unary predicate, say pPred(X) that can be defined by the following 3 ordinary deductive rules: temp1(X,U) :- p(X,_), e(U). temp2(X) :- temp1(X,U), not p(X,U). pPred(X) :- p(X,_), not temp2(X). These three rules basically code the double-negation equivalent of the universally quantified formula. An explanation follows: The universally quantified formula (forall U)(e(U) -> p(X,U)) is equivalent to the formula (without universal quantifier): not (exists U)(e(U) and not p(X,U)) To be able to write this equivalent formula in the form of deductive rules, we add the predicate p(X,_) twice to limit the variable X in sub-formulas as follows: p(X,_) and not (exists U)(p(X,_) and e(U) and not p(X,U)) The three deductive rules are now apparent. temp1(X,U) expresses the conjunctive sub-formula: p(X,_) and e(U). temp2(X) expresses the sub-formula: (exists U)(p(X,_) and e(U) and not p(X,U)). Finally, pPred(X) expresses the entire formula p(X,_) and not (exists U)(p(X,_) and e(U) and not p(X,U)) So, the output ordinary deductive database would include the above 3 rules and the following rule: SBBD - Simpósio Brasileiro de Banco de Dados 118 Deductive Databases with Universally Quantified Conditions · 161-7 r(X,Y) :- pPred(X), q(X,Y). Example: Consider the following quantified deductive database rule involving #: r(X,Y) :- p(X,#:e(#)), q(X,Y). The rule expands to: r(X,Y) :- (forall U)(p(X,U) -> e(U)), q(X,Y). The formula (forall U)(p(X,U) -> e(U)) expresses a unary predicate, say pPred(X) that can be defined by the following 2 ordinary deductive rules: temp1(X) :- p(X,U), not e(U). pPred(X) :- p(X,U), not temp1(X). The first rule collects X values that are related U values that are not present in e and the second rule eliminates these values to produce the X values that satisfy the universally quantified formula. So, the output ordinary deductive database would include the above 2 rules and the following rule: r(X,Y) :- pPred(X), q(X,Y). A second approach to defining the semantics is to extend the traditional model-theoretic or the fixpoint semantics for deductive databases to quantified deductive databases. This is easily done for quantified deductive databases with no explicit negation in the body of rules. Current work is addressing this approach. 3. APPLICATIONS In this section, we present two applications of quantified rules. One is in the area of querying relational databases. Quantified rules are quite useful and easy to use to express complicated relational queries that involve verifying a universally quantified condition. The second application is in the area of computing the intended model of general deductive databases. A database transformation approach that uses quantified rules is presented. 3.1 Relational Querying Quantified rules can be used to answer many relational queries without the use of explicit negation. Consider the following relational database recording information about movies, actors, and directors: movie(TITLE) director(TITLE,DIRECTOR) actor(TITLE,ACTOR) The movie table records titles of movies, the director table records which directors direct a particular movie, and the actor table records which actor acts in a particular movie. It is possible to have multiple directors for a movie as well as a director of a movie also playing the role of an actor in the same movie. We now illustrate several queries against the movie database that involve universally quantified conditions. Query 1: Get actors who act in all movies. The query can be answered using the following quantified rule: SBBD - Simpósio Brasileiro de Banco de Dados 119 161-8 · Weiling Li and Rajshekhar Sunderraman answer(A) :- actor(*:movie(*),A). To clearly understand the rule, the condition in the above rule may be expanded using the definition as follows: answer(A) :- (forall T)(movies(T) -> actor(T,A)). A bottom-up evaluation of the rule involves the "divide" relational operator shown below: AN SW ER(A) = ACT OR(T, A) ÷ M OV IE(T ) Query 2: Get actors who do not act in all movies. The query can be answered using the following quantified rule: answer(A) :- actor(T,A), not actor(*:movie(*),A). The conditions in the above rule may be expanded using the definition as follows: answer(A) :- actor(T,A), not (forall T)(movies(T) -> actor(T,A)). A bottom-up evaluation of the rule involves the "project", "minus" and "divide" relational operators shown below: AN SW ER(A) = project[A](ACT OR(T, A)) − (ACT OR(T, A) ÷ M OV IE(T )) Query 3: Get directors such that every actor has acted in at least one of his or her movies. The query can be answered using the following quantified rule: aperson(A) :- actor(T,A). r(A,D) :- actor(T,A), director(T,D). answer(D) :- r(*:aperson(*),D). The condition in the third rule above may be expanded using the definition as follows: answer(D) :- (forall A)(aperson(A) -> r(A,D)). A bottom-up evaluation of the rules involves the "project", "join" and "divide" relational operators shown below: AN SW ER(D) = project[A, D](ACT OR(T, A) ./ DIRECT OR(T, D)) ÷ project[A](ACT OR(T, A)) Query 4: Get pairs of actors who have acted in exactly the same set of movies. The query can be answered using the following quantified rule: answer(A1,A2) :actor(*:actor(*,A2),A1), actor(*:actor(*,A1),A2), A1<A2. The conditions in the rule may be expanded using the definition as follows: SBBD - Simpósio Brasileiro de Banco de Dados 120 Deductive Databases with Universally Quantified Conditions · 161-9 answer(A1,A2) :(forall T)(actor(T,A2) -> actor(T,A1)), (forall T)(actor(T,A1) -> actor(T,A2)), A1<A2. As can be seen in the expansion, the first condition verifies that A1 acts in all the movies in which A2 has acted in and the second condition verifies that A2 acts in all the movies in which A1 has acted in. Another way to solve this query is by using the following quantified rule: answer(A1,A2) :actor(*:actor(*,A2),A1), actor(#:actor(#,A2),A1), A1<A2. Note the use of # and the exchange of the two variables in the second condition; the second condition verifies that A1 acts only in movies in which A2 has acted in. 3.2 Bottom-up Computation of (Weak) Well-Founded Model Deductive databases with arbitrary negation in the body of rules have important applications in knowledge representation, data integration and database repairs, and many other emerging areas. Semantics of such deductive databases have been studied and two popular approaches have emerged: the well-founded model semantics ([van Gelder, Ross, and Schlipf 1988]) and the stable model semantics ([Gelfond and Lifschitz 1988]). A pre-cursor to the well-founded model was published by Fitting ([Fitting 1985]) in which the basic idea behind the well-founded model was introduced. This model is sometimes referred to as the Fitting model or the weak well-founded model. In [Li, Khabya, Fang and Sunderraman 2010], we proposed a program transformation algorithm that takes as input a deductive database with arbitrary negation and transforms it into a Fitting-model equivalent deductive database in which the arbitrary negations are not present. The transformation introduces two new predicates, p_plus and p_minus, for each predicate p in the original deductive database. The quantified deductive rules introduced in this paper can be used to simplify the transformation process. We introduce the transformation process and the introduction of the quantified rules in the output using an example. Example: Consider the following general dedutive database: edge(1,2). edge(2,3). vc(X):- edge(X,Y), not vc(Y). This simple deductive database is non-stratified and defines the concept of a vertex cover in undirected graphs. The transformed program (with comments) is shown below: //for each constant introduce a dom() fact. dom(1). dom(2). dom(3). // For the extensional predicate edge(), introduce the positive facts. edge_plus(1,2). edge_plus(2,3). // For the extensional predicate edge(), introduce the following rule edge_minus(X,Y) :- dom(X), dom(Y), not edge_plus(X). // The rule for vc_plus; replace negated literal by vc_minus SBBD - Simpósio Brasileiro de Banco de Dados 121 161-10 · Weiling Li and Rajshekhar Sunderraman vc_plus(X) :- edge_plus(X,Y), vc_minus(Y). // Rules for vc_minus // For each literal in the body introduce temp() predicate temp(X,Y) :- edge_minus(X,Y). temp(X,Y) :- dom(X), vc_plus(Y). // Introduce rule for vc_minus vc_minus(X) :- temp(X,*). The transformed deductive database is a quantified deductive database with no explicit negation involving the intensional predicates. A bottom up compuation of the fixed-point is guaranteed to stop. The Fitting model of the original program can easily be extracted from the fixed-point of the transformed program. The following is the computation process: Iteration edge_plus(X,Y) edge_minus(X,Y) vc_plus(X) temp(X,Y) vc_minus(X) Iteration 1 {(1,2),(2,3)} ∅ unchanged {2} Iteration 3 unchanged unchanged unchanged {(1,1),(1,3), (2,1),(2,2), (3,1),(3,2),(3,3)} result from the last iteration ∪ {(1,2)} unchanged {3} Iteration 2 {(1,1),(1,3), (2,1),(2,2), (3,1),(3,2),(3,3)} unchanged result from the last iteration ∪ {1} unchanged The iteration stops when no change happens. A bottom-up evaluation of the transformed program results in the values for vc_plus = {2} and vc_minus = {1,3}, which coincides with the Fitting model for the input program. 4. CONCLUSIONS AND FUTURE WORK We have presented quantified deductive databases and their applications. The class of quantified deductive databases is easily seen to encompass stratified deductive databases. If explicit negations are not allowed in quantified deductive databases, an intuitive semantics can be defined using a bottom-up fixpoint technique. The conjecture is that in this case, a unique fixpoint semantics exists. Further investigation is needed to exactly classify quantified deductive databases in the spectrum of deductive databases with negation. REFERENCES M. Fitting. A Kripke-Kleene semantics for logic programs. J. of Logic Programming, 4:295–312, 1985. H. Gallaire and J. Minker. Logic and Data Bases, Plenum Press, New York, 1978. H. Gallaire, J. Minker, and J.M. Nicolas. Advances in Data Base Theory, in Plenum Press, New York, 1981. H. Gallaire, J. Minker, and J.M. Nicolas. Logic and Databases : A Deductive Approach, in ACM Computing Surveys, 16(2):151–184, June 1984. M. Gelfond and V. Lifschitz. The stable model semantics for logic programming. In Proceedings of the 5th International Conference and Symposium on Logic Programming, Seattle, Washington, pages 1070–1080. IEEE, 1988. W. Li, K. Khabya, M. Fang and R. Sunderraman. Handling Negation in General Deductive Databases: A Program Transformation Method, International Conference on Management of Data (COMAD), pp. 39-49, 2010. Kenneth A. Ross. Modular stratification and magic sets for Datalog programs with negation, in Journal of ACM, 41(6):1216–1266, 1994. J.D. Ullman. Assigning an Appropriate Meaning to Database Logic with Negation, in Computers as Our Better Partners, pp. 216-225, World Scientific Press, 1994. Allen van Gelder, Kenneth A. Ross, and John S. Schlipf. Unfounded sets and well-founded semantics for general logic programs. In Proceedings of the Seventh Annual ACM Symposium on Principles of database systems, pages 221–230. ACM, 1988. SBBD - Simpósio Brasileiro de Banco de Dados 122