* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Optimization of Real Conjunctive Queries
Survey
Document related concepts
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Ingres (database) wikipedia , lookup
Microsoft Access wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Clusterpoint wikipedia , lookup
Versant Object Database wikipedia , lookup
Database model wikipedia , lookup
Transcript
Optimization Surajit Hewlett Palo IBM Laboratories Alto, CA The has been studied research nated). tion queries in In this techniques from carry to the over we We the set-theoretic seman- are study not the show under that set-theoretic number the such do not to of that optimize is semantically sive to evaluate. queries class queries that attention has most this trast, of equivalence cial A amount of queries, arise in SQL, ination two functions practice class. The granted direct title to copy provided commercial of the specific ACM all or part of this are not made the ACM and its data appear, material or distributed copyright and is by permission of tha Association To copy otherwise, or to republish, notice notice is lem for for tion ia given of change used that con- in commer- semantics; eliminated relathat unless This is, elim- is done elimination expensive. is, In is rnultisets), as COUNT) set- database sets, language duplicate in the Second, for might aggregate are sensitive to the semantics makes Do the known in to the the the results queries setting know, the optimization queries conjunctive As we shall underlying to re-examine conjunctive set-theoretic for Computing requires e fee are tuples. not men- assumes of tuples. setting. and the above either queries the con- about carry prob- bag-theoretic optimiza- over from bag-theoretic results the setting? do not carry over. queries is typically permission. ACM-PODS-5/93/Washington, 01993 fee the copiee advantage, publication that copying Mechinery. and/or that without of in real-life The requested. First, multiplicity We of duplicate term are (such it necessary Permission query is explicitly reasons. and applying invariably of be computationally since in has bag-theoretic tuples queries optimization relations, contain the con- optimization almost are bags (another duplicate has queries. on systems. results not of for complexity however, to query databases, tions of so research a significant that expen- or do a result understood. computational semantics; relations they questions of relational of queries are a query less fundamental In general, classes into equivalence class of conjunctive number into but deciding attracted is the a large equivalent is undecidable, on certain queries query optimization. relational focused relational a given Thus, is one of the in query fall to well to is. research research As problem conjunctive management tioned on transforming the the theoretic based of joins. very at ion queries database Techniques is of corresponds optimization minimize 14). minimization which is a difficulty, results Introduction is the number what junctive 1 research the minimiz There setting. [ASU79A, see [U89](Chapter queries how we know bag- conjunctive of conjuncts, research, know optimization setting bag-theoretic the for extensively DBS90]; of this junctive elimi- optimiza- ive queries focus this studied CM77, The com problem been minimizing In bag-theoretic conjunct semantics. this 95120-6099 optimization has ASU79B, queries eliminated). duplicates paper for are have general problems theoretic assumes duplicates SQL (i.e., conjunctive Unfortunately, invariably (i.e., contrast, for extensively. almost semantics tics problem CA Center vardiQalmaden.ibm. queries optimization Vardi Research Jose, E-mail: .com Y, Almaden San 94304 chaudhuriQhpl.hp Abstract The Queries Moshe Chaudhuri Packard E-mail: of Real Conjunctive 0-89791 Equivalence D.C. -593 -3/931000510059 approached . ..+1 .50 co of conjunctive via the notion of containment. In the set-theoretic setting, contained in another is always a subset Two queries in answer to if they of cent ainment. vestigation under thus start conjunctive bag-theoretic query leaves lations are the middle of conjunctive theoretic setting tainment in the problem is known ASU79B, CM77]. some sufficient under We is at erarchy, the NP(= semantics plexity X;) that does (rather the theoretic semantics. seems der set-theoretic to fact, portant the sible; ordering The picture queries are by conjuncts above for The result practical queries. We indicate that consider unions present, perhaps of same from seems not indicates the in the a in a bag set. B will B is clear is a bag of tuples is an assignment names. which (wing, is where pre- An is that “;” the relation PART consists of tuples: [2]), (flap, Seattle; Portland; im- Seattle; [1]), [1])} ~~~, the end is indicated but tuples. That Seattle)l = of the tuple in square PART contains Seattle), the other under marks that (engine, two only copies one of each Seattle)l I(wing, the This of the tuple copy is, I(engine, 1, and and brackets. of = 2, Portland)l = of conis posto re- We say that query. ement a fairly bleak than of conjunctive two Let complex- be restricted all database annotated {(engine, under of conjuncts conjunctive 2.1: the following means however, relat ton a of an ele- Ial, when A of Intuitively, element A arity. annota- un- semantics to paint the (or simply to relation Example prob- queries optimization case multiplicity of an element context). of relations the the of an element fixed conjunctive in the this occurrences for by lal~ the multiplicity must in integer. duplicate multiplicity [GJ79]. result contain be denoted bag- the no optimization removal and bag- that we then the isomorphic. of this represents elements; called is a positive of duplicates be NP-hard two situation show re- duplicates Semantics also the multiplicity NP-complete bag-theoretic queries optimization the ment; is equivalence the no is possible. of an element, number com- problem, to semantics any of the underlying equivalence has that consequence bag-theoretic junctive level in is still known under when may optimization While CM77], show are equivalent cisely bag hi- to be strict, of the graph-isomorphism we element, Since surprisingly, easier. We the database set-theoretic A bag is a set of annotated the computational semantics is not Bag-Theoretic 11~-hard. first problem semantics but 2 op- queries. (i.e., this We optimization tion containment), Here ASU79B, as the the query than be bag-theoretic In for equivalence lem NP con- polynomial is believed the be sets between some of some concept examine [ASU79A, is [St77]. change where Intuitively, semantics. is decid- problem. key equivalence in the semantics. of meaningful of conjunctive to re- for whether problem is at increase of the Since know hierarchy hierarchy indicates conditions of the bag-theoretic of this latter we do have semantics level con- that [ASU79A, necessary the second while than The while even that polynomial-time ity contrast, polynomial-time this harder setting. bag-theoretic prove the be in the bag- to be NP-complete In we do not tainment able. to set-theoretic and some containment, II; seems queries known ground theoretic Containment under equivreduces respectively, the possibility the situation are allowed). con- semantics. equivalence, unions also consider and queries [SY8 1]. We show hold for containment conjunctive and open timization in- of queries does not This to our union containment sult we say answer semantics, of conjunctive can be expressed We by considering tainment to latter. if the answer of the equivalence alence contained setting, a subbag set-theoretic is former the are in another is always and again terms the of is contained to the former a query to the Inthebag-theoretic a query the latter, that answer are equivalent in each other. that we say if the results is lost. queries. or equal will we be denoted between the Under 60 by =b two (=.). ~b. by ~.. bags (resp., of B’ if each el- also in B! multiplicity. bag relationship that First, a bag B is a subbag of B is contained We will The The with define subset equality sets) will a greater the sub- relationship relationship be denoted by Example 2.2: In this example, The B ~b B’. operational queries B = {(engine, Seatt/e; [1])} B’ = {(engine, Seattle; [2]), (flap, (wing, Portland; is operational Seattle; [1]), of [1])} in are bag also define union of two multiplicity bags. bags factors The bag Example B’ the 2.3: in Example (wing, tuple in either be denoted The k [3]), (flap, the on Example B and the bag: Seattle; Let Example for this section mantics we describe is the we start and the the bag-theoretic queries. optimization with the Since of description of SQL se- our “real-life” sults An SQL Conjunctive conjunctive the following SELECT coluxnnlist rellist WHERE equal where columnist selected, (called ple tion the table queries, conjunctive query is an the complete is SQL) names equalit among syntax cross conjunctive 3.1: The tuple of the tuples: Portland; SQL query of relation the [1]), ylist following wing, Seattie, application . ID , SUPPLIER , PART WHEItE SUPPLIER . CITY re- [1]), [1])} Seattle; the the [2]), Portland; flap, of only Seattle, condition following engine, Seattle, flap, the application following relation with in tuples the qualify: Seattle; [2]), Seattle; [1]), } of the selection relation as answer list results to the query 3.1. {(Boeing, engine; [2]), (Boeing, flap; [1])) tu- is a conjuncattributes. we refer the (For reader is an example of a 3.2 Logical In this subsection, mantics. SUPPLIER relations of query. SELECT Seattle; Seattle, of conjunctive FROM of the attributes of (possibly relation of SQL, list product (Boeing, equalitylist to [D87].) Example that of the [1])} engine, in the is the list and conjunctive relation: {(Boeing, rellist of equalities in the [1])} [2]), (wing, Seattle, Finally, variables), among us assume consists {(Boeing, After Queries itylist in are the Let PART consists Seattle; in Example to that the tuples moti- form: FROM in qualifying Seattle; Seatt/e; (Boeing, be for (Boeing, SQL taken. be described 3.1. SUPPLIER relation in the queries. 3.1 is conditions Queries of conjunctive vation will the cross obtained the us consider {(Boeing, [1]), Therefore, In rellist attributes in relation (flap, Conjunctive for the selection tuple details 3.2: query {(engine, 3 the The First, paper. SQL [1])} Portland; each conjunctive [D87] of the bags the to projected full in of the Finally, product. colunmlist. by U. the B U B’ Seattle; The by adding us consider 2.2. {(engine, each will Let operator. is obtained for union bag unton each equalitylist (see of SQL). relations we apply cross We will the of SQL as follows semantics product Next, semantics defined The the previous PART. ID alent = PART. CITY (for 61 we describe queries two and section) see also the their approaches a detailed see [NPS91]; Queries Conjunctive are then semantical [MPR90]). logical syntax denotational se- (of this section shown to be equiv- account and of SQL, A logical conjunctive query is a rule The of the form: result denoted Query(X) :-Cl (Xl), assignment . . . . C~(X~) result where the X and the Xi’s Cj’s are head of the body of the a query relation query in this that there rather, by multiple variables will are the distinguished of the over mapping bag Example 3.3: The equalities are cap- variables, Example 3.6: ample and 3.3 assignment The (a) s-id mapped and p_id p.id) Supplier(s-id, Part(p.icl, (b) The denotational queries mappmgs. tive query of data that mapping in Q. X, 3.4: Let and database 3.3 mapping where to Seattle and mapping. Ci (Xi) is an tive variables of Q is Section and let the X be a data value by L9(C;(X;)) query Example to engine, now due to rni 10( Ci(Xi))l, to the = 0 of Q define an The over multiplicity i = D the tuple Boeing, c-id derived mapping 1, . . . ,n. is the tuple use Example due tion name The following ical syntax. {(Boeing, the engine; Consider The (O(X); Example assignment [2])} p-id of (a) assign- Therefore, the wing; [1])} Conjunctive by 0. the result [m]) a the in 3.4. The there relation in order that of no rela- once in rellist. every the log- relation in variable. rellist introduce with as the con- beginning the a variables attributes in the information. every attribute equalit for ylist, among equalities there induces tion on the variables. tive from each variable corresponds each variables is 62 of conjunct variable, each of SQL generate a distinct re- 1 than of steps attribute same equality mapping form in the more sequence every tinct with of conjunc- we assume introduce 3. Since due a transformation syntax canonical is repeated schema Let sketch to logical as introduced every 2. For m = ml m2 . . . mn. 3.5: to [1])}. SQL 3. For simplicity, These sult and that bag: we briefly rellist, is an assignment assignment There possible: Seattle result [2]), (Boeing, syntax query 1. For in Ex- 3.2. 9 can to in Ex- of assignment The vs. section, in We are except result wing; in the corresponding query that c-id 3.5. engine; will D, the 3.2. queries. We is mapped. in query as (l), The is {(Boeing, the SQL junctive is mapped to In this from @ be an assignment us consider s-id p-id D to the we denote the Example Queries of a conjunc- by O(X) and Same Example Logical conjunc- in the body Let database to which Example ample D in assignment a database in conjunct We denote O maps tuple into in D. of Q into variable mapping values every of us consider c-id) of logical terms assignment to a tuple to which in Q as above of Q such the defined An assignment mapped semantics is mappings mappings results 3.2.1 tive assignment (b) {(Boeing, : – c-id), to these to wing. in evaluation (s-id, by takmappings to Boeing, to engine; ment 3.2 can be as: Query re is the assignment Let two is mapped in Example and is obtained all database are van’ables. query due D, O is any D is, Q(D) over a database where D. is given expressed over 1-1oro, of Q into union results Q by upon in the of variables. selected the is the look of literals equalities occurrences X be called sometimes or set ing a query is given due to 0. That is the . . . . C~(X~) are no explicit representation; tured head will bag and Query(X) C’l(X1), (we as the Note of variables, names. and query itself body). are tuples of Q(D), equality a dis- is a corresponding such as X an equivalence We select equivalence to predicate class, by its representative. = Y. rela- a represent and a- replace 4. The the distinguished variables representatives of the respond in the head variables to the attributes in the An are that important ment cor- for relationship colundist. observation conjunctive than is that queries bag contain- is a strictly set containment for stronger conjunctive queries. Example 3.7: ample 3.1. CITY Assume as the first SUPPLIER step, the the the of SUPPLIER attributes query in Ex- has ID and schema second attributes relations. s-id, variables of Part. In the we create the c-id p-id (a) first for and second the Proposition 4.1: of the In variables and transformation, SQL the the PART we introduce the the that and and tributes for Consider at- For any ifQ <b Q’ c-id), step (b) ., of There exist such body Part(p-id, that the cl.-id third step, to obtain we apply the the The c.id), head result Query the equality c-id = ing Query Part(p.id, (s-id, c-id). (s-id, us assume p-id) p-id) in the fourth Part (p-id, c-id) Q database Z’ransform(Q) obtained from denotes an SQL the query 3.8: Let Q be an SQL the Transform(Q) about results to any be easier the of of Therefore, for D Q the the and prove our rest syntax of this semantics. paper, of Q is bag contained denoted by Q <b Q’, we use the D. In contrast, be denoted ~b Q’(D) set containment by Q ~. result of Q has two of Q’ has exactly (bag) us consider, result Q ~b Q’. however, [1]), (q(a); a [2])}. tuples, one B Q’, any would the that a conjunct distinguished next by be of set mappings. each conjunct to a It in the of Q. an identity there Q’ of Q’ to vari- in the body The mapping is known that is a containment to Q (see [CM77]). can we strengthen proposition queries. 63 to a characterization Proposition of Q in Q’. u maps when Q obtained a query variables. precisely from from u of variables of Q such bag a characterizabe of cent ainment of Q’ into Q’ set contain- characterization mapping is required that than expect known in terms subsection would ables How over one containment Q is a mapping the Containment stronger body for query for is strictly Thus, query Results in another if Q(D) {(p(a); A containment Conjunc- and Let tuples seen in the previous of bag Q <s Definitions A query will q(X) Q’. Conditions get Q’ following the (bag) containment problems and S, 4.2 mapping database p(x) strengthening results Queries Basic P(X), Therefore, ment. are equal. and equivalence logical Containment 4.1 the consider :- tuple. mapping tive us :- containment and latter. 4 Let Q(X) the We have conjunctive applying database to state containment terms by the follow- 4.l(b) Q’(X) with the tion in Q’, Q transformation. Then It will and : – c.id), that query query. Proposition 4.2: whereas Theorem Q <b Q’ does but Q example. Then, above queries holds, queries: 1 by the Q’ cl-id). Clearly, Let Q’, is: Supplier(s-id, conjunctive and Q’. hold. Example We add Q also Q <. body Suppiier(s-id, step. queries then conjunctive Q <, We illustrate In conjunctive holds, cl-id not Suppiier(s-id, two this of bag provides 4.3: Let characterization containment? to The a clue. Q and Q’ be conjunctive (a) If Q <h Q’, then the number no less than query (b) for any relation ofp-conjuncts name in the query the corresponding p, Theorem is queries. IfQ relation name, Q’ number in the Q. tainment If Q <h Q’, there then for every M a containment to Q such that conjunct mapping The above Q’ 1 E u(Q’). proposition between tainment. If set Q <, less constraining Q’, the there are conjuncts is for Q to be contained Q’, however, be fewer Q’ To get Q’ result bag yields and containment, juncts. As independently nan [I R92]. We riow of “coverage” show yields that ple, Q <~ Example 4.fi enough that a strong It con- requires enough queries. Q’ If there onto Q, Q Q’ for the following two r(Z) Q(X, Z) a p(x), q(U, X), q(V, Z), T(Z) is easy that that from there Q’ is no to Q. onto We con- can show, Q <b Q’. definition of containment over show all that this for a method this bag to test such as the quan- full paper, can over somea finite the multiplicities While procedure the quantification where symbolically. cases, In quantification by of databases, given involves databases. be replaced number from following observe however, tification to Q is onto mapping to mapping provide the us consider q(V, Y), decision Consider containment H Q <h Q’. 4.5: is no onto q(U, Y), tain Example there exam- bag be conjunctive is a containment then condi- In the following to Q. Let tainment notion condition and Q’ of an onto a necessary p(X), we will Q but existence is not discov- Ramakrish- z times Let 4.6 were and Z) that if Q ~b a(Q’). 4.4: Q. Theorem the the same M a con- Q’(X, The Proposition onto with there multiplicity, shows, u from iff queries. would in Q by the conjuncts mapping Q’, be conjunctive in Ioannidis cent ainment. from Q’ Q’ however, mapping containment. A containment from mapping bag Q’ it to ensure have a sufficient general, for multiplicity. enough 4.3 of the conjuncts Q <b 4.4 and ered tion and conjuncts consequently lower Q’ should is the conjuncts there and high Proposition ‘(coverage” in Q. that “easier” we need with means the that of of Q’ Thus, Fewer means tuples body con- Q then mapping containment dif- bag of Q. Q’, mappings, in t uples that the body in a basic and in Q’. would assignment would out then than fewer in brings containment Let has no two Proposition 1 in Q, u from In ference 4.6: does not containment, bag are yield it containment queries a does in cer- in Example 4.7. three queries: Complexity 4.3 Q(X, Y) :- s(X, Z), t(iV, Y), Q’(X, Y) :- s(X, Z), t(Z, Y) Q“(X, Y) :- s(X, Z), t(Z, Y), t(Z, The complexity from Q Q onto to Q“ that Q“. ~, also Q’ and Thus, Q. tainment there On u(Y, out that a containment a containment we the mapping turns is have that W) Proposition Q’ hand, there from Q onto Q“. ~b Q. mapping mapping other Q“ of set be NP-complete lated Observe <b of Containment W’) from ilar Q and is no con- Indeed, Since Theorem characterization is known the 4.6 to condition are of closely re- of set containment, that this condition The problem has a sim- complexity. 4.8: whether 1 and surprising Theorem it 4.4 to the it is not containment [CM77]. there conjunctive ts a conjunctive query Q’ onto of determining mapping a conjunctive from a query Q is NP-complete. The be condition necessary conjunctive of Proposition and sufficient 4.4 for turns a large out to class of The queries. tainment 64 suggestion given of by the intractability NP-completeness of set con- result if [CM77] is somewhat plexity is is in terms typically database. misleading, much smaller test if Q <, To apply Q’ yields the goal arise to the in practice, body of Q. this full paper rithm for testing than Q’, the com- the of the have is quite describe Lemma this queries of algo- of onto containment existence of onto containment that mapping the is in general so Theorem complexity now of bag describe in terms tells oracle Turing St77] second bound level reader It Q’(X) Clearly, The is referred is in is believed of returns to cates. that bound 4.9: The for bag con- bag containment Q’ problem is saw harder than complexity of problem. set containment containment even know is The pre- is an open containment. bag We do not bag if the 4.3) Q’. Over but the this Q’ returns Q #b Q’. database database, four Q dupli- , problem for bag containment in a conjunctive of Q by such a sufficient Q’. We show coverage had (Proposi- for such We now of query coverage condition 4.4). two zsomorphac ment 4 that Q that coverfor condition bag is neces- sufficient. conjunctive iff mappings there from queries are Q and one-to-one Q’ onto Q and Q’ containvice versa. is 5.2: Let Q ~b Q’ iff Q and Q and Q’ Q’ be conjunctive are isomorphic. ConjuncCorollary tive us consider [2])}. a simple and queries. of P(X) “coverage” and Theorem Equivalence se- Q and p(x), p(x) conditions We say that are decidable. 5 queries : – query requires sary that is equivalence :- in Section equivalence 4.9 suggests semantics the set-theoretical the duplicates age (Proposition 11~-hard. Theorem under Let Therefore, We tion cise two necessary Theorem Q’. of {(p(a); a conjunctive lower than to of con- the tainment. indeed Q =, consisting in H;. the similar equivalence bag-theoretic Consider Q(X) problem II; when We in terms class hierarchy. state this 5.1: the general. results, the property queries neces- about hierarchy. The contained We can now for the details. of the not is defined machines; for is strictly in polynomial-time hierarchy [GJ79, but us nothing containment a lower of the polynomial-time NP sufficient 4.8 that under conjunctive Example Recall to prove stronger precisely <~ Q hold. mantics. mappings. sary, queries a strictly practical. existence Q’ 4.1, showing junctive that a similar Q -b Q’ holds Q’. clearly, Q <h Q’ and It is straightforward to see whether algorithm we will size For many Q = both which we simply of Q and tuple In the since of the size of the queries, Queries 5.3: querzes Bag equivalence is polynomially of conjunctwe equivalent to graph iso- morphism. The focus that In on the and containment tion vates the A query Q’(D). setting both We is quite equivalence any database If denote equivalence Q and it by fact Corollary equivalence D, queries are NP- equivalence in Sec- ter problem in the bag- morphism difficult. This moti- are •~ of Q and to another bag Q’. Q’ In will The query crux tive equivalent, we placement contrast, the denoted lent queries 65 to here, of the in NP, [GJ79]. misses, in of a conjunctive optimizing query it is known with that setting query by a smaller for it isois not Focusing however, set-theoretic set the lat- graph but of than While [CM77], be matter in the equivalence easier queries. be NP-complete conjunctive conjuncts: by bag perhaps is NP-complete is known to us that is of conjunctive complexity =, be Q(D) tells queries problem known the we have that 5.3 conjunctive directly. Q’ Q the containment. however, saw, Q is bag equivalent Q’ iff over from to containment semantics studying stems reducible of conjunctive [CM77]. 4 that theoretic set is set-theoretic complete will containment equivalence every the on point. conjuncis the re- an equivanumber conjunctive of — query Q there tive query other is a mmzmally Q’, i.e., conjunctive than Q’ orem however, bag-equivalent up optimization the and tion class ttes conjunct An Theorem 5.2 of conjunctive the {(Boeing, are and the relation the ive queries not carry in over with Then, ques- Seatt/e; [1]), to the first SQL query SQL provides obtained SQL and the ability by second SQL the Therefore, statement for Semantics to take evaluating union individual bag union of the bags queries. is given ALL A and B are SQL compatible, tributes i.e., (the quired corresponding to be type A and B are SQL sume that evaluating lations the yields TA A UNION union same TB are by: A union our queries. the U a database. Then, and TB. 6.1: Consider schema three relations PART (ID, CITY) , CAPITAL (CITY, the of the [1])}. and Semantics expressions is an ex- u . . . UQn(X), tween by for the re- SQL query database CITY), arity the SQL The for result of all set of U over equivalence and the logical a be- approach extended of conjunctive to queries. We can represent in Example and same can be easily union 6.2: the The a~p~oach queries query and the SQL 6.1 by QI (1) U Q2(I), query where given consists SUPPLIER a conjunctive same variables. conjunctive equivalence by taking is the D is U1<i<nQi(D). Example The in form Qi has database us as- obtained each Qi’s given Example results [2]), (.fiap; of conjunctive distinguished purpose, Let relations B is obtained below. query Syntax of the where of at- are also re- For ALL of TA relation [11)} combined Logical 6.2 The number attributes conjunctive and the and are union- compatible). A and B over for bag the relations B statements contains the [2])}. query Q1(X) where [1])} I pression A UNION Pittsburgh; yields {(engine; {(engine; Syntax [1]), the Queries SQL tuples: Portland; [1]), (brake, {Uk 6.1 [1])} (~iap, Seattle; inequalz- Conjunctive of SUPPLIER for PART has the following Seatfie; (engine, The Union for applicable interesting queries {(engine, bssic [K88]. 6 relation are identical Thus, is simply setting. is whether larger for that tuple The- queries they reordering. bag-theoretic to conjunctive setting us assume has the no to Q has fewer According when technique set-theoretic in the two Let conjunc- to Q and equivalent [CM77]. precisely to renaming equivalent is equivalent query conjuncts 5.2, Q’ of Q~(l):-part(l, (ID, C), supplier(l, QZ(l):-part(l, C) C), capita/(C, N) COUNTRY). 9 SELECT FROM PART . ID PART , SUPPLIER WHERE PART . CITY = SUPPLIER. Equivalence 6.3 Sagiv ALL for UNION and PART . ID FROM PART , WHERE PART . CITY must Q <~ Qj. COUNTRY optimizing = COUNTRY. CAPITAL in the 66 Yannakakis conjunctive there SELECT and Containment CITY exist This [SY81] have shown that if <~ Ui Qi, then queries, some suggests a union set-theoretic Qj Q in the union the following Ui Qi setting: such approach of conjunctive that to queries (a) eliminate redundant eliminate Qi conjunctive if Qi ~~ Qj queries, for some Example i.e., j # i, and which then (b) replace each a minimally Assume giv The query. that did <~ answers (b) junctive mizable Qj applicable, query in the union theoretic that does carry not assume that plicates and consists ated due cates that Q’ and 6.3: and Q“ First, to Q’(X, Q“(X, each such duplilater by as COUNT or AVERAGE. : –Student(id, age) 7.1 not Containment conA query mini- Q is bag-set denoted any set-valued of Sagiv to the set bag- Q’, database containment between contained by Q Sbs in another if Q(D) D. is an bag There such that are conjunctive queries It turns and query Lb Q’(D) out intermediate containment Q <h Q’ LIQ”, 7.2 Example but Q -fb only Consider 7.1. those The over that bag- relationship set containment. a variant query, students following queries satisfy the given who of the query below, in considers are also employed. : -p(X), q(U, X), q(V, Z), r(Z) Z) : -p(X), q(U, X), q(V, X), r(Z) q(V, Z), r(Z) : -p(X), q(U, Z), a student Example failure open mization of the for step in Sagiv-Yannakakis’ possibility unions that a characterization of conjunctive of of Theorem meaningful conjunctive would be of bag equivalence may 7.3: Consider will use the obtain for unions no duplicates. tainment An and tions in the arise often set-valued is a set, important equivalence database but the Q jobs. following age) Student(id, age), It is not hard to verify that database relations <~, Q’, S queries. Female(id) Q <b$ Q’, but Q <b 1 When the condition of Proposition are 4.3 has set-valued, to be weak- ened. Set-valued analogously. Q’, multiple Student(id, the Databases term that <. :- Proposition a relation jobtitle) A to queries. Set-Valued to Q have : – Q’. opti- queries. direction Emp(id, Q(id) I The age), It is easy to see that Z) Z) : –Student(id, claims Q’(id) We the the Second, proposition. Q(X, 7 that I since first and Age be gener- if Q(age) The leaves ID may Q $b Q“. Proof: of the attributes can no du- can be processed function Example Q’ contains Observe are generated query, We semantics. Proposition Q, following students. duplicates to projection. Q’(age) Q’, over of the However, the even the result the of all the STUDENT relation an aggregate 5.2. however, age set- answer. since Consider the respectively. bag- over is by itself to Theorem Yannakakis carry contributes is not It so happens, the since, in the above according to to be negative. tuples by of Sa- bag-theoretic applicable, Q~ and of the over the seems Qj , both result we then to is not the carry Could above multiplicity and conjunctive technique (a) step equivalent setting. optimization Qi query Yannakakis theoretic ting? conjunctive hypothetically and step remaining 7.1: returns are i.e., relation databases special arises are case when set-valued. Let If Q <b, Q, is a containment there Q such to refer a relation 7.4: queries. that Q’, Q and then for Q’ be conjunctive every mapping vartable u from v in Q’ to v E u(Q’). with Example defined vide of conthe This rela- ment. case says that then in practice, 67 7.3 provides a sufficient The sufficient for Q <b Q’. us condition if the with for condition containment Certainly, this a clue bag-set to in Proposition mapping is also pro- contain4.4 is onto a sufficient condition for the restricted weaken the condition containment onto mapping the where in the query However, 7.2 we can restricted u from if V ~~ u(V’) of variables caee. for case. A query Q’ to Q is variable- V And Q and V’ Equivalence A are the Q’, set Q’(D) Q’ respectively. Q is bag-set denoted for any Proposition queraes. Q’ 7.5: If there Let Q and Q’ be conjunctive is a containment variable-onto Q, then mapping Q <b, Example Q 7.6: the is Consider only from Example Therefore, variable-onto. Proposition 7.5 that We note that Proposition bag-set 7.3. mapping it Q <h$ Q’. it follows 7.5 is not Observe from Q’ follows from Example 7.10: from complexity is similar in nature whet her Thus, there the following there one query to the problem is an onto is a following two bag-set : – l(x, -Z),P(X> Y) :- P(X, Y) difference between equivalence is that under tuple Q’ bag-set equiv- v bag duplicate equivalence Iiterals are equivalence, since each one. We representation literals and duplicate has multiplicity is a canonical Q if all of determining is not the not will say of a query are removed from Q. to another cent ainment result that but Y) database whether from and for of Containment of determining Observe Q(X, A key mapping equivalence 4.7 that condition that variable-onto with is an interme- bag Q(X, redundant The =h As alent. bag-set Complexity equivalence between query Q(D) D. to containment. 7.1.1 that database are set equivalent ~ a necessary to another if we have set equivalence. Q’. Example containment bag-set relationship queries that Q’, set-valued cent ainment, diate equivalent Q =h~ Theorem mapping. 7.11: queries. surprising. Q; QI are ~b, Let Q2 canonical Q1 iff Q; and Q2 ~h Q; be conjunctive representations where Q; of QI and and Q2 respectively. Theorem 7.7: whether there conjunctive query problem of a conjunctive query Q’ determining mapping variable-onto from Corollary a tive a conjunctive that is a sufficient condition complexity lowing condition and of bag-set Proposition tween the of Theorem does not tell and The a connection bag-set Theorem 7.7 timization, us about containment. establishes bag containment 7.12 queries Bag-set equivalence is polynomzally of conjunc- equivalent to graph isomorphism. Q is NP-complete. We observe the The is erals, fol- is possible set-valued. be- containment. 7.11 shows namely of conjunctive only some of the only very of removing in the In the full t ion that that limited lit- relations are case where paper, queries we discuss over relations are op- duplicate optimiza- databases known where to be set- valued. Proposition tion 7.8: There of bag containment is a polynomial to bag-set reduc- containment. 8 From lowing Proposition result Theorem 7.8 and Theorem follows. 7.9 The bag-set Related containment problem is 11~-hard. Bag containment tive queries As in the for unrestricted bag-set containment case, the remains decision prob- 68 and equivalence first addressed [D GK82]. These notions by Klausner [K86] 1 in relational 1Klausner et al.. open. were al. tended lem Work 4.9 the fol- algebra also corrected the with were for conjunc- by also context Dayal of additional the earlier et addressed results an ex- control by Dayal over duplicate the query are harder tained elimination than by Recently, in our model, and the conditions consider the results than setting our aspect ob- [IR92] of ours. [ASU79A] Aho results. in- V., SIAM Sagiv of Journal [ASU79B] do and Aho [CM77] V., Sagiv Queries,” Chandra Remarks paper we studied conjunctive the semantics. We niques the set-theoretic from over to the bag containment to be showed on Merlin As queries sible in the setting is an a posteriori on join nation in commercial (See We rem [S*79, have shown does leaving for lence the database that open of the of queries than database not unions we discussed rather [DBS90] pos- on join the [GJ79] em- have a discussion our attention problem with the need of optimizing tics. Umesh Dayal work in this area. comments This Waqar to [IR92] Garey on an earlier a M., and equiva- 1992. setting where [JK84] was inspired by who brought to version [K86] on pp. Sagiv Y., of 3rd 1990, Johnson 117- “Op- conjunc- Int ’1 Conf. pp. bag D. on 455-469. M., E., Surveys 16:2, Klausner A., 1986. Klug A., and Theory and of Co., Ramakrishnan Technical Re- Wisconsin-Madison, J., “Query Systems,” pp. R., of Conjunctive Science of Koch Database the Freeman Containment Computer sity, to 1979. University Jarke S., Computers Guide W. Y. Databases,” practical us by pointing Albert H., with Elimination,” J., Proc. A Queries,” work with R. Symposaum subclass Theory, Ioannidis port, the Katz Systems, Biskup of queries,” Finally, Hasan queries helped Joseph P., “Generalized no duplicates. address Standard, Algebra ACM of Database Dublish in Acknowledgement N., Duplicate NP-completeness, sys- of optimiza- case of containment to the SQL First Intractability: Theo- queries. in the bag-theoretic relations over Database bag-theoretic possibility New 1982. tive elimi- Sagiv-Yannakakis’ conjunctive in ACM This current management to the 9th Computing, Relational San Francisco, extend Proc. Goodman the timization JK84]). [SY81] setting, tion ordering are is~ bag- of the U., of 123, in the of conjuncts. justification “Optimal queries 1987. Extended Principles it is not by Dayal Proc. conjunctive queries phasis tems setting, Wesley, Control further they conjunctive removal that conjunctive when P. M., of on 77-90. C, J., A Guide “An contain- found unlike set-theoretic [DGK82] seems set two precisely a consequence, to minimize theoretic We setting are equivalent morphic. We found than Date Addison carry queries queries. bag-theoretic [D87] Theory pp. D., 435-454. of conjunctive 1977, J. of Re- tech- do not conjunctive harder of conjunctive queries optimization setting. of prob- bag-theoretic setting bag-theoretic in the under that computationally ment that optimization queries Unman pp. databases,” Symp. York, for 218- Transactions 4:4, A. K., relational this D., pp. of a Class ACM Systems, Implementation In 8:2, Y, Optimization lational com- plexity. lems J. Expressions,” of Computing A. “Efficient nor of computational Concluding Unman 246. Database !3 Y, Relational some They problem, A. “Equivalence contain- and found to equivalence the problems problem similar the study and References As a result, Ramakrishnan addressed in the bag-theoretic do they SQL. equivalence are weaker Ioannidis sufficient not and Klausner dependently ment than containment Optimization ACM Computing 111-152. “Multirelations Ph .D thesis, in Relational Harvard Univer- semanto past has given [K88] containing useful pp. of the ‘draft. 69 146-160. “On Conjunctive Inequalities,” J. ACM Queries 35:1, [MPR90] Mumick nan R., I. S., Pirahesh “The gregates,” Proc. ference, [NPS91] Negri M., of the 1990, Pelagatti Transactions 16:3, 1991, pp. P. Selinger lection ment. Proc. and VLDB pp. 264277. of AgCon- Sbattella SQL L., Queries,” on Database Systems, 513-534. G. et.al.: in a Relational ference Ramakrish- 16th G., Semantics ACM H., of Duplicates Brisbane, “Formal [S*79] Magic Access Path Database of the ACM on Management Se- Manage- SIGMOD of Data, ConJune 79, pp.23-34. [St77] Stockmeyer L. J., hierarchy”, ence, [SY81] Vol Sagiv pp. J. D., Science Computer M., Sci- “Equivalences expressions difference Knowledge-base puter polynomial-time 1–22. Yannakakis and 27,1980, Unman pp. Relational union [U89] 3, 1977, Y., among “The Theoretical with operators,” the JACM 633-655. Ptincip/es Systems, Press, of Database Vol 2, and Com- 1989. 70