Download E - Read

PART 4 DATA STORAGE AND QUERY Chapter 14 Query Optimization Main parts in Chapter 14    §14.1 Overview  why optimization needed §14.2 Transformation of relational expressions  equivalence rules §14.4 Choice of evaluation plans — Query optimization  cost-based optimization,§14.2  heuristic optimization, §14.3 May 2008 Database System Concepts - Chapter 14 Query Optimization - 3 §14.1 Overview A SQL query statement may corresponds to several equivalent expressions ( or query evaluation plans ), each of which may have different costs  Query optimization is needed to choice an efficient queryevaluation plan with lower costs   E.g.1. select * from r, s where r.A = s.A σr.A=s.A (r ╳ s) is much slower than r May 2008 s Database System Concepts - Chapter 14 Query Optimization - 4 §14.1 Overview  E.g.2 Find the names of all customers who have an account at any branch located in Brooklyn  select customer-name from branch, account, depositor where branch.name=account.name AND account.number =depositor.number AND branch-city = Brooklyn  May 2008 Fig.14.1 Database System Concepts - Chapter 14 Query Optimization - 5 Fig. 14.1 Equivalent Expression 14.1 Overview (cont.)  The procedures of optimization  generating the equivalent expressions/query trees, by transforming of relational algebra expressions according to equivalence rules in§14.2  generating alternative evaluation plans, by annotating the resultant equivalent expressions with implementation algorithms for each operations in the expressions  choosing the optimal (that is, cheapest) or near-optimal plan based on the estimated cost, by  cost-based optimization  heuristic optimization May 2008 Database System Concepts - Chapter 14 Query Optimization - 7 14.1 Overview (cont.)  The cost of operations is needed to estimate based on statistical information about relations  e.g. number of tuples, number of distinct values for join attributes, etc. May 2008 Database System Concepts - Chapter 14 Query Optimization - 8 §14.2 Transformation of Relational Expressions  Definition. Two relational algebra expressions are equivalent, if on every legal database instances, the two expression generate the same set of tuples May 2008 Database System Concepts - Chapter 14 Query Optimization - 9 14.2.1 Equivalence Rules Fig.14.0.1 Schema diagram for the banking enterprise May 2008 Database System Concepts - Chapter 14 Query Optimization - 10 14.2.1 Equivalence Rules (cont.)  Rule1.(选择串接律, 将1个选择操作分解为2个选择操作) Conjunctive selection operations can be deconstructed into a sequence of individual selections    ( E )    (  ( E )) 1   2 1 2 e.g. refer to Rule2. (选择交换律) Selection operations are commutative :   (  ( E ))    (  ( E )) 1 May 2008 2 2 1 Database System Concepts - Chapter 14 Query Optimization - 11 14.2.1 Equivalence Rules (cont.)  Rule3. (投影串接律) Only the final operation in a sequence of project operations is needed  L1 ( L2 ( ( Ln ( E )) ))   L1 ( E )   May 2008 note: it is required that L1⊆ L2⊆ ….⊆ Ln e.g.  {loan-number} {  {loan-number, amount} (loan) } =  {loan-number} (loan) Database System Concepts - Chapter 14 Query Optimization - 12 14.2.1 Equivalence Rules (cont.) Rule4. (以连接操作代替选择和笛卡尔乘积) Selections can be combined with Cartesian products and theta joins a. (E1 ╳ E2) = E1  E2 note: definition of  join b. 1(E1 2 E2) = E1 1 2 E2  e.g.  branch-name=“Perryridge”( borrower loan) = borrower (branch-name=“Perryridge)  (borrower.loan-numbr=loan.loan-number) loan  By Rule4, the number of operations can be reduced, and the costs of the left-hand expressions are less than that of the right-hand expressions  often used in heuristic query optimizing  May 2008 Database System Concepts - Chapter 14 Query Optimization - 13 14.2.1 Equivalence Rules (cont.)  Rule5. (连接操作可交换) Theta join operations are commutative E1  E2 = E2  E1  Fig. 14.2 May 2008 Database System Concepts - Chapter 14 Query Optimization - 14 14.2.1 Equivalence Rules (cont.)  Rule6. (连接操作的结合率, associative) a. natural join operations are associative (E1 E2) E3 = E1 (E2 E3) b. Theta joins are associative in the following manner: (E1 1 E2) 2  3 E3 = E1 1 3 (E2 2 E3) , where 2 involves attributes from only E2 and E3 May 2008 Database System Concepts - Chapter 14 Query Optimization - 15 Rule 5 Rule 6a Rule 7a Fig.14.2 Pictorial Depiction of Equivalence Rules 14.2.1 Equivalence Rules (cont.)  Rule7. Selection operation distributes over theta-join(选择操作对于连接操作的分配率,选择条件下移)  a. when all the attributes in 0 involve only the attributes of one of the expressions , e.g. E1, being joined 0E1   May 2008  E2) = (0(E1))  E2 b. when  1 involves only the attributes of E1, and 2 involves only the attributes of E2 1 E1 e.g. in Fig.14.0.2  E2) = (1(E1))  ( (E2)) Database System Concepts - Chapter 14 Query Optimization - 17 When Rule7 is applied, the size of relations participating in theta-join operation is reduced, thus evaluation costs are reduced  branch account Fig.14.0.2 An example of Rule7 14.2.1 Equivalence Rules (cont.)  Rule8. Projection operation distributes over theta-join(投影操作对于连接操作的分配率,投影下移)  a. L1 and L2 are the attributes of E1 and E2 respectively. if join condition  involves only attributes in L1  L2, then  L1  L2 ( E1.......  E2 )  ( L1 ( E1 ))......  ( L2 ( E2 )) May 2008 Database System Concepts - Chapter 14 Query Optimization - 19 L1 L2  {branch-name, branch-city}  {branch-name, amount} branch loan {branch-name, branch-city}  {branch-name, amount} branch E1: branch (branch-name, branch-city, assets) E2: loan(loan-number, branch-name, amount) Fig.14.0.3 An example of Rule 8a loan 14.2.1 Equivalence Rules (cont.) b. consider a join E1  E2  let L1 and L2 be sets of attributes from E1 and E2, respectively  let L3 be attributes of E1 that are involved in join condition , but are not in L1  L2, and  let L4 be attributes of E2 that are involved in join condition , but are not in L1  L2.  L1  L2 ( E1..... E2 )   L1  L2 (( L1  L3 ( E1 ))......  ( L2  L4 ( E2 ))) May 2008 Database System Concepts - Chapter 14 Query Optimization - 21 L1 L2  {branch-city}  { loan-number} E1: branch (branch-name, branch-city, assets) E2: loan(loan-number, branch-name, amount)   {branch-city}  { loan-number} B. assets> L. amount L3 branch L4 loan B. assets> L. amount  {branch-city, assets} branch Fig.14.0.4 An example of Rule 8b  {loan-number, amount} loan 14.2.1 Equivalence Rules (cont.)  Rule9. The set operations union and intersection are commutative E1  E2 = E2  E1 E1  E2 = E2  E1 set difference is not commutative  Rule10. Set union and intersection are associative (E1  E2)  E3 = E1  (E2  E3) (E1  E2)  E3 = E1  (E2  E3) May 2008 Database System Concepts - Chapter 14 Query Optimization - 23 14.2.1 Equivalence Rules (cont.)  Rule11. (选择操作对于集合并、交、差的分配率) The selection operation distributes over ,  and –.   (E1 – E2) =  (E1) – (E2)  similarly for  and  in place of –   (E1 – E2) = (E1) – E2  similarly for  in place of –, but not for   Rule12. (投影操作对于集合并的分配率) The projection operation distributes over union L(E1  E2) = (L(E1))  (L(E2)) May 2008 Database System Concepts - Chapter 14 Query Optimization - 24 14.2.2 Example One    Find the names of all customers who have an account at some branch located in Brooklyn customer-name(branch-city = “Brooklyn” (branch (account depositor))) Transformation using rule 7a. customer-name ((branch-city =“Brooklyn” (branch)) (account depositor)) Performing the selection as early as possible reduces the size of the relation to be joined. May 2008 Database System Concepts - Chapter 14 Query Optimization - 25 14.2.2 Example Two  For customer-name((branch-city = “Brooklyn” (branch)   account) depositor) When we compute t ←(branch-city = “Brooklyn” (branch) account ) we obtain a temporal relation t whose schema is: T(branch-name, branch-city, assets, account-number, balance) Push projections using equivalence rules 8a and 8b; eliminate unneeded attributes from intermediate results to get:  customer-name (  account-number t(branch-name, branch-city, assets, accountnumber, balance) depositor(customer-name, account-number)) May 2008 Database System Concepts - Chapter 14 Query Optimization - 26 14.2.2 Example Three  Find the names of all customers with an account at a Brooklyn branch whose account balance is over $1000. customer-name((branch-city = “Brooklyn”  balance > 1000 (branch (account depositor)))  Transformation using join associatively (Rule 6a), refer to Fig.14.3 May 2008 Database System Concepts - Chapter 14 Query Optimization - 27 14.2.2 Example Three (cont.) Fig. 14.3 Multiple transformations May 2008 Database System Concepts - Chapter 14 Query Optimization - 28 14.2.3 Join Ordering   For all relations r1, r2, and r3, (r1 r2) r3 = r1 (r2 r3 ) If r2 r3 is quite large and r1 r2 is small, we choose (r1 r2) r3 so that we compute and store a smaller temporary relation. May 2008 Database System Concepts - Chapter 14 Query Optimization - 29 14.2.3 An Example    Consider the expression customer-name ((branch-city = “Brooklyn” (branch)) account depositor) Could compute account depositor first, and join result with branch-city = “Brooklyn” (branch) , but account depositor is likely to be a large relation. Since it is more likely that only a small fraction of the bank’s customers have accounts in branches located in Brooklyn, it is better to compute branch-city = “Brooklyn” (branch) account first. May 2008 Database System Concepts - Chapter 14 Query Optimization - 30 §14.3 Estimating Statistics of Expression Results  How to estimate the cost of an operation  estimating the size of the resultant relations produced by the operation p, according to statistics in the catalog about the relations on which the operation is conducted   May 2008 e.g. customer-name(balance2500(account) customer) computing the cost of the operation in terms of disk access, by means of the cost formulae shown in chapter13  disk access is the predominant cost, and the number of block transfers from disk is used as a measure of the actual cost of evaluation. Database System Concepts - Chapter 14 Query Optimization - 31 14.3.1 Statistics in Catalog for Optimization       nr: number of tuples in a relation r br: number of blocks containing tuples of r lr: size of a tuple of r in bytes fr: blocking factor of r  the number of tuples of r that fit into one block V(A, r): number of distinct values that appear in r for attribute A  same as the size of A(r) Statistics about indices, such as heights of B+-trees, that is HTi, and the number of leaf pages in the indices, are also in catalog May 2008 Database System Concepts - Chapter 14 Query Optimization - 32 Statistics in Catalog for Optimization (cont.)  If tuples of r are stored together physically in a file, then:     nr  br  f r  May 2008 Database System Concepts - Chapter 14 Query Optimization - 33 Statistics in Catalog for Optimization (cont.)  An example for statistical information  faccount= 20 (20 tuples of account fit in one block)  V(branch-name, account) = 50 (50 branches)  V(balance, account) = 500 (500 different balance values)  naccount = 10000 (account has 10,000 tuples)  assume the following indices exist on account +  a primary, B -tree index for attribute branch-name +  a secondary, B -tree index for attribute balance May 2008 Database System Concepts - Chapter 14 Query Optimization - 34 14.3.2 Selection Size Estimation   The size estimation of the result of a selection operation depends on the selection predicates, that is single equality predicate, single comparison predicate, and combination of predicates Equality selection A=a (r)  assuming uniform distribution of values in relations, and the value a appears in attribute A of some record in r  the selection is estimated to have  nr / V(A, r) tuples  nr /V(A, r) /fr blocks that these tuples occupy May 2008 Database System Concepts - Chapter 14 Query Optimization - 35 14.3.2 Selection Size Estimation (cont.)  Comparison selections AV(r)  the lowest and highest values of attributes, that is min(A, r) and max(A, r), are available in catalog  the estimated number of records/tuples satisfying comparison condition A V  0, if v < min(A, r)  nr, if v  max(A, r)  , otherwise v  min( A, r ) nr . max( A, r )  min( A, r )  May 2008 in absence of statistical information cost is assumed to be nr / 2 Database System Concepts - Chapter 14 Query Optimization - 36 14.3.2 Selection Size Estimation (cont.)  The selectivity of a condition i is the probability that a tuple in the relation r satisfies i . If si is the number of satisfying tuples in r, the selectivity of i is given by si /nr. Conjunction: 1 2. . .  n (r). The estimate for number of  tuples in the result is: n  s1  s2  . . .  sn r nrn Disjunction:1 2 . . .  n (r). Estimated number of tuples:   sn  s1 s2  nr  1  (1  )  (1  )  ...  (1  )  nr nr nr    Negation: (r). Estimated number of tuples: nr – size((r)) May 2008 Database System Concepts - Chapter 14 Query Optimization - 37 14.4 Choice of Evaluation Plans — Query optimization  How to generate the evaluation plan for an relational algebra expression with lower or the lowest cost  generating of the optimized query trees  selecting  implementation algorithms for each operations in the tree  pipeline or materialization strategy for the expression  E.g. Fig.14.5 May 2008 Database System Concepts - Chapter 14 Query Optimization - 38 Fig.14.5 An evaluation plan 14.4 Choice of Evaluation Plans (cont.)    Query optimization  choosing the evaluation plan with lowest or the lower cost Query optimization is conducted by two strategies/approaches  cost-based optimization,§14.4.2  heuristic optimization, §14.4.3! Practical query optimizers incorporate elements of these two approaches May 2008 Database System Concepts - Chapter 14 Query Optimization - 40 14.4.2 Cost-based Optimization  Cost-based optimization  search all the possible plans and choose the best plan  a optimal plan with the minimum costs can be obtained  Cost-based optimization is expensive, though results optimal plan May 2008 Database System Concepts - Chapter 14 Query Optimization - 41 14.4.3 Heuristic Optimization  Heuristic optimization  uses heuristics (or heuristic rules) to choose a better plan  a suboptimal plan with lower costs can be obtained  Principles: transforms the query-tree by heuristics to reduce cost  perform selection early to reduce the number of tuples  perform projection early to reduces the number of attributes  substitute Cartesian product and selection with join May 2008 Database System Concepts - Chapter 14 Query Optimization - 42 14.4.3 Heuristic Optimization (cont.)  May 2008 perform most restrictive selection and join operations before other similar operations  the most restrictive operations generate resulting relations with smallest size Database System Concepts - Chapter 14 Query Optimization - 43 Steps in Heuristic Optimization    Step1 使用rule1，将conjunctive selection 分解为多个单独的选择操作，以使单个选择操作尽可能沿查询树下移（尽早执行选择操作，以减少中间计算结果） Step2 根据选择操作的交换率和分配率，利用rule2, rule7.a, rule7.b, rule11，将查询树上的每个选择操作尽可能移向叶节点，以便尽早执行选择操作 Step3 根据连接操作的结合律和交换率，使用rule6，重新安排查询树中的叶结点，使得具有restrictive selection特征的叶结点先执行  restrictive selection：执行此操作后，产生的结果关系最小（所含元组最少）  refer to Fig. 14.3 May 2008 Database System Concepts - Chapter 14 Query Optimization - 44 Steps in Heuristic Optimization (cont.)    Step4 利用rule4.a, 以连接操作代替相邻的选择和笛卡尔乘积操作 Step5 利用rule3, 8.a, 8.b,12, 将查询树上的投影操作尽可能下移，以便尽早执行投影操作，减少中间计算结果 Step6 将最后的查询树分解为多个子树，使子树中的各操作可以采用流水线方式执行，以减少对外设的访问次数 e.g. Fig.14.5 重点：前5步！！！ May 2008 Database System Concepts - Chapter 14 Query Optimization - 45 Example One    Given  S(S#, sname, age, sex)  C(C#, cname, teacher)  SC(S#, C#, grade) Find all the students, who are male, and get grades more than 90 when they learn some one course Step1. SQL statement selection conditions  SELECT DISTINCT sname FROM S, SC WHERE S.s# = SC.s# AND sex=M AND grade > 90 join condition May 2008 Database System Concepts - Chapter 14 Query Optimization - 46 Example One (cont.)  Principles select A1, A2, …, An from r1, r2, …, rn where P corresponds to ∏A1, A2, …, An (σP ( r1 ╳ r2 ╳ …╳ rn ) ) May 2008 Database System Concepts - Chapter 14 Query Optimization - 47 Example One (cont.)  Step2. Initial Relational algebra expression  E1 = sname( s.s#=sc.s# ∧ sex=M ∧grade>90 (S ╳ SC ) )  Step3. Initial query tree  Fig. 14.0.5 (a), E1  Step4. By means of Rule1, Rule7b, distribute  operations over relation S and SC  Fig. 14.0.5 (b), E2 May 2008 Database System Concepts - Chapter 14 Query Optimization - 48 sname sname s.s#=sc.s# s.s#=sc.s# ∧ sex=M ∧grade>90 ╳ ╳ sex=M grade>90 SC (S#, C#, grade) S (S#, sname, age, sex) (a) Initial query tree E1 S SC （b）E2 Example One (cont.)    Step5. By means of Rule4.a, replace selection and Cartesian product with natural join  Fig. 14.0.5 (c), E3 Step6. !! By means of Rule8b, distribute  operations over relation S and SC  Fig. 14.0.5 (d), The final optimized expression E4 that is equivalent to initial expression E1 is  sname {sname, S# (sex=M (S)) grade>90 (grade, S# (SC)) } May 2008 Database System Concepts - Chapter 14 Query Optimization - 50 sname sname s.s#=sc.s# sex=M grade>90 SC (S#, C#, grade) S s.s#=sc.s# Πsname, S# ΠS# sex=M grade>90 S SC(S#, C#, grade) (S#, sname, age, sex) (S#, sname, age, sex) （c）E3 （d）E4 Example Two  Consider the following insurance database, where the primary keys are underlined,  Person(driver-id, name, address)  Car(license, model, year)  Accident(report-number, date, location)  Participated(driver-id , license, report-number, damageamount) May 2008 Database System Concepts - Chapter 14 Query Optimization - 52 Example Two (cont.)  Problems  Give a SQL statement to find out the driver name, license, report-number of the accidents which happened in Beijing and before May 3, 2008  Give the initial query tree for this query, and construct an optimized and equivalent relational algebra expression for it by means of heuristic optimization May 2008 Database System Concepts - Chapter 14 Query Optimization - 53 Example Two (cont.)   Step1. SQL statement  Select name, license, Accident.report-number From Person, Participate, Accident Where Person.driver_id = Participate.driver_id AND Accident.report_number = Participate.report_number AND Location=Beijing AND date < 5/3/2008 Step2. Initial expression  name, license, Accident.report-number ( Person.driver_id = Participate.driver_id AND Accident.report_number = Participate.report_number AND Location=Beijing AND date < 5/3/2003 (Person ╳ Participate ╳ Accident ) May 2008 Database System Concepts - Chapter 14 Query Optimization - 54  name, license, Accident.report-number  Person.driver_id = Participate.driver_id AND Accident.report_number = Participate.report_number AND Location=Beijing AND date < 5/3/2003 ╳ ╳ Person Participate Accident (a) initial query tree  name, license, Accident.report-number Accident.report.number = Participate.report_number  name, license, Accident.report_number report-number Person.driver_id = Participate.driver_id  driver_id, name Location=Beijing  driver_id, AND date < 5/3/2003 license, report-number Participate Accident (report-number, date ,location) Person (driver-id, license, report-number, damage-amount) (driver-id, name, address) (b) optimal query tree Example Three  Consider the following relations in banking enterprise database, where the primary keys are underlined  branch (branch-name, branch-city, assets),  loan ( loan-number, branch-name , amount)  borrower( customer-name, loan-number, borrow-date)  customer (customer-name, customer-street, customer-city)  account (account-number, branch-name, balance)  depositor (customer-name, account-number , deposit-date) May 2008 Database System Concepts - Chapter 14 Query Optimization - 57 Example Three (cont.)  For the query “ Find the names of all customers who have an loan at any branch that is located in Brooklyn and have assets more than $100,000, requiring that loan-amount is less than $1000”  give an SQL statement for this query  given a initial query tree for the query, and convert it into an optimized query tree by means of heuristic optimization May 2008 Database System Concepts - Chapter 14 Query Optimization - 58 Example Three (cont.)  May 2008 SQL select customer-name from borrower, loan, branch where loan.loan-number=borrower.loan-number and branch.branch-name=loan.branch-name and branch-city=”Brooklyn” and assets>100000 and amount<1000 Database System Concepts - Chapter 14 Query Optimization - 59  customer-name  customer- name, loan-number loan.loan-number loan.loan-number branch-name ,branch-name borrower ( customer-name, loan-number, borrow-date)  amount< 100 loan( loan-number, branchname , amount)  branch-city=”Brooklyn” AND assets>100000 Branch (branch-name, branch-city, assets) Example Four (cont.)     给定如下关系数据库 Employee(Employee#, Name, Address, Super-E#, E-D#) Department(D#, Dname, manager#, depart-location) Project(P#, Pname, Plocation, P-D#) 查询：对于每个在“哈尔滨”进行的工程项目，列出其工程项目号P# ，工程所属部门号P-D#, 该部门领导的名字Name 和地址Address 要求：写出该查询的SQL语句；转换为关系代数表达式并利用启发式方法进行查询优化；给出优化后的关系代数表达式。 Select P#, P-D#, Name, Address) From Project, Department, Employee Where Plocation =哈尔滨 AND P-D#= D# AND manager# = Employee#  P#, P-D#, Name, Address  employee#, name, address Employee  P#, P-D#, manager#  P#, P-D#  P-location=“哈尔滨” Project D#, manager# Department 作业        Consider the following schema, where the primary keys are underlined , Suppliers (supplier-id, supplier-name, city, telephone, address) Parts ( parts-id, parts-name, parts-color) Catalog (supplier-id, parts-id, price ) . (1) Give an SQL statement to find out the name and telephone of the suppliers who supply a red part whose price is below $2000. (2) Translate this SQL statement into an initial query tree, and give an optimized query tree for it, by means of heuristic query optimization. May 2008 Database System Concepts - Chapter 14 Query Optimization - 63 Appendix A Sybase 平台下的查询执行计划优化  Select max(BscpID) From BSC May 2008 Database System Concepts - Chapter 14 Query Optimization - 64 Appendix A Sybase 平台下的查询执行计划优化（续） May 2008 Database System Concepts - Chapter 14 Query Optimization - 65 Appendix A Sybase 平台下的查询执行计划优化（续）  查询实验内容  单表、多表（如3张表）的查询、插入、删除、修改所对应的查询执行计划比较  有无索引、索引类型对查询、插入、删除、修改等操作的代价的影响  查询结果相同、但表达形式不一样的多个等价的查询、删除、插入、修改语句的查询执行计划的比较例如。 delete from loan where amount between 0 and 50 May 2008 Database System Concepts - Chapter 14 Query Optimization - 67 Appendix A Sybase 平台下的查询执行计划优化（续） delete from loan where amount in ( select amount from loan where amount >= 0 and amount <= 50) May 2008 Database System Concepts - Chapter 14 Query Optimization - 68 Appendix B DDBS查询代价与数据分布的关系分布式数据库系统DDBS中, 数据在各站点的分布和站点间通讯代价/传输时间将影响到整个查询代价  例. 4  S(S#，SNAME，AGE，SEX) , 10 个元组，站点A 5  C(C#，CNAME，TEACHER), 10 个元组，站点B  SC(S#，C#，GRADE), 106个元组，站点A 4  元组平均长度100bit，站点间传输速率v = 10 bit/s ，通信延迟d=1s。假设无副本  查询要求  在支持分片透明性的DDBS（分布在A和B 2个站点）中，查询“所有选修“MATH“的男生的学号和姓名”  May 2008 Database System Concepts - Chapter 14 Query Optimization - 69 Appendix B DDBS查询代价与数据分布的关系 (续)   SQL（全局）查询语句  SELECT S#，SNAME FROM S，SC， C WHERE S.S# = SC.S# AND SC.C# = C.C# (连接条件) AND SEX=M AND CNAME= MATH （选择条件）查询代价(query cost)  只考虑通信代价 TC (x) =1s ╳ 传输次数 + 传输的元组bit数 ÷ 104 bit/s May 2008 Database System Concepts - Chapter 14 Query Optimization - 70 Appendix B DDBS查询代价与数据分布的关系 (续)   策略1  将关系C传到A地，在A地进行查询处理 5 4 3  TC1 = 1s + 10 ╳100 bit ÷ 10 bit/s ≈ 10 s ≈ 16.7 min 策略2  将关系S和SC传到B地，在B地进行查询处理 4 6 4  TC2 = 2s + （10 +10 ） ╳ 100 bit ÷ 10 bit/s ≈10100 s≈28h May 2008 Database System Concepts - Chapter 14 Query Optimization - 71 Appendix B DDBS查询代价与数据分布的关系 (续) 策略3 5  A地从关系SC、S中求出所有男学生的成绩元组，共10 个  根据元组中的C#, 向B地的C关系询问(应答过程) CNAME=MATH？共询问105次 5  TC3 ≈ 2 ╳ 10 ╳ 1 ≈ 2.3天  策略4  在B地从关系C中求出CNAME=MATH的课程元组，假设有10个  根据C#询问A地的S、SC的连接，核实是否为选修MATH （SC.C#= C#?）的男生(S.SEX=M?)  May 2008 Database System Concepts - Chapter 14 Query Optimization - 72 Appendix B DDBS查询代价与数据分布的关系 (续)    TC4 ≈ 2 ╳ 10 ╳ 1 ≈ 20s 策略5 5  在A地从关系S、SC中求出男生的的课程元组，有10 个  将结果传到B地，在B地执行查询 5 4 3  TC5= 1s + 10 ╳ 100 bit ÷ 10 bit/s ≈ 10 s ≈ 16.7 m 策略6  在B地从关系C中求出CNAME=MATH的课程元组，有 10个  将结果传到A地，在A地执行查询 4  TC6 = 1s + 10 ╳100 bit ÷ 10 bit/s ≈ 1s May 2008 Database System Concepts - Chapter 14 Query Optimization - 73 Appendix C DB2查询优化对于IBM DB2来说，它有非常好的优化算法，大致是基于代价估算后选择代价最小的策略。  代价估算的方法  搜集系统目录相关数据和要被执行的SQL语句，然后选择若干方案进行代价计算。  DB2应用程序开发过程中部署测试环境时，可以专门设置虚拟的但是和将来实际运行环境比较一致的系统目录数据来模拟真实环境下的DB2的SQL执行策略。  对于DB2的优化器也可以设置不同的优化级别，可以使用 SET CURRENT QUERY OPTION命令或者某些子句进行相应设置，其中  0：最小的优化  1：相当于DB2/6000 Version 1中的优化加一些低成本的优化  May 2008 Database System Concepts - Chapter 14 Query Optimization - 74 Appendix C DB2查询优化（续） 2：使用opt level 5的功能，但连接优化上进行了一定简化  3：中等优化，大约相当于DB2 for Z/OS  5：使用一定数量的优化，包括Heuristic Rules（这是默认的值）  7：使用一定数量的优化，不包括Heuristic Rules  9：使用所有可用的优化技术  May 2008 Database System Concepts - Chapter 14 Query Optimization - 75 Appendix C DB2查询优化（续）有4种查询优化策略访问工具，分别能够针对静态SQL、动态SQL、CLI应用程序、DRDA应用程序请求者（DRDA AR）提供图形界面或者文本界面查询访问或者提供基本信息。  第一种。基本的是直接访问Explain Table（可以直接使用 SELECT语句来访问它，所以这是一个各种OS平台上都可以采取的措施），它的缺点是没有提供任何界面，必须由人去理解表中的数据。另外只能对执行的SQL来处理，如果是未执行的静态包则无法产生信息。  第二种。基于图形界面的Visual Explain（主要用于windows 平台，甚至可以在windows平台上对诸如UNIX上的DB2系统上的SQL语句进行这种图形化分析），它不支持DADA AR，也不支持分析未执行的静态包，不能一次分析多条SQL语句，但最直观。 Database System Concepts - Chapter 14 Query Optimization - 76 May 2008  Appendix C DB2查询优化（续）   第三种。db2exfmt以预先定义好的报告格式获得explain table的信息，它不支持DADA AR，也不支持分析未执行的静态包，但支持一次分析多条语句。第四种。db2expln允许我们查看作为静态绑定包（static package）的一部分访问策略，它提供一种文本格式描述的策略。不支持CLI或者DRDA AR，可以一次分析多条语句。 May 2008 Database System Concepts - Chapter 14 Query Optimization - 77 Have a break

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download E - Read