* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Talk slides
Entity–attribute–value model wikipedia , lookup
Oracle Database wikipedia , lookup
Microsoft Access wikipedia , lookup
Concurrency control wikipedia , lookup
Ingres (database) wikipedia , lookup
Relational algebra wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
ContactPoint wikipedia , lookup
Database model wikipedia , lookup
Relational model wikipedia , lookup
Versant Object Database wikipedia , lookup
Outline  SQL Server Optimizer  Enumeration architecture  Search space: flexibility/extensibility  Cost and statistics  Automatic Physical Tuning  Database Tuning Advisor  New Directions Running Example (TPC-H Database) Primary Index Secondary Index Running Example (Query) Sample query: Obtain information about certain ordered lineitems that are filtered by suppliers and parts. SELECT l_orderkey, l_linenumber, o_orderstatus FROM lineitem JOIN orders ON l_orderkey = o_orderkey WHERE l_suppkey < 2000 AND l_partkey < 2000 l_orderkey, l_linenumber, o_orderstatus l_suppkey<2000, l_partkey<2000 l_orderkey=o_orderkey lineitem orders SQL Server Query Optimizer  Based on Cascades Framework  Transformation-based, top-down approach  Optimization = Tasks + Memo ( Programs   = Algorithms + Data Structures ) Fully cost-based Flexible and Extensible  Search space easy to change  New operators and rules easy to add SQL Query Parsing, Validation Normalization (predicate unfolding, join collapsing, view substitution, etc.) Cost-based Optimization Execution Plan The Memo  Search Space Memory    Compactly stores all explored alternatives (AND-OR graph) Groups together equivalent operator trees and their plans Provides memoization, duplicate detection, property and cost management, etc. Groups a<10 R b>20 S b>20 a<10 1) SELECT (a<10, ) R R Expressions 1) SELECT (a<10, ) 2) JOIN (x=y, , ) 3) ... 1) GET(R) R b>20 1) SELECT (b>20, ) S S S 1) GET(S) 1) JOIN (x=y, , ) 2) ... Optimization Tasks Optimize Inputs Initialize Memo Optimize Root Group Obtain optimization context Optimize children groups Calculate costs, pick best Optimize Group Apply Rule Explore Group Obtain enforcers and implementation rules for all expressions Apply Rules by Promise Pruning checks Generate bindings, substitutes Restrict ruleset optimizing?Opt Inputs:Explore Explore Group For each expression -Get Guidance -Explore Expression. Explore Expression Apply-always rules Explore Inputs Rest of Apply-always Rules Other Rules by Promise SQL Server Optimizer: Summary  Transformation-based, top-down approach No need for bottom-up interesting orders  Fully Cost-based No separation into phases (heuristic+cost)  Flexible and Extensible New operators, rules, and strategies are simple to add  Adaptive  Automatic statistics create and refresh  Automatic optimization levels  Physical Tuning Problem Statement Workload Database Physical Design Tool Configuration Set of physical structures (i.e., indexes and views) that make similar workloads execute as fast as possible Challenges  Recommend a variety of physical structures.  Indexes,   indexed views, partitions, XML indexes, etc. Support space constraints, update queries. Exceptionally large search space, especially for materialized views and partitions.  Strong interaction among access paths.  Merging needed: Optimal solution with suboptimal parts.  Cannot implement and test alternatives!  Industrial-strength quality (not trivial!) Design Principles    Workload-driven: Take into account database usage. What-if API: Determine impact without actually materializing physical design. Don’t second-guess the query optimizer!  An index is useful only if the query optimizer decides to use it.  Don’t use external estimator of goodness, but the optimizer-estimated cost.  Keep tool reasonably separated from today’s optimizer (extensibility). Database Tuning Advisor in Yukon Workload Compress Workload Tuning Client Candidate Selection (per query) Merging What-if API Query Optimizer Database Server Metadata … Enumeration Yes Time? No Recommendation - Create Hypothetical Index/View. - Optimize Query with respect to hypothetical configurations. New Architecture Workload Get Optimal Configuration (per query) Tuning Client Relaxation Yes Requests API Request Identification What-if API Query Optimizer Time? No Recommendation   Instrumenting the query optimizer. Search strategy based on relaxations. Metadata … Database Server Related Bibliography      Surajit Chaudhuri An Overview of Query Optimization in Relational Systems. PODS 1998: 34-43 Goetz Graefe The Cascades Framework for Query Optimization. IEEE Data Eng. Bull. 18(3): 19-29 (1995) Surajit Chaudhuri, Vivek R. Narasayya An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. VLDB 1997: 146-155 Sanjay Agrawal, Surajit Chaudhuri, Vivek R. Narasayya Automated Selection of Materialized Views and Indexes in SQL Databases. VLDB 2000: 496-505 Nicolas Bruno, Surajit Chaudhuri Automatic Physical Database Tuning: A Relaxation-based Approach. SIGMOD 2005