Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 6: Database Evolution • Title: AutoAdmin “What-if” Index Analysis Utility • Authors: Surajit Chaudhuri, Vivek Narasayya • ACM SIGMOD 1998 AutoAdmin “What-if” Index Analysis Utility • Problem – Problem Statement – Why is this problem important? – Why is this problem hard? • Approaches – Approach description, key concepts – Contributions (novelty, improved) – Assumptions Problem Statement – Index Selection • Given – Database, i.e. tables – A Database Management System, i.e. SQL Server • Find – Index Selection Tool to compare alternate indexes • by estimated processing cost of workload queries – Tools to define workload queries, alternative indices, … • Objectives – Assist database administrators, i.e. reduce manual effort • Constraints – Avoid physical alternation to Databases – Avoid scan of large tables to create statistics for query optimizer Why is this problem important? • Total cost of ownership – Database administration is a significant component – Annual salaries compare with cost of software and hardware • Thus, DBA time is valuable – Tools to assist DBA produce significant savings • DBA perform many services – Performance tuning via Index selection – Other, e.g. recovery, upgrade, … Why is this problem Hard? • Index selection is intrinsically a hard search problem. – A large # of possible single and multi-column indexes – Various types of indexes – Variety of usage by query optimizer • e.g., indexed-only access • Trade-off between read and update queries – Cost of update, insert, delete, bulk load may go up! – Impact analysis is required Novelty of Contribution • Limitations of Related Work – Lack of impact analysis – Reliance of large physical changes to databases • Ex. ‘views’ in simulating hypothetical database [Stonebraker] • Very computation intensive • Contributions – Built ‘index selection’ & ‘index analysis’ utility. (This paper more focuses on index analysis utility.) – Hypothetical index structures – Enables a large class of analysis at low cost. • Query optimizer to evaluate indexes • Sampling to collect statistics Basic Concepts • Ideas for modeling – ‘Workload’ = a set of SQL statements – ‘Configuration’ = a set of indexes – Hypothetical Configuration Analysis (HCA) • Summary analysis on simulation results • Ideas for efficient implementation – Simulate a hypothetical configuration • estimate the cost of queries in the workload – Sampling to estimate statistics needed by optimizer Key Concepts - Simulating Configurations • Simulating Hypothetical Configurations to Estimate – (a) the cost of queries in the workload: is relative to the total cost of the workload in the current configuration. – (b) usage of indexes: represents the indexes that are expected to be used by the server to answer the query if the hypothetical configuration exists Key Concepts - Sampling Strategy • Used for creating hypothetical indexes. – Accuracy – Figure 6 – Cost – Figure 5 • Details - adaptive page-level sampling algorithm – Psuedo-code (fig.. 3) Key Concepts - Architecture Validation Methodology • Prototyped index analysis utility – for Microsoft SQL Server 7.0 • Case Study – Workload – TPC-D – Decision Support benchmark – Usage analysis • Figure 10. Usage of each index in configuration for the workload • Figure 11. – Distribution of selection conditions on a given table – Comparison of two configuration by cost • Figure 13 – For 10 most expensive queries • Figure 14 – By SQL Statement types Examples of Summary Analysis • Distribution of workload by SQL type • Distribution of conditions over tables Summary • Paper’s focus – Index Analysis Utility • Ideas – Simulating hypothetical indexes can estimate the cost of queries and usage of indexes. • Contributions – The first of its kind, index analysis utility with low cost – Presented efficient mechanism for implementation • Analytical Validation – Case study with Microsoft SQL Server 7.0 Assumptions, Rewrite today • Assumptions – The simulation represents the real behavior of database system. • Rewrite today – Compare with newer methods • DB2 Design Advisor – Experimental evaluation • To measure the efficiency of the index analysis utility