* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download DB2_Ch11
Survey
Document related concepts
Tandem Computers wikipedia , lookup
Microsoft Access wikipedia , lookup
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Ingres (database) wikipedia , lookup
Functional Database Model wikipedia , lookup
Concurrency control wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
ContactPoint wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Clusterpoint wikipedia , lookup
Transcript
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization Objectives • In this chapter, you will learn: – Basic database performance-tuning concepts – How a DBMS processes SQL queries – About the importance of indexes in query processing – About the types of decisions the query optimizer has to make – Some common practices used to write efficient SQL code – How to formulate queries and tune the DBMS for optimal performance Database Systems, 8th Edition 2 Database Performance-Tuning Concepts • Goal of database performance is to execute queries as fast as possible • Database performance tuning – Set of activities and procedures designed to reduce response time of database system • All factors must operate at optimum level with minimal bottlenecks • Good database performance starts with good database design Database Systems, 8th Edition 3 Database Systems, 8th Edition 4 Performance Tuning: Client and Server • Database performance-tuning activities can be divided into: – Client side • Generate SQL query that returns correct answer in least amount of time • Using minimum amount of resources at server • SQL performance tuning – Server side • DBMS environment configured to respond to clients’ requests as fast as possible • Optimum use of existing resources • DBMS performance tuning Database Systems, 8th Edition 5 DBMS Architecture • All data in database are stored in data files • Data files (Assigned by DBA) – Automatically expand in predefined increments known as extends (10kb or 10mb) – Grouped in file groups or table spaces • Table space or file group: (created by DBMS) – Logical grouping of several data files that store data with similar characteristics – Ex: System Table space – data dictionary User table space – tables created by users Index table space – to hold all indexes Temporary table space – Temp sorting or grouping Database Systems, 8th Edition 6 Database Systems, 8th Edition 7 DBMS Architecture (continued) • Data cache or buffer cache: shared, reserved memory area – Stores most recently accessed data blocks (that was read from data files) in RAM – Also caches system catalogs and content of indexes • SQL cache or procedure cache: – stores most recently executed SQL statements – Also PL/SQL procedures – Stores the proposed version of SQL NOT the user written SQL. This version is ready for execution. • To work with data, DBMS retrieves data from permanent storage (data files) and places it in RAM (data cache) Database Systems, 8th Edition 8 DBMS Architecture (continued) • Input/output request: low-level data access operation to/from computer devices (memory, hard disk, printer) – I/O disk read operation retrieves an entire physical disk block (4kb, 8kb,…etc depending on OS), generally containing multiple rows • Data cache is faster than data in data files because: – DBMS does not wait for hard disk to retrieve data • Majority of performance-tuning activities focus on minimizing I/O operations – RAM access times range from 5 to 70 ns (nanoseconds), while hard disk access times range from 5 to 15 ms (milliseconds). • Typical DBMS processes: – Listener, User, Scheduler, Lock manager, Optimizer Database Systems, 8th Edition 9 DBMS Architecture (continued) • Listener. It listens for clients’ requests and handles the processing of the SQL requests to other DBMS processes. Once a request is received, the listener passes the request to the appropriate user process. • User. The DBMS creates a user process to manage each client session. Therefore, when you log on to the DBMS, you are assigned a user process. This process handles all requests you submit to the server. There are many user processes—at least one per each logged-in client. • Scheduler. The scheduler process organizes the concurrent execution of SQL requests. • Lock manager. This process manages all locks placed on database objects. • Optimizer. The optimizer process analyzes SQL queries and finds the most efficient way to access the data. Database Statistics • Another DBMS process that plays an important role in query optimization is gathering database statistics • Make critical decisions about improving query processing efficiency • Measurements about database objects and available resources – – – – – Tables Indexes Number of processors used Processor speed Temporary space available (for grouping and sorting) • Can be gathered manually by DBA or automatically by DBMS Database Systems, 8th Edition 11 Database Statistics (continued) • Example: ANALYZE <table/index> object name COMPUTE STATISTICS. Database Systems, 8th Edition 12 Query Processing • DBMS processes queries in three phases – Parsing • DBMS parses the query and chooses the most efficient access/execution plan – Execution • DBMS executes the query using chosen execution plan – Fetching • DBMS fetches the data and sends the result back to the client Database Systems, 8th Edition 13 Database Systems, 8th Edition 14 SQL Parsing Phase • Break down query into smaller units • Transform original SQL query into slightly different version of original SQL code – Fully equivalent • Optimized query results are always the same as original query – More efficient • Optimized query will almost always execute faster than original query Database Systems, 8th Edition 15 SQL Parsing Phase (continued) • Query optimizer analyzes SQL query and finds most efficient way to access data – Validated for syntax compliance – Validated against data dictionary • Tables, column names are correct • User has proper access rights – Analyzed and decomposed into components – Optimized – Prepared for execution Database Systems, 8th Edition 16 SQL Parsing Phase (continued) • Access plans are DBMS-specific – Translate client’s SQL query into series of complex I/O operations – Required to read the data from the physical data files and generate result set • DBMS checks if access plan already exists for query in SQL cache • DBMS reuses the access plan to save time • If not, optimizer evaluates various plans – Chosen plan placed in SQL cache Database Systems, 8th Edition 17 Database Systems, 8th Edition 18 SQL Execution Phase SQL Fetching Phase • All I/O operations indicated in access plan are executed – Locks acquired – Data retrieved and placed in data cache – Transaction management commands processed • Rows of resulting query result set are returned to client • DBMS may use temporary table space to store temporary data Database Systems, 8th Edition 19 Query Processing Bottlenecks • Delay introduced in the processing of an I/O operation that slows the system – – – – – CPU RAM Hard disk Network Application code Database Systems, 8th Edition 20 Tue 2-7 Indexes and Query Optimization • Indexes – Crucial in speeding up data access – Facilitate searching, sorting, and using aggregate functions as well as join operations – Ordered set of values that contains index key • More efficient to use index to access table than to scan all rows in table sequentially Database Systems, 8th Edition 21 SELECT CUS_NAME, CUS_STATE FROM CUSTOMER WHERE CUS_STATE = 'FL'; Database Systems, 8th Edition 22 Indexes and Query Optimization • Data sparsity refers to the number of different values a column could possibly have. For example, a STU_SEX column in a STUDENT table can have only two possible values, M or F; therefore that column is said to have low sparsity. In contrast, the STU_DOB column that stores the student date of birth can have many different date values; therefore, that column is said to have high sparsity. Optimizer Choices • The central activity during the parsing phase • Must chose indexes, tables to use first, how to make join, ...etc. • Rule-based optimizer – Uses set of preset rules and points to determine best approach to execute query • assign a “fixed cost” to each SQL operation • For example, a full table scan has a set cost of 10, while a table access by row ID has a set cost of 3. • the costs are then added to yield the cost of the execution plan • Cost-based optimizer – Algorithms based on statistics about objects being accessed – Adds up processing cost, I/O costs, resource costs to derive total cost Database Systems, 8th Edition 24 SELECT P_CODE, P_DESCRIPT, P_PRICE, V_NAME, V_STATE FROM PRODUCT, VENDOR WHERE PRODUCT.V_CODE = VENDOR.V_CODE AND VENDOR.V_STATE = 'FL'; Let’s assume that the database statistics indicate that: • The PRODUCT table has 7,000 rows. • The VENDOR table has 300 rows. • Ten vendors are located in Florida. • One thousand products come from vendors in Florida. Database Systems, 8th Edition 25 Wed 3-7 SQL Performance Tuning • Evaluated from client perspective – Most current relational DBMSs perform automatic query optimization at the server end – Most SQL performance optimization techniques are DBMS-specific • Rarely portable • Majority of performance problems related to poorly written SQL code • Carefully written query usually outperforms a poorly written query Database Systems, 8th Edition 26 Index Selectivity • Indexes are the most important technique used in SQL performance optimization • Indexes are used when: – Indexed column appears by itself in search criteria of WHERE or HAVING clause – Indexed column appears by itself in GROUP BY or ORDER BY clause – MAX or MIN function is applied to indexed column – Data sparsity is high • Index selectivity: a measure of how likely an index will be used Database Systems, 8th Edition 27 Index Selectivity (continued) • General guidelines for indexes: – Create indexes for each attribute in WHERE, HAVING, ORDER BY, or GROUP BY clause – Do not use in small tables or tables with low sparsity – Declare primary and foreign keys so optimizer can use indexes in join operations – Declare indexes in join columns other than PK/FK Database Systems, 8th Edition 28 Conditional Expressions • Normally expressed within WHERE or HAVING clauses of SQL statement • Restricts output of query to only rows matching conditional criteria Database Systems, 8th Edition 29 Conditional Expressions (continued) • Common practices for efficient SQL: – Use simple columns (std_name) or literals (10 or ‘FL’) in conditionals • Avoid using expressions (p_min *100) – Numeric field comparisons are faster • Character comparisons are slow. • Null comparison is the slowest. – Equality comparisons faster than inequality (>,<.<>,>=,<=) • Like is also slow – Transform conditional expressions to use literals P_PRICE − 10 = 7, change it to read P_PRICE = 17 – Write equality conditions first Change P_QOH < P_MIN AND P_MIN = P_REORDER AND P_QOH = 10 To P_QOH = 10 AND P_MIN = P_REORDER AND P_MIN > 10 – AND: Use condition most likely to be false first – OR: Use condition most likely to be true first – Avoid NOT Database Systems, 8th Edition 30 Query Formulation • If an end user gives you a sample output and tells you to match that output format, you must write the corresponding SQL: – – – – Identify what columns and computations are required Identify source tables Determine how to join tables (Normally natural join) Determine what selection criteria is needed • Simple (P_PRICE > 10) • Nested (P_PRICE > = ( SELECT AVG(P_PRICE) FROM PRODUCT). – Determine in what order to display output Database Systems, 8th Edition 31 DBMS Performance Tuning • Includes managing the DBMS processes in primary memory (allocating memory for caching purposes) and managing the structures in physical storage (allocating space for the data files). • DBMS performance tuning at server end focuses on setting parameters used for: – – – – Data cache SQL cache Sort cache Optimizer mode (Rule-based, cost-based) Database Systems, 8th Edition 32