Download DB2_Ch11

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DBase wikipedia , lookup

IMDb wikipedia , lookup

Tandem Computers wikipedia , lookup

Microsoft Access wikipedia , lookup

Oracle Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Ingres (database) wikipedia , lookup

Functional Database Model wikipedia , lookup

Concurrency control wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

ContactPoint wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Database Systems: Design,
Implementation, and
Management
Eighth Edition
Chapter 11
Database Performance Tuning and
Query Optimization
Objectives
• In this chapter, you will learn:
– Basic database performance-tuning concepts
– How a DBMS processes SQL queries
– About the importance of indexes in query
processing
– About the types of decisions the query optimizer
has to make
– Some common practices used to write efficient
SQL code
– How to formulate queries and tune the DBMS for
optimal performance
Database Systems, 8th Edition
2
Database Performance-Tuning Concepts
• Goal of database performance is to execute
queries as fast as possible
• Database performance tuning
– Set of activities and procedures designed to
reduce response time of database system
• All factors must operate at optimum level with
minimal bottlenecks
• Good database performance starts with
good database design
Database Systems, 8th Edition
3
Database Systems, 8th Edition
4
Performance Tuning: Client and Server
• Database performance-tuning activities can be
divided into:
– Client side
• Generate SQL query that returns correct answer in
least amount of time
• Using minimum amount of resources at server
• SQL performance tuning
– Server side
• DBMS environment configured to respond to
clients’ requests as fast as possible
• Optimum use of existing resources
• DBMS performance tuning
Database Systems, 8th Edition
5
DBMS Architecture
• All data in database are stored in data files
• Data files (Assigned by DBA)
– Automatically expand in predefined increments known
as extends (10kb or 10mb)
– Grouped in file groups or table spaces
• Table space or file group: (created by DBMS)
– Logical grouping of several data files that store data
with similar characteristics
– Ex: System Table space – data dictionary
User table space – tables created by users
Index table space – to hold all indexes
Temporary table space – Temp sorting or grouping
Database Systems, 8th Edition
6
Database Systems, 8th Edition
7
DBMS Architecture (continued)
• Data cache or buffer cache: shared, reserved
memory area
– Stores most recently accessed data blocks (that was
read from data files) in RAM
– Also caches system catalogs and content of indexes
• SQL cache or procedure cache:
– stores most recently executed SQL statements
– Also PL/SQL procedures
– Stores the proposed version of SQL NOT the user
written SQL. This version is ready for execution.
• To work with data, DBMS retrieves data from permanent
storage (data files) and places it in RAM (data cache)
Database Systems, 8th Edition
8
DBMS Architecture (continued)
• Input/output request: low-level data access operation
to/from computer devices (memory, hard disk, printer)
– I/O disk read operation retrieves an entire physical disk
block (4kb, 8kb,…etc depending on OS), generally
containing multiple rows
• Data cache is faster than data in data files because:
– DBMS does not wait for hard disk to retrieve data
• Majority of performance-tuning activities focus on
minimizing I/O operations
– RAM access times range from 5 to 70 ns (nanoseconds), while
hard disk access times range from 5 to 15 ms (milliseconds).
• Typical DBMS processes:
– Listener, User, Scheduler, Lock manager, Optimizer
Database Systems, 8th Edition
9
DBMS Architecture (continued)
•
Listener. It listens for clients’ requests and handles the
processing of the SQL requests to other DBMS processes.
Once a request is received, the listener passes the request to
the appropriate user process.
•
User. The DBMS creates a user process to manage each
client session. Therefore, when you log on to the DBMS, you
are assigned a user process. This process handles all requests
you submit to the server. There are many user processes—at
least one per each logged-in client.
•
Scheduler. The scheduler process organizes the concurrent
execution of SQL requests.
•
Lock manager. This process manages all locks placed on
database objects.
•
Optimizer. The optimizer process analyzes SQL queries and
finds the most efficient way to access the data.
Database Statistics
• Another DBMS process that plays an important role in
query optimization is gathering database statistics
• Make critical decisions about improving query
processing efficiency
• Measurements about database objects and available
resources
–
–
–
–
–
Tables
Indexes
Number of processors used
Processor speed
Temporary space available (for grouping and sorting)
• Can be gathered manually by DBA or automatically by
DBMS
Database Systems, 8th Edition
11
Database Statistics (continued)
• Example:
ANALYZE <table/index> object name COMPUTE STATISTICS.
Database Systems, 8th Edition
12
Query Processing
• DBMS processes queries in three phases
– Parsing
• DBMS parses the query and chooses the most
efficient access/execution plan
– Execution
• DBMS executes the query using chosen
execution plan
– Fetching
• DBMS fetches the data and sends the result back
to the client
Database Systems, 8th Edition
13
Database Systems, 8th Edition
14
SQL Parsing Phase
• Break down query into smaller units
• Transform original SQL query into slightly
different version of original SQL code
– Fully equivalent
• Optimized query results are always the same as
original query
– More efficient
• Optimized query will almost always execute faster
than original query
Database Systems, 8th Edition
15
SQL Parsing Phase (continued)
• Query optimizer analyzes SQL query and finds
most efficient way to access data
– Validated for syntax compliance
– Validated against data dictionary
• Tables, column names are correct
• User has proper access rights
– Analyzed and decomposed into components
– Optimized
– Prepared for execution
Database Systems, 8th Edition
16
SQL Parsing Phase (continued)
• Access plans are DBMS-specific
– Translate client’s SQL query into series of
complex I/O operations
– Required to read the data from the physical data
files and generate result set
• DBMS checks if access plan already exists for
query in SQL cache
• DBMS reuses the access plan to save time
• If not, optimizer evaluates various plans
– Chosen plan placed in SQL cache
Database Systems, 8th Edition
17
Database Systems, 8th Edition
18
SQL Execution Phase
SQL Fetching Phase
• All I/O operations indicated in access plan are
executed
– Locks acquired
– Data retrieved and placed in data cache
– Transaction management commands processed
• Rows of resulting query result set are returned
to client
• DBMS may use temporary table space to store
temporary data
Database Systems, 8th Edition
19
Query Processing Bottlenecks
• Delay introduced in the processing of an I/O
operation that slows the system
–
–
–
–
–
CPU
RAM
Hard disk
Network
Application code
Database Systems, 8th Edition
20
Tue 2-7 Indexes and Query Optimization
• Indexes
– Crucial in speeding up data access
– Facilitate searching, sorting, and using
aggregate functions as well as join operations
– Ordered set of values that contains index key
• More efficient to use index to access table than
to scan all rows in table sequentially
Database Systems, 8th Edition
21
SELECT CUS_NAME, CUS_STATE
FROM CUSTOMER
WHERE CUS_STATE = 'FL';
Database Systems, 8th Edition
22
Indexes and Query Optimization
• Data sparsity refers to the number of different
values a column could possibly have. For
example, a STU_SEX column in a STUDENT
table can have only two possible values, M or
F; therefore that column is said to have low
sparsity. In contrast, the STU_DOB column that
stores the student date of birth can have many
different date values; therefore, that column is
said to have high sparsity.
Optimizer Choices
• The central activity during the parsing phase
• Must chose indexes, tables to use first, how to make join, ...etc.
• Rule-based optimizer
– Uses set of preset rules and points to determine best approach to
execute query
• assign a “fixed cost” to each SQL operation
• For example, a full table scan has a set cost of 10, while a table
access by row ID has a set cost of 3.
• the costs are then added to yield the cost of the execution plan
• Cost-based optimizer
– Algorithms based on statistics about objects being accessed
– Adds up processing cost, I/O costs, resource costs to derive total
cost
Database Systems, 8th Edition
24
SELECT P_CODE, P_DESCRIPT, P_PRICE, V_NAME, V_STATE
FROM
PRODUCT, VENDOR
WHERE PRODUCT.V_CODE = VENDOR.V_CODE
AND VENDOR.V_STATE = 'FL';
Let’s assume that the database statistics indicate that:
• The PRODUCT table has 7,000 rows.
• The VENDOR table has 300 rows.
• Ten vendors are located in Florida.
• One thousand products come from vendors in Florida.
Database Systems, 8th Edition
25
Wed 3-7 SQL Performance Tuning
• Evaluated from client perspective
– Most current relational DBMSs perform
automatic query optimization at the server end
– Most SQL performance optimization techniques
are DBMS-specific
• Rarely portable
• Majority of performance problems related to
poorly written SQL code
• Carefully written query usually outperforms a
poorly written query
Database Systems, 8th Edition
26
Index Selectivity
• Indexes are the most important technique used in
SQL performance optimization
• Indexes are used when:
– Indexed column appears by itself in search criteria
of WHERE or HAVING clause
– Indexed column appears by itself in GROUP BY or
ORDER BY clause
– MAX or MIN function is applied to indexed column
– Data sparsity is high
• Index selectivity: a measure of how likely an
index will be used
Database Systems, 8th Edition
27
Index Selectivity (continued)
• General guidelines for indexes:
– Create indexes for each attribute in WHERE,
HAVING, ORDER BY, or GROUP BY clause
– Do not use in small tables or tables with low
sparsity
– Declare primary and foreign keys so optimizer
can use indexes in join operations
– Declare indexes in join columns other than
PK/FK
Database Systems, 8th Edition
28
Conditional Expressions
• Normally expressed within WHERE or HAVING
clauses of SQL statement
• Restricts output of query to only rows matching
conditional criteria
Database Systems, 8th Edition
29
Conditional Expressions (continued)
• Common practices for efficient SQL:
– Use simple columns (std_name) or literals (10 or ‘FL’) in conditionals
• Avoid using expressions (p_min *100)
– Numeric field comparisons are faster
• Character comparisons are slow.
• Null comparison is the slowest.
– Equality comparisons faster than inequality (>,<.<>,>=,<=)
• Like is also slow
– Transform conditional expressions to use literals
P_PRICE − 10 = 7, change it to read P_PRICE = 17
– Write equality conditions first
Change P_QOH < P_MIN AND P_MIN = P_REORDER AND P_QOH = 10
To
P_QOH = 10 AND P_MIN = P_REORDER AND P_MIN > 10
– AND: Use condition most likely to be false first
– OR: Use condition most likely to be true first
– Avoid NOT
Database Systems, 8th Edition
30
Query Formulation
• If an end user gives you a sample output and tells
you to match that output format, you must write the
corresponding SQL:
–
–
–
–
Identify what columns and computations are required
Identify source tables
Determine how to join tables (Normally natural join)
Determine what selection criteria is needed
• Simple (P_PRICE > 10)
• Nested (P_PRICE > = ( SELECT AVG(P_PRICE) FROM PRODUCT).
– Determine in what order to display output
Database Systems, 8th Edition
31
DBMS Performance Tuning
• Includes managing the DBMS processes in primary
memory (allocating memory for caching purposes)
and managing the structures in physical storage
(allocating space for the data files).
• DBMS performance tuning at server end focuses
on setting parameters used for:
–
–
–
–
Data cache
SQL cache
Sort cache
Optimizer mode (Rule-based, cost-based)
Database Systems, 8th Edition
32