Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Getting the Best Out of Your Data Warehouse Performance Tips Scott McKay WhereScape Limited Abstract Performance. It could almost always be better, but is it good enough? This can be a difficult question to answer in a data warehouse environment where much of the load may consist of adhoc queries. However, poor performance can turn an otherwise successful data warehousing project into a failure. Data volumes and the expectations of data warehouses are ever increasing, so performance considerations can’t be ignored. This paper focuses on the design and setup of a dimensional data warehouse to get the best out of it. It is intended to dispel a few myths about data warehouse environments and provide a few practical performance tips. The tips are based on a number of years experience as an Oracle DBA, combined with two years specializing in data warehouse development and performance. Overview Performance is often found wanting after a project gets implemented and data volumes start growing. There can a number of reasons for this. Whatever the reasons, they often stem from performance not being designed in from the beginning. It is too easy to blame issues on the hardware, or to purchase bigger hardware as a solution to the problem. Often this is nothing more than a temporary bandaid as it masks underlying issues that will get worse as data volumes grow. For a data warehouse to successfully handle today’s ever increasing data volumes and complex analytical needs is often a balancing act between query performance, load times and the ever important cost. Common Myths 1. “A data warehouse is read only so write performance is not important.” There are 2 good reasons why write performance is critical to a Data Warehouse environment: The data has to be loaded in to the database. This may seem obvious, but often seems to be forgotten when designing or purchasing the IO subsystem. As Data Warehouses grow bigger and more complex, the time required for data loading grows larger. At the same time companies are becoming larger and more global, so the actual window available for loading the data is shrinking. This compounds the issue, and having the data available when the business would like it is often a challenge in mature data warehouse environments. Query performance. This may not be quite as obvious, until you consider the types of queries that a Data Warehouse is generally expected to answer. A typical data warehouse query might be: “Show me how my sales have been trending over the last 5 years?” or “How have my inventory costs been tracking?”. Both of these would require the database to sort and aggregate potentially large volumes of data. If the data volume is large then these operations will happen in the temporary segments on disk. Sorts on disk are relative write intensive and will certainly have a performance impact if write performance is a bottleneck. 2. “Full table scans are bad.” Full table scans may be bad in a particular case. In another case, a full table scan may be the best way of achieving the desired result. Consider the Inventory query mentioned above. This may involve reading all of the records in a table which contains a snapshot of your entire inventory each month, for several years. To see how your inventory cost is trending, you would probably have to read all the records in the table and a full table scan is the best way of doing this. What about a scenario where you might have 3 years worth of inventory snapshots and you want to look at the past year. You need to view one third of the table. Using an index on a date field might be good because then there is two thirds of the table that you don’t need to read. This seems logical, except that it depends on the work required to get the table rowid from the index. For a standard b-tree index the database may have to read an average of 2 or 3 blocks to traverse the index to get to the required entry, and then perform another block read for the rowid lookup. In this case a full table scan would be much faster. If you were using a bitmap index however, then it would quite likely be faster than either of the other approaches. A common data loading approach is to load a subset of data each night into load and/or stage tables for processing before loading fact tables. In this case you want to be using full table scans for the processing of your load and stage tables because you are processing all of the rows in the tables. 3. “Additional Indexes Improve Performance.” This sounds similar to myth number 2, and in some cases the reasoning is the same, but is worth a separate mention because it is so common. I have seen several major performance issues fixed by actually removing indexes from tables. In some cases they may have made an improvement on the particular query, or data load which was being worked on at the time, but have had adverse effects on other parts of the system. The reasons for this are generally as discussed above. Another potential issue with over-indexing is the adverse effect that it has on inserting data into a table. In a data warehouse load, thousands of records are often being inserted at once. In this case it can often be better to drop the indexes on the table before loading and the recreate them at the end of the load process. While index creation can be a relatively expensive operation, the time for this is often significantly less than the negative impact of the database maintaining these indexes while inserting the data. Bitmap indexes are particularly expensive to maintain when inserting data, but can be quicker to create, so it is definitely worth considering dropping any bitmap indexes on a table before loading any data into it, and then recreating them after the data is loaded. It should be noted that adding an additional index to a table can be exactly what is required. However, it should be done after considering what negatives it may cause along with other options. Application Design Tips 1. Use a Dimensional Star Schema Design This is described in many texts and is a common standard for data warehouse design, so I won’t describe it here (Refer to The Data Warehouse Toolkit by Ralph Kimball). It is a simple design which is easy to model and performs well for queries. 2. Generate artificial keys for dimension table joins. This can greatly simplify the development of queries over the star schema because all of the table joins have been simplified to single column joins, and nulls have been given a value. Performance can also be better as any index on this column just contains a single column. 3. Partition Tables that are likely to grow large. This can have significant benefits for both query performance and load performance. Range partitioning by month is often the best for both scenarios in a Data Warehouse for the following reasons: Queries often have date ranges or restrictions, so partitioning by a date field lends itself to query performance in many case. As data loads are generally done over only the most recent data, or recently changed data, the data load can be written to only work on the related partitions. This can have significant benefits for very large tables where indexes are dropped before the load and rebuilt after. If local indexes are used, then the indexes on the unaffected partitions can remain untouched. Another big advantage is that often a certain “window” of data is required to be kept. In these cases then the old partitions can be dropped instantly rather than expensive deletes being required. 4. Use Bitmap indexes on Fact tables for Dimension table joins. Dimensions are generally orders of magnitude smaller than the fact tables that they related to. This means that there are generally a large number of like dimension keys in the fact table. Bitmap indexes are ideally suited to this type of join. In rare cases where a large dimension contains almost as many rows as the related fact table (i.e. the dimension key records are relatively unique in the fact table then a standard b-tree index may perform better). 5. Use Materialized views to create generic aggregate tables. Materialized views when used in conjunction with the query rewrite feature in Oracle can significantly improve query performance. If the materialized view is kept generic then it can potentially improve any number of queries. An example of this might be a sales fact table that shows sales by customer, product and month. A materialized view which selects all the columns from this table except product could then be automatically used by the optimizer for any query which does not reference product (e.g. “Who is my best customer?”). Initial System Setup Large Block size Usually set to 32kb for Oracle. The database is generally dealing with large data sets and large transactions so performance is improved by having a large block size. There may be some advantage in having a tablespace with a smaller block size for any particularly small dimensions which exist. Large pga_aggregate_target (or sort_area_size). Sorts on disk are expensive and a data warehouse is generally dealing with much larger data sets than a transactional system. There may be some advantage in setting workarea_policy to manual within a job, and then setting a large sort_area_size, hash_area_size and bitmap_merge_area_size parameters. These parameters can all be modified using the “alter session” command. Use a smaller number of large rollback segments rather than a lot of little ones. “Snapshot too old” errors are common when the initial size of the rollback segments is too small. Use local, temporary and read only tablespaces where applicable. Local tablespaces in particular can improve the speed of data loads because during the dropping and creating of indexes before and after the load a lot of extent management may be required. For the same reason uniform extent sizes within a particular tablespace can lead to a significant overall disk space saving. Having 3 different uniform extent sizes (small, medium and large) may useful. Use the Keep buffer pool. This is particularly useful for dimensions and will give significant performance benefits over a single very large buffer pool. The reason for this becomes clear when you consider the type of tables in a data warehouse. Dimension tables are generally small and the same dimensions are constantly referenced by different queries. Fact tables, by contrast, can be very large and are often accessed in isolation. If you have your fact tables in the same tablespace as your dimension tables then the dimension tables may be getting constantly aged out. A keep buffer pool large enough to hold all your dimension tables and the indexes on them is a good idea. You can then assign your dimensions and dimension indexes to the keep pool using the “alter table” or “alter index” commands. Table Statistics Analyze database tables using the dbms_stats database package. This allows the database to use the cost based optimizer and all the Oracle Performance features which have been introduced since Oracle 7. Table statistics also provide much more accurate statement cost details when doing explain plans. Diagnosing Existing Performance Problems Statspack Oracle Statspack is a great utility for diagnosing existing performance problems and it is shipped free with the database. It works by taking snapshots of various performance metrics and storing them in database tables. The supplied statspack report can then be used to give a picture of what has happened between any two snapshots taken. For a new system or a system experiencing performance issues then running a Statspack snapshot at 6 hour intervals can be particularly useful. The report can be done to a number of different levels of detail, but the beginning of the report in all cases provides a useful over for system performance statistics, such as shared pool statistics, buffer pool statistics and IO metrics. Also provided is a formatted summary of the System Wait Events from the v$system_event view. System Wait Events These are often overlooked, but can generally provide a very quick indication of where the bottlenecks are in a system. It is usually the best place to start. Searching Oracle metalink for a particular wait event from statspack will generally return a useful note which not only describes what the event is in some detail, but also provides suggestions as to how to reduce the waits for the event. An example of the wait events shown by Statspack is shown in Appendix 1. Session Wait Events If a specific process or time of day is causing performance issues then Session Wait Events from the v$session_wait view can be particularly useful. If looking in real time this table can be queries for a particular SID, several times which gives a good indication of where most of the time is going. When trying to look at what the issues where for historical periods of bad performance, then the Statspack report can provide a summary of the Session Waits occurring at the time of a snapshot. Tuning Individual Statements Tuning system parameters can have a dramatically effect on systems where there is a fundamental setup issue. However, this is not usually the case. Most performance problems can be attributed to poorly designed or poorly written SQL statements. Selecting from v$sqlarea at any particular time and ordering by buffer_gets or disk_reads descending can provide and instant picture of what the most resource intensive statements in the system are. Tuning the poor performing statements can often have the biggest impact on overall system performance, because it can free up a lot of resources that may have otherwise been causing a bottleneck. Hints SQL hints are a great way of changing the execution plan of a statement to make it perform more efficiently. There are a large number of different types of hints available, so I won’t go into detail on them here. Using an explain plan utility you can quickly and easily try different hints and see the effect that they have on the execution path of the statement. Rewrite the Statement The performance of statements can change dramatically when a statement is rewritten to return the same result set via different methods. An example of this may be changing a statement which uses and “where not in” clause to “where not exists”, or to a “minus”. Each of these methods could be made to return the same result set, but may do it via quite different execution paths. Add Indexes While over-indexing tables is usually not a good idea, adding an index to improve query performance is still often the best way. Generally if the column, or combination of columns which is used for a join or lookup is unique, or very selective then adding an index on it should have any adverse effects on any other queries. If the column, or columns is not very selective then an index on it, potentially resulting in a large range scan can often perform worse. If there are other indexes on the table where range scans are performed then other queries may also change to use any additional index created. Conclusions There are a number of common mistakes made with regards to data warehouse system setup. Many can be attributed to myths (or misunderstandings) about the role of the data warehouse and it’s requirements for performing that role. Application design is one of the keys to avoiding performance issues later on. Generally for a data warehouse a dimensional star schema’s provide a robust and simple design which performs well for data warehouse type queries. Following good practices when purchasing and setting up a data warehouse system is also important. If performance problems do occur then Oracle provides a large number of tools for diagnosing and solving these problems with the database. Key among these are Statspack, the System and Session Wait tables and the explain plan utility. Most problems stem from poorly performing individual SQL statements, and the biggest performance gains can usually be obtained by tuning individual statements. Oracle’s query hints are particularly useful for this. Performance is something that should be considered and designed in from the beginning of any system implementation. When issues occur it is generally best to follow a pragmatic approach to diagnosing what is causing the problems, rather than just spending money on more powerful hardware. References Oracle Metalink: http://metalink.oracle.com Oracle 9i Database Utilities, Release 2(9.2), March 2002, Oracle Corporation Oracle 9i Database Performance Tuning Guide and Reference, Release 2(9.2), March 2002, Oracle Corporation The Data Warehouse Lifecycle Toolkit, Ralph Kimball et al, 1998, John Wiley and Sons. Appendix 1: Statspack System Wait Events Wait Events for DB: DWT Instance: DWT Snaps: 283 -327 -> s - second -> cs - centisecond 100th of a second -> ms - millisecond 1000th of a second -> us - microsecond - 1000000th of a second -> ordered by wait time desc, waits desc (idle events last) Event Waits Timeouts ---------------------------- ------------ ---------PL/SQL lock timer 12,604 12,537 db file sequential read 9,250,844 0 db file parallel write 127,206 0 db file scattered read 4,480,652 0 SQL*Net message from dblink 1,926,862 0 direct path read 2,325,280 0 direct path write 2,300,831 0 free buffer waits 23,111 20,728 SQL*Net more data from dblin 7,139,501 0 async disk IO 200,718 0 log file parallel write 1,031,109 0 write complete waits 4,502 4,215 enqueue 2,624 1,181 log file sequential read 102,660 0 local write wait 4,301 2,914 buffer busy waits 58,778 607 log file sync 109,580 93 PX Deq: Execute Reply 1,703 120 control file parallel write 34,068 0 log file switch completion 2,639 286 log buffer space 2,957 55 latch free 179,875 7,075 row cache lock 5,052 69 control file sequential read 58,803 0 single-task message 363 0 db file parallel read 4,417 0 inactive session 60 60 library cache load lock 134 4 process startup 240 0 SQL*Net more data to client 132,054 0 LGWR wait for redo copy 66,473 73 log file single write 708 0 PX Deq Credit: send blkd 12,955 0 SQL*Net message to dblink 1,926,862 0 library cache pin 461 0 PX Deq: Table Q Get Keys 190 0 SQL*Net break/reset to clien 991 0 PX Deq: Parse Reply 224 0 PX Deq Credit: need buffer 7,416 0 PX Deq: Signal ACK 79 2 kksfbc child completion 9 9 PX Deq: Join ACK 231 0 undo segment extension 119,908 119,904 wait list latch free 3 0 SQL*Net break/reset to dblin 4 0 slave TJ process wait 1 1 PX Deq: Table Q qref 69 0 kkdlgon 1 0 SQL*Net more data to dblink 14 0 Avg Total Wait wait Waits Time (s) (ms) /txn ---------- ------ -------194,811 15456 0.0 67,709 7 29.8 56,831 447 0.4 54,684 12 14.4 39,846 21 6.2 26,306 11 7.5 22,017 10 7.4 21,509 931 0.1 17,017 2 23.0 7,305 36 0.6 5,634 5 3.3 4,291 953 0.0 3,862 1472 0.0 3,641 35 0.3 3,018 702 0.0 1,943 33 0.2 1,849 17 0.4 763 448 0.0 732 21 0.1 496 188 0.0 445 151 0.0 390 2 0.6 263 52 0.0 259 4 0.2 140 385 0.0 117 26 0.0 59 986 0.0 16 121 0.0 12 51 0.0 10 0 0.4 6 0 0.2 3 4 0.0 3 0 0.0 3 0 6.2 3 6 0.0 1 4 0.0 1 1 0.0 0 2 0.0 0 0 0.0 0 2 0.0 0 13 0.0 0 0 0.0 0 0 0.4 0 18 0.0 0 9 0.0 0 18 0.0 0 0 0.0 0 2 0.0 0 0 0.0