* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download disk i/o`s - Progress Community
Microsoft Jet Database Engine wikipedia , lookup
Relational model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Tandem Computers wikipedia , lookup
Clusterpoint wikipedia , lookup
Object-relational impedance mismatch wikipedia , lookup
DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications John Goodson Vice President, DataDirect, R&D Agenda Topic: Understanding how drivers work and what factors influence performance Topic: Guidelines for improving performance Topic: Review 2 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications At The End, You Should Understand: How driver implementations trade off performance for functionality How to design for performance How to avoid pitfalls in some drivers 3 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications ODBC Provides a common API to all data stores Cross platform Initially a Microsoft standard based on a draft pre-release of the X/Open SQL Access Group CLI ODBC 3.0 aligns with the ISO and ANSI CLI standards 4 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications JDBC JDBC is a Java™ interface for accessing SQL data sources Based on ODBC, Open Group SQL CLI, and ANSI/ISO CLI JDBC 1.0 designed for basic usage with emphasis on ease of use JDBC 2.0 enhanced for advanced features and server-side use JDBC 3.0 goal was to “round out” API by providing missing functionality 5 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Performance Issues Developing code that works is hard enough without having to make it work fast! Tough to know all the specifics of a single backend No exceptions are thrown to say your code is running too slow Most performance “help” is DBMS server focused … not middleware focused 6 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications How Do Drivers Work? Each database vendor defines a communication protocol of how the database server will communicate with clients This communication protocol is defined in terms of packets that will be transferred using a network transport (i.e. information will be encoded into a specific database defined format that will be sent over the network (tcp-ip) to/from clients) The database server only understands requests that are sent in this format 7 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications How Do Drivers Work – Part 2 Every database vendor defines their packet descriptions and their model differently – Oracle and DB2 are cursor based models – SQL Server and Sybase are streaming models – Progress OpenEdge is a cursor based model Every JDBC request will most likely result in many packets being sent to/from the database – Why? – (1) packets are fixed size and data might not fit in one packet – (2) JDBC API does not map 1:1 with database communication protocol 8 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Why Do We Care How Drivers Work? The two slowest things in a database system are – Network i/o – Disk i/o The goal is to keep applications using the CPU and not waiting on movement of data across a network or to/from a storage device. 9 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications What Will Applications Do? – – – – – – – Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect 10 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications What Happens When We Connect? Login credentials verified by dbms Shared memory (allocated and) assigned to connection – Depending on db between 24k and 64k Various queues updated with connection presence Connection recorded in audit files on disk Database context established Code page set Transaction information initialized Connection properties applied Initialization statements executed 11 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Connecting Connecting is the single slowest thing you can do in a well tuned application – Lots of network i/o’s … disk i/o’s … CPU cycles Using connection pooling, we can make the slowest (network i/o and disk i/o intensive) operation in your application one of the fastest (using no network i/o or disk i/o) With JDBC, Connection pooling requires use of JDBC DataSources … don’t connect via Driver Manager 12 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications What Will Applications Do? Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect 13 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Looking for Data – Catalog Functions/DatabaseMetaData Compared to other xDBC methods, catalog calls/DatabaseMetaData methods that return ResultSet objects are slow – getTables, getColumns, getProcedures, … xDBC drivers have to execute complex queries – and many times multiple queries – in order to return the results mandated by the xDBC specs. 14 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Getting a List of Tables What you want: a list of all tables that I have starting with the prefix NYC ResultSet WSrs = WSc.getTables (null, null, “NYC%”, null); 15 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Getting a List of Tables (part 2) What a driver thinks you want: A list of all tables, views, synonyms, system tables, aliases, or temporary tables owned by any schema in any catalog/database starting with the prefix NYC ResultSet WSrs = WSc.getTables (“cat1”, “johng”, “NYC%”, “TABLE”); 16 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Avoid Search Patterns for DatabaseMetaData Methods Using null arguments or search patterns to Catalog/DatabaseMetaData methods results in time-consuming queries and, potentially, increases network traffic due to unwanted results. Common places to see null that should be “fixed” – Schema name (most all) – Table type (getTables/SQLTables) – Catalog name (most all) 17 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Oracle SQL … getProcedures … yuck! select a.owner, a.object_name, INSTR(a.object_type, 'PROC'), '1' from sys.all_objects a where a.object_type IN ('FUNCTION','PROCEDURE') and a.status = 'VALID' union all select c.owner, c.synonym_name, INSTR(a.object_type, 'PROC'), '1' from sys.all_objects a, sys.all_synonyms c where c.table_owner = a.owner and c.table_name = a.object_name and a.object_type IN ('FUNCTION','PROCEDURE') and a.status = 'VALID' union all select distinct b.owner, CONCAT(b.package_name, '.' || b.object_name), min(b.position), max(b.overload) from sys.all_arguments b where b.package_name IS NOT NULL group by b.owner, CONCAT(b.package_name, '.' || b.object_name) union all select distinct c.owner, CONCAT(c.synonym_name, '.' || b.object_name), min(b.position), max(b.overload) from sys.all_arguments b, sys.all_synonyms c where c.table_owner = b.owner and c.table_name = b.package_name and b.package_name IS NOT NULL group by c.owner, CONCAT(c.synonym_name, '.' || b.object_name) union all select distinct c.owner, c.synonym_name, min(b.position), max(b.overload) from sys.all_arguments b, sys.all_synonyms c where c.owner = b.owner and c.table_owner=b.package_name and c.table_name=b.object_name group by c.owner, c.synonym_name 18 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications getTypeInfo Even getTypeInfo may result in a network round-trip – Do this once and cache the information 19 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Use Prepared Statement Metadata Instead of getColumns (when possible) getColumns/SQLColumns – driver sends a potentially complex SQL query to the server – SQL is preprocessed and executed on the server – driver must then obtain result set descriptions, bind results, and retrieve rowset from the server – Usually at least 3 network i/o’s (probably more) ResultSetMetaData/SQLDescribeCol – Simple query, which evaluates to no rows, sent to server – NOT EXECUTED – Only result set information obtained – Usually 2 network i/o (prepare + close) 20 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Example PreparedStatement pstmt = WScon.prepareStatement (“select * from StockPurchasePlan where 0 = 1”); ResultSetMetaData rsmd = pstmt.getMetaData(); int count = rsmd.getColumnCount; // this code would replace // getColumns (“ct1”, “johng”, // “StockPurchasePlan”, “%”); 21 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications What Will Applications Do? Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect 22 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Prepared Statements Many times our applications do the same things over and over again by specifying SQL that is parameterized SELECT firstname, lastname, address, phone FROM employees WHERE empid = ? UPDATE balance SET balance = balance + ? WHERE acctid = ? 23 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Prepared Statements (2) Many times prepared statements are not as efficient as executing the SQL directly (using the Statement object in JDBC/SQLExecDirect in ODBC) Use prepared statements if the query is to be executed more than twice in this session OR if it will be executed frequently by the system (JDBC) Execute directly if the query is to be executed once or twice 24 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Prepared Statement Side Effect Side effect: There is no way to control what gets put in the prepared statement pool. If you create a prepared statement, then it’s going into the pool and might remove some other statement you want to stay in the pool. 25 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Even More on Prepared Statements Prepare, Execute, Describe – Always preferred – Saves network roundtrip with most databases Prepare, Describe, Execute – Always slower 26 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Stored Procedure Invocation Use parameter markers for arguments to stored procedures … do not use literal arguments {call getFunds (?,?,?)} // good {call getFunds (1234, ‘JPG’, 53.23)} 27 © 2005 Progress Software Corporation // bad DB-15 ODBC/JDBC OpenEdge™ Applications Why Not Use Literal Arguments? The spec says not to use literals More code for the programmer, but executions can be done via RPCs (remote procedure calls) instead of executed as language events Use of RPCs improves performance by – Eliminating parameter processing on server – Parsing done on server 28 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Example: {call foo (12,’abc’)} Server gets a generic SQL packet – Parse SQL – Semantically validate Does procedure foo exist? Is foo defined to have 2 arguments? – SQL is always a Unicode character string Look up data type for foo’s parameter 1 (int) Convert ’12’ to an integer Look up data type for foo’s parameter 2 (varchar) Convert ‘abc’ to varchar – Execute 29 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Example: {call foo (?,?)} Server gets a specific stored procedure packet where the packet structure contains the arguments in the defined server format for the parameters – Execute only 30 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Understand Cursor Types Forward only cursors are always the fastest cursor type (most of the time) static/insensitive scrollable cursors will retrieve all rows for the query to the client (most of the time) keyset/sensitive scrollable cursors will retrieve all keys to the client, then each fetch (next) will result in an execute 31 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Forward-Only Cursors driver will pre-fetch rows from the server For small result sets, can fit entire contents of the result set in one network roundtrip Side effects: – First fetch will be slower than subsequent fetches for application – Cursor position is not where you “think” it is – Read Committed data might have already been read 32 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Scroll Insensitive/Static Cursors drivers will usually cache the entire contents of the result set on the client “Lazy fetch” model … only fetch the data when you have to. If you request a scroll sensitive cursor and the driver can’t support a scroll sensitive cursor, then it will downgrade to scroll insensitive Not scalable! 33 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Keyset/Scroll Sensitive Cursors drivers usually cache a set of keys to the client Example, ‘select * from table’ with Oracle – Results in the driver executing: ‘select rowid from table’ – Retrieves all rowids – Each call to SQLFetch/rs.next() will result in the driver executing ‘select * from table where rowid = ?’ and retrieving the result 34 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Use Batches Instead of Prepared Statements Bulk load of data has typically been done by preparing an INSERT statement and executing many times. Preparing an INSERT statement and executing 100 times: – 101 network round-trips Using batch queries/addBatch() to compound 100 inserts: – 1 network round-trip Network round-trips are more expensive than DBMS CPU cycles 35 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Arrays In ODBC, use SQLParamOptions to execute multiple parameterized queries with one network roundtrip Not as easy in JDBC; however, use batch queries and eventually all drivers will optimize to use parameter arrays 36 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Retrieving JDBC Data The hard work is not just positioning to the data, follow a few more guidelines when putting data into buffers Use getter methods with result column ordinals instead of with column names. String fullname = rs.getString (3); // good String fullname = rs.getString (“lname”); // bad 37 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications JDBC Retrieving Data (2) Only use getObject() unless it’s absolutely necessary. String children = (String) rs.getObject (2); // bad Int children = rs.getInt (2); 38 © 2005 Progress Software Corporation // fast DB-15 ODBC/JDBC OpenEdge™ Applications Avoid Long Data (Including LOBs) Select * from table – Easy! Most people do this! – Slow! Even if you don’t ask for column data, a performance penalty has been paid as the data has already been moved across the network to the client Particularly bad when the table contains long varchar/long varbinary/LOB data Compounded because the driver is trying to array fetch … which is might not be able to do 39 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications CLOB/BLOB Using CLOB and BLOB objects sparingly Although you might think that only parts of the LOB are being transported across the network, most database wire protocols retrofit LOBs into existing packet structures JDBC drivers emulate getClob/getBlob by retrieving the entire object locally 40 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Various other Guidelines Use getGeneratedKeys() – Saves network i/o’s Don’t call rs.last() or fetch absolute last Don’t write code that relies on the number of result rows from a query – Drivers have no idea how many rows a query will return – Drivers have to fetch all rows to know how many rows the query will return 41 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications More guidelines Understand: – setMaxRows – setMaxFieldSize Both limit network i/o by requesting that the server not send data across the wire Side effects: – setMaxRows also applies to updates/deletes in some databases – These options will persist in the connection 42 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications What Will Applications Do? Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect 43 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Transactions Committing transactions is expensive! – Network i/o’s – Numerous non-sequential disk i/o’s Don’t run in auto-commit mode Distributed txs are at least 4 times as slow as local transactions Use DTC/JTA only if required! 44 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications What Will Applications Do? Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect 45 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Summary Avoid: – Network i/o – Disk i/o Think about what drivers are doing to process your requests Stop lazy programming ... Fill in the null arguments 46 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Summary Avoid network I/O Avoid disk I/O Avoid sloppy programming 47 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Questions? 48 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications Thank you for your time! 49 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications 50 © 2005 Progress Software Corporation DB-15 ODBC/JDBC OpenEdge™ Applications