Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Entity–attribute–value model wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
Database model wikipedia , lookup
Chapter 11 Group Functions (up to p.420) Jason C. H. Chen, Ph.D. Professor of MIS School of Business Gonzaga University Spokane, WA 99258 USA [email protected] Dr. Chen, Oracle Database System (Oracle) 1 Objectives • Differentiate between single-row and multiple-row functions • Use the SUM and AVG functions for numeric calculations • Use the COUNT function to return the number of records containing non-NULL values • Use COUNT(*) to include records containing NULL values • Use the MIN and MAX functions with nonnumeric fields Dr. Chen, Oracle Database System (Oracle) 2 Objectives (continued) • Determine when to use the GROUP BY clause to group data • Identify when the HAVING clause should be used • List the order of precedence for evaluating WHERE, GROUP BY, and HAVING clauses • State the maximum depth for nesting group functions • Nest a group function inside of a single-row function Dr. Chen, Oracle Database System (Oracle) 3 Objectives (continued) • Calculate the standard deviation and variance of a set of data, using the STDDEV and VARIANCE functions • Explain the concept of multidimensional analysis • Perform enhanced aggregation grouping with the GROUPING SETS, CUBE, and ROLLUP • Use composite columns and concatenated groupings in grouping operations Dr. Chen, Oracle Database System (Oracle) 4 Refresh the Database • 1. Download chapter 11 files from Bb to c:\oradata\chapter11\ • 2. Run the following script file – Start c:\oradata\chapter11\JLDB_Build_11.sql Dr. Chen, Oracle Database System (Oracle) 5 Group Functions • Return one result per group of rows processed • Are also called multiple-row and aggregate functions • All group functions ignore NULL values except COUNT(*) • Use DISTINCT to suppress duplicate values Dr. Chen, Oracle Database System (Oracle) 6 Added Clauses Figure 11-1 SELECT statement syntax Dr. Chen, Oracle Database System (Oracle) 7 SUM Function • Calculates total amount stored in a numeric column for a group of rows Figure 11-2 Dr. Chen, Oracle Database System (Oracle) Using the SUM function to calculate order profit 8 AVG Function • Calculates the average of numeric values in a specified column Figure 11-4 Using the AVG function to calculate average profit Dr. Chen, Oracle Database System (Oracle) 9 Figure 11-4 Using the AVG function to calculate average profit -- chapter 11, Figure 11-5(B); p.407 SELECT TO_CHAR(AVG( retail - cost), '$999.99') "Average Profit" FROM books WHERE category = 'COMPUTER'; Dr. Chen, Oracle Database System (Oracle) 10 COUNT Function • Two purposes – Count non-NULL values – Count total records, including those with NULL values Dr. Chen, Oracle Database System (Oracle) 11 COUNT Function – Non-NULL Values • Include column name in argument to count number of occurrences Figure 11-9 Dr. Chen, Oracle Database System (Oracle) Using the COUNT function with the DISTINCT option 12 COUNT Function – NULL Values • Include asterisk in argument to count number of rows Figure 11-11 Dr. Chen, Oracle Database System (Oracle) Using the COUNT(*) function to include NULL values 13 MAX Function • Returns largest value Figure 11-13 Dr. Chen, Oracle Database System (Oracle) Using the MAX function on numeric data 14 MIN Function • Returns the smallest value Figure 11-16 Dr. Chen, Oracle Database System (Oracle) Using the MIN function on date data 15 Datatypes • The COUNT, MIN, and MAX functions can be used on values with character, numeric, and date datatypes Dr. Chen, Oracle Database System (Oracle) 16 Grouping Data • GROUP BY clause – Used to group data – Must be used for any individual column in the SELECT clause with a group function – Cannot reference column aliases Dr. Chen, Oracle Database System (Oracle) 17 GROUP BY Example Figure 11-18 Adding the GROUP BY clause Dr. Chen, Oracle Database System (Oracle) 18 Common Error • A common error is missing a GROUP BY clause for nonaggregated columns in the SELECT clause Figure 11-17 Flawed query: Including both aggregate and nonaggregate columns requires a GROUP BY clause Dr. Chen, Oracle Database System (Oracle) 19 It is an individual record (just “Profit” not “Highest Profit”) Figure 11-19 Inappropriate use of GROUP BY Dr. Chen, Oracle Database System (Oracle) 20 Figure 11-20 Calculate the total amount due by each customer and order Dr. Chen, Oracle Database System (Oracle) 21 Restricting Aggregated Output • HAVING clause serves as the WHERE clause for grouped data. • It is used to eliminate certain groups from further consideration. 4 records are eliminated Dr. Chen, Oracle Database System (Oracle) Figure 11-21 Using a HAVING clause to restrict which groups are displayed 22 Restricting Aggregated Output (continued) • When included in the same SELECT statement, the clauses are evaluated in the order of: – WHERE – GROUP BY – HAVING Dr. Chen, Oracle Database System (Oracle) 23 Restricting Aggregated Output (continued) Figure 11-22 Using the WHERE, GROUP BY, and HAVING clauses Dr. Chen, Oracle Database System (Oracle) 24 Figure 11-23 Using a HAVING clause to restrict grouped output Dr. Chen, Oracle Database System (Oracle) 25 Figure 11-24 Filtering correctly with the WHERE and HAVING clauses Dr. Chen, Oracle Database System (Oracle) 26 It is quite inefficient and considered poor SQL programming practice. The statement must process ALL rows in the BOOKS table with the aggregated calculation and then eliminate categories. Figure 11-25 Filtering incorrectly with the HAVING clauses Dr. Chen, Oracle Database System (Oracle) Figure 11-24 Filtering correctly with the WHERE and HAVING clauses 27 Nesting Functions • Inner function is resolved first • Maximum nesting depth: 2 Figure 11-26 Nesting group functions Dr. Chen, Oracle Database System (Oracle) 28 Exercises • Practice all the examples in the text. • A Script file is available on the Bb (file name: ch11Queries.sql) • After completing all examples, do the HW. In-class Exercise • #7 (p.447) Dr. Chen, Oracle Database System (Oracle) 7. SELECT customer#, COUNT(*) FROM orders GROUP BY customer#; 29 Homework - Hands-On Assignments Upload the spooled file (*.txt) to the Bb (under “Assignments & Projects”) by the deadline. Read and Practice all examples on Chapters 11 • 1. Run the script files (in the folder \oradata\chapter11\): JLDB_Build_11.sql • 2. Read Oracle assignment and create a script file Oracle_ch11_Lname_Fname.sql for questions (ALL EVEN problems; pp.446-448) on “Hands-on Assignments”. Use appropriate COLUMN or other SQL commands to produce readable outputs (or your grade will be discounted –see a sample output on the Bb) • 3. Execute and test one problem at a time and make sure they are all running successfully. • 4. When you done, spool the script files (see next slide for spooling instructions) and UPLOAD the source (*.sql) and spooled files (Oracle_ch11_Spool_Lname_Fname.txt) to Bb by the midnight before the next class. • 5. Turn in a hardcopy (*.txt) next class. Dr. Chen, Oracle Database System (Oracle) 30 How to Spool your Script and Output Files After you tested the script file of Oracle_ch11_Lname_Fname.sql successfully, follow the instructions below to spool both script and output files: Step 0. Run the following script file from SQL*Plus (since you have created JLDB tables) – Start c:\oradata\chapter11\JLDB_Build_11.sql • 1. type the following on SQL> – Spool c:\oradata\Oracle_ch11_Spool_Lname_Fname.txt (make sure your name is entered) • 2. open Oracle_ch11_Lname_Fname.sql that you already tested • 3. copy and paste all the SQL commands (including all comments) to the SQL*PLUS • 4. type Spool Off on the SQL> The output should contain your personal information, all SQL commands and their solution on the .txt file and saved in C: drive (oradata\ folder) Upload the spooled file (*.txt) to the Bb (under “Assignments & Projects”) by the deadline. Dr. Chen, Oracle Database System (Oracle) 31 Summary • The AVG, SUM, STDDEV, and VARIANCE functions are used only with numeric fields • The COUNT, MAX, and MIN functions can be applied to any datatype • The AVG, SUM, MAX, MIN, STDDEV, and VARIANCE functions all ignore NULL values • By default, the AVG, SUM, MAX, MIN, COUNT, STDDEV, and VARIANCE functions include duplicate values Dr. Chen, Oracle Database System (Oracle) 32 Summary (continued) • The GROUP BY clause is used to divide table data into groups • If a SELECT clause contains both an individual field name and a group function, the field name must also be included in a GROUP BY clause • The HAVING clause is used to restrict groups in a group function • Group functions can be nested to a depth of only two. The inner function is always performed first, using the specified grouping. The results of the inner function are used as input for the outer function. Dr. Chen, Oracle Database System (Oracle) 33 Summary (continued) • The STDDEV and VARIANCE functions are used to perform statistical analyses on a set of data • GROUPING SETS operations can be used to perform multiple GROUP BY aggregations with a single query • The CUBE extension of the GROUP BY calculates aggregations for all possible combinations or groupings of columns included • The ROLLUP extension of the GROUP BY calculates increasing levels of accumulated subtotals for the column list provided • Composite columns and concatenated groupings can be used in GROUPING SETS, CUBE, and ROLLUP operations • The GROUP_ID function helps eliminate duplicate grouping results Dr. Chen, Oracle Database System (Oracle) 34 STDDEV Function Figure 11-27 Dr. Chen, Oracle Database System (Oracle) Using the STDDEV function 35 VARIANCE Function • Determines data dispersion within a group Figure 11-28 Dr. Chen, Oracle Database System (Oracle) Using the VARIANCE function 36 Enhanced Aggregation for Reporting • Oracle provides extensions to the GROUP BY clause, which allow both aggregation across multiple dimensions or the generation of increasing levels of subtotals with a single SELECT statement • A dimension is a term used to describe any category used in analyzing data, such as time, geography, and product line • Each dimension could contain various levels of aggregation; for example, the time dimension may include aggregation by month, quarter, and year; the product dimension may include product type, sales, store, region and month. Dr. Chen, Oracle Database System (Oracle) 37 Figure Slicing a data cube REGION CUSTOMER Dr. Chen, Oracle Database System (Oracle) 38 Excel Pivot Table Example Figure 11-30 A pivot table with two dimensions on a row Dr. Chen, Oracle Database System (Oracle) 39 Excel Pivot Table Example (continued) Figure 11-31 A pivot table with one row and one column dimension Dr. Chen, Oracle Database System (Oracle) 40 GROUPING SETS • The grouping sets expression is the component on which the other GROUP BY extensions, ROLLUP and CUBE, are built. • With this extension, you can use a single query statement to perform multiple GROUP BY clauses. • The single query in Figure 11-32 produces the average retail price for books in four groupings: 1) publisher (the Name column) and category, 2) category, 3) publisher, and 4) overall average. -- chapter 11, Figure 11-32; p.427 Try without SELECT name, category, COUNT(isbn), GROUING TO_CHAR(AVG(retail), '99.99') "Avg Retail" SETS FROM publisher JOIN books USING (pubid) WHERE pubid IN (2,3,5) GROUP BY GROUPING SETS (name, category, (name, category), ()) Dr. Chen, Oracle Database System (Oracle) 41 Without GROUPING SETS -- chapter 11, Figure 11-32(a); p.427 SELECT name, category, COUNT(isbn), TO_CHAR(AVG(retail), '99.99') "Avg Retail" FROM publisher JOIN books USING (pubid) WHERE pubid IN (2,3,5) GROUP BY name, category; Dr. Chen, Oracle Database System (Oracle) 42 Grouping Sets Figure 11-32 Using a GROUPING SETS expression in a GROUP BY clause Dr. Chen, Oracle Database System (Oracle) 43 The CUBE Extension • The CUBE extension of GROUP BY instructs Oracle to perform aggregations for all possible combinations of the specified columns (e.g., two columns: Name (publisher name) and Category). • The outputs matches Figure 11-32. • If you need only a subset of the four aggregate levels calculated, you must use the GROUPING SETS expression because the CUBE extension always performs all aggregation levels. • Adding a GROUPING function to the CUBE extension (Fig. 11-35) to identify subtotal rows in the results (helpful in labeling sorting, and restricting output). Dr. Chen, Oracle Database System (Oracle) 44 CUBE Figure 11-34 Dr. Chen, Oracle Database System (Oracle) Using the CUBE extension of GROUP BY 45 Figure 11-35 The GROUPING function returns a 1 to identify subtotal rows Dr. Chen, Oracle Database System (Oracle) 46 The ROLLUP Extension • The ROLLUP extension of GROUP BY calculates cumulative subtotals for the specified columns. • If multiple columns are indicated, subtotals are performed for each column in the argument list, except the one on the far right. A grand total is also calculated. Dr. Chen, Oracle Database System (Oracle) 47 ROLLUP Figure 11-37 Dr. Chen, Oracle Database System (Oracle) Using the ROLLUP extension of GROUP BY 48 Category is the column outside the ROLLUP and is considered the aggregate value. A subtotal is calculated for the aggregate value as well as for each unique value of the ROLLUP column in the aggregate value – i.e., by Category and each Name in each Category . Figure 11-38 Dr. Chen, Oracle Database System (Oracle) Using a partial ROLLUP 49