Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Entity–attribute–value model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational algebra wikipedia , lookup
Clusterpoint wikipedia , lookup
ContactPoint wikipedia , lookup
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui Basic OLAP Queries Lecturer : Dr. Pavle Mogin SWEN 432 Advanced Database Design and Implementation SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 1 Plan for OLAP Queries • Basic multidimensional OLAP queries: – Roll – up and Drill – down – Slice and Dice – Pivoting – Readings : • Ramakrishnan, Gehrke: “Database Management Systems”, Chapter 25, Sections 25.3, 25.4, and 25.6 • Chaudhuri and Dayal : “An Overview of Datawarehousing and OLAP Technologies” SWEN 432 Advanced Database Design and Implementation 2017 Multidimensional Data Analysis • OLAP brings some special basic operations • These are: – Roll – up and Drill down, – Slice and Dice, and – Pivot OLAP Queries 2 SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 3 Roll – Up and Drill - Down • Roll – up operation corresponds to taking the current aggregation level of fact values and doing a further aggregation (getting coarser grained data) • Each roll-up operation can be expressed using a SQL/92 SELECT…AGG()… GROUP BY… statement • Drill – down is just an opposite operation of roll – up, but it requires the access to data of finer granularity SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 4 Classification of Roll – Up • Roll – up operations can be classified onto: – Dimensional roll-ups that are done by dropping one or more dimensions – Hierarchical roll-ups that are done by climbing up the attribute hierarchies of dimensions – Mixed roll-ups that combine the previous two techniques • Note that in an ultimate case, the hierarchical roll-up above the top level of an attribute hierarchy (to the attribute “all”) can be viewed as converting to a dimensional roll-up SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 5 Example Star Schema • In the examples that follow, we shall consider the following star schema: – Fact Table : • Sales (ShopId, ProdId, TimeId, Total) • Total is a fact – Dimension tables: • Location(ShopId, City ), • Product(ProdId, ProdName, Type) , • Time(TimeId, Week) – Dimension attribute hierarchies: • ShopIdCity (location hierarchy) • ProdIdType (product hierarchy) • TimeIdWeek (time hierarchy) SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 6 Dimensional Roll – Ups • One dimension dropped: SELECT ProdId, ShopId, SUM(Total ) AS Total_by_Prod_Loc FROM Sales GROUP BY ProdId, ShopId ; • Two dimensions dropped: SELECT ProdId, SUM(Total ) AS Total_by_Prod FROM Sales GROUP BY ProdId ; • All dimensions dropped: SELECT SUM(Total ) AS Overall_Total FROM Sales; SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 7 Hierarchical Roll- Ups (one dimension) – One roll – up would give sales by product, city and day SELECT ProdId, City, TimeId, SUM(Total ) AS City_Total FROM Sales NATURAL JOIN Location GROUP BY City, ProdId, TimeId ; – The second roll – up would give sales by product type, shop and day SELECT Type, ShopId, TimeId, SUM(Total ) AS Type_Total FROM Sales NATURAL JOIN Product GROUP BY Type, ShopId, TimeId ; – The third roll – up Week_Total would give sales by product, shop and week SELECT ProductId, ShopId, Week, SUM(Total ) AS Week_Total FROM Sales NATURAL JOIN Time GROUP BY Week, ShopId, ProductId ; SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 8 Hierarchical Roll-Ups (2d and 3d) • Two dimensional roll-up: SELECT Type, City, TimeId, SUM(Total ) AS Type_City_Tot FROM Sales NATURAL JOIN Product NATURAL JOIN Location GROUP BY Tipe, City, TimeId ; (There would be also two other 2d roll-ups possible (Type_Week and City_Week ) but these are not shown) • Three dimensional roll-up: SELECT Type, City, Week, SUM(Total ) AS Type_City_Week_Tot FROM Sales NATURAL JOIN Product NATURAL JOIN Location NATURAL JOIN Time GROUP BY Type, City, Week ; SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 9 Roll – Up Summary • Generally, if there are k dimensions, it is possible to define more than 2 k roll - up SQL queries of the type: SELECT <attribute_list> AGG(attribute_name) FROM <list_of_tables> WHERE <list_of_conditions> GROUP BY <grouping_list> • There are exactly 2 k dimensional roll – ups • The number of hierarchical roll – ups depends on the number of levels of dimension attribute hierarchies • A simple (with no parallel branches) attribute hierarchy with n levels, gives raise to n roll-ups SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 10 Slice_and_Dice Operations • Slice operation corresponds to reducing the number of dimensions by taking a projection of facts on a proper subset of dimensions for some selected attribute values of dimensions that are being dropped • Slice operation amounts to equality select condition • Dice operation amounts to range select condition on one dimension, or to equality select condition on more than one dimension • Both terms come from visualizing the effect of these operations on a hypercube data set • Here, the term select relates to the operator of the relational algebra SWEN 432 Advanced Database Design and Implementation 2017 A Sales Hypercube Wellington Nelson Auckland January Eraser 123 Linear 23 Pencil 1230 February March 99 183 1 13 1111 2303 OLAP Queries 11 SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 12 Slice Operation Visualization Wellington Nelson Auckland January Eraser 123 Linear 23 Pencil 1230 ProdId, CityId, Amount (MonthId = January(Sales)) SWEN 432 Advanced Database Design and Implementation 2017 Dice Operation Visualization Wellington Nelson Auckland January Eraser 123 Linear 23 Pencil 1230 February 99 1 1111 ProdId, CityId, MonthId, Amount (MonthId {January, February}(Sales)) OLAP Queries 13 SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 14 SQL and Slice_and_Dice Operations • Consider Sales fact table with Product, City, and Time as dimensions: Sales (ProdId, CityId, MonthId, Amount ) • A slice operation can be described in the following way SELECT ProdId, CityId, Amount FROM Sales WHERE MonthId = ‘January’ ; • A dice operation would be: SELECT ProdId, CityId, MonthId, Amount FROM Sales WHERE MonthId =‘January’ OR MonthId = ‘February’ ; or SELECT CityId, Amount FROM Sales WHERE MonthId =‘January’ AND ProdId = ‘Pencil’ ; SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 15 Pivoting • In a fact table, each tuple corresponds to at least one measure value and there is a column for each of dimensions • The simplest view of pivoting is that it selects two dimensions to aggregate one of measures • The aggregated values are often displayed in a grid where each point in the (x, y ) coordinate system corresponds to an aggregated value of the measure • The x and y coordinate values are the values of the selected two dimensions • The result of pivoting is also called cross – tabulation SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 16 Extension of a Star Schema Location Sales CityId City CityId PerId TimId Amnt Time TimId Day 1 Well 1 1 1 230 2 Nels 1 1 2 300 1 Mon 3 Auck 1 1 8 310 2 Tue SalesPerson 1 2 7 50 3 Wed PerId Name 2 3 1 550 4 Thu 1 John 2 3 5 100 5 Fri 2 Susan 3 4 6 880 6 Sut 3 James 3 5 1 60 7 San 4 Susan 3 5 2 60 8 Mon 5 Ann 3 5 4 140 SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 17 Pivoting on City and Day Mon Auckland Tue Wed 60 60 0 Nelson 550 0 Wellington 540 1150 SubTotal Thu Fri 140 Sat San SubTotal 0 880 0 1140 0 0 100 0 0 650 300 0 0 0 0 50 890 360 0 140 100 880 50 2680 SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 18 Expressing Pivoting by SQL (body) SELECT City, Day, SUM(Amnt) AS Sales FROM Sales NATURAL JOIN Location NATURAL JOIN Time GROUP BY City, Day; City Day Sales Auckland Monday 60 Wellington Monday 540 Auckland Tuesday 60 Wellington Tuesday 300 Auckland Thursday 140 Wellington Sunday 50 Auckland Saturday 880 Nelson Monday 550 Nelson Friday 100 City Day Sales But, this is only the body of the pivoting table SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 19 Expressing Pivoting by SQL (subs&total) SELECT Day, SUM(Amnt) AS Total FROM Sales NATURAL JOIN Time GROUP BY Day; Day Total Mon 1150 Tue 360 Thu 140 Fri 100 Sat 880 San 50 SELECT City, SUM(Amnt) AS Total FROM Sales NATURAL JOIN Location GROUP BY City; City Total Auck 1140 Nels 650 Well 890 SELECT SUM(Amnt) FROM Sales; SUM(Amnt) 2680 SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 20 Summary of the Cross Tab Example SELECT City, Day, SUM(Amnt)… GROUP BY City, Day {City, Day} SELECT City … {City} {Day} GROUP BY City; SELECT Day … GROUP BY Day; {} SELECT SUM(Amnt) FROM Sales Structures like that one are often called cube lattice SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 21 Replacing Pivot Operation by SQL • To express a pivot operation involving two dimensions, we would have to issue the following SQL/92 statement: SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 22 Replacing Pivot Operation by SQL (SELECT City, Day, SUM(Amnt) FROM Sales NATURAL JOIN Location NATURAL JOIN Time GROUP BY City, Day) UNION ALL (SELECT City, ‘ ’, SUM(Amnt) FROM Sales NATURAL JOIN Location GROUP BY City) UNION ALL (SELECT ‘ ’, Day, SUM(Amnt) FROM Sales NATURAL JOIN Time GROUP BY Day) UNION ALL (SELECT ‘ ’, ‘ ’, SUM(Amnt) FROM Sales); SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 23 A Relational Representation of Pivoting City Day Sales Auckland Monday 60 Auckland Auckland Tuesday 60 Nelson 650 Auckland Thursday 140 Wellington 890 Auckland Saturday 880 Monday 1150 Nelson Monday 550 Tuesday 360 Nelson Friday 100 Thursday 140 Wellington Monday 540 Friday 100 Wellington Tuesday 300 Saturday 880 Wellington Sunday 50 City Day Sales 1140 Sunday 50 2680 SWEN 432 Advanced Database Design and Implementation 2017 OLAP Queries 24 Summary of Basic OLAP Operations OLAP operation Use of attribute hierarchy No of dimensions Use of aggregate functions Use of May be replaced by Roll-up Y Same or smaller Y N SELECT… AGG GROUP BY Drill-down Y Same or greater May use N SELECT… AGG GROUP BY Slice N Smaller N Y SELECT… WHERE cond Dice N May be Smaller N Y SELECT… WHERE cond Pivot / cross tab Y Smaller Y (on many Subsets) N Multiple roll-up