Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DW-2: Designing a Data Warehousing System 용 환승 이화여자대학교 http://dblab.ewha.ac.kr/hsyong ER Modeling Defines the Detailed Relationships Defines Entities That Are Fully Normalized Follows Third Normal Form or Greater Produces a Complex Database Design Stores Data at the Lowest Level of Transactional Detail Increases the Number of Joined Tables in Queries Is Fixed in Nature Dimensional Modeling Defines Entities That Are Denormalized. Follows the Relational Model but Violates the ER Modeling Rules Produces a Simple Database Design That Is Easily Understood by Users Stores Data at the Lowest Level of Transactional Detail or Summarized Data Decreases the Number of Joined Tables in Queries Is Extendable Modeling a Data Warehouse Using a Star Schema Components of a Star Schema Using a Snowflake Schema Choosing a Schema Components of a Star Schema Employee_Dim EmployeeKey EmployeeID ... Time_Dim TimeKey Sales_Fact TheDate ... TimeKey TimeKey EmployeeKey ProductKey CustomerKey ShipperKey RequiredDate ... Dimensional Keys Product_Dim ProductKey ProductID ... Multipart Key Measures Shipper_Dim Customer_Dim ShipperKey CustomerKey ShipperID ... CustomerID ... Using a Snowflake Schema Sales_Fact TimeKey EmployeeKey ProductKey CustomerKey ShipperKey RequiredDate ... Product_Dim ProductKey Product Name Product Size Product Brand ID Product_Brand_ID Product Brand Product Category ID Product_Category_ID Product Category Product Category ID Choosing a Schema Star schema Snowflake schema Overall Row Count Higher Lower Model Understandability Easier More Difficult Number of Tables Less More Query Complexity Simpler More Complex Dimensional Searching Quicker Slower Supports Inhibits Bitmapped Indexing Defining Dimension Characteristics Applying Characteristics to Dimension Tables Define a primary key Use descriptive character data Include highly correlated description columns Designing for Usability and Extensibility Minimize or avoid using codes or abbreviations Create columns that are useful for levels of aggregation Minimize the number of rows that will change over time Identifying Dimension Hierarchies Consolidated Hierarchy Separate Hierarchy Store Location Continent Store Location Continent Country State City Store ... Continent Country ... ... ... ... ... Country State or Province ... ... State or Province City ... City Store number ... Store Number 01 ... Defining Conventional Dimensions Time Dimension Year, Quarter, Month, Week, Day, time Geographic Dimension Nation, state, city, district, location Product Dimension Customer Dimension Sharing Dimensions Among Other Data Marts Production One instance exist and is shared among data marts Time Multiple instances exist in individual data marts Purchasing Sales Implementing a Star Schema Estimating Size of the Data Warehouse Creating a Database Creating Tables Creating Constraints Creating Indexes Creating a Database Using CREATE DATABASE Options SIZE MAXSIZE FILEGROWTH Setting Database Options Read-only: No Locking Trunc. log on chkpt.: No Serious Recovery SELECT INTO/Bulkcopy : No Logging Implementing of a Star Schema ( example ) Implementing of a Star Schema ( example ) Creating Tables CREATE table sales_fact_1997 ( product_id int not null, customer_id int not null, store_id int not null, time_id int not null, store_sales float not null, store_cost float not null, unit_sales real not null ) Fact Table Dimension Tables CREATE table Product ( product_id int not null, CREATE table Customer product_class_id int not null, ( customer_id int not null, product_family char(50) not null, country char(50) not null, product_department char(50) not null, state_province char(50) not null, product_category char(50) not null, city char(50) not null, product_subcategory char(50) not null, lname char(100) not null, brand_name char(255) not null, product_name char(255) not null, primary key (customer_id) ) primary key (product_id) ) CREATE table Store ( store_id int not null, store_country char(50) not null, store_state char(50) not null, store_city char(50) not null, primary key (store_id) ) CREATE table Time ( time_id int not null, the_month char(15) not null, quater char(2) not null, the_year int not null, primary key (time_id) ) Implementing of a Star Schema ( example ) Data Importing using DTS ( Data Transform Services ) Using the Query result to specify the data Implementing of a Star Schema ( example ) Specify the Table Implementing of a Star Schema ( example ) Define FOREIGN KEY Constraints • ALTER TABLE sales_fact_1997 ADD foreign key (customer_id) references Customer • ALTER TABLE sales_fact_1997 ADD foreign key (product_id) references Product • ALTER TABLE sales_fact_1997 ADD foreign key (time_id) references Time • ALTER TABLE sales_fact_1997 ADD foreign key (store_id) references store Creating Composite Indexes create index fact on sales_fact_1997 ( product_id, customer_id, store_id, time_id ) Transforming Data Convert Transform buyer_name reg_id Item_id total_sales Adam Barr II 32 17.60 Sean Chai IV 48 52.80 Erin O’Melia VI 9 8.82 ... ... ... ... buyer_name reg_id Item_id total_sales Adam Barr 2 32 17.60 Sean Chai 4 48 52.80 Erin O’Melia 6 9 8.82 ... ... ... ... Combine buyer_first Adam Sean Erin ... buyer_last reg_id Item_id total_sales Barr 2 32 17.60 Chai 4 48 52.80 O’Melia 6 9 8.82 ... ... ... ... buyer_name reg_id Item_id total_sales Adam Barr 2 32 17.60 Sean Chai 4 48 52.80 Erin O’Melia 6 9 8.82 ... ... ... ... Calculate buyer_name reg_id Item_id price_id qty_id Adam Barr 2 1052 .55 32 Sean Chai 4 1023 1.10 48 Erin O’Melia 6 1069 .98 9 ... ... ... ... ... buyer_name reg_id Item_id total_sales Adam Barr 2 32 17.60 Sean Chai 4 48 52.80 Erin O’Melia 6 9 8.82 ... ... ... ... Transforming Data with a Lookup Query Data Warehouse Look up Table State_lookup ST Abr.. State Source Data FL Florida WY Wyoming AR Arkansas Destination Customer_source Customer_dim Name State Name State D. Smith FL D. Smith Florida L. Wilson WY L. Wilson Wyoming P. Salinger Arkansas P. Salinger AR Transform Components of a Cube Location Atlanta Denver Detroit Member Grapes Product Cherries Dimension Melons Cell Apples Sales Pears Day 1 Day 2 Week 1 Properties ... Week 2 ... January February ... Levels Q1 Q2 Q3 Period Q4 Time Quarter 1 Quarter 2 Quarter 3 Quarter 4 Start July 1 October 1 January 1 April 1 End September 30 December 31 March 31 June 30 Storing Cubes Storing in a ROLAP Structure Storing in a MOLAP Structure Storing in a HOLAP Structure Comparing Storage Structures Storing in a ROLAP Structure Data Mart or Data Warehouse SQL Server Oracle Other Data in OLAP Environment Relationship Database Relationship Database Relational Database ROLAP Data ROLAP Aggregations Storing in a MOLAP Structure Data Mart or Data Warehouse SQL Server Oracle Data in OLAP Environment MOLAP Data MOLAP Data MOLAP Aggregations Other Storing in a HOLAP Structure Data Mart or Data Warehouse SQL Server Data in OLAP Environment MOLAP Data MOLAP ROLAP Data Data Oracle Other MOLAP Aggregations Comparing Storage Structures ROLAP HOLAP MOLAP Base Data Storage Relational Table Relational Table Cube Aggregation Storage Relational Table Cube Cube Query Performance Fast Faster Fastest Storage Consumption Low Medium High Maintenance Low Medium High