Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Adamson & Venerable Chapter 2 & working out a Homework 5/6 Solution Transforming Relational Databases into Dimensional Diagrams Spring 2012 Dimensional modeling in Sales • In a DW designed to analyze SALES data, important component of a dimensional model is the Product Dimension. • Product dimension includes important characteristics used to differentiate the product in the marketplace, called Discriminators. Key Business Term: Discriminators • Discriminators: Descriptive characteristics of a product that further describe it and are relevant to purchasing decisions. Tracking discriminators allows the business analyst to monitor performance of various product styles, influencing production and marketing plans. • Discriminators for a men’s suit: Cloth, color, style/cut, weight, size • Discriminators for vehicles: Model name, model styling package, line, category, exterior color, model year, interior color Other Dimensions in Sales • Time dimension: Time key, month, day, date, year, day of week, quarter. • Customer_Demographic dimension: This does not require a row for each customer, but groups customers by different combinations of age, gender, income, and geography. The degree of demographic segmentation varies by industry. • Dealer dimension: Data on dealer performance are needed, to support decisions on which dealers should be eased out of business. • Method_Of_Payment dimension (lease, financing options, etc.) Fact Table: Storing derived facts • A commonly used derived fact should be stored, and not calculated in reports and queries. Cutting such “redundant” key measures from the fact table results in the following: 1. Development of reports gets more complex 2. Increased potential for errors in reports 3. Increased documentation requirements 4. A Hundred dollars’ worth of disk space is saved (40 MB of space savings for a 10-million row fact table) Transformation Stages (Key for Homework #5/6) • De-normalization Process – Start with Normalized Tables – Determine Dimensions and Fact Tables – Delete Relationships • Rebuild Tables • Rebuild Relationship Diagram as Star Diagram, a.k.a., Dimension Table • Provides information needed to complete Homework 6, too! Premiere Products ERD Redrawn to form most likely Star Diagram Delete Relationships Rebuild Tables • Using copied operations database • Be sure all ops. data is saved and backed up … multiple times. • Data staging & cleansing – Denormalize extra relationships • • • • – – – – – Order:OrderLine; Customer:SalesRep; Order:OrderLine OrderDetail Customer:SalesRep OrderDetail Transform data for new tables in Access: Make Table Export data files, if needed, to rebuild elsewhere (Excel) Additional Transformations as needed in Excel Create the Time dimension Re-Import data files to new tables Order_OrderLine Query Restructured Data -- Stage 1 Preparing Rep:Customer Join to Order_OrderLine OrderDetail Query Order of Columns Not critical Sort order not critical but Good time to revise Make-Table OrderDetail Query OrderDetail Table Set Primary Keys Data Cleansing Fix dates Add Indexes Build Star Diagram Fact Table What’s Missing? Time Dimension Use SQL to eliminate Redundant dates Built-in Functions Built-in Date/Time Functions Excel Time Table Create Time table using Excel Formulas: 1. Fill 2. Copy/paste 3. Compute Quarter 4. Conversion Formulas 5. Etc. Import Time Table to Access Time Table w/Indexes Premier Products Star Diagram PP -- Relation List Fact Table • OrderDetail[OrderNum, PartNum, OrderDate, CustNo, RepNo, NumOrdered, Price] Dimension Tables • Customer[CustNum, CustName, Street, City, State, Zip, Balance, CreditLimit] • Rep[RepNum, LastName, FirstName, Street, City, State, Zip, Commission, Rate] • Part[PartNum, Desc, OnHand, Class, Warehouse, Price] • Time[TimeKey, Day_of_Week, Month, Year, JulianDate, Quarter, etc.] Replacing OrderDate with Time_key Tools/Analyze/Tables Tools/Analyze/Tables Tools/Analyze/Tables Tools/Analyze/Documenter Tools/Analyze/Documenter CUSTOMER table http://www.webopedia.com GUID Last modified: Thursday, June 20, 2002 • Short for Globally Unique Identifier, a unique 128-bit number that is produced by the Windows OS or by some Windows applications to identify a particular component, application, file, database entry, and/or user. For instance, a Web site may generate a GUID and assign it to a user's browser to record and track the session. A GUID is also used in a Windows registry to identify COM DLLs. Knowing where to look in the registry and having the correct GUID yields a lot information about a COM object (i.e., information in the type library, its physical location, etc.). Windows also identifies user accounts by a username (computer/domain and username) and assigns it a GUID. Some database administrators even will use GUIDs as primary key values in databases. • GUIDs can be created in a number of ways, but usually they are a combination of a few unique settings based on specific point in time (e.g., an IP address, network MAC address, clock date/time, etc.). Tools/Analyze/Documenter OrderDetail table Henry Books ERD (before) BookInventory Henry Books ERD (before) Update Query for Inventory Table [Physical Inventory Date] Inventory Table: Date Updated BookInventory Fact Table Note: 2 Time Keys [Pick one] BookInventory Fact Table Note: 2 Time Keys [Pick one] BookInventory Fact Table Note: 2 Time Keys [Pick one] BookInventory Fact Table Note: 2 Time Keys [Pick one] Henry Books Star Diagram HB -- Relation List Fact Table • BookTrans[BookCode, AuthorNum, BranchNum, Sequence, PubCode, OnHand, InventoryDate] Dimension Tables • Book[BookCode, Title, Type, Price, Paperback] • Publisher[PubCode, PubName, City, …State, Zip] • Author[AuthorNum, LastName, FirstName, …] • Branch[BranchNum, BranchName, Location, Employees] • Time[TimeKey, Day_of_Week, Month, Year, JulianDate, Quarter, etc.] Summary • Complete Transformations • How normal are the resulting tables? – 1NF, 2NF, 3NF? • Document Transformation maps • Prepare for Appending Tables with new data • Tools/Analyze/Documenter (next time)