Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Warehouse DSS Business Intelligence [email protected] BI Evolution History Legacy MIS Reports Current Decision Support Systems Business Intelligence Source: META Group Inc. 2005+ Business Performance Management •Hand coded •Report writers •OLAP •Dashboard/mining •Single system data •Joined operating data •DW •Enterprise portals •Summary metrics •Statistical metrics •Predictive metrics •Recommendations •Extreme latency •Extreme cost •Extreme ‘infoglut’ •Extreme integration Moving beyond one-way info delivery to true BPM © Minder Chen, 2004-2010 Data Warehouse - 2 BI Questions • What happened? – What were our total sales this month? • What’s happening? – Are our sales going up or down, trend analysis • Why? – Why have sales gone down? • What will happen? – Forecasting & “What If” Analysis • What do I want to happen? – Planning & Targets Source: Bill Baker, Microsoft © Minder Chen, 2004-2010 Data Warehouse - 3 BI Business Intelligence (BI) is the process of gathering meaningful information to answer questions and identify significant trends or patterns, giving key stakeholders the ability to make better business decisions. “The key in business is to know something that nobody else knows.” -- Aristotle Onassis PHOTO: HULTON-DEUTSCH COLL “To understand is to perceive patterns.” — Sir Isaiah Berlin "The manager asks how and when, the leader asks what and why." — “On Becoming a Leader” by Warren Bennis © Minder Chen, 2004-2010 Data Warehouse - 4 BI Definition • Business intelligence provides the ability to transform data into usable, actionable information for business purposes. BI requires: – Collections of quality data and metadata important to the business – The application of analytic tools, techniques, and processes – The knowledge and skills to use business analysis to identify/create business information – The organizational skills and motivation to develop a BI program and apply the results back into the business © Minder Chen, 2004-2010 Data Warehouse - 5 Business Intelligence Increasing potential to support business decisions (MIS) Making Decisions Data Presentation Visualization Techniques End User Business Analyst Data Analyst Data Mining Information Discovery Data Exploration OLAP, MDA, Statistical Analysis, Querying and Reporting Data Warehouses / Data Marts Data Sources (Paper, Files, Information Providers, Database Systems, OLTP) © Minder Chen, 2004-2010 DBA Data Warehouse - 6 Inmon's Definition of Data Warehouse – Data View • A warehouse is a – subject-oriented, – integrated, – time-variant and – non-volatile collection of data in support of management's decision making process. Source: http://www.intranetjournal.com/features/datawarehousing.html – Bill Inmon in 1990 © Minder Chen, 2004-2010 Data Warehouse - 7 Inmon's Definition Explain • Subject-oriented: They are organized around major subjects such as customer, supplier, product, and sales. Data warehouses focus on modeling and analysis to support planning and management decisions v.s. operations and transaction processing. • Integrated: Data warehouses involve an integration of sources such as relational databases, flat files, and online transaction records. Processes such as data cleansing and data scrubbing achieve data consistency in naming conventions, encoding structures, and attribute measures. • Time-variant: Data contained in the warehouse provide information from an historical perspective. • Nonvolatile: Data contained in the warehouse are physically separate from data present in the operational environment. © Minder Chen, 2004-2010 Data Warehouse - 8 The Data Warehouse Process Data Marts and cubes Source Systems Clients Data Warehouse 1 Design the Data Warehouse © Minder Chen, 2004-2010 2 Populate Data Warehouse Query Tools Reporting Analysis Data Mining 3 Create OLAP Cubes 4 Query Data Data Warehouse - 9 BI Architecture Source: http://www.rpi.edu/datawarehouse/docs/DW-Architecture.pdf © Minder Chen, 2004-2010 Data Warehouse - 10 BI Infrastructure Components © Minder Chen, 2004-2010 Data Warehouse - 11 Key Concepts in BI Development Lifecycle Technology Data Application © Minder Chen, 2004-2010 Data Warehouse - 12 Performance Dashboards for Information Delivery © Minder Chen, 2004-2010 Data Warehouse - 13 OLTP Versus Business Intelligence: Who asks what? OLTP Questions • When did that order ship? • How many units are in inventory? • Does this customer have unpaid bills? • Are any of customer X’s line items on backorder? © Minder Chen, 2004-2010 Analysis Questions • What factors affect order processing time? • How did each product line (or product) contribute to profit last quarter? • Which products have the lowest Gross Margin? • What is the value of items on backorder, and is it trending up or down over time? Data Warehouse - 14 Classification of Entity Types © Minder Chen, 2004-2010 Data Warehouse - 15 Transaction Level Order Item Fact Table © Minder Chen, 2004-2010 Data Warehouse - 16 OLTP Versus OLAP OLTP Questions • When did that order ship? • How many units are in inventory? • Does this customer have unpaid bills? • Are any of customer X’s line items on backorder? © Minder Chen, 2004-2010 OLAP Questions • What factors affect order processing time? • How did each product line (or product) contribute to profit last quarter? • Which products have the lowest Gross Margin? • What is the value of items on backorder, and is it trending up or down over time? Data Warehouse - 17 Requirements © Minder Chen, 2004-2010 Data Warehouse - 18 Dimensional Design Process Business Requirements • Select the business process to model • Declare the grain of the business process/data in the fact table • Choose the dimensions that apply to each fact table row • Identify the numeric facts that will populate each fact table row Data Realities © Minder Chen, 2004-2010 Data Warehouse - 19 Star Schema Source: Moody and Kortink, "From ER Models to Dimensional Models: Bridging the Gap between OLTP and OLAP Design, Part I," Business Intelligence Journal, Summer 2003, pp. 7-24. © Minder Chen, 2004-2010 Data Warehouse - 20 Identifying Measures and Dimensions Performance Measures for KPI Measures Performance Drivers Dimensions The attribute varies continuously: The attribute is perceived as a constant or discrete value: •Balance •Unit Sold •Cost •Sales •Description •Location •Color •Size © Minder Chen, 2004-2010 Data Warehouse - 21 A Dimensional Model for a Grocery Store Sales Why? © Minder Chen, 2004-2010 Data Warehouse - 22 Facts Table Measurements of business events. DateID ProductID Dimensions CustomerID Units Dollars Measures The Fact Table contains keys and units of measure © Minder Chen, 2004-2010 Data Warehouse - 23 Fact Tables Fact tables have the following characteristics: • Contain numeric measures (metric) of the business • May contain summarized (aggregated) data • May contain date-stamped data • Are typically additive • Have key value that is typically a concatenated key composed of the primary keys of the dimensions • Joined to dimension tables through foreign keys that reference primary keys in the dimension tables © Minder Chen, 2004-2010 Data Warehouse - 24 Store Dimension • It is not uncommon to represent multiple hierarchies in a dimension table. Ideally, the attribute names and values should be unique across the multiple hierarchies. © Minder Chen, 2004-2010 Data Warehouse - 25 Inside a Dimension Table • Dimension table key: Uniquely identify each row. Use surrogate key (integer). • Table is wide: A table may have many attributes (columns). • Textual attributes. Descriptive attributes in string format. No numerical values for calculation. • Attributes not directly related: E.g., product color and product package size. No transitive dependency. • Not normalized (star schemar). • Drilling down and rolling up along a dimension. • One or more hierarchy within a dimension. • Fewer number of records. © Minder Chen, 2004-2010 Data Warehouse - 26 Product Dimension • SKU: Stock Keeping Unit • Hierarchy: – Department Category Subcategory Brand Product © Minder Chen, 2004-2010 Data Warehouse - 27 Hierarchy © Minder Chen, 2004-2010 Data Warehouse - 28 Levels and Members Year Quarter Month 1999 Quarter 1 Jan 1999 Quarter 1 Feb 1999 Quarter 1 Mar 1999 Quarter 2 Apr 1999 Quarter 2 May 1999 Quarter 2 Jun 1999 Quarter 3 Jul 1999 Quarter 3 Aug 1999 Quarter 3 Sep 1999 Quarter 4 Oct 1999 Quarter 4 Nov 1999 Quarter 4 Dec © Minder Chen, 2004-2010 Data Warehouse - 29 Operations in Multidimensional Data Model • Aggregation (roll-up) – dimension reduction: e.g., total sales by city – summarization over aggregate hierarchy: e.g., total sales by city and year -> total sales by region and by year • Selection (slice) defines a subcube – e.g., sales where city = Palo Alto and date = 1/15/96 • Navigation to detailed data (drill-down) – e.g., (sales - expense) by city, top 3% of cities by average income • Visualization Operations (e.g., Pivot) © Minder Chen, 2004-2010 Data Warehouse - 30 Drilling down in a data mart is nothing more than adding row headers from the dimension tables. Drilling up is removing row headers. We can drill down or up on attributes from more than one explicit hierarchy and with attributes that are part of no hierarchy. © Minder Chen, 2004-2010 Data Warehouse - 31 Avoid Null Key in the Fact Table • Include a row in the corresponding dimension table to identify that the dimension is not applicable to the measurent. 11: No Promotion Sales Fact Table © Minder Chen, 2004-2010 Data Warehouse - 32 Querying the Retail Sales Schema © Minder Chen, 2004-2010 Data Warehouse - 33 Dragging and dropping dimensional attributes and facts into a simple report © Minder Chen, 2004-2010 Data Warehouse - 34 ETL ETL = Extract, Transform, Load • Moving data from production systems to DW • Checking data integrity • Assigning surrogate key values • Collecting data from disparate systems • Reorganizing data © Minder Chen, 2004-2010 Data Warehouse - 35 Building The Warehouse Transforming Data © Minder Chen, 2004-2010 Data Warehouse - 36 Pivot Table in Excel © Minder Chen, 2004-2010 Data Warehouse - 37 Use of Data Mining • • • • • Customer profiling Market segmentation Buying pattern affinities Database marketing Credit scoring and risk analysis © Minder Chen, 2004-2010 Data Warehouse - 38 OLAP and Data Mining Address Different Types of Questions While reporting and OLAP are informative about past facts, only data mining can help you predict the future of your business. OLAP Data Mining What was the response rate to our mailing? What is the profile of people who are likely to respond to future mailings? How many units of our new product did we Which existing customers are likely to buy sell to our existing customers? our next new product? Who were my 10 best customers last year? Which 10 customers offer me the greatest profit potential? Which customers didn't renew their policies Which customers are likely to switch to the last month? competition in the next six months? Which customers defaulted on their loans? Is this customer likely to be a good credit risk? What were sales by region last quarter? What are expected sales by region next year? What percentage of the parts we produced yesterday are defective? What can I do to improve throughput and reduce scrap? Source: http://www.dmreview.com/editorial/dmreview/print_action.cfm?articleId=2367 © Minder Chen, 2004-2010 Data Warehouse - 39 UA’s Existing Data Warehouse Balanced Scorecard Financial Perspective Improve Stakeholder Value Stakeholder Value ROCE Revenue Growth Strategy Increase Customer Value Build the Franchise New Revenue Services Productivity Strategy Improve Cost Structure Profitability Customer Acquisition Cost per Unit Improve Asset Utilization Asset Utilization Customer Retention • Customer Profitability Product Leadership Customer Intimacy Customer Perspective Customer Value Proposition Product/Service Attributes Price Quality Time (Innovation Processes) Learning & Growth Perspective © Minder Chen, 2004-2010 Relationship Function Service Image Relations Brand Customer Satisfaction “Build the Franchise” Internal Perspective Operational Excellence “Increase Customer Value” “Achieve Operational Excellence” “Become a Good Neighbor” (Customer Management Processes) (Operations & Logistics Processes) (Regulatory & Environmental Processes) A Motivated and Prepared Workforce Strategic Competencies Strategic Technologies Climate for Action Data Warehouse - 40