Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Business Intelligence : a primer Rev April 2012 Introduction & overview The paradigm of BI systems Platforms Appendix Review questions Our approach to BI Rep DSS Plan Exec Mon Ctl BI Architecture Info BI systems modelling ES taxonomy Worker Customer Dash Manager Cost Flexibility & speed 1. Production unit cost 2. Productivity 3. Usage / workload 1. Process duration 2. Activity timeliness 3. Resource flexiblity Quality & satisfaction 1. 2. Spec conformity of service and products Technology dependability KPI Identification / mapping HIGO 1. Customer access / acquisition unit cost 2. Customer use cost 1. Expectation conformity 1. Response time of the service 2. Response timeliness 2. Service dependability 3. Vendor flexiblity 3. Customer satisfaction 1. Execution unit cost 2. Preparation effort 1. Technology response time 2. Technology timeliness 3. Activity & technology flexibility 1. 2. 3. Expectation conformity of work / work environment Technology dependability Employee satisfaction Enterprise Information Modeling SIRE Aggregate Strategic Level (ASL) Analytic Information Modeling DFM GUI Modeling GOA Implementation Level Rich Semantic Level (RSL) Software Engineering Interface (SEI) Business Intelligence: the role within Enterprise Systems Management support Management Information Systems [Planning & Management Control + Business Intelligence ] Front-end systems (Support the life cycle of customers and end products) Back-end systems (Support the cycle of production and delivery) Administrative systems (Finance, HR etc.) Operations support Acronyms • • • • • • • • • • • • ABC: Activity Base Costing ABM: Activity Based Management BI: Business Intelligence BW: Business Warehouse (synonym of DW) BSC: Balanced Score Card CPM: Corporate Performance Management (synonym of SEM) CRM: Customer Relationship Management CSF: Critical Success Factor DBMS: Data Base Management System DSS: Decision Support System DW: Data Warehouse EIS: Executive Information System • • • • • • • • • • • • EPM: Enterprise Performance Management (synonym of SEM) ERP: Enterprise Resource Planning ERM: Enterprise Resource Management ES: Enterprise System KPI: Key Performance Indicator MBO: Management By Objectives MRP: Manufacturing Resource Management ODS: Operational Data Store OLAP: On Line Analytical Processing OLTP: On Line Transaction Processing SCM: Supply Chain Management SEM: Strategic Enterprise Management Characteristics of Analytic & Management Information • Information is – Periodical – Output of computation or aggregations – Reflects objectives or actual data • E.g. data of P& L of an imaginative Car Company come from different transaction processing systems – – – – • Sales Purchasing Accounting Etc. Therefore, the design of BI / MIS : – Is top-own – Defines first target data i.e. the variables that BI should process – Identifies corresponding source data – Defines the process to extract and transform source in target data The 4-layer paradigm of BI /MIS systems Decision support engines (DSS) Presentation / reporting engine (EIS, reporting) Mining & other application engines DATA MART DATA WAREHOUSE Loading Tranformation Extraction DATA ENTRY BASI DATI OPERATIVE BASI DATI OPERATIVE BASI DATI OPERATIVE Transactions Data Bases The 4-layer paradigm of BI /MIS systems • • • • • BI/MIS applications are based on 4 layers Layer 1 contains source data, typically stored in Transaction Data Base Layer 2 extracts information, and transforms source data into Multi-key & Time-dependent data Layer 3 stores such transformed information Layer 4 processes transformed information according various purposes – Support decisions (DSS) • – Prepare reports and dashboard (Report) • – E.g. define the sale budget E.g., sales performance Mine stored data (Mining) • E.g. identify customer who may churn Business Intelligence : a primer Rev April 2012 Introduction & overview The paradigm of BI systems Platforms Appendix Review questions “Jones” case study • CONTEXT – The Supermarket Chain «Jones» includes 300 shops in 3 regions with 60k items on sale – A POS (Point Of Sale) system supports all activities of each shop : • • • • item receiving, storing, scrapping, selling – Specifically, POS terminals record sales transactions and issue receipts • REQUIREMENTS – Management want to analyze sales – Facts : Sales – Measures: amount, quantity, number of tickets – Analysis dimensions • Date • Item • Shop – Time span : 24 months rolling Level 1 (source data) «Jones» case study Ticket # 2002a23b11 Store #0021MI Item #190 #69 #90 TOTALE Payment Date Des Pen Mat Lib Receipt Heading • # Store • # Ticket • Amount • Payment • Date Price 3560 550 32000 Qty Amount 2 7.12 10 5.50 1 32.00 44.62 Fidelity P. 120109 Item Master Data • # Item • # Store • Description • Price • Qunatity mesuere • Stock on hand • Stock at the beginning of the day • Average forecasted dayly sale Receipt detail • # Ticket • # Item • Amount • Qty Level 2 • DSS Report/ dashboard – Select source data – Check and clean source data (data cleaning o data cleansing) – Staging of extracted data (as needed) – Log of extractions Mining & other • DATA MART DATA ENTRY TRANSACTIONS DATABASES Extraction can be – Automatic: a batch procedure that runs periodically (e.g. daily, weekly, monthly) – Interactive: integrates and fixes automatic data DATA WAREHOUSE LOADING TRANSFORMATION EXTRACTION Extraction includes • ETL can use intermediate databases – Staging Area : where extracted data are temporarily parked (e.g. Data of each individual shop) – Operational Data Store (ODS): where granular data are stored and reconciled for future use (e.g. receipt data) Level 3 DSS Report/ dashboard Mining & other DATA MART DATA WAREHOUSE LOADING TRANSFORMATION EXTRACTION DATA ENTRY TRANSACTIONS DATABASES • Data are stored in Data Warehouse and Data Marts • A Data Warehouse is a “subject-oriented, integrated, time-variant (temporal), non volatile collection of summary and detailed data, used to support strategic decisionmaking process for the enterprise” (Inmon 1996) • Data Mart is a smaller warehouse, often a subset or extraction of a warehouse. • Warehouse e Mart typically adopt different data schemas Level 3 : Data Warehouse Key table 1 •Key 1 •Attribute 1 •Attribute 2 •Attribute …. Key table 2 •Key 2 •Attribute 1 •Attribute 2 •Attribute …. Fact table •Key 1 •Key 2 •Key … •Measure 1 •Measure 2 •Measure …. Key table … •Key … •Attribute 1 •Attribute 2 •Attribute …. • The warehouse is typically implemented by relational database, whose schema reflects the corresponding DFM (Dimensional Fact Model). • In relational schemas: • Fact tables: • Store the value of facts (measures) • Are identified by multiple keys (K>= 2) • Key tables • Describe the attributes of dimensions Level 3: Data Warehouse: star schema Jones case study Shop Shop# • Description • Shop-class • ZIP-code Time Date# • Week-day • Flag work/holyday for local calendar • Date in muslim calendar • Flag work/holyday for muslim calendar Sales • • • • • • Date# Item# Shop# Sales amount Sales qty Number of receipts Item Item# • Billing-metric •Item description • Bar-code# • Package qty • Package-class • Supplier-brand • Item-class • A simple implementation of the DFM is a STAR schema where key tables are implemented only for immediate keys • Further analysis / segmentation is obtained by queries on attributes of key tables Level 3 : Data Warehouse : Snow flake schema Jones Case study ZIP Shop Provinceregion Area Holiday Sales Date Muslim date Week-day Chinese date Class Superclass Time Item Supplier • A full implementation of the DFM requirements implies a snow flake schema with a key table for every hierarchy node Level 3: design steps The process from extraction up to data warehouse creation is supported by warehouse building tools that are incorporated in most BI platforms 1 2 Target Data design 3 Mapping of Source Data into Target Data 4 ETL code generation 5 Source Data Base Identification 6 Data extraction Creation of Data Warehouse Level 3: design steps : detail Level 3: Data Mart DSS Report/ dashboard Mining & other DATA MART DATA WAREHOUSE LOADING TRANSFORMATION EXTRACTION DATA ENTRY TRANSACTIONS DATABASES • Data mart store frequently accessed information • From a same warehouse multiple data marts can be created • Data marts are typically implemented by hypercube (OLAP technology) Level 3: Data Mart Customer History Sales Analysis Marketing Accounting Shop Data Warehouse From a same warehouse multiple data marts can be created Level 3: Data Mart : Hyper-cube : display Pages Facts Columns Level 3: Data Mart : Hyper-cube : logic Fact Sales Shop Item Time Event Dimension Quantity = 20 Amount= 100 • An hypercube is a matrix of tables • A Fact (e.g. Sales) is identified in a multidimensional space whose axes are Analysis Dimensions (e.g. Shop, Time, Item) • An hypercube enables to instantly retrieve complex information e.g. : – Sales in last Year (aggregation of Time) – by Region (=aggregation of Shops) – by Category (= aggregation of Product) Level 3: Data Mart : Hyper-cube : logic Shops Shop Item Month BUDGET MB21000 MB21000 MB21000 MB21000 MB31000 MB31000 MB31000 MB31000 MB41000 MB41000 MB41000 MB41000 0601 0601 0602 0602 0601 0601 0602 0602 0601 0601 0602 0602 Jan Feb Jan Feb Jan Feb Jan Feb Jan Feb Jan Feb 50 55 50 60 65 45 55 50 60 70 65 75 MB21000 MB31000 MB41000 Item 0601 Date Jan Feb Mar ITEM’ SHOP OLAP dimensions = warehouse key 0602 Apr Level 3: Data Mart : Hyper-cube : logic Dimension Product Hierarchy Type Category Svelto …. Ajax Washing powder House Cleaning Dash … Soap Palmolive Dairy Bread & Biscuit Food Drinks Tools Hardware Nuts & bolts All Products • Dimensions are arranged in «aggregation hierarchies» (roll-up) • Levels of hierarchies are called «dimensional attributes» • A multidimensional analysis is performed by navigating trough aggregation levels of dimensions Level 3: Data Mart : Hyper-cube : implementation • FACT TIME ITEM Date Tempo (ch) Tempo attributi (da def.) Prodotto (ch) Prodotto attributi (da def.) Item Shop Sales-amount Sales-qty Shop PuntoVendita (ch) PuntoVendita attributi (da def.) • Shop Shop Shop Receipt-number Item Item Item A wise approach to implement multidimensional information is to have an hyper-cube for each measure This easies arithmetic operations and keeps hyper-cubes light Level 4 DSS Report/ dashboard Mining & other DATA MART DATA WAREHOUSE LOADING TRANSFORMATION EXTRACTION DATA ENTRY TRANSACTIONS DATABASES • It processes information for management from various perspectives – Define / assess decisions and program (DSS) – Present information with a friendly navigation that enables roll up and drill down (EIS & dashboard) – Produce structured reports (reporting) – Identify trends an pattern in stored information (mining and profiling) Leve 4 : reporting Information distribution and privileges handling Format editing Semantic Layer Data Marts Data warehouse Data Bases Level 4: reporting : semantic layer • Purpose: to map data from heterogeneous sources • Generally semantic layer includes a set of types e.g.: – Dimensions (= warehouse keys) – Dimensions attributes ( = key attributes) – Measures and Facts Level 4: reporting : format editing • Includes editing functions by which report pages are defined. • He content of the report is obtained by dragging an dropping information item from the catalogue of the semantic layer • Further activities manage the layout of pages Level 4: reporting : information distribution Level 4 : DSS • A DSS is a computer based application designed to support semi-structured management decisions by – Searching and analyzing information on a collection of sources – Compute and assess results (e.g. sensitivity analysis) • Typical application fields are: – – – – Planning Budgeting Optimization Funding and Investment Decisions • ERP / CRM vendors offer DSS suites for corporate planning as Oracle’s EPM and SAP’s BO Level 4 : DSS : an example (budgeting) The control system produces monthly a financial report and a report with physical performance indicators (KPI) Financial report and KPI report are on 5 dimensions: 1. 2. 3. 4. 5. Time Cost centers Item Sales channel Activity Sales data come from the Sales systems and are stored in a data mart; the same approach is also for sales budget, actual costs and budget costs Data marts are merged in two hyper-cubes, respectively KPI and Financial. Over hyper-cubes a software processes reports on P&L, A&L, Cashflow, KPI Ricavi a budget Ricavi Conto economico Processi di calcolo Sistema di vendita Ricavi Stato patrimoniale Dati Processi finanziari di calcolo Cashflow Sistema amministrativo Voci economiche e patrimoniali Spese e costi a budget Costi KPI Processi di calcolo KPI Processi di calcolo Memorizzazione e calcolo Elaborazione report Level 4 : Analysis Engines • Data mining applications for research and marketing are designed for – Discover in a data base relations and associations previously unknown (“data mining helps end user extract useful business information from large databases” (Berson 1997)). – Mining software is a key in marketing to calculate predictive indicators as • • • • Churning, Fraud risk, Saving attitude, Economic potential etc. • Customer Profiling systems (Analytic CRM) Business Intelligence : a primer Rev April 2012 Introduction & overview The paradigm of BI systems Platforms Appendix Review questions BI solutions are offered by all main vendors • • BI is 5-10% of the ES market Main vendors offer BI products & applications – ES vendors • Oracle: the largest DB vendor – – • SAP: the largest ERP vendor – – • – products on Warehousing and applications from vendors acquired (Essbase, Hyperion ) Applications: EPM analogous of SAP’s SEM Applications: Strategic Enterprise Management (SEM) to support the entire management and analysis life cycle Products : Crystal report, Business Object (founder of reporting paradigm) Microsoft : Office products , SQL server family BI vendors • • • SAS: founder of BI and the largest BI independent vendor, offers a wide range of applications by industry and business area, and specific solutions Microstrategy Open source platforms: e.g. Pentaho Business Intelligence Platforms : SAS • By industry – – – – – • … Education Financial Services Government ….. By solution – – – – – – – – – – – – – – Analytics Business Analytics Business Intelligence Customer Intelligence Data Management Fraud & Financial Crimes High-Performance Analytics IT & CIO Enablement On Demand Solutions Performance Management Risk Management SAS® 9.3 Supply Chain Intelligence Sustainability Management • Featured solutions – SAS® 9.3 – SAS® Clinical Data Integration – SAS® Curriculum Pathways® – SAS® Enterprise Guide® – SAS® Enterprise Miner™ – SAS Fraud Framework for Government – SAS® High-Performance Analytics – SAS® Inventory Optimization – SAS® OnDemand for Academics – SAS® Social Media Analytics – SAS® Text Analytics – SAS® Visual Data Discovery Business Intelligence : a primer Rev April 2012 Introduction & overview The paradigm of BI systems Platforms Appendix Review questions Data Warehouse and Data Mart vs Database Data base Data Warehouse Data Mart Conceptual modeling (Rich Semantic Layer) ERA DFM DFM Information type (Master, Event, Analysis) Master + Event Analysis Analysis Information organization Normalized (e.g. 3NF) Star or snowflake Hypercube Data schema Relational Relational OLAP or Relational Processing orientation Create + Update Read Read Typical data operations Insert one individual record or modify one or multiple records Access a vector of records Roll-up, Drill down, Dice Access one ore multiple a vector of records Roll-up, Drill down, Dice Transaction example Enter a customer order Segment customer in Italy with a degree of loyalty >70% by age and region Segment customer in Italy with a degree of loyalty >70% by age and region Business Intelligence : a primer Rev April 2012 Introduction & overview The paradigm of BI systems Platforms Appendix Review questions Review questions • Illustrate the input, process and output of the four layers of BI systems • What is an Hyper-cube ? • What is a data mart? • What is a data warehouse? Compare data warehouse versus classic database in terms of – – – – Conceptual modeling (Rich Semantic Layer) Implementation (DB schema) Information type (Master, Event, Analysis) Processing orientation