Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 9 Business Intelligence Systems Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga University Spokane, WA 99258 [email protected] Dr. Chen, Management Information Systems 1 Study Questions Q1: How do organizations use business intelligence (BI) systems? Q2: What are the three primary activities in the BI process? Q3: How do organizations use data warehouses and data marts to acquire data? Q4: How do organizations use reporting applications? Q5: How do organizations use data mining applications? Q6: How do organizations use BigData applications? Q7: What is the role of knowledge management systems? Q8: What are the alternatives for publishing BI? Q9: 2026? Dr. Chen, Management Information Systems Database vs. Datawarehouse DBMS ??? Dr. Chen, Management Information Systems Database Datawarehouse Q1: How Do Organizations Use Business Intelligence (BI) Systems? • Information systems generate enormous amounts of operational _____ that contain patterns, relationships, clusters, and trends about customers, suppliers, business partners, and employees that can facilitate management, especially planning and forecasting. • Business intelligence (BI) systems produce such information from operational data. • Data communications and data storage are essentially free, enormous amounts of data are created and stored every day. 12,000 gigabytes per person of data, worldwide in 2009 4 Dr. Chen, Management Information Systems Q1: How Do Organizations Use Business Intelligence (BI) Systems? Components of Business Intelligence System Fig 9-1: Components of a Business Intelligence System Dr. Chen, Management Information Systems 5 Tools vs. Applications vs. Systems • BI ____ (e.g., decision-tree analysis) is one or more computer programs. BI tools implement the logic of a particular procedure or process. • BI __________ is the use of a tool on a particular type of data for a particular purpose. • BI _______ is an information system having all five components (what are they?) that delivers results of a BI application to users who need those results. Dr. Chen, Management Information Systems 6 Why do organizations need business intelligence? • BI systems are computer programs provide valuable information for decision making. • Three primary BI systems: – __________ tools read data, process them, and format the data into structured reports (e.g., sorting, grouping, summing, and averaging) that are delivered to users. They are used primarily for assessment. RFM (Recency, Frequency, and Monetary Value) is one of the tool for reporting. – ____________ tools process data using statistical, regression, decision tree, and market basket techniques to discover hidden patterns and relationships, and make predictions based on the results – ___________ _________ tools store employee knowledge, make it available to whomever needs it. These tools are distinguished from the others because the source of the data is human knowledge. Dr. Chen, Management Information Systems How Do Organizations Use BI? [4] [3] [2] (Decision Support Systems) [1] Fig 9-2 Example Uses of Business Intelligence Dr. Chen, Management Information Systems 8 What Are Typical BI Applications? • Identifying changes [or patterns] in purchasing patterns (data warehouse) • Important life events cause customers to change what they buy. • BI for entertainment • Netflix has data on watching, listening, and rental habits, however, determines what people actually want, not what they say. • Predictive policing/ ______ • Analyze data on past crimes, including location, date, time, day of week, type of crime, and related data, to predict where crimes are likely to occur. Dr. Chen, Management Information Systems 9 Just-in-Time Medical Reporting • Example of real time data mining and reporting. • Injection notification services • Software analyzes patient’s records, if injections needed, recommends as exam progresses. • Blurry edge of medical ethics. Dr. Chen, Management Information Systems Q2: What Are the Three Primary Activities in the BI Process? [1] [2] [3] Fig 9-3 Three Primary Activities in the BI Process Dr. Chen, Management Information Systems 11 Q3: Components and Functions of a Data Warehouse Fig 9-14 Components of a Data Warehouse • Functions of a data warehouse Obtain data from operational, internal and external databases. Cleanse data. Organize and relate data. Catalog data using metadata. Dr. Chen, Management Information Systems Data Warehouse vs. Data Mart Data Mart is a _________ of Data Warehouse Dr. Chen, Management Information Systems Fig 9-15 Data Mart Examples Independent data mart data warehousing architecture Data marts: Legacy System: Operational database Mini-warehouses, limited in scope L T E Separate ETL for each independent data mart Dr. Chen, Management Information Systems Data access complexity due to multiple data marts 14 Q4: How Do Organizations Use Reporting Applications? • Create meaningful information from disparate data sources • Deliver information to user on time • Basic operations: 1. 2. 3. 4. 5. Sorting Filtering Grouping Calculating Formatting Dr. Chen, Management Information Systems 15 What are typical reporting applications? • RFM Analysis allows you to analyze and rank customers according to purchasing patterns as this figure shows. – __________: How recently a customer purchased items? => leads and opportunities – __________: How frequently a customer purchased items? => retention – __________ Value: How much a customer spends on each purchase? => profitability • RFM Analysis – Sort the data by date (for recency), times (for frequency), and purchase amount (for money), respectively – Divide the sorted data into five groups – Assign 1 to top 20%, 2 to next 20%, 3 to the third 20%, 4 to the fourth 20% and 5 to the bottom 20%. – The _______ the score, the better the customer. Dr. Chen, Management Information Systems RFM Analysis: Example RFM Scores • _ecently • _requently • _oney lower the score, The ______ the better the customer, and, consequently, the more profit the company will be. Fig 9-16 Example of RFM Scores Organizations can find their most valuable customers through “RFM”: – Recency: How recently a customer purchased items? => leads and opportunities – Frequency: How frequently a customer purchased items? => retention – Monetary Value: How much a customer spends on each purchase? => profitability Dr. Chen, Management Information Systems Interpreting RFM Score Results – more examples • Ajax has ordered recently and orders frequently. M score of 3 indicates it does not order most expensive goods. A good and regular customer but need to attempt to up-sell more expensive goods to Ajax • Bloominghams has not ordered in some time, but when it did, ordered frequently, and orders were of highest monetary value. May have taken its business to another vendor. Sales team should contact this customer immediately. • Caruthers has not ordered for some time; did not order frequently; did not spend much. Sales team should not waste any time on this customer. • Davidson in middle Set up on automated contact system or use the Davidson account as a training exercise 80/20 Rule (Pareto Principle) 18 Dr. Chen, Management Information Systems RFM Analysis Classification Scheme Dr. Chen, Management Information Systems 19 OLAP and its Applications • Online Analytical Processing (OLAP), a second type of reporting tool, is more generic than RFM. • OLAP provides you with the dynamic ability to sum, count, average, and perform other arithmetic operations on groups of data. Reports, also called OLAP cubes. • What software and function that enable you to create OLAP and its applications? • ANSWER – _________ with – __________ Dr. Chen, Management Information Systems Example of Grocery Sales OLAP Report http://dwreview.com/OLAP/ http://www.tableausoftware.com • Online Analytical Processing (OLAP) cubes, use – Measures which are data items of interest. In the figure below a measure is Store Sales Net . – Dimensions which are characteristics of a measure. In the figure below a dimension is Product Family. OLAP Product Family by Store Type Dr. Chen, Management Information Systems Fig 9-17 Example Grocery Sales OLAP Report 21 Example of Expanded Grocery Sales OLAP Report Drill down Fig 9-18 Example of Expanded Grocery Sales OLAP Report Dr. Chen, Management Information Systems 22 On-Line Analytic Processing (OLAP) • Enables mangers and analysts to interactively examine and manipulate large amounts of detailed and consolidated data from different dimensions. • Analytical Processing: – Drill-up (Consolidation) – ability to move from detailed data to aggregated data • Profit by Product >>> Product Line >>> Division – Drill-down – ability to move from summary/general to lower/specific levels of detail • Revenue by Year >>> Quarter >>>>Week >>>Day – Slice and Dice – ability to look across dimensions • Sales by Region Sales • Profit and Revelers by Product Line Dr. Chen, Management Information Systems Slicing a data cube REGION CUSTOMER Dr. Chen, Management Information Systems PART II Dr. Chen, Management Information Systems 25 Q5 How Do Organizations Use Data-mining Applications? Businesses use statistical techniques to find __________ and relationships among data and use it for classification and prediction. Data mining techniques are a blend of statistics and mathematics, and artificial intelligence (AI) and machine-learning. Data Warehouse Fig 9-20 Source Disciplines of Data Mining Dr. Chen, Management Information Systems Unsupervised vs. Supervised Data Mining • Data mining is an automated process of discovery and extraction of hidden and/or unexpected patterns of collected data in order to create models for decision making that predict future behavior based on analyses of past activity. • There are two types of data-mining techniques: – Unsupervised data-mining characteristics: • • • • No model or hypothesis exists before running the analysis Analysts apply data-mining techniques and then observe the results Analysts create a hypotheses after analysis is completed Apply statistical techniques such as Market Basket Analysis to estimate parameters of a model • Cluster analysis (and decision tree), a common technique in this category groups entities together that have similar characteristics – Supervised data-mining characteristics: • Analysts develop a model prior to their analysis • Regression analysis is a technique in this category that measures the impact of a set of variables on another variable • Neural networks predict values and make classifications. • Used for making predictions Dr. Chen, Management Information Systems Unsupervised vs. Supervised Data Mining Unsupervised Supervised •Analyst does not start with a priori hypothesis or model •Hypothesized model created (after) based on analytical results to explain patterns found •Example: Market Basket Analysis and Cluster analysis to find groups (Decision Tree) •Model created before analysis •Hypotheses created before analysis •Regression analysis: make predictions •Example: Cellphone Weekend Minuses (next slide) Dr. Chen, Management Information Systems Supervised Data Mining • Uses a priori model to compute outcome of model • Prediction, such as regression analysis • Analysts predict the number of minutes of weekend cell phone: • Ex: A customer who is 21 years old and opens an account with 6 months. What is the number of weekend minutes can be predicted? • Answer: CellPhoneWeekendMinutes = (12 + (17.5*CustomerAge)+(23.7*NumberMonthsofAccount) = 12 + 17.5*___ + 23.7*___ = ___________ Dr. Chen, Management Information Systems Market-Basket Analysis • Market-basket analysis – Identify sales patterns in large volumes of data – un-supervised data-mining tool – Products customers tend to buy together – Probabilities of customer purchases – Identify cross-selling opportunities Customers who bought fins also bought a mask. Dr. Chen, Management Information Systems 30 Interpretation on the Results Market-Basket Analysis is a un-supervised data-mining tool for determining sales patterns. It helps businesses create cross-selling opportunities (i.e., buying relevant products together). Two terms used with this type of analysis are: Support: the probability that two items will be purchased together (e.g., Fins and Mask will be purchased together) Confidence: a conditional probability estimate (e.g., proportion of the customers who bought a fin also bought mask [given by he/she also bought mask]) Lift: ratio of confidence to the base probability (e.g., ratio between customers of buying fins after buying mask and those buying fins of walking into the store) Dr. Chen, Management Information Systems Market Basket Analysis at a Dive Shop (Transactions = 400) Dr. Chen, Management Information Systems Fig 9-21 Market-Basket Analysis at a Dive Shop 32 Market-Basket Example: Dive Shop Transactions = 400 Dr. Chen, Management Information Systems 33 Transactions of both fins and masks are bought together are 250. Therefore, the probability that these two items will be purchased together. i.e., support is 250/400=0.625 Dr. Chen, Management Information Systems It means that if there are 100 transactions done today, fins and masks are found in the shopping basket is about 62.5 (62.5%) transactions. Transactions of both fins and masks are bought together are 250. Therefore, the probability that these two items will be purchased together. i.e., support is 250/400=0.625 It means that if there are 100 transactions done today, fins and masks are found in the shopping basket is about 62.5 (62.5%) transactions. There exists cross-selling opportunities The _______ the support value the ______ cross-selling opportunities will be Dr. Chen, Management Information Systems The conditional probability of the customers who bought a fin given by he/she also bought mask. Therefore, confidence is 250/270=0.926. Dr. Chen, Management Information Systems It means that 92.6% probability of the times that when mask is bought, fin is bought as well. 36 Lift: the ratio of confidence (0.926) to the base probability (0.7) of buying an item (fins). 0.926/0.7=1.32 Dr. Chen, Management Information Systems Therefore, the likelihood that customers buy fins when they buy a mask increases by 32% (1.32-1) Decision Trees • Hierarchical arrangement of criteria to predict a classification or value • Unsupervised data mining technique • Basic idea of a decision tree – Select attributes most useful for classifying something on some criteria to create “pure groups” Dr. Chen, Management Information Systems 38 Credit Score Decision Tree Fig 9-22 Credit Score Decision Tree Dr. Chen, Management Information Systems 39 Decision Tree Example for MIS Classes (hypothetical data) • A decision tree is a hierarchical arrangement of criteria that predicts a classification or value. It’s an unsupervised data-mining technique that selects the most useful attributes for classifying entities on some criterion. It uses if…then rules in the decision process. Here are two examples. If student is a junior and works in a restaurant, then predict grade > 3.0 If student is a senior and is a nonbusiness major, then predict grade < --- 3.0 If student is a junior and does not work in a restaurant, then predict grade < 3.0 --- If student is a senior and is a business major, then make no prediction [A] [B] [C] [D] Decision Tree Examples for MIS Class (Hypothetical Data) Dr. Chen, Management Information Systems What are typical data-mining applications? DM Capabilities Description Discover rules that Associations/Affinity correlate one set of (Unsupervised): Association between items events or items with another set of events or items. Relate events in time Sequence/Temporal based on a series of Patterns (Supervised): Time-based Affinity preceding events. (Statistical Analysis) Create partitions so Clustering: Grouping items according that all members of to statistical similarities each set are similar according to some (Unsupervised) metric or set of metrics. Classification: Assigns new records to existing classes (Unsupervised) Dr. Chen, Management Information Systems Example Market Basket Analysis: 75% of customers who buy Coke also buy corn chips (good for CRM analysis) Time-Based Analysis: 60% of customers buy TVs followed by digital camcorders Customer Segmentation: Meals charged on a business-issued gold card are typically purchased on weekdays and have a mean value of greater than $250, whereas meals purchased using a personal platinum card occur predominately on weekends, have a mean value of $175 and include a bottle of wine more than 65% of the time. Discover rules that Decision Tree Analysis (Customer define whether an item Segmentation): or event belongs to a Customers with excellent credit history have particular subset or a debt/equity ratio of less than 10% class of data Q6: How Do Organizations Use BigData Applications? • • • • BigData? – Not just big. Three V’s. Huge _______– petabyte and larger Rapid _______– generated rapidly Great _______ – Structured data, free-form text, log files, graphics, audio, and video • Because BigData is huge, fast, and varied. It can’t be processed using traditional techniques. • MapReduce is a technique for harnessing the power of thousands of computers working in parallel. • Push ____________ to the data instead of pushing data to a computing mode. 42 Dr. Chen, Management Information Systems Q7: What Is the Role of Knowledge Management Systems? • Knowledge Management The process of creating value from intellectual capital and sharing knowledge with those who need that capital Preserving organizational memory by capturing and storing lessons learned and best practices of key employees Scope of KM same as SM in hyper-social organizations. Enhance employee retention rates by recognizing and rewarding knowledge sharing. Streamline operations and reduce costs. • Knowledge management system (KMS) – an information system that supports the capturing and use of an organization’s “know-how” Dr. Chen, Management Information Systems Tacit vs. Explicit Knowledge • Intellectual and knowledge-based assets fall into two categories • _______ knowledge is personal, context-specific and hard to formalize and communicate • ________ knowledge can be easily collected, organized and transferred through digital means. Dr. Chen, Management Information Systems Tacit and Explicit KNOWLEDGE Oral Communication “Tacit” Knowledge 50-95% Information Request “Explicit” Knowledge Information Feedback Dr. Chen, Management Information Systems Explicit Knowledge Base 5 -50 % Q8: What Are the Two Functions of a BI Server? This figure shows the components of a generic BI system. A BI application server delivers results in a variety of formats to devices for consumption by BI users. A BI server provides two functions: management and delivery. Which option is for the “Grocery Store (UK)” case (pull or push)? Fig 9-29 Elements of Generic Business Intelligence System Dr. Chen, Management Information Systems Q9: 2026? • Exponentially more information about customers, better data mining techniques. • Companies buy and sell your purchasing habits and psyche. • Singularity Computer systems adapt and create their own software without human assistance. Machines will possess and create information for themselves. Will we know what the machines will know? Dr. Chen, Management Information Systems • END of CHAPTER 9 Dr. Chen, Management Information Systems 48