Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Webcast - searchsap.com September 10, 2002 ERP Centric Data Mining and Knowledge Discovery Naeem Hashmi Chief Technology Officer Information Frameworks e-mail: [email protected] Web: http://infoframeworks.com Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 2 About the Speaker • Founder and CTO of Information Frameworks, an author, speaker and world-renowned expert on emerging Information Architectures, Integration and Business Intelligence Technologies. • Author of the best selling book titled, – SAP Business Information Warehouse for SAP, 2000. Naeem Hashmi • Technical Editor • – SAP BW Certification Guide, authored by Catherine Roze 2002 Contributing Author, SAP BW Handbook, 2002 • • Member of Intelligent ERP magazine's board of editors, is a frequent speaker at IT industry conferences including SAP TechEd, ASUG, Oracle Open World, DCI, The ERP World, Data Mining and the Data Warehouse Institute. 25+ years of experience in emerging Information Technology research, development, and management; Information Architectures; Enterprise Application Integration e-business; ERP applications; Data Warehousing; Data Mining; CRM; Internet, Object and Client/Server Technologies and Strategic Consulting. • Email- [email protected] url: http://infoframeworks.com Tel: 603-432-4550 Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 3 Agenda • Data Mining and Knowledge Discovery Basics • ERP Vendors and Data Mining Solutions • Data Mining in SAP Business Information Warehouse • Pro and Cons of ERP centric Data Mining • Q&A Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 4 Agenda • Data Mining and Knowledge Discovery Basics • ERP Vendors and Data Mining Solutions • Data Mining in SAP Business Information Warehouse • Pro and Cons of ERP centric Data Mining • Q&A Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 5 What is Data Mining and Knowledge Discovery ? • Data Mining is a tactical process that uses mathematical algorithms to sift through large datastores to extract data patterns/models/rules • The Knowledge Discovery is the process of identifying and understanding potentially useful hidden anomalies, trends and patterns. Data mining is an integral part of knowledge discovery process Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 6 Data Mining and Statistics ? • DM sounds very similar to regression analysis but its approach and purpose are quite different – Statistical methods tests a hypothesis on a data set – Data Mining starts from the data sets to construct a hypothesis Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 7 Data Mining - Present State Application Domains Business Life Sciences Other 317 85 31 73% 20% 7% Source: http://www.kdnuggets.com/polls/ Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 8 Data Mining Methodologies CRISP-DM http://www.crisp-dm.org/ Source: http://www.kdnuggets.com/polls/ CRoss Industry Standard Process for Data Mining SIX STEPS PROCESS 1. Business Understanding 2. Data Understanding 3. Data Preparation 4. Modeling 5. Evaluation 6. Deployment Source: http://www.crisp-dm.org/ Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 9 Data Mining Process http://www.crisp-dm.org/ CRoss Industry Standard Process (CRISP) for Data Mining Data Warehouse Data Understanding Data Preparation Initially will take about 60% to 80% of the data mining project time Source: http://www.crisp-dm.org/ 1. 2. 3. 4. 5. 6. Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 10 Data Mining - Tools and Data Formats Domains Business Life Sciences Other 317 85 31 73% 20% 7% 57% Flat files 37% Proprietary 27% DBMS Source: http://www.kdnuggets.com/polls/ Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 11 Data Mining Technology Visualization Use human pattern recognition capabilities Statistics T E C H N I Q U E S Applying statistical techniques to predict Decision Trees Building scripts based on historic data Association Rules (Rule Induction) Reasoning from specific facts to reach a hypothesis Clustering U S A G E Discover Understand Predict Refers to finding and visualizing groups of facts that were not previously known Neural Networks Learning how to solve problems based on examples K-Nearest Neighbor Classification by looking at similar data Genetic Algorithms Survival of the fittest … Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 12 Data Mining Models Two Types of Data Mining Models Prediction Models Prediction and Classification Regression algorithms • Neural Networks, Rule Induction • Predict Numerical Outcome Classification algorithm Descriptive Models Grouping & Associations Clustering/Grouping algorithms • K-means, Kohonen, Factor Analysis Association algorithms • CHAID, discriminant analysis • Predict Symbolic Outcome Copyrights 2002 • ERP Data Mining & Knowledge Discovery Apriori, Sequence webcast searchsap.com Sept 10, 2002 13 Traditional DM vendors • • • • • SPSS Clementine SAS Enterprise Miner IBM Intelligent Miner Salford CART/MARTS …more Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 Database Vendors – DM within the Products • 14 Data Mining Engine in Oracle 9i – Oracle 9i consists of key products • Oracle9i Database ,Oracle9i Application Server,Oracle9i Developer Suite • • • IBM Intelligent Miner into DB2 TeraMiner into Teradata Microsoft – SQL Server 2000 • When you implement DM functionality in a DBMS, you are limited to a specific database engine and not quite flexible in a typical enterprise application landscape - heterogeneous environment. Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 15 Data Mining Standards • • • • PMML - Predictive Model Markup Language OleDB for Data Mining Java Data Mining API Other Data Exchange Standards for Analytics and need Data Mining extensions – – – – CWM: Common Warehouse Metadata XML/A: XML for Analytics CPEX: Customer Profile EXchange xCIL: Extensible Customer Information Language Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 16 Agenda • Data Mining and Knowledge Discovery Basics • ERP Vendors and Data Mining Solutions • Data Mining in SAP Business Information Warehouse • Pro and Cons of ERP centric Data Mining • Q&A Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 17 Enterprise Applications Landscape • ERP Solutions – Oracle – PeopleSoft – SAP • ERP vendors have extended scope of their applications far beyond tradition ERP functions to a wide array of business solutions such as: Customer Relationships Management Business Intelligence Enterprise Portals • Siebel Copyrights 2002 ERP Data Mining & Knowledge Discovery • Oracle Business Intelligence Solution • Peoplesoft Enterprise Performance Management • SAP Business Information Warehouse webcast searchsap.com Sept 10, 2002 18 Oracle Business Intelligence Solution Business Processes (Pre-Built Portlets) • Response to Lead (27) • Lead to Quote (56) • Quote to Order (15) • Order to Cash (34) • Demand to Build (40) • Procure to Pay (28) • Revenue to Compensation (29) • Expiration to Renewal (33) • Issue to Resolution (51) • HR Family (43) Oracle 9i DM Integration • Oracle Marketing Online for Campaign Management • Oracle9iAS Personalization • iStore • more to come… Oracle 9i Business Intelligence Copyrights 2002 Source: Oracle Oracle9iDS Warehouse Builder Oracle9iDS Reports Oracle9iAS Clickstream Intelligence Oracle9i Data Mining ERP Data Mining & Knowledge Discovery Oracle9iAS Discoverer Oracle9iAS Portal Oracle9iAS Personalization Oracle9iDS Business Intelligence Beans webcast searchsap.com Sept 10, 2002 PeoplSoft Business Intelligence Solution Enterprise Performance Management (EPM) Customer Profitability Finance Workforce Analytics Supply Chain Management Process Workforce Rewards Enrollment Management Retail Merchandise Project Analysis Student Administration Balanced Scorecard CRM Prospect Analysis Employee Scorecard Customer Scorecard Vendor Scorecard Data mining Capabilities CRM Marketing Analysis CRM Sales Effectiveness CRM Service Effectiveness No word on PeopleSoft Data Mining tools/technologies for predictive analytics - home grown, acquired or 3rd Party Products. No response from PeopleSoft contacts Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 19 20 SAP Business Intelligence Solution Business Information Warehouse SAP Markets, Procurement SAP CRM Campaign management Bidding, pattern-based offering Opportunity analytics Activity reproting, service Customer behavior modeling analytics SAP SCM Demand planning Spend optimization SCOR KPIs +1700 Queries SAP Portals E-commerce analysis Closed loop platform capabilities SAP Financials, Human Capital Management SEM Balanced scorecard Planning Economic profit Benchmarking Employee turnover & retention Corporate investment management +420 InfoCubes Drill-through (report-report i/f) Remote cubes (read through) 90 ODS Objects Real-time data warehousing Data mining Write back to operational system Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 CRM Venders – Data Mining Integration • Oracle CRM – Pre 9i Darwin – Post 9i ODM • RightPoint and E.piphany • SPSS and Siebel • SAP CRM – Native Data Mining built in SAP BW - Database Independent – Interface to IBM Intelligent Miner Interface with SAP BW • PeopleSoft CRM – No official data mining product or vendor solution – Waiting for their response on what they have? Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 21 22 Agenda • Data Mining and Knowledge Discovery Basics • ERP Vendors and Data Mining Solutions • Data Mining in SAP Business Information Warehouse • Pro and Cons of ERP centric Data Mining • Q&A Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 SAP BW 3.0b Data Mining Implementation • Currently for Customer Subject Area • Algorithm Supported – – – – Decision Trees Scoring Clustering/Segmentation Association • Data Mining process – – – – – – Model definition No Extensive Training the model Data Staging Performing prediction using the training results Uploading the results back into BW Utilizing the mining results (on the operational side) SAPGUI is the Interface to the Data Mining modeling and analysis Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 23 24 Modeling a Decision Tree Create a mining model 1 Model ccolumns Specifying the column parameters Data type of the column 6 2 7 Specifying the values in case the original values in the column are to be treated differently 4 3 The nature of the column content Indicating the prediction column 5 Indicating the key column Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 25 Modeling a Decision Tree Specify Model Parameters Size of the window (such as 10%) Use portion (%) of the data for training or the whole data set for training 1 2 3 The number of repeats with different samples 4 7 Stop training when the no. of cases under the given node is less than/equal to the specified value 6 5 Stop training when the accuracy is greater than or equal to the expected accuracy Use the information gain threshold to check the relevance If the tree is too big, prune the tree without violating the expected accuracy Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 26 Modeling a Decision Tree Create a training source and map the model columns BW Query 5 1 Runtime parameters for query Model columns 3 2 Selected source columns 4 Mapping between model column and source column Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 27 SAP BW Data Mining – Process Steps Create a mining model Train the model Predictions using Training results Using the data mining results against BW Query Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 Viewing Decision Tree Training Results 2 Chances of a customer leaving is 70.7% if the profession is “LABOURER” 1 This decision tree predicts whether the customer has left or is still “on board 28 Out of a total of 705 cases, 41 cases are covered under this node 4 3 Chart shows the distribution at the selected node 6 5 28/41 customers are likely to leave 13/41 customers are likely to stay Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 29 Data Mining – Decision Trees Source: SAP Uploaded in BW Then BEX for further Analysis Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 30 Data Mining – Association • Create a Association model • Define Model Columns • Train the model • Predictions using Training results • Using the data mining results against BW Query Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 31 Data Mining – Association Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 32 Data Mining – Cluster Analysis • Create a Cluster model • Train the model • Predictions using Training results • Using the data mining results against BW Query Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 33 Viewing Cluster Analysis Results 2 3 1 Source: SAP Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 34 Viewing Cluster Analysis results Uploaded in BW Then BEX for further Analysis Copyrights 2002 ERP Data Mining & Knowledge Discovery Source: SAP webcast searchsap.com Sept 10, 2002 35 SAP Data Mining • Good attempt to implement few Data Mining Algorithms • Very traditional Data Mining Approach • Requires a well versed Statistician or Data Mining Expert to model and interpret the results • Source: BEX Query – Big Limitation in DM • Weak Visualization • BEX for additional discovery - slicing and dicing Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 36 SAP BW - IBM Intelligent Miner IBM Intelligent Miner is designed to: • Copy data from SAP BW to IBM Intelligent Miner – Results of reports in BW – Modeling in Business Explorer Analyzer – Data direct from InfoCubes (for cross-selling analysis) – Descriptions, hierarchies • Results data from IBM IM back into SAP BW – Results of segmentation can be loaded as master data or hierarchies • Data transport is designed through Wizards in SAP BW – Possible to get a good view of Intelligent Miner Results from SAP BW Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 37 Agenda • Data Mining and Knowledge Discovery Basics • ERP Vendors and Data Mining Solutions • Data Mining in SAP Business Information Warehouse • Pro and Cons of ERP centric Data Mining • Q&A Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 38 ERPs and Data Mining: Good and the Bad News • Good News – – – – – – – Known Business Processes Few data Sources Improved Data Quality Metadata Integration Near real-time data mining Closed-loop Knowledge Discovery Consistent Infrastructure CRISP-DM 1. Business Understanding 2. Data Understanding 3. Data Preparation 4. Modeling 5. Evaluation 6. Deployment • Bad News – – – – Complex Data Structures Performance Availability Very few Data Mining algorithms - Today Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 Data Mining Process and ERP Data Mining Business Understanding Data Understanding Data Preparation Will reduce data mining project time up to 50% Deployment 39 Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment Source: http://www.crisp-dm.org/ Good News for Future Business Applications Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 40 Agenda • Data Mining and Knowledge Discovery Basics • ERP Vendors and Data Mining Solutions • Data Mining in SAP Business Information Warehouse • Pro and Cons of ERP centric Data Mining • Q&A Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 41 INFORMATION FRAMEWORKS Seminars Webinars Keynotes Panel Moderator Publications Hands-on training Conferences Executive and Senior IT Management Consulting KNOWLEDGE TRANSFER Enterprise Information Architectures (EIA) Business Case Development Information Architecture Application Deployment Architectures implementation Legacy Application Migration Strategies ERP Application deployment strategies Enterprise Applications Integration (EAI) Market Research Market Assessment Competitive Analysis Technology due INFORMATION TECHNOLOGY INVESTORS INFORMATION TECHNOLOGY ORGANIZATION Architectures, Service Modeling and design, EAI technology assessment Tools and Technology Assessment Vendor Selection and Assessment Conference Room Pilot implementation Business Intelligence and Portals Architectures, Methodologies Technology/Solution Assessment Product Strategy Solution Strategy Product Positioning Competitive Analysis Software product architecture Marketing Strategy Product Performance and Benchmarking Consulting Hardware Configuration SOFTWARE AND SOLUTION VENDORS Tool/technology/Vendor assessment and selection Data Warehouse, Data Marts, Analytics, Information Delivery Deployment Architectures Business Intelligence and eBusiness Integration architectures Portals Strategies, Business case, Assessment, Architectures, Modeling, Planning and knowledge Transfer http://infoframeworks.com Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002 42 Questions Naeem Hashmi Chief Technology Officer September 10, 2002 Email: [email protected] Web Site: http://infoframeworks.com Tel: 603-432-4550 Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002