Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
New Features in Enterprise Miner Dr. John Brocklebank, SAS Institute Inc. Gerhard Held, SAS Institute EMEA Copyright © 2000 SAS EMEA New Features in Enterprise Miner Agenda ! ! ! ! ! The Big Picture: Importance of analytics in today’s Marketplace Enterprise Miner 4.0: Integration Enterprise Miner 4.0: New analytics/graphics Beyond Enterprise Miner 4.0 Summary Copyright © 2000 SAS EMEA Analysts: Analytical Applications are Key! Three key stages of CRM implementation •Operational CRM: Sales, mktg, service automation •Analytical CRM: 74% of Global 2000 plan to invest more •Collaboration CRM: Customer channels Measuring Web Success …instead, firms need an intelligent infrastructure to track visitors and their activities… • Web-based reporting • OLAP and query tools • Data mining tools uncover hidden opportunities Copyright © 2000 SAS EMEA Analytical Applications are Key …-and SAS Institute leads the Pack! SAS Institute by far market leader WW in Statistical / Data Mining Revenues (29.3%, 1999) Copyright © 2000 SAS EMEA Enterprise Miner Release 4.0 to come with SAS Release 8.e ! ! Clients: Windows 2000, NT, 98, and 95 Servers: ! ! ! ! ! ! Windows 2000 and NT Solaris 2.6 and 7 or higher MVS ESA 5 or prior releases including all releases of OS/390 HP-UX 10.20 and 11.0 Compaq Tru64 Unix 4.0E Intel ABI all compliant ABI+ systems New Copyright © 2000 SAS EMEA Integration: Enterprise Miner 4.0 is V8e enabled ! ! ! Long variable and table names up to 32 bytes Handles data with mixed-case variable names Documentation is integrated with the rest of the SAS help system in HTML format. Copyright © 2000 SAS EMEA Sampling Tools for Metadata Creation in Warehouse Administrator Add-ins Copyright © 2000 SAS EMEA Integration: Converts EM Score Code to C Functions for Deployment DATA step score Copyright © 2000 SAS EMEA code to C functions Beta for EM 4.0 New Analytics/Graphics: New Tree Viewer ! ! ! ! ! Copyright © 2000 SAS EMEA Written in Microsoft Foundation Classes Creates a thin client viewer Interactive Tree Display, presentation quality, printing %let emv4tree=1; * add to autoexec.sas; Select the “New view…” popup-menu item from the Tree node icon to launch the browser; MFC Based Tree Browser Copyright © 2000 SAS EMEA Results Browser for Associations Copyright © 2000 SAS EMEA • • Fast new neural network methodology Supports stand alone principal components analysis Experimental for EM 4.0 Copyright © 2000 SAS EMEA PROC DMVQ for SOM/Kohonen ! ! Copyright © 2000 SAS EMEA Dedicated procedure, provides enhanced speed Builds dummy variables and incorporates these into score code instead of calculating these outside of the procedure SAS Code Node: New Macro Variables and Improved Interface Copyright © 2000 SAS EMEA Integrated Installation; Quick Conversion Version 3.0x project: simply open Version 2.0x project use import Wizzard. Copyright © 2000 SAS EMEA Major R&D Efforts on the Way Some in SAS Release 8.2/EM 4.1 ! ! ! ! ! ! Copyright © 2000 SAS EMEA Text Mining - Preprocessing and Variable Reduction Memory Based Reasoning - Fast models for e-intelligence Analytic Recommendation Engine using new Model Repository Java based score code Forecasting for cross sectional time series Genomic data mining add-ons for SNP linkage analysis and microarray expression profiling... What Is Text Mining? It is a process of ! converting free-form textual data to an intelligent infrastructure so that we can extract implicit meaning and ! discover heretofore unknown information ! via data analytical tools. Copyright © 2000 SAS EMEA Applications & Customers ! Customer relationship learning ! E-mail routing ! Newsgroup filtering ! Newswire/News report analysis ! Document analysis etc. Copyright © 2000 SAS EMEA Remove Noise Words ! Stop words e.g. are, hence, maybe, of, the,… ! Punctuation ! Non-discriminating words Copyright © 2000 SAS EMEA Analyze Word Morphology ! Irregular words ! understand & understood ! swim & swam ! Stemming ! walk & walking ! dance & danced ! Hyphenated words ! e-commerce Copyright © 2000 SAS EMEA Create Frequency Table ! Count occurrences of terms in each document Term Frequency count 1 e-commerce 2 World wide web 1 World wide web 1 software 1 software Copyright © 2000 SAS EMEA Doc Key 2 2 3 1 5 1869 2001 2001 2005 2005 Reduce Dimension via SVD ! Project document vectors into a k-dimensional best fit subspace ! Choosing k properly should reduce “noise” in data but preserves all relevant info ! Add additional target to projected image Copyright © 2000 SAS EMEA Fuzzy Pattern Matching ! When “fuzzy” pattern matching used for categorization, it is called Memory-Based Reasoning, or Lazy Learning. ! Categorization can be done by having each of the neighbors “vote” on what category value to predict for the scored instance. Copyright © 2000 SAS EMEA When MBR is useful ! ! ! ! Target needs to be determined “on-the-fly”: eintelligence, other web applications Different profiles will predict same target value. Target has many values, perhaps not mutually exclusive. Want to incrementally change model over time --- forgetting possible. Copyright © 2000 SAS EMEA Analytic Recommendation Engine (ARE) ! ARE plugs in analytical results into an API ! API implemented as Java classes Configurable to do score lookup or real-time scoring Currently deployed in the SAS Publishing Web site www.sas.com Accesses Model Repository (MR) where all information is contained ! • • Copyright © 2000 SAS EMEA Genome Miner Solution Copyright © 2000 SAS EMEA Multiple Regression with Time Series Errors An Example of Extended Inputs ! ! ! ! ! ! ! ! ! ! ! ! ! ! Copyright © 2000 SAS EMEA Year 1975 1976 1977 . . 1998 1999 2000 2001 1975 1976 . . Y 317.60 391.80 410.60 . . 1304.40 1486.70 . . 26.63 23.39 . . X1 3078.50 4661.70 5387.10 . . 6241.70 5593.60 4989.18 5045.91 290.60 291.10 . . X2 Company 2.80 AA 52.60 AA 156.90 AA . . . . 1777.30 AA 2226.30 AA 2675.10 AA 3123.89 AA 162.00 BB 174.00 BB . . . . Data Mining Knowledge Solutions Applications/Industries ! Available: ! ! ! ! ! ! Cross-selling in finance Fraud detection in finance Rate making in insurance Churn management in telco (CRM Knowledge Solution) Intrusion detection (joint usage: Systems, e-intelligence) Coming this year: ! ! ! ! ! Copyright © 2000 SAS EMEA Mining of quality data in manufacturing Database marketing in retail Credit scoring in banking Web mining Customer attrition in finance Data Mining and other Initiatives ! Data Mining and CRM ! ! Data Mining and Web-based Computing ! ! Customer acquisition, retention, cross-sell, up-sell, profitability, fraud, closed-loop. On-line scoring of customer over Web, WAP phone, PALM Data Mining and “e” ! Customer profiling, personalization, tailoring web site to user behaviour, identify potential, increase “stickiness” of site Data Mining and Pharma ! ! Find active chemical structures, new drug discovery, outcomes research, sales and marketing Data Mining and Systems ! Copyright © 2000 SAS EMEA Capacity planning, intrusion detection Conclusion ! ! Analytical applications are key and SAS software is leading the Pack! Enterprise Miner Version 4.0: ! ! ! ! ! ! ! V 8e enabled/integrated Windows 2000, MVS, Intel ABI Deployment using C-scoring Some new GUIs, algorithms: MFC Tree Browser, DMNeural Major R&D on the way: text mining, MBR, MR, Java-based score code, forecasting, genomics. DM-based Knowledge Solutions, co-operation with other Initiatives … and Web Mining is a key part from now on! Copyright © 2000 SAS EMEA