Download New Features in Enterprise Miner

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
New Features in
Enterprise Miner
Dr. John Brocklebank, SAS Institute Inc.
Gerhard Held, SAS Institute EMEA
Copyright © 2000 SAS EMEA
New Features in Enterprise Miner
Agenda
!
!
!
!
!
The Big Picture: Importance of analytics in
today’s Marketplace
Enterprise Miner 4.0: Integration
Enterprise Miner 4.0: New analytics/graphics
Beyond Enterprise Miner 4.0
Summary
Copyright © 2000 SAS EMEA
Analysts:
Analytical Applications are Key!
Three key stages of CRM implementation
•Operational CRM: Sales, mktg, service automation
•Analytical CRM: 74% of Global 2000 plan to invest more
•Collaboration CRM: Customer channels
Measuring Web Success
…instead, firms need an intelligent infrastructure to track
visitors and their activities…
• Web-based reporting
• OLAP and query tools
• Data mining tools uncover hidden opportunities
Copyright © 2000 SAS EMEA
Analytical Applications are Key
…-and SAS Institute leads the Pack!
SAS Institute by far market
leader WW in Statistical /
Data Mining Revenues
(29.3%, 1999)
Copyright © 2000 SAS EMEA
Enterprise Miner Release 4.0 to come with SAS Release 8.e
!
!
Clients:
Windows 2000, NT, 98, and 95
Servers:
!
!
!
!
!
!
Windows 2000 and NT
Solaris 2.6 and 7 or higher
MVS ESA 5 or prior releases including all releases of OS/390
HP-UX 10.20 and 11.0
Compaq Tru64 Unix 4.0E
Intel ABI all compliant ABI+ systems
New
Copyright © 2000 SAS EMEA
Integration:
Enterprise Miner 4.0 is V8e enabled
!
!
!
Long variable and table names up to 32 bytes
Handles data with mixed-case variable names
Documentation is integrated with the rest of the
SAS help system in HTML format.
Copyright © 2000 SAS EMEA
Sampling Tools for Metadata Creation in
Warehouse Administrator Add-ins
Copyright © 2000 SAS EMEA
Integration: Converts EM Score Code to
C Functions for Deployment
DATA step score
Copyright © 2000 SAS EMEA
code to C functions
Beta for EM 4.0
New Analytics/Graphics:
New Tree Viewer
!
!
!
!
!
Copyright © 2000 SAS EMEA
Written in Microsoft Foundation Classes
Creates a thin client viewer
Interactive Tree Display, presentation quality,
printing
%let emv4tree=1; * add to autoexec.sas;
Select the “New view…” popup-menu item from
the Tree node icon to launch the browser;
MFC Based Tree Browser
Copyright © 2000 SAS EMEA
Results Browser for Associations
Copyright © 2000 SAS EMEA
•
•
Fast new neural network methodology
Supports stand alone principal components
analysis
Experimental for EM 4.0
Copyright © 2000 SAS EMEA
PROC DMVQ for SOM/Kohonen
!
!
Copyright © 2000 SAS EMEA
Dedicated procedure, provides enhanced
speed
Builds dummy variables and incorporates
these into score code instead of calculating
these outside of the procedure
SAS Code Node: New Macro Variables and
Improved Interface
Copyright © 2000 SAS EMEA
Integrated Installation; Quick Conversion
Version 3.0x project:
simply open
Version 2.0x project use import Wizzard.
Copyright © 2000 SAS EMEA
Major R&D Efforts on the Way
Some in SAS Release 8.2/EM 4.1
!
!
!
!
!
!
Copyright © 2000 SAS EMEA
Text Mining - Preprocessing and Variable
Reduction
Memory Based Reasoning - Fast models for
e-intelligence
Analytic Recommendation Engine using new
Model Repository
Java based score code
Forecasting for cross sectional time series
Genomic data mining add-ons for SNP
linkage analysis and microarray expression
profiling...
What Is Text Mining?
It is a process of
!
converting free-form textual data to an
intelligent infrastructure
so that we can
extract implicit meaning and
! discover heretofore unknown information
!
via data analytical tools.
Copyright © 2000 SAS EMEA
Applications & Customers
!
Customer relationship learning
!
E-mail routing
!
Newsgroup filtering
!
Newswire/News report analysis
!
Document analysis
etc.
Copyright © 2000 SAS EMEA
Remove Noise Words
!
Stop words
e.g. are,
hence, maybe, of, the,…
!
Punctuation
!
Non-discriminating words
Copyright © 2000 SAS EMEA
Analyze Word Morphology
!
Irregular words
! understand
& understood
! swim & swam
!
Stemming
! walk
& walking
! dance & danced
!
Hyphenated words
! e-commerce
Copyright © 2000 SAS EMEA
Create Frequency Table
!
Count occurrences of terms in each document
Term
Frequency
count
1
e-commerce
2
World wide web
1
World wide web
1
software
1
software
Copyright © 2000 SAS EMEA
Doc
Key
2
2
3
1
5
1869
2001
2001
2005
2005
Reduce Dimension via SVD
!
Project document vectors into a k-dimensional best
fit subspace
!
Choosing k properly should reduce “noise” in data
but preserves all relevant info
!
Add additional target to projected image
Copyright © 2000 SAS EMEA
Fuzzy Pattern Matching
!
When “fuzzy” pattern matching used for
categorization, it is called Memory-Based
Reasoning, or Lazy Learning.
!
Categorization can be done by having each of
the neighbors “vote” on what category value to
predict for the scored instance.
Copyright © 2000 SAS EMEA
When MBR is useful
!
!
!
!
Target needs to be determined “on-the-fly”: eintelligence, other web applications
Different profiles will predict same target value.
Target has many values, perhaps not mutually
exclusive.
Want to incrementally change model over time
--- forgetting possible.
Copyright © 2000 SAS EMEA
Analytic Recommendation Engine
(ARE)
!
ARE plugs in analytical results into an API
!
API implemented as Java classes
Configurable to do score lookup or real-time
scoring
Currently deployed in the SAS Publishing Web
site www.sas.com
Accesses Model Repository (MR) where all
information is contained
!
•
•
Copyright © 2000 SAS EMEA
Genome Miner Solution
Copyright © 2000 SAS EMEA
Multiple Regression with Time Series
Errors
An Example of Extended Inputs
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Copyright © 2000 SAS EMEA
Year
1975
1976
1977
.
.
1998
1999
2000
2001
1975
1976
.
.
Y
317.60
391.80
410.60
.
.
1304.40
1486.70
.
.
26.63
23.39
.
.
X1
3078.50
4661.70
5387.10
.
.
6241.70
5593.60
4989.18
5045.91
290.60
291.10
.
.
X2
Company
2.80
AA
52.60
AA
156.90
AA
.
.
.
.
1777.30
AA
2226.30
AA
2675.10
AA
3123.89
AA
162.00
BB
174.00
BB
.
.
.
.
Data Mining Knowledge Solutions
Applications/Industries
!
Available:
!
!
!
!
!
!
Cross-selling in finance
Fraud detection in finance
Rate making in insurance
Churn management in telco (CRM Knowledge Solution)
Intrusion detection (joint usage: Systems, e-intelligence)
Coming this year:
!
!
!
!
!
Copyright © 2000 SAS EMEA
Mining of quality data in manufacturing
Database marketing in retail
Credit scoring in banking
Web mining
Customer attrition in finance
Data Mining and other Initiatives
!
Data Mining and CRM
!
!
Data Mining and Web-based Computing
!
!
Customer acquisition, retention, cross-sell, up-sell, profitability,
fraud, closed-loop.
On-line scoring of customer over Web, WAP phone, PALM
Data Mining and “e”
!
Customer profiling, personalization, tailoring web site to user
behaviour, identify potential, increase “stickiness” of site
Data Mining and Pharma
!
!
Find active chemical structures, new drug discovery, outcomes
research, sales and marketing
Data Mining and Systems
!
Copyright © 2000 SAS EMEA
Capacity planning, intrusion detection
Conclusion
!
!
Analytical applications are key and SAS software is
leading the Pack!
Enterprise Miner Version 4.0:
!
!
!
!
!
!
!
V 8e enabled/integrated
Windows 2000, MVS, Intel ABI
Deployment using C-scoring
Some new GUIs, algorithms: MFC Tree Browser, DMNeural
Major R&D on the way: text mining, MBR, MR, Java-based score
code, forecasting, genomics.
DM-based Knowledge Solutions, co-operation with
other Initiatives
… and Web Mining is a key part from now on!
Copyright © 2000 SAS EMEA