Download Data Mining + Business Intelligence

Document related concepts
no text concepts found
Transcript
Data Mining + Business Intelligence
Integration, Design and Implementation
ABOUT ME
Vijay Kotu
Data, Business, Technology,
Statistics
www.LearnPredictiveAnalytics.com
BUSINESS INTELLIGENCE
-
Result
Making data accessible
Wider distribution
Dimensional slicing
Mostly as-is reporting
DATA MINING
-
Finding useful patterns in data
Limited distribution
Algorithms
Insights and Predictions
DATA MINING
Data Mining in simpler terms, is finding useful patterns in the data.
“It is non-trivial process of finding useful, valid, novel, understandable patterns or relationships in
the data to make important decisions” (Fayyad et al., 1996)
Statistics
Quantitative
Operations Research
Computing
Machine Learning
Data Stores
Computation
Machine Learning,
Optimization, Algorithms
DATA MINING: MODELS
DATA MINING: TYPES
Tasks
Regression
Classification
Feature Selection
Clustering
Data Mining
Text Mining
Anomaly detection
Time Series
Applications
Association
DATA MINING: TYPES
Tasks
Examples
Classification
Assigning voters into known buckets by political parties eg: soccer
moms. Bucketing new customers into one of known customer
groups.
Regression
Predicting unemployment rate for next year. Estimating insurance
premium.
Anomaly detection
Fraud transaction detection in credit cards. Network intrusion
detection.
Time series
Sales forecasting, production forecasting, virtually any growth
phenomenon that needs to be extrapolated
Clustering
Finding customer segments in a company based on transaction,
web and customer call data.
Association analysis
Find cross selling opportunities for a retailer based on transaction
purchase history.
DATA MINING: TYPES
Tasks
Algorithms
Classification
Decision Trees, Neural networks, Bayesian models, Induction
rules, K nearest neighbors
Regression
Linear regression, Logistic regression
Anomaly detection
Distance based, Density based, LOF
Time series
Exponential smoothing, ARIMA, regression
Clustering
K means, density based clustering - DBSCAN
Association analysis
FP Growth, Apriori
DATA MINING: PROCESS
DATA MINING: PROCESS
DATA MINING: PROCESS
DATA MINING: PROCESS
DATA MINING: PROCESS
DATA MINING: PROCESS
Data Mining Scoring
625
DATA MINING: PROCESS
Data Mining + Business Intelligence
CLASSIC BI ARCHITECTURE
Security Layer
Extraction Transformation
&Loading
Star Schema
Staging
OLAP
Dashboards,
reports, alerts,
ad hoc...
ANALYTICAL ARCHITECTURE #1
Data Mining Tool Scoring
Data
Mining
Tool
Extraction Transformation
&Loading
Star Schema
Staging
OLAP
Dashboards,
reports, alerts,
ad hoc...
Data Mining tool does the scoring. Robust modeling and scoring capabilities. BI tool reports the scored like any other data points.
Limitations: New records cannot be scored, unless scoring is provided by DM tool. Required multiple analytical tools.
ANALYTICAL ARCHITECTURE #2
Database Scoring
Extraction Transformation
&Loading
Star Schema
Staging
OLAP
Database does the scoring. Can handle large data. Model, scoring and data in one place.
Limitations: DB vendors have to provide full DM suite. Analysis Skills
Dashboards,
reports, alerts,
ad hoc...
ANALYTICAL ARCHITECTURE #3
BI Scoring: Native Modeling
Extraction Transformation
&Loading
Star Schema
Staging
OLAP
Dashboards,
reports, alerts,
ad hoc...
BI platform does the scoring. Good integration between predictive metrics with BI metrics. Security. Distribution. Real time scoring.
Limitations: Performance. Limited Functionality
ANALYTICAL ARCHITECTURE #4
BI Scoring: Data Mining Tool Modeling
Extraction Transformation
&Loading
Star Schema
Data
Mining
Tool
Staging
OLAP
Dashboards,
reports, alerts,
ad hoc...
PMML
Model
BI platform does the scoring. Modeled by DM tool and imported in BI platform. Real time scoring. Supports wide selection of algo.
Limitations: Performance.
ANALYTICAL ARCHITECTURE
Data Mining Tool Scoring
Database Scoring
BI Scoring
-
Native Modeling
-
Data Mining Tool Modeling
USE CASE
Association Analysis or
Market Basket Analysis
CLICKSTREAM DATA
Can be generalized to transactions
Applies to any product purchases in an enterprise
CLICKSTREAM DATA
Creation of Association Rules
CLICKSTREAM DATA
Creation of Association Rules
CLICKSTREAM DATA
Creation of Association Rules
DATA MINING USING BI SYSTEM
Model Building in BI
MicroStrategy Desktop > Data Mining Services
DATA MINING SERVICE
MicroStrategy Desktop > Data Mining Services
DATA MINING SERVICE
MicroStrategy Desktop > Data Mining Services
DATA MINING SERVICE
MicroStrategy Desktop > Data Mining Services
MODEL DETAILS
MicroStrategy Desktop > Data Mining Services
RESULTS
MicroStrategy Desktop > Data Mining Services
RESULTS
PMML
MicroStrategy Desktop > Data Mining Services
PMML
PMML
PMML
PMML
PMML
BI VS. DATA MINING THINKING
Number of customers lost last month
Production downtime report
ROI for Marketing Campaigns
Yesterday’s revenue
Who will most likely churn in next 10 days
What part of process will fail and mitigation
Whats the next action will the prospect make
Tomorrow’s
Data Mining + Business Intelligence
ISSUES
Data
Mining
Business
Intelligence
-
People: Skills of data mining and business intelligence are exclusive
-
Organization: They live in different organizations within an enterprise
-
Technology: Minimal overlap in the tools, platform and technology
-
Use cases: History reporting vs. prediction and insights
BENEFITS
Data
Mining
Business
Intelligence
-
Distribution: Data Mining insights will have wider real time distribution
-
Smarter Analytics: History + Predictions
-
Visual discovery: Common link
-
Security: Secure delivery of insights
RECOMMENDED READING
Advanced
Reporting
Guide:
Enhancing
Your
Business
Intelligence
OPEN SOURCE DATA MINING TOOLS
THANK YOU
Vijay Kotu
linkedin.com/in/vkotu
www.LearnPredictiveAnalytics.com
Data Mining + Business Intelligence
Appendix
CLUSTERING
CLUSTERING
CLUSTERING
CLUSTERING
Data Set
CLUSTERING
k-Means Clustering
CLUSTERING
CLUSTERING
CLUSTERING
CLUSTERING
DECISION TREES
DECISION TREES
DECISION TREES
DECISION TREES
DECISION TREES
DECISION TREES
DECISION TREES
Related documents