Download SQL Server Analysis Services Data Mining Overview

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Introduction to SQL
Server
Data Mining
Nick Ward
SQL Server & BI Product Specialist
Microsoft Australia
Agenda






What is Data Mining?
Why use Data Mining?
Data Mining Tasks
Data Mining Process
SQL Server 2005 Data Mining
Demonstration
SQL Server 2005 Data Mining
Discussion
What is Data Mining?
What is not Data Mining?
•
•
•
•
•
•
Ad-Hoc Query
Event Notifications
Multidimensional Analysis/Slice Dice
Statistics
OLAP
Canned or
ad-hoc reports
What is Data Mining?


“Data mining is the semiautomatic extraction of patterns,
changes, associations,
anomalies, and other statistically
significant structures from large
data sets.” R. Grossman
Also known as


Machine Learning
Predictive Analytics
Why Data Mining?
Disk
Processor
Time
Types of Analysis

Query-Reporting-Analysis

“What happened?”




Real-Time - “What is happening?”


Simple Reports
Key Performance Indicators
OLAP Cubes – Slice/Dice
Events/Triggers
Data Mining


“What will happen?”
“How/why did this happen?”
Data Mining Tasks
Explores
Your Data
Finds
Patterns
Performs
Predictions
Data Mining Tasks
Mining Model
Training Data
Data
Mining Model To Predict
DM
Engine
DM
Engine
Mining Model
Predicted Data
Customer Examples
 ComputerFleet
(Australia): Predict
when hired equipment will be
returned
 Sanford Securities (Australia): Data
mining automation
 Clait Health Services: Identify
patients likely to suffer deteriorating
health for pro-active treatment
 AIM Healthcare: Identify billing
errors, duplicate payments etc. to
minimize costs
Data Mining Tasks






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
Data Mining Tasks






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
•
•
•
•
•
•
What type of membership card should I
offer?
Which customers will respond to my
mailing?
Is this transaction fraudulent?
Will I lose this customer?
Will this product be defective?
Why is my system failing?
Which patients health will degrade?
Data Mining Tasks






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
•
•
•
How much revenue will I get from this
customer?
How long will this asset be in service?
What is the mean time to failure?
What is the particle density of this fluid?
Data Mining Tasks






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
•
•
•
Describe my customers
How can I differentiate my customers?
How can I organize my data in a manner
that makes sense?
Is this record an outlier?
Data Mining Tasks






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
•
•
What items are bought together?
Which services are used together?
What products should I recommend to
my customers?
Data Mining Tasks






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
– What are projected revenues for all
products?
– What are inventory levels next month?
Data Mining Tasks






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
Analysis of unstructured data
–
–
–
Finds key terms and phrases in text
Conversion to structured data
Feed into other algorithms
•
•
•
•
•
•
Classification
Segmentation
Association
How do I handle call center data?
How can I classify mail?
What can I do with web feedback?
Data Mining Process
CRISP-DM
“Doing Data Mining”
Business
Understanding
Data
Understanding
Data
Preparation
Data
Deployment
Modeling
Evaluation
“Putting Data Mining
to Work”
www.crisp-dm.org
Value of Data Mining
Business Knowledge
Relative Business Value
SQL Server 2005
Data Mining
OLAP
Reports (Adhoc)
Reports (Static)
Easy
Difficult
Usability
Data Mining Process
CRISP-DM
“Doing Data Mining”
Business
Understanding
Data
Understanding
Data
Preparation
Data
Deployment
Modeling
Evaluation
“Putting Data Mining
to Work”
www.crisp-dm.org
Data Mining User Interface

SQL Server BI Development Studio




Creation and exploration environment
Data Mining projects inside Visual Studio solutions with
related projects
Source Control Integration
SQL Server Management Studio


Single place for management of all SQL Server
technologies
Manage, Browse, and Query Data Mining Models
Data Mining
Data Mining Algorithms






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
Data Mining Algorithms






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
•
•
•
Decision Trees
Neural Nets
Naïve Bayes
Logistic Regression
Data Mining Algorithms






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
•
•
•
Decision Trees
Neural Nets
Logistic Regression
Linear Regression
Data Mining Algorithms






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
•
Clustering
Sequence Clustering
Data Mining Algorithms






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
•
Association Rules
Decision Trees
Data Mining Algorithms






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
Time Series
Data Mining Algorithms






Classification
Estimation
Segmentation
Association
Forecasting
Text Analysis
•
Integration Services
– Term Extraction Transform
– Term Lookup Transform
Data Mining Programmability

DMX Query Interface

OLEDB, ADO, ADO.Net, ADOMD.Net, XMLA
Dim cmd as ADOMD.Command
Dim reader as ADOMD.DataReader
Cmd.Connection = conn
Set reader = Cmd.ExecuteReader(“Select Predict(Gender)…”)

Data Mining Object Model





Analysis Management Objects (AMO)
ADOMD.Net, Server ADOMD.Net
Direct access to Mining content
CLR User Defined Procedures execute on the server
Expandability


Plug-In Algorithms
Plug-In Viewers
Session Summary



Data Mining is the automatic
extraction of information from data
for descriptive or predictive purposes
Data Mining addresses a wide variety
of problems
SQL Server 2005 contains a fullfeatured set of data mining tools and
API’s for the creation and
deployment of data mining solutions.
Next Steps
1)
2)
3)
4)
5)
SQL Server website:
http://www.microsoft.com/sql
Virtual labs
Data Mining Tutorial
Find more info at:
http://www.sqldatamining.com
Ask Questions:
news:microsoft.public.sqlserver.datamining