Download Taking Your Application Design to the Next Level with SQL

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Clusterpoint wikipedia , lookup

Operational transformation wikipedia , lookup

Big data wikipedia , lookup

Data Protection Act, 2012 wikipedia , lookup

Predictive analytics wikipedia , lookup

Data center wikipedia , lookup

Data model wikipedia , lookup

Database model wikipedia , lookup

Information privacy law wikipedia , lookup

Data vault modeling wikipedia , lookup

3D optical data storage wikipedia , lookup

Forecasting wikipedia , lookup

Data analysis wikipedia , lookup

Business intelligence wikipedia , lookup

Data mining wikipedia , lookup

Transcript
13 October, 2014 | SQL Server User Group Norway
Taking Your Application
Design to the Next Level
with SQL Server
Data Mining
Presenter Introduction
Peter Myers
BI Expert – Bitwise Solutions
BBus, SQL Server MCSE, MCT, SQL Server MVP
Experienced in designing, developing and maintaining
Microsoft database and application solutions, since 1997
Focuses on education and mentoring
Based in Melbourne, Australia
[email protected]
http://www.linkedin.com/in/peterjsmyers
Session Objectives
To introduce data mining and the data mining process
To introduce SQL Server data mining
To describe the SQL Server data mining algorithms
To demonstrate how data mining can be used to enrich
.NET application experiences
Session Outline
Introducing Data Mining
Describing the Data Mining Process
Introducing Analysis Services 2014
Demonstrations
A Message for Developers
Summary
Resources
Introducing Data Mining
Addresses the problem:
“Too much data and not enough information”
Enables data exploration, pattern discovery, and
prediction — which lead to knowledge discovery
Forms a key part of a Business Intelligence solution
Introducing Data Mining
Enabling Predictive Analytics
Role of
Software
Data Mining
Proactive
Predictive Analytics
Data Model (BISM)
Interactive
Ad hoc Reporting
Passive
Canned Reporting
Presentation
Exploration
Discovery
Business
Insight
Introducing Data Mining
Business Scenarios
Seek
Profitable
Customers
Correct
Data
During ETL
Understand
Customer
Needs
Data
Mining
Detect and
Prevent
Fraud
Build
Effective
Marketing
Campaigns
Anticipate
Customer
Churn
Predict
Sales and
Inventory
Introducing Data Mining
Describing the Data Mining Process
Business
Understanding
Data
Understanding
Data
Preparation
Data
Deployment
Modeling
“Putting Data
Mining to Work”
Evaluation
www.crisp-dm.org
Introducing Data Mining
Data Preparation
Often significant amounts of effort are required to prepare
data for mining
Transforming for cleaning and reformatting
Isolating and flagging abnormal data
Appropriately substituting missing values
Discretizing continuous values into ranges
Normalizing values between 0 and 1
Of course, having the required data to begin with is important
When designing systems, give consideration to attributes that may be
required as inputs for classification
For example, demographic data: Age, Gender, Region, etc.
The Data Mining Process
Design Time • Process Time • Query Time
Mining Model
The Data Mining Process
Design Time • Process Time • Query Time
Mining Model
Data
Mining
Engine
Training Data
The Data Mining Process
Design Time • Process Time • Query Time
Mining Model
Data
Mining
Engine
Predicted Data
Data to Predict
Introducing Analysis Services 2014
BI Semantic Model (BISM)
Developed by using tabular or multidimensional
development approaches
Delivers intuitive browsing and high performance query
results
Performs calculations difficult to perform by using
relational queries
Supports advanced Business Intelligence, including KPIs
Data Mining
Discovers patterns in data
Patterns can be used to surface knowledge about data,
and may be used for predictive analytics
Introducing Analysis Services 2014
(Continued)
Hides the complexity of an advanced technology
Includes full suite of algorithms to automatically identity and store
patterns in data
Patterns can be used to surface knowledge about data, and can be used for
predictive analytics
Includes visualizations to explore patterns
Uses standard programming interfaces:
XMLA
DMX – Data Mining Extensions
Allows direct integration with other Microsoft BI products
Delivers a complete framework for developing intelligent applications
Capabilities and features have changed little since SQL Server 2005
Introducing Analysis Services 2014
Data Mining Algorithms
Microsoft Naïve Bayes
Quick and approachable algorithm
Used for classification
Microsoft Decision Trees
Popular data mining technique
Used for classification, regression and association
Microsoft Linear Regression
Finds the best possible straight line through a series of points
Used for prediction analysis
Introducing Analysis Services 2014
Data Mining Algorithms (Continued)
Microsoft Neural Network
More sophisticated than Decision Trees and Naïve Bayes, this algorithm
can explore extremely complex scenarios
Used for classification and regression tasks
Microsoft Logistic Regression
A particular case of the Neural Network algorithm
Microsoft Clustering
Finds natural groupings inside data
Supports segmentation and anomaly detection tasks
Introducing Analysis Services 2014
Data Mining Algorithms (Continued)
Microsoft Sequence Clustering
Groups a sequence of discrete events into natural groups based on
similarity
Microsoft Time Series
Used to predict future values from a time series
Was improved in SQL Server 2008 to produce more accurate long-term
forecasts
Microsoft Association Rules
Commonly supports market basket analysis to learn what products are
purchased together
Introducing Analysis Services 2014
Data Mining Algorithms (Continued)
Classify
Estimate
• Decision
Trees
• Logistic
Regression
• Naïve
Bayes
• Neural
Networks
• Decision
Trees
• Linear
Regression
• Logistic
Regression
• Neural
Networks
Cluster
• Clustering
Forecast
Associate
• Time Series
• Association
Rules
• Decision
Trees
Introducing Analysis Services 2014
Integration and Development Scenarios
Distribute Reporting Services data mining reports
Embed visualizations into Windows Forms applications
to allow users to explore and understand the discovered
patterns
Integrate predictions:
Data correction during ETL processing
Targeted advertising
“Those that bought this book also purchased these books”
Any many more scenarios…
Help validate or repair user data entry
Introducing Analysis Services 2014
Data Mining Visualizations
In contrast to OLTP and BISM queries, data mining queries
typically extract information that the user is not aware of
Appreciate that end users do not typically query data mining
models directly
Visualizations can effectively present data discoveries
SQL Server provides algorithm-specific visualizations that can
Test and explore models in SQL Server Data Tools and SQL Server
Management Studio
Be embedded into Windows Forms applications
Developers can construct and plug-in custom data mining
viewers
Introducing Analysis Services 2014
Programmability
C++ App
VB App
OLE DB
ADO
.NET App
ADO.NET
Any App
AMO
Any Platform, Any
Device
WAN
XMLA
Over TCP/IP
XMLA
Over HTTP
Analysis Server
OLAP
Data Mining
Server ADOMD.NET
Data Mining Interfaces
.NET Stored Procedures
Microsoft
Algorithms
Third-Party
Algorithms
Demonstrations
1. Creating, training, testing data mining models with SSDT
2. Embedding a data mining visualization into a Windows
application
3. Authoring a Reporting Services report based on a data mining
model, and embedding the report into a Windows application
4. Automating data validation with data mining
5. Enhancing an E-commerce web application with market basket
analysis
A Message for Developers
Take your application to the next level
by embedding data mining results and
predictive capabilities!
Embed custom visualizations into Windows
Forms applications to allow users to explore
and understand the discovered model patterns
Integrate predictions:
Targeted advertising
“Those that bought this book also purchased these books”
Forecasts of sales or inventory
Help validate or repair user entry
Summary
Data mining enables data exploration, pattern discovery,
and prediction — which lead to knowledge discovery
Analysis Services 2005, 2008, 2008 R2, 2012 and 2014:
Include full suite of algorithms to automatically identity and
store patterns in data
Include visualizations to explore patterns
Use standard programming interfaces
Allow direct integration with other Microsoft BI products
Deliver a complete framework for developing intelligent
applications
Questions
Resources
Microsoft website: Business Intelligence – Predictive Analytics
http://www.microsoft.com/en-us/server-cloud/solutions/businessintelligence/predictive-analytics.aspx
Links to datasheet and technical documentation
SQL Server Data Mining web site
http://www.sqlserverdatamining.com
Site designed and maintained by the SQL Server Data Mining team
Includes: Live samples, tutorials, webcasts, tips and tricks, and FAQ
Book: “Data Mining for SQL Server 2008”
Wiley Press
Authors: ZhaoHui Tang and Jamie MacLennan
Thank You