Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Gavin Russell-Rockliff BI Technical Specialist Microsoft BIN305 Please Raise Your Hand If You’ve Ever… <… Put a Party Reference Here… > Attended a Statistics Lecture ?? Got a Statistics Degree ?? Used SQL Server Data Mining ?? Agenda Data Mining – What is it? Data Mining – How do we do it? Demonstrations Visualisation Reporting ETL Application Q&A Data Mining – What Is It? According to Encarta Noun “Search for Hidden Information” “The locating of previously unknown patterns and relationships within data” Server-Driven Discovery Uses a combination of statistics, probability analysis and database technologies DM Enables Predictive Analysis Role of Software Proactive Data mining Predictive Analysis OLAP Interactive Ad-hoc reporting Passive Canned reporting Business Insight Presentation Exploration Discovery Business Scenarios Forecasting sales Churn Analysis Detecting fraud or invalid data Targeting promotions Cross-selling Determine Business Drivers Our End-to-End BI Offering The Big Picture DELIVERY END USER TOOLS AND PERFORMANCE MANAGEMENT APPLICATIONS BI PLATFORM (RDBMS, ETL, OLAP, Reporting) SQL Server SQL Server Reporting Services Analysis Services SQL Server DBMS SQL Server Integration Services Mainframe/ Departmental Systems Our End-to-End BI Offering The Big Picture DELIVERY END USER TOOLS AND PERFORMANCE MANAGEMENT APPLICATIONS SQL Server Analysis Services BI PLATFORM (RDBMS, ETL, OLAP, Reporting) SQL Server™ 2008 Data Mining Key Drivers Keep Development Simple Retain Full Suite of Algortihms Manage Large Volumes Allow for Integration SQL Server™ 2008 Algorithms Microsoft Naïve Bayes Quick and approachable algorithm Used for classification Microsoft Decision Trees Popular data mining technique Used for classification, regression and association Microsoft Linear Regression Finds the best possible straight line through a series of points Used for prediction analysis SQL Server™ 2008 Algorithms Continued Microsoft Neural Network More sophisticated than Decision Trees and Naïve Bayes, this algorithm can explore extremely complex scenarios Used for classification and regression tasks Microsoft Logistic Regression A particular case of the Neural Network algorithm Microsoft Clustering Finds natural groupings inside data Supports segmentation and anomaly detection tasks SQL Server™ 2008 Algorithms Continued Microsoft Sequence Clustering Groups a sequence of discrete events into natural groups based on similarity Microsoft Time Series Used to predict future values from a time series Has been improved in SQL Server 2008 to produce more accurate long-term forecasts Microsoft Association Rules Commonly supports market basket analysis to learn what products are purchased together Data Mining Algorithm Usage What is your task? Predict Variable • Naïve Bayes • Decision Trees • Neural Network • Logistic Regression Predict Value • Decision Trees • Linear Regression • Neural Network • Logistic Regression Marketing Cluster • Clustering Forecast Value • Time Series Associate • Association Rules • Decision Trees Data Mining Process Define the Problem Data Preperation Model Validation Accuracy Reliability Usefulness Model Visualisation Describing the Data Mining Process Design time Process time Query time Mining Model Describing the Data Mining Process Design time Process time Query time Mining Model Training Data Data Mining Engine Data Mining Visualization Model Creation + Processing Describing the Data Mining Process Design time Process time Query time Mining Model Training Data Data Mining Engine Describing the Data Mining Process Design time Process time Query time Mining Model Data Mining Engine Predicted Data Data to Predict Predicting the Future Data Mining for the Developer Related Content Breakout Sessions Using MDX for Enhanced Scorecards and Dashboards (BIN 307) Track Resources www.sqlserverdatamining.com www.microsoft.com/sql twitter.com/gavinrr Resources www.microsoft.com/teched www.microsoft.com/learning Sessions On-Demand & Community Microsoft Certification & Training Resources http://microsoft.com/technet http://microsoft.com/msdn Resources for IT Professionals Resources for Developers 10 pairs of MP3 sunglasses to be won Complete a session evaluation and enter to win! © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.