* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Taking Your Application Design to the Next Level with SQL
Clusterpoint wikipedia , lookup
Operational transformation wikipedia , lookup
Data Protection Act, 2012 wikipedia , lookup
Predictive analytics wikipedia , lookup
Data center wikipedia , lookup
Database model wikipedia , lookup
Information privacy law wikipedia , lookup
Data vault modeling wikipedia , lookup
3D optical data storage wikipedia , lookup
Forecasting wikipedia , lookup
Data analysis wikipedia , lookup
13 October, 2014 | SQL Server User Group Norway Taking Your Application Design to the Next Level with SQL Server Data Mining Presenter Introduction Peter Myers BI Expert – Bitwise Solutions BBus, SQL Server MCSE, MCT, SQL Server MVP Experienced in designing, developing and maintaining Microsoft database and application solutions, since 1997 Focuses on education and mentoring Based in Melbourne, Australia [email protected] http://www.linkedin.com/in/peterjsmyers Session Objectives To introduce data mining and the data mining process To introduce SQL Server data mining To describe the SQL Server data mining algorithms To demonstrate how data mining can be used to enrich .NET application experiences Session Outline Introducing Data Mining Describing the Data Mining Process Introducing Analysis Services 2014 Demonstrations A Message for Developers Summary Resources Introducing Data Mining Addresses the problem: “Too much data and not enough information” Enables data exploration, pattern discovery, and prediction — which lead to knowledge discovery Forms a key part of a Business Intelligence solution Introducing Data Mining Enabling Predictive Analytics Role of Software Data Mining Proactive Predictive Analytics Data Model (BISM) Interactive Ad hoc Reporting Passive Canned Reporting Presentation Exploration Discovery Business Insight Introducing Data Mining Business Scenarios Seek Profitable Customers Correct Data During ETL Understand Customer Needs Data Mining Detect and Prevent Fraud Build Effective Marketing Campaigns Anticipate Customer Churn Predict Sales and Inventory Introducing Data Mining Describing the Data Mining Process Business Understanding Data Understanding Data Preparation Data Deployment Modeling “Putting Data Mining to Work” Evaluation www.crisp-dm.org Introducing Data Mining Data Preparation Often significant amounts of effort are required to prepare data for mining Transforming for cleaning and reformatting Isolating and flagging abnormal data Appropriately substituting missing values Discretizing continuous values into ranges Normalizing values between 0 and 1 Of course, having the required data to begin with is important When designing systems, give consideration to attributes that may be required as inputs for classification For example, demographic data: Age, Gender, Region, etc. The Data Mining Process Design Time • Process Time • Query Time Mining Model The Data Mining Process Design Time • Process Time • Query Time Mining Model Data Mining Engine Training Data The Data Mining Process Design Time • Process Time • Query Time Mining Model Data Mining Engine Predicted Data Data to Predict Introducing Analysis Services 2014 BI Semantic Model (BISM) Developed by using tabular or multidimensional development approaches Delivers intuitive browsing and high performance query results Performs calculations difficult to perform by using relational queries Supports advanced Business Intelligence, including KPIs Data Mining Discovers patterns in data Patterns can be used to surface knowledge about data, and may be used for predictive analytics Introducing Analysis Services 2014 (Continued) Hides the complexity of an advanced technology Includes full suite of algorithms to automatically identity and store patterns in data Patterns can be used to surface knowledge about data, and can be used for predictive analytics Includes visualizations to explore patterns Uses standard programming interfaces: XMLA DMX – Data Mining Extensions Allows direct integration with other Microsoft BI products Delivers a complete framework for developing intelligent applications Capabilities and features have changed little since SQL Server 2005 Introducing Analysis Services 2014 Data Mining Algorithms Microsoft Naïve Bayes Quick and approachable algorithm Used for classification Microsoft Decision Trees Popular data mining technique Used for classification, regression and association Microsoft Linear Regression Finds the best possible straight line through a series of points Used for prediction analysis Introducing Analysis Services 2014 Data Mining Algorithms (Continued) Microsoft Neural Network More sophisticated than Decision Trees and Naïve Bayes, this algorithm can explore extremely complex scenarios Used for classification and regression tasks Microsoft Logistic Regression A particular case of the Neural Network algorithm Microsoft Clustering Finds natural groupings inside data Supports segmentation and anomaly detection tasks Introducing Analysis Services 2014 Data Mining Algorithms (Continued) Microsoft Sequence Clustering Groups a sequence of discrete events into natural groups based on similarity Microsoft Time Series Used to predict future values from a time series Was improved in SQL Server 2008 to produce more accurate long-term forecasts Microsoft Association Rules Commonly supports market basket analysis to learn what products are purchased together Introducing Analysis Services 2014 Data Mining Algorithms (Continued) Classify Estimate • Decision Trees • Logistic Regression • Naïve Bayes • Neural Networks • Decision Trees • Linear Regression • Logistic Regression • Neural Networks Cluster • Clustering Forecast Associate • Time Series • Association Rules • Decision Trees Introducing Analysis Services 2014 Integration and Development Scenarios Distribute Reporting Services data mining reports Embed visualizations into Windows Forms applications to allow users to explore and understand the discovered patterns Integrate predictions: Data correction during ETL processing Targeted advertising “Those that bought this book also purchased these books” Any many more scenarios… Help validate or repair user data entry Introducing Analysis Services 2014 Data Mining Visualizations In contrast to OLTP and BISM queries, data mining queries typically extract information that the user is not aware of Appreciate that end users do not typically query data mining models directly Visualizations can effectively present data discoveries SQL Server provides algorithm-specific visualizations that can Test and explore models in SQL Server Data Tools and SQL Server Management Studio Be embedded into Windows Forms applications Developers can construct and plug-in custom data mining viewers Introducing Analysis Services 2014 Programmability C++ App VB App OLE DB ADO .NET App ADO.NET Any App AMO Any Platform, Any Device WAN XMLA Over TCP/IP XMLA Over HTTP Analysis Server OLAP Data Mining Server ADOMD.NET Data Mining Interfaces .NET Stored Procedures Microsoft Algorithms Third-Party Algorithms Demonstrations 1. Creating, training, testing data mining models with SSDT 2. Embedding a data mining visualization into a Windows application 3. Authoring a Reporting Services report based on a data mining model, and embedding the report into a Windows application 4. Automating data validation with data mining 5. Enhancing an E-commerce web application with market basket analysis A Message for Developers Take your application to the next level by embedding data mining results and predictive capabilities! Embed custom visualizations into Windows Forms applications to allow users to explore and understand the discovered model patterns Integrate predictions: Targeted advertising “Those that bought this book also purchased these books” Forecasts of sales or inventory Help validate or repair user entry Summary Data mining enables data exploration, pattern discovery, and prediction — which lead to knowledge discovery Analysis Services 2005, 2008, 2008 R2, 2012 and 2014: Include full suite of algorithms to automatically identity and store patterns in data Include visualizations to explore patterns Use standard programming interfaces Allow direct integration with other Microsoft BI products Deliver a complete framework for developing intelligent applications Questions Resources Microsoft website: Business Intelligence – Predictive Analytics http://www.microsoft.com/en-us/server-cloud/solutions/businessintelligence/predictive-analytics.aspx Links to datasheet and technical documentation SQL Server Data Mining web site http://www.sqlserverdatamining.com Site designed and maintained by the SQL Server Data Mining team Includes: Live samples, tutorials, webcasts, tips and tricks, and FAQ Book: “Data Mining for SQL Server 2008” Wiley Press Authors: ZhaoHui Tang and Jamie MacLennan Thank You