Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Point of View Microsoft Data Mining September 2012 www.oakton.com.au Contents 02 Summary 03 The Problem 03 What is Data Mining? 04 How Can It Be Used? 04 Microsoft Data Mining 05 Outcome 06 Links 07 Contacts 08 www.oakton.com.au Summary Microsoft Data Mining, part of SQL Server, can help users find the hidden gems of information locked away in enterprise content management systems like Microsoft SharePoint. While search can find information that is related to the search terms, data mining can help organisations leverage the combined activity of all their staff and find information that is not necessarily directly related to the search terms, but is still useful and relevant. The Problem Most organisations have, or are in the process of implementing, some form of enterprise content management (ECM) system. Microsoft SharePoint is by far the most popular of these tools and allows organisations to store and manage a whole range of information, from documents and reports to blogs and discussions. Many organisations are discovering that the greatest challenge in the face of the information explosion is finding the content that is most useful and relevant. Search has evolved significantly and nowadays you can be reasonably confident of finding what you are searching for. The challenge remains with discovering things you weren’t searching for, that maybe you didn’t even know existed, but that are still useful and relevant to you. This white paper explores how Oakton used Microsoft’s Market Basket Analysis Data Mining algorithm to provide a cost-effective and powerful tool for discovering these “hidden” information assets. www.oakton.com.au 03 What is Data Mining? 04 Data Mining covers a range of statistical techniques for analysing large volumes of data and identifying useful correlations and trends. The specific data mining technique used here, Market Basket Analysis, is an analysis technique based on the assumption that if you buy one or more items you are more likely to buy from a related group of items. The “related group” is determined by looking at the contents of previous shopping baskets and determining a number of “average” baskets. When your selected item matches some of the contents of one of these “average” baskets then you are statistically more likely to be interested in the other items in the basket. The basic idea can be illustrated as follows: How Can It Be Used? Market Basket Analysis can be applied to a range of areas where a business may be interested in “typical” combinations of items. Examples include any organisation attempting to suggest additional items you might be interested in such as Amazon’s recommendations based on items other people have browsed. To solve the challenge of discovering information assets in an ECM system such as Microsoft’s SharePoint 2010, Oakton can utilise user activity to define a basket. This activity can include viewing documents, sites, reports and blogs. By making a basket out of user activity, Oakton can use Market Basket Analysis to find common clusters of behaviour - in this case of sites viewed. Then, when a user visits a particular site, other web pages that different visitors of that site also viewed can be recommended. These recommendations can be weighted so that only the other sites that were frequently visited are shown. www.oakton.com.au These other sites may be unrelated to the immediate activity, but there is a high probability that they would be of interest to the user. Essentially, businesses can tap into the combined knowledge and search capability of all users and distil the information that has the highest probability of being useful based on their behaviour. Microsoft Data Mining Microsoft Data Mining forms part of the SQL Server suite of applications and was initially released in 2005. This release proved to be well conceived, robust and provided a powerful set of tools that can easily be implemented and applied to most data-mining problems. Since then, it has been fundamentally unchanged although subsequent SQL Server releases have seen a number of refinements such as new algorithms. The core data-mining engine forms part of SQL Server Analysis Services and datamining models are most easily developed using the BI Development Studio that is bundled with SQL Server. This data-mining engine can easily read data either from the SQL Server relational database or from an analysis services cube. The most important development since the original release has been an Excel add-in released for Office 2007. This brings the full power of data mining into the most popular BI tool in the world, Microsoft Excel, enabling knowledge workers to use powerful data-mining techniques over their own data on their desktops. Traditionally, data mining has been seen as a specialist activity only carried out by statistical experts using expensive software. As part of Microsoft’s drive to democratise Business Intelligence, Microsoft Data Mining breaks down those barriers and opens up data mining to ordinary business users through the familiar Excel interface. www.oakton.com.au 05 Outcome A seamlessly integrated solution that presents further ECM content that is statistically likely to be of interest based on the content currently being viewed. 06 The Suggested Items is a list of content that the current user has not previously seen, supporting the discovery of entirely new content. By linking it to the currently viewed content, the solution effectively provides a dynamic, crowdsourced search that remains relevant at all times. With the organisation already owning Microsoft SharePoint 2010 (and therefore SQL Server) and Microsoft Office 2010, there were no additional licencing or server costs. By using the Excel Data Mining add-in, development time was dramatically reduced, with the overall development effort being less than 15 days (this included the SharePoint integration). These key factors meant the upfront investment was very low. While no formal benchmarking was done to allow the benefits to be measured, the anecdotal evidence and user feedback suggests that users of the system are discovering a far wider range of useful information assets than previously. Data mining is opening up valuable data stored in the ECM solution that users would otherwise never have found through traditional search tools. www.oakton.com.au Links 1. SharePoint 2010 market penetration: http://www.newhorizons.com content/800401157-detail-sharepoint-2010-increases-market-penetration.aspx 2. Microsoft SQL Server: http://www.microsoft.com/sqlserver/en/us/default.aspx 3. Microsoft Data Mining: http://www.microsoft.com/sqlserver/en/us/solutions technologies/business-intelligence/data-mining.aspx 4. Microsoft PowerPivot: http://www.microsoft.com/en-us/bi/powerpivot.aspx 5. Excel Data Mining Add-In: http://office.microsoft.com/en-us/excel-help/data mining-add-ins-HA010342915.aspx When every decision is based upon the definition of a problem, people routinely rush to identify the problem so they can get on with what they think is the real work of solving it. An ill-conceived problem though only leads to an ill-conceived solution - get it wrong and it is costly and disruptive. At Oakton, we think differently: instead of jumping in we step back and invest time and effort to improve our understanding of the problem you’re trying to solve. We focus on examining the problem from different perspectives to master what we believe is the most important step, clearly defining the problem in the first place! We’re an Australian consulting and technology firm founded in 1988. Our business is helping create tangible value by blending business insights and specialist technology solutions to give our clients a significant advantage in today’s rapidly changing world. Oakton Consulting Technology www.oakton.com.au 07 Contacts Oakton Level 8, 271 Collins Street Melbourne VIC 3000 t +61 3 9617 0200 f +61 3 9621 1951 e [email protected] www.oakton.com.au 08 Future events cannot reliably be predicted accurately. Oakton makes no statements, representations or warranties about the accuracy or completeness of, and you should not rely on, any information relating to this document, including forecasts and estimates (‘Information’) disclosed to you by Oakton. To the full extent permitted by law, Oakton disclaims all responsibility for Information and all liability (including without limitation, liability in negligence) for all expenses, losses, damages and costs you may incur as a result of the Information being inaccurate or incomplete in any way for any reason. © Oakton Services Pty Ltd 2011. This work is copyright. Except as permitted under the Copyright Act 1968 (Cth), no part of this publication may be reproduced by any process, without the written permission of Oakton Services Pty Ltd. Oakton is a registered trademark of Oakton Limited www.oakton.com.au 09 Oakton Services Pty Ltd ABN 31 100 103 268 Melbourne Head office Level 8 271 Collins Street Melbourne VIC 3000 Australia t +61 3 9617 0200 f +61 3 9621 1951 Perth Level 14 Governor Stirling Tower 197 St Georges Terrace Perth WA 6000 Australia t +61 8 6188 7680 f +61 8 6188 7607 Sydney Level 3 65 Berry Street North Sydney NSW 2060 Australia t +61 2 9923 9800 f +61 2 9929 6731 Canberra Unit 2 45Wentworth Avenue Kingston ACT 2604 Australia t +61 2 6230 1997 f +61 2 6230 1919 Brisbane Level 5 200 Mary Street Brisbane QLD 4000 Australia t +61 7 3136 2900 f +61 7 3136 2999 Hyderabad Krishe-e 8-2-293 Plot 499 Road 36 Jubilee Hills 500033 Hyderabad India t +91 40 23552694 Voip: +61 3 9617 0294 www.oakton.com.au