Download Microsoft Data Mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Point of View
Microsoft Data Mining
September
2012
www.oakton.com.au
Contents
02
Summary
03
The Problem
03
What is Data Mining?
04
How Can It Be Used?
04
Microsoft Data Mining
05
Outcome
06
Links
07
Contacts
08
www.oakton.com.au
Summary
Microsoft Data Mining, part of SQL Server, can help users find the hidden gems
of information locked away in enterprise content management systems like
Microsoft SharePoint.
While search can find information that is related to the search terms, data
mining can help organisations leverage the combined activity of all their staff
and find information that is not necessarily directly related to the search terms,
but is still useful and relevant.
The Problem
Most organisations have, or are in the process of implementing, some form of
enterprise content management (ECM) system. Microsoft SharePoint is by far
the most popular of these tools and allows organisations to store and manage
a whole range of information, from documents and reports to blogs and
discussions. Many organisations are discovering that the greatest challenge in
the face of the information explosion is finding the content that is most useful
and relevant.
Search has evolved significantly and nowadays you can be reasonably confident
of finding what you are searching for. The challenge remains with discovering
things you weren’t searching for, that maybe you didn’t even know existed, but
that are still useful and relevant to you.
This white paper explores how Oakton used Microsoft’s Market Basket Analysis
Data Mining algorithm to provide a cost-effective and powerful tool for
discovering these “hidden” information assets.
www.oakton.com.au
03
What is Data Mining?
04
Data Mining covers a range of statistical techniques for analysing large volumes
of data and identifying useful correlations and trends. The specific data mining
technique used here, Market Basket Analysis, is an analysis technique based on
the assumption that if you buy one or more items you are more likely to buy
from a related group of items. The “related group” is determined by looking
at the contents of previous shopping baskets and determining a number of
“average” baskets. When your selected item matches some of the contents
of one of these “average” baskets then you are statistically more likely to be
interested in the other items in the basket.
The basic idea can be illustrated as follows:
How Can It Be Used?
Market Basket Analysis can be applied to a range of areas where a business
may be interested in “typical” combinations of items. Examples include any
organisation attempting to suggest additional items you might be interested in such as Amazon’s recommendations based on items other people have browsed.
To solve the challenge of discovering information assets in an ECM system
such as Microsoft’s SharePoint 2010, Oakton can utilise user activity to define
a basket. This activity can include viewing documents, sites, reports and blogs.
By making a basket out of user activity, Oakton can use Market Basket Analysis
to find common clusters of behaviour - in this case of sites viewed. Then, when
a user visits a particular site, other web pages that different visitors of that site
also viewed can be recommended. These recommendations can be weighted so
that only the other sites that were frequently visited are shown.
www.oakton.com.au
These other sites may be unrelated to the immediate activity, but there
is a high probability that they would be of interest to the user. Essentially,
businesses can tap into the combined knowledge and search capability of all
users and distil the information that has the highest probability of being useful
based on their behaviour.
Microsoft Data Mining
Microsoft Data Mining forms part of the SQL Server suite of applications and was
initially released in 2005. This release proved to be well conceived, robust and
provided a powerful set of tools that can easily be implemented and applied to
most data-mining problems. Since then, it has been fundamentally unchanged
although subsequent SQL Server releases have seen a number of refinements
such as new algorithms.
The core data-mining engine forms part of SQL Server Analysis Services and datamining models are most easily developed using the BI Development Studio that
is bundled with SQL Server. This data-mining engine can easily read data either
from the SQL Server relational database or from an analysis services cube.
The most important development since the original release has been an Excel
add-in released for Office 2007. This brings the full power of data mining into the
most popular BI tool in the world, Microsoft Excel, enabling knowledge workers
to use powerful data-mining techniques over their own data on their desktops.
Traditionally, data mining has been seen as a specialist activity only carried out
by statistical experts using expensive software. As part of Microsoft’s drive to
democratise Business Intelligence, Microsoft Data Mining breaks down those
barriers and opens up data mining to ordinary business users through the
familiar Excel interface.
www.oakton.com.au
05
Outcome
A seamlessly integrated solution that presents further ECM content that is
statistically likely to be of interest based on the content currently being viewed.
06
The Suggested Items is a list of content that the current user has not previously
seen, supporting the discovery of entirely new content. By linking it to the
currently viewed content, the solution effectively provides a dynamic, crowdsourced search that remains relevant at all times.
With the organisation already owning Microsoft SharePoint 2010 (and
therefore SQL Server) and Microsoft Office 2010, there were no additional
licencing or server costs. By using the Excel Data Mining add-in, development
time was dramatically reduced, with the overall development effort being
less than 15 days (this included the SharePoint integration). These key factors
meant the upfront investment was very low.
While no formal benchmarking was done to allow the benefits to be measured,
the anecdotal evidence and user feedback suggests that users of the system are
discovering a far wider range of useful information assets than previously. Data
mining is opening up valuable data stored in the ECM solution that users would
otherwise never have found through traditional search tools.
www.oakton.com.au
Links
1. SharePoint 2010 market penetration: http://www.newhorizons.com
content/800401157-detail-sharepoint-2010-increases-market-penetration.aspx
2. Microsoft SQL Server: http://www.microsoft.com/sqlserver/en/us/default.aspx
3. Microsoft Data Mining: http://www.microsoft.com/sqlserver/en/us/solutions
technologies/business-intelligence/data-mining.aspx
4. Microsoft PowerPivot: http://www.microsoft.com/en-us/bi/powerpivot.aspx
5. Excel Data Mining Add-In: http://office.microsoft.com/en-us/excel-help/data
mining-add-ins-HA010342915.aspx
When every decision is based upon the definition of a problem, people routinely rush to identify the
problem so they can get on with what they think is the real work of solving it. An ill-conceived problem
though only leads to an ill-conceived solution - get it wrong and it is costly and disruptive.
At Oakton, we think differently: instead of jumping in we step back and invest time and effort to improve
our understanding of the problem you’re trying to solve. We focus on examining the problem from different
perspectives to master what we believe is the most important step, clearly defining the problem in the first
place!
We’re an Australian consulting and technology firm founded in 1988. Our business is helping create
tangible value by blending business insights and specialist technology solutions to give our clients a
significant advantage in today’s rapidly changing world.
Oakton Consulting Technology
www.oakton.com.au
07
Contacts
Oakton
Level 8, 271 Collins Street
Melbourne VIC 3000
t +61 3 9617 0200
f +61 3 9621 1951
e [email protected]
www.oakton.com.au
08
Future events cannot reliably be predicted accurately. Oakton makes no statements, representations or
warranties about the accuracy or completeness of, and you should not rely on, any information relating
to this document, including forecasts and estimates (‘Information’) disclosed to you by Oakton. To the
full extent permitted by law, Oakton disclaims all responsibility for Information and all liability (including
without limitation, liability in negligence) for all expenses, losses, damages and costs you may incur as a
result of the Information being inaccurate or incomplete in any way for any reason.
© Oakton Services Pty Ltd 2011. This work is copyright. Except as permitted under the Copyright Act 1968
(Cth), no part of this publication may be reproduced by any process, without the written permission of
Oakton Services Pty Ltd.
Oakton is a registered trademark of Oakton Limited
www.oakton.com.au
09
Oakton Services Pty Ltd ABN 31 100 103 268
Melbourne Head office Level 8 271 Collins Street Melbourne VIC 3000 Australia t +61 3 9617 0200 f +61 3 9621 1951
Perth Level 14 Governor Stirling Tower 197 St Georges Terrace Perth WA 6000 Australia t +61 8 6188 7680 f +61 8 6188 7607
Sydney Level 3 65 Berry Street North Sydney NSW 2060 Australia t +61 2 9923 9800 f +61 2 9929 6731
Canberra Unit 2 45Wentworth Avenue Kingston ACT 2604 Australia t +61 2 6230 1997 f +61 2 6230 1919
Brisbane Level 5 200 Mary Street Brisbane QLD 4000 Australia t +61 7 3136 2900 f +61 7 3136 2999
Hyderabad Krishe-e 8-2-293 Plot 499 Road 36 Jubilee Hills 500033 Hyderabad India t +91 40 23552694 Voip: +61 3 9617 0294
www.oakton.com.au