Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Using Structured and Unstructured Data as part of an Analytical Process Managing Future Requirements Now Copyright © 2006, SAS Institute Inc. All rights reserved. Agenda Copyright © 2006, SAS Institute Inc. All rights reserved. Agenda Why is this stuff important? • Trends in analytics and analytical data Why can’t I simply use the same approach I’m currently using? • Analytics’ unique characteristics So what’s the solution? • Strategies to plan for and manage information growth for analytics Copyright © 2006, SAS Institute Inc. All rights reserved. Key Messages Analytics has been and will continue to be a competitive differentiator Structured and unstructured data volumes are rapidly increasing Traditional reporting-driven information management strategies are not always effective Data quality is paramount to effective analytics Execution times can be in the order of years, so you need to plan now to succeed in the future Copyright © 2006, SAS Institute Inc. All rights reserved. Why is this stuff important? Copyright © 2006, SAS Institute Inc. All rights reserved. We’re going through an information revolution … WWW: 170 Terabytes Emails: 35,000,000,000/day (400,000 Terabytes/yr) Telephone: 17.3 Exabytes/yr Source: University of California, Berkeley Copyright © 2006, SAS Institute Inc. All rights reserved. And our information sources keep growing … Customer information ID columns Copyright © 2006, SAS Institute Inc. All rights reserved. Purchases / Services Demographic, Financial Profiling Time Series Text-based customer interactions Non-text-based customer interactions Our customer knowledge keeps increasing ... 100 75 Terabytes of data Knowledge Gap 50 Customer Data Availability Analytical Capacity Execution Gap Execution Capacity 25 0 1960 1970 1980 1990 Time Copyright © 2006, SAS Institute Inc. All rights reserved. 2000 2010 And we’re moving beyond reporting. “Tedious data mining and static reports have had their day. The new business intelligence applies business analytics to fresh data and puts analysis in the hands of those who need it.” Source: InfoWorld 2006, SAS Institute Inc. All rights reserved. Copyright © 2005, 9 And some companies are specifically competing on analytics… “The idea of competing on analytics is not entirely new” “What is new is the spreading of analytical competition from individual business units to an enterprise-wide perspective” -- Thomas H. Davenport (author) Source: Harvard Business Review (January 2006) Copyright © 2006, SAS Institute Inc. All rights reserved. We’ve entered the “Era of Analytics”. “Previous bases for competition … have been eroded … That leaves three things as the basis for competition: • Efficient & effective execution • Smart decision making • Ability to wring every last drop of value from business processes … all of which can be gained through sophisticated use of analytics.” “Competing on Analytics” (Davenport & Harris) Harvard Business School Press Worldwide Release: March 6, 2007 Copyright © 2007, 2006, SAS Institute Inc. All rights reserved. Why can’t I use the same approach I’m currently using? Copyright © 2006, SAS Institute Inc. All rights reserved. Analytics is … Data-driven insight for better decisions. A process encompassing a range of techniques dealing with the collection, classification, analysis, and interpretation of data to gain insight, reveal patterns, anomalies, key variables and relationships. Copyright © 2006, SAS Institute Inc. All rights reserved. But, more importantly, what’s critical? Sufficient historical data Sufficient granularity Clean, accurate data Breadth and representativeness of data Copyright © 2006, SAS Institute Inc. All rights reserved. What is a “model”? An abstraction of reality • Simplifies reality via assumptions • Defines constraints and actors • Narrows our focus by eliminating everything other than what we’re concerned about Why do we use them? • Helps us gain insight about real-world processes / objects • Gives us something we can “play” with • They’re cheaper than using the real things Copyright © 2006, SAS Institute Inc. All rights reserved. No, really - what is a “model”? What are some examples? • The theory of relativity • A 100:1 scale architectural rendition of a proposed building • A “clay” of a car • A catwalk / clothes model Copyright © 2006, SAS Institute Inc. All rights reserved. So what does an analytical model look like? Example specification: Risk of Default on a Loan y 1x1 2 x2 Example implementation: Risk of Default on a Loan CreditRisk 300 (15 * Income) (22 * Age) Copyright © 2006, SAS Institute Inc. All rights reserved. Text Mining is no different. Reading the text files Singular Value Decomposition Term weighting/rollup Text Preprocessing Dimension Reduction Document analysis Copyright © 2006, SAS Institute Inc. All rights reserved. Analytics drives significant value … Customer Value Acquisition / Activating Target/ acquire prospect Welcome Prg. Customer development Harvest Win Back Up/X-sale Service/advice Pro-activity based on “If” events: - Lifetime - Usage/purchase - Behaviour - Critical Churn Prevention / Attrition Cancellation Analytical insight Time / insight Behaviour Scoring Response rates Entry Scoring Copyright © 2006, SAS Institute Inc. All rights reserved. Contact Policy Fraud Detection Segmentation (Value / Needs) Tariff Plan Optimisation X Sell / Up Sell Credit / Collections Churn Propensity Churn Segmentation Satisfaction score But it requires data, which can take many forms … Highly aggregated data Highly disaggregated data Interactive Copyright © 2006, SAS Institute Inc. All rights reserved. Analytical Different activities have different requirements … Highly aggregated data e Th p re tin or g pa th e Th an al ic yt s pa th Highly disaggregated data Interactive Copyright © 2006, SAS Institute Inc. All rights reserved. Analytical And different approaches are there for good reason … Highly aggregated data e Th p re tin or g pa th e Th an al ic yt s pa th Highly disaggregated data Interactive Copyright © 2006, SAS Institute Inc. All rights reserved. Analytical Reporting-driven data management processes aren’t always appropriate … Traditional reporting processes support highly managed activities Analytical processes are flexible and iteratively driven Successful companies are managing the two processes differently Integrate Hypothesis Structured Process Source Data Systems Integration DW Storage Metadata Copyright © 2006, SAS Institute Inc. All rights reserved. BI Interpret Hand coded Extracts Analytical Tools Unstructured Process However, the two are closely aligned. Hypothesis Copyright © 2006, SAS Institute Inc. All rights reserved. Descriptive – Measures Inferential – Brings deep understanding the past (What) of past and predictive of future (Why) So what’s the solution? Copyright © 2006, SAS Institute Inc. All rights reserved. Predictive Analytics: A Summary Going from seeing small bits to understanding the bigger picture Integration • Data: being able to link the unseen • Models: provide the complete picture of the customer • Technology: support the integration • People: integrate all stakeholders of analytics into the business process Copyright © 2006, SAS Institute Inc. All rights reserved. We’re using more and more data … Used to work with a couple of dozens of variables Nowadays at least a couple of hundreds • Data from different sources • Derived data (differences, rations, trends etc.) • Data from combined algorithms (market basket analysis, combined with clustering combined with predictive modeling) Can become thousands • Pharma: micro-array data • Interactions Copyright © 2006, SAS Institute Inc. All rights reserved. This has some major implications … History is key • To build a model, you need historical data • This history must be collected over time • To be able to effectively use analytics now, you must have planned and executed up to two years ago • Start collecting data now if you want to remain competitive Data quality can be showstopper • All the data in the world is useless if it isn’t accurate • Capturing and storing this data can be expensive if it isn’t useful • Bad quality data can delay an analytics project by years • Solve the data quality problem when you start, not afterwards Copyright © 2006, SAS Institute Inc. All rights reserved. This has some major implications … Granularity is essential • Statistics works by extracting trends of large amounts of information • Pre-summarised information is almost always useless • The enterprise data warehouse may not be the best location for this data • Don’t assume everything must be in the single data warehouse It’s not just about data, it’s about the right data • Knowing what data is important can be a challenge • Requires a highly consultative approach with the business • Helps to be tied back to strategic business drivers / business model • Understand not only the business and the problem, but involve the right stakeholders Copyright © 2006, SAS Institute Inc. All rights reserved. Copyright © 2006, SAS Institute Inc. All rights reserved.