Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MGS 4020 Business Intelligence Data Mining and Data Visualization Apr 16, 2013 Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 1 Agenda Data Mining Georgia State University - Confidential Marketing Analytics Example MGS4020_10.ppt/Apr 16, 2013/Page 2 What is Data Mining? • A set of activities used to find new, hidden, or unexpected patterns in data • Verification versus Discovery • Accuracy in predicting consumer Georgia State University - Confidential behavior MGS4020_10.ppt/Apr 16, 2013/Page 3 OLAP – Online Analytical Processing • MOLAP – Multidimensional OLAP • ROLAP – Relational OLAP Data Warehouse / Data Mart RDBMS Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 4 Techniques and Technologies • Techniques Used to Mine the Data • Classification • Association • Sequence • Cluster • Data Mining Technologies • Statistical Analysis • Neural Networks, Genetic Algorithms and Fuzzy Logic • Decision Trees Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 5 Market Basket Analysis • Market Basket Analysis • Most common and useful in Marketing • What products customers purchase together Diapers and Beer sell well on Thursday nights • Benefits • Better target marketing • Product positioning with stores (virtual stores) • Inventory management • Limitations • Large volume of real transactions needed • Difficult to correlate frequently purchased items with infrequently purchased items • Results of previous transactions could have been affected by other marketing promotions Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 6 Market Basket Analysis Association Rules for Market Basket Analysis • All associations are unidirectional and take on the following form: Left-hand side rule IMPLIES Right-hand side rule Left and Right hand side can both contain multiple items (Multidimensional Market Analysis) Examples: Steak IMPLIES Red Wine Hunting Magazines IMPLIES Smokeless Tobacco Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 7 Market Basket Analysis 3 Measures of Market Basket Analysis • Support – the percentage of baskets in the analysis where the rule is true • Of 100 baskets 11 contained both steaks and red wine. • 11% support • Confidence – the percentage of Left-hand side items that also have rightside items • Of the 17 baskets that contained steak, 11 contained red wine. • 65% confidence • Lift – compares the likelihood of finding the right-hand item in any random basket • Also referred to as Improvement • Lift of less than 1 means it is less predictive than random choice • If Confidence is 35%, but the right-hand side items is in 40% of the baskets, the rule offers no Improvement of random selection. Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 8 Market Basket Analysis Market Basket Analysis results can be: • Trivial • Hot Dogs IMPLIES Hot Dog Buns • TV IMPLIES TV Warranty • Inexplicable Virtual Items – Associating non-items or other attributes into the correlation study “New Customer” Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 9 Limitations of Data Mining • All relevant data items / attributes may not be collected by the operational systems • Data noise or missing values (data quality) • Large database requirements and multi-dimensionality Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 10 Agenda Data Mining Georgia State University - Confidential Marketing Analytics Example MGS4020_10.ppt/Apr 16, 2013/Page 11 Why use Analytics? Some Benefits Are Quantifiable • • • • • • 15% to 51%+ increase in net sales Other Benefits Not So Easily Quantified • Decisions based on exhibited behaviors • Makes data actionable • Easier to measure results For one product over a 3 yr period, $650mm in cost savings & over $350mm in increm contribution • Validate instincts and opinions • Enhanced what-if analysis & planning >50% more accurate targeting of likely residential movers • Less guesswork, more facts • Built-in process improvement ROI of over 2500% Annual increm revenue of > $178mm 24% reduction in churn rate from modeling/targeting likely churners Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 12 Advanced analytics can help to answer the following questions … • How do I determine which offers to make to my customers? • What do my best customers look like, and where can I find more of them? • What is the return on my marketing investment? How might my marketing plans be tweaked to optimize investment? • Who are my most valuable customers? What are my key value drivers? • Which of my customers have the greatest potential for growth – and which have little or no potential? • Which of my customers are most vulnerable? What are the triggers causing them to leave or churn? • Where should I employ my assets to meet customer demand? Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 13 Marketing Analytics Landscape Strategy & Tactics: Guiding the business & helping to make numbers Business Planning, Forecasting, Corp Strategy, Financial Metrics, Profitability Analysis Acquisition Growth Where can I find new customers? Where can I find more revenue & profit from my current customers? • Customer Acquisition • Propensity to buy & response modeling • Prospect profiling • Marketing Optimization • Event driven marketing • Market Basket Analysis • Online and Retail Channels Retention Reacquisition Which of my customers are at risk and how can I keep them? • Customer and product churn modeling • Retentive stickiness of key products Which customers do I want to win back? • Customer reacquisition • Customer profitability analysis • Prediction of key events (eg, residential movers) Customer Knowledge – Who are my customers? Segmentation & Profiles, External Data, Mkt Share/Wallet Share, Channel Preference Modeling Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 14 Direct Marketing Campaign Platform Direct Marketing Campaign Platform ACQUIRE • POS • Partners STORE • Advertising DIFFERENT CHANNELS ACTIVATION highest value customers RETAIN Vehicles: PURCHASED PROMOTION PURCHASE E-mail Address • Statements NO PURCHASE • Newsletters Triggered Promotions (for example) • Inserts • Direct mail Days since last purchase = X X = 30 days for PTNM • Personalized kits X = 60 days for GOLD • E-mail • Telephone Test Area downgrade trigger X = 120 days for CLUB lowest value customers * * < 1 purchase in last 12 mo REACTIVATE Vehicles: • Direct Mail If: Vc Cost to reactivate If: Vc < Cost to reactivate • E-mail • Statements If : Time since inactive = X, and Point balance > X “FIRE” Georgia State University - Confidential Ugly Postcard??? MGS4020_10.ppt/Apr 16, 2013/Page 15 General Data Mining Methods Classification: Association: Sequencing: Clustering: • Predicting which customers will purchase, based on demographics, psychographics, firmographics, service history, transactions, credit history, etc. Statistical algorithms and decision trees are used for these problems with much success. • Market Basket Analysis: which customers who purchase an additional telephone line are also likely to purchase dialup internet service? Pattern matching works well: associative rules, fuzzy logic, neural networks. • Which types of activities precede each other; eg, do customer hospitality and gaming activities show patterns or sequences? We use a combination of statistical modeling and simulations to identify these trigger points for action, and to estimate the marginal value of each. • Clustering is useful for determining similar groups based on how closely they resemble each other. Multitude of clustering techniques exist, with the primary difference being in how they define what is “close”. Clustering can be very useful for marketing messaging and advertising, strategy development and implementation, and channel development. Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 16 Analytics Process DISCOVERY DATA PREPARATION KNOWLEDGE DEVELOPMENT LEVERAGING ANALYTICS POST ANALYSIS FEEDBACK IDENTIFYING OPPORTUNITIES DATA WAREHOUSE HYPOTHESIS TESTING SCOPING EFFORT EXTERNAL DATA APPEND OBJECTIVE SETTING DATA EXTRACTION SEGMENTATION DEVELOPING HYPOTHESES DATA VALIDATION OFFER OPTIMIZATION Georgia State University - Confidential STATISTICAL MODELING CUSTOMER BEHAVIOR SCORING RESULTS DECOMPOSITION DIRECT MAIL TELEMARKETING EMAIL LOYALTY CAMPAIGN FEEDBACK FOR REFINING ANALYTICS MGS4020_10.ppt/Apr 16, 2013/Page 17 Summary • Analytics allow quantifiable, intelligent decision making • Analytics can be leveraged across all areas of a business • Different analytical methods apply to different situations • Modeling enables you to combine potential hundreds of factors into a single decision metric (or a few key scores/clusters) • Analytics are more powerful when tied to bottom line profitability Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 18 Agenda Data Mining Georgia State University - Confidential Marketing Analytics Example MGS4020_10.ppt/Apr 16, 2013/Page 19 InterContinental Brand Reactivation Promotion • Frequent travelers (points collectors) who had 1+ stays at InterContinental hotels in the US between Jan 1, 2001 and Jun 30, 2002. • Frequent travelers (points collectors) who had 0 stays at InterContinental hotels in the US between Jul 1, 2002 and Dec 31, 2003. • A set of activities used to find new, hidden, or unexpected patterns in data • Accuracy in predicting and reactivating these consumers Georgia State University - Confidential behavior MGS4020_10.ppt/Apr 16, 2013/Page 20 SQL SELECT MBR.MEMBERSHIP_ID, MBR.FIRST_NAME, MBR.LAST_NAME, MBR.ADDR_LINE_1, MBR.ADDR_LINE_2, MBR.ADDR_LINE_3, MBR.ADDR_LINE_4, MBR.ADDR_LINE_5, MBR.CITY, MBR.STATE_DESTINATION, MBR.ZIP_CODE, MBR.TYPE, SUM (CASE WHEN EVENT.CHECK_OUT_DATE BETWEEN '01-01-2001' AND '06-302002' THEN 1 ELSE 0 END) AS ONE_PLUS_STAYS, SUM (CASE WHEN EVENT.CHECK_OUT_DATE BETWEEN '07-01-2002' AND '12-312003' THEN 1 ELSE 0 END) AS ZERO_STAYS Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 21 SQL FROM MBR, EVENT, PROPERTY, XREF WHERE ( MBR.MEMBERSHIP_ID=XREF.MEMBERSHIP_ID ) AND ( PROPERTY.PROPERTY_ID=EVENT.PROPERTY_ID ) AND ( EVENT.MEMBERSHIP_ID=XREF.MEMBERSHIP_ID ) AND ( MBR.MARKET_REGION_CODE = '05388' AND MBR.TYPE IN ('BASE','GOLD','PLTNM') AND MBR.PREF_ALLIANCE_CODE = 'POINT' AND PROPERTY.BRAND_MAJOR_CODE = ‘INTERCONTINENTAL' AND PROPERTY.MARKET_REGION = 'US' ) Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 22 SQL GROUP BY MBR.MEMBERSHIP_ID, MBR.FIRST_NAME, MBR.LAST_NAME, MBR.ADDR_LINE_1, MBR.ADDR_LINE_2, MBR.ADDR_LINE_3, MBR.ADDR_LINE_4, MBR.ADDR_LINE_5, MBR.CITY, MBR.STATE_DESTINATION, MBR.ZIP_CODE, MBR.TYPE HAVING ONE_PLUS_STAYS >= 1 AND ZERO_STAYS = 0 Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 23 Cluster Analysis • Definition: The identification and grouping of consumers that share similar characteristics • Yields: better understanding of prospects/customers • Translates into: improved business results through revised strategies attributes • Definition: The identification and grouping of consumers that share similar characteristics • Process: • Data Selection • Missing Values • Standardization • Removal of Outliers • Cluster Analysis Considerations Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 24 Cluster Analysis • Only want a small subset of variables for clustering • Weed out undesirable variables • Can use PROC FACTOR, PROC CORR • Can use expert system • Consideration for observations, weighting • Probably done with factor analysis • If not, then two options • Set Missing to Mean of data • Set Missing to Value of Equivalent Performance • No right or wrong answer • Might do both - depending on variables Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 25 Clustering Midscale / Business Traveler Midscale / Leisure Traveler Upscale / Leisure Traveler Prospect Base Country Club / Resort Set Upscale / Business Traveler – Prosperous Traveler Georgia State University - Confidential Upscale / Business Traveler – Loan Dependent Other MGS4020_10.ppt/Apr 16, 2013/Page 26 Cluster Analysis Attribute Cluster Name A B C D E (ALL) Age of Head of Household 38 62 48 44 52 43 7 12 9 6 7 7 48 45 102 73 71 13 1 3 6 2 3 69 6 29 51 7 30 0 5 6 5 3 2 11 55 21 15 32 16 24 2 10 15 8 7 Length of Residence in high income group zip codes Household Income (,000) 72 Weekday Check in Weekend Check in No. Stays (resort) between Jan 1, 2001 and Jun 30, 2002 No. Stays (mid properties) between Jan 1, 2001 and Jun 30, 2002 No. Stays (upscale properties) between Jan 1, 2001 and Jun 30, 2002 Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 27 Cluster Analysis Cluster Population % Resp. Index Avg. Profit A 6 250 (75) B 16 30 5 C 5 110 48 D 8 175 86 E 7 80 (5) . . . . . . . . All 100 100 35 Georgia State University - Confidential MGS4020_10.ppt/Apr 16, 2013/Page 28 Cluster Analysis Cluster 1 Cluster 1 ------------ Cluster 1 Calculate Scores (ROI, Response, Utilization) Overlay Profitability Estimate High Evaluate Risk-Return Tradeoff (by Offer and by Cluster) Low Make Final Selections RISK RETURN Low Mail No-Mail High DM/Offer 1 DM /Offer 2 Georgia State University - Confidential -------- DM /Offer N MGS4020_10.ppt/Apr 16, 2013/Page 29