Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Market Basket Analysis & Neural Networks (chaps 7 & 11) Retail Checkout Data 11-2 MARKET BASKET ANALYSIS • INPUT: list of purchases by purchaser – do not have names • Identify purchase patterns – what items tend to be purchased together • obvious: steak-potatoes; beer-pretzels – what items are purchased sequentially • obvious: house-furniture; car-tires – what items tend to be purchased by season McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-3 Market Basket Analysis • Categorize customer purchase behavior • Identify actionable information – purchase profiles – profitability of each purchase profile – use for marketing • layout or catalogs • select products for promotion • space allocation, product placement McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-4 Market Basket Analysis • Affinity Positioning – coffee, coffee makers in close proximity • Cross-Selling – cold medicines, tissue, orange juice – Monday Night Football kiosks on Monday p.m. McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-5 Possible Market Baskets Customer 1: beer, pretzels, potato chips, aspirin Customer 2: diapers, baby lotion, grapefruit juice, baby food, milk Customer 3: soda, potato chips, milk Customer 4: soup, beer, milk, ice cream Customer 5: soda, coffee, milk, bread Customer 6: beer, potato chips McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-6 Co-occurrence Table Beer Pot. Chips Milk Diapers Soda Beer Pot. Milk Chips 3 2 1 2 3 1 1 2 4 0 0 1 0 1 2 beer & potato chips - makes sense McGraw-Hill/Irwin Diap. Soda 0 0 1 1 0 0 1 2 0 2 milk & soda - probably noise ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-7 Jaccard Coefficient Ratio of cases together over total cases Beer PotChip Milk PotChip 0.333 Milk 0.143 0.143 Diapers 0 0 0.200 Soda 0 0.200 0.333 McGraw-Hill/Irwin Diapers 0 ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-8 Market Basket Analysis • Steve Schmidt - president of ACNielsenUS • Market Basket Benefits – selection of promotions, merchandising strategy • sensitive to price: Italian entrees, pizza, pies, Oriental entrees, orange juice – uncover consumer spending patterns • correlations: orange juice & waffles – joint promotional opportunities McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-9 Market Basket Analysis • • • • Retail outlets Telecommunications Banks Insurance – link analysis for fraud • Medical – symptom analysis McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-10 Market Basket Analysis • Chain Store Age Executive (1995) 1) Associate products by category 2) What % of each category was in each market basket • Customers shop on personal needs, not on product groupings McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-11 Purchase Profiles Beauty conscious Kids’ play Smoker Health conscious Casual drinker Pet lover Sports conscious New family Gardener Men’s image conscious Casual reader Hobbyist Convenience food Sentimental Illness (OTC) Home handyman Automotive Illness (prescription) TV/stereo enthusiast Photographer Personal care Seasonal/traditional Homemaker Men’s fashion Student/home office Home Comfort Kid’s fashion Fashion footwear Women’s fashion McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-12 Purchase Profiles • Beauty conscious – cotton balls – hair dye – cologne – nail polish McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-13 Purchase Profile Use • Each profile has an average profit per basket Kids’ fashion $15.24 Push these Men’s fashion $13.41 Push these …. Smoker $2.88 Don’t push these Student/home office $2.55 Don’t push these McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-14 Market Basket Analysis • LIMITATIONS – takes over 18 months to implement – market basket analysis only identifies hypotheses, which need to be tested • neural network, regression, decision tree analyses – measurement of impact needed – difficult to identify product groupings – complexity grows exponentially McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-15 Market Basket Analysis • BENEFITS: – simple computations – can be undirected (don’t have to have hypotheses before analysis) – different data forms can be analyzed McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-16 Market Basket Software • Market Basket Analysis is highly unstructured • Most popular data mining software doesn’t support – Clementine does • Specialty software market for this specific purpose – DataSage Customer Analysis – Xaffinity McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved Neural Networks Automatic Model Building (Machine Learning) Artificial Intelligence 11-18 High-Growth Product • Used for classifying data – target customers – bank loan approval – hiring – stock purchase – trading electricity – DATA MINING • Used for prediction McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-19 Description • Use network of connected nodes (in layers) • Network connects input, output (categorical) – inputs like independent variable values in regression – outputs: {buy, don’t} {paid, didn’t} {red, green, blue, purple} {character recognition - alphabetic characters} McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-20 Perceptron Bias W1 I1 I2 W2 W3 I3 In Inputs F(x) X O Wn Synaptic Weights Neuron Basic building block Comprised of Synaptic Weights and Neuron Weights scale the input values Combination of weights and transfer function F(x) transform inputs to needed output O Trained by changing weights until desired output is achieved McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-21 Network Input Layer Hidden Layers Output Layer Good Bad McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-22 Operation • Randomly generate weights on model – based on brain neurons • input electrical charge transformed by neuron • passed on to another neuron – weight input values, pass on to next layer – predict which of the categorical output is true • Measure fit – fine tune around best fit McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-23 Operation • Useful for PATTERN RECOGNITION • Can sometimes substitute for REGRESSION – works better than regression if relationships nonlinear – MAJOR RELATIVE ADVANTAGE OF NEURAL NETWORKS: YOU DON’T HAVE TO UNDERSTAND THE MODEL McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-24 Neural Network Testing • Usually train on part of available data – package tries weights until it successfully categorizes a selected proportion of the training data • When trained, test model on part of data – if given proportion successfully categorized, quits – if not, works some more to get better fit • The “model” is internal to the package • Model can be applied to new data McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-25 Business Application • Best in classifying data mortgage underwriting bond rating commodity trading asset allocation fraud prevention • Predicting interest rate, inventory firm failure bank failure takeover vulnerability stock price corporate merger profitability McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-26 Neural Network Process 1. Collect data 2. Separate into training, test sets 3. Transform data to appropriate units • Categorical works better, but not necessary 4. Select, train, & test the network • • • Can set number of hidden layers Can set number of nodes per layer A number of algorithmic options 5. Apply (need to use system on which built) McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-27 Marketing Applications • Direct marketing – database of prospective customers • age, sex, income, occupation, education, location • predict positive response to mail solicitations • THIS IS HOW DATA MINING CAN BE USED IN MICROMARKETING McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-28 Neural Nets to Predict Bankruptcy Wilson & Sharda (1994) Monitor firm financial performance Useful to identify internal problems, investment evaluation, auditing Predict bankruptcy - multivariate discriminant analysis of financial ratios (develop formula of weights over independent variables) Neural network - inputs were 5 financial ratios - data from Moody’s Industrial Manuals (129 firms, 1975-1982; 65 went bankrupt) Tested against discriminant analysis Neural network significantly better McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-29 CASE: Support CRM Drew et al. (2001), Journal of Service Research • Identify customers to target • Customer hazard function: – Likelihood of leaving to a competitor (CHURN) • Gain in Lifetime Value (GLTV) – NPV: weight EV by prob{staying} – GLTV: quantified potential financial effects of company actions to retain customers McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-30 Systems A great many products • general NN products $59 to $2,000 @Brain BrainMaker Discover-It • components DATA MINING along with megadatabases other products • specialty products construction bidding, stock trading, electricity trading McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved 11-31 Potential Value • THEY BUILD THEMSELVES – humans pick the data, variables, set test limits • CAN DEAL WITH FAST-MOVING SITUATIONS – stock market • CAN DEAL WITH MASSIVE DATA – data mining • Problem - speed unpredictable McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved