Download Marketing Mix Modelling

Document related concepts
no text concepts found
Transcript
Predictive methods
Understanding customer preferences
Agenda
• Introduction to predictive analytics
– Logistic regression
• Case study: Japanese car manufacturer exporting in
the US
• Modelling interdependent consumer preferences
• Causality estimation
– Propensity scores to estimate effectiveness of marketing
interventions of a pharmaceutical company
• Web Analytics
Predictive methods for marketing
• Predictive methods exploit patterns found on
historical data to estimate the probability for a
certain individual to make a decision
• Three main categories:
– Scoring models -rank customers by their probability of
making a decision
– Descriptive models -categorize customers by their
preferences and life style
– Decision models - describe the relationship between
all the elements of a decision
Predictive methods applications
• Marketing planning and campaign
optimization
• Customer relationship management
• Market basket analysis
• Customer retention
• Direct marketing
• Fraud detection
• Web click stream analysis
Questions to be answered
• What’s the probability of a given customer to
purchase a product?
• How can I categorize the customer base in
homogeneous groups?
• Which potential customer should a promotion
be offered to?
• Which website should I advertise on?
• Which search keywords should I invest in?
Logistic regression
• Is used for prediction of the probability of
occurrence of an event by fitting data to a
logistic curve
• Makes use of several explanatory variables
that can be either numerical or categorical
• Used to predict dichotomous (0 = event doesn’t
occur, 1 = event does occur) or categorical values
(0=event a occurs, 1=event b occurs, 2=event c occurs)
Logistic regression
• Logistic regression
– Linear relation between the predictors and the
Logistic function
– P(Yi=1) is the probability of an event to occur
 P(Yi  1) 
   0  1 xi1   2 xi 2  ...   n xin   i
ln 
 1  P(Yi  1) 
Logistic regression
• Logistic curve
– Input values: any real number
– Output values:
• (from 0 to 1)
z
e
P(Yi  1) 
z
1 e
z   0  1 x1   2 x2  ...   n xn  
Example
• A sample of PhD students are asked to decide
whether to stop a research project considered
unethical by an animal rights’ group
Dependent variable:
Explanatory variable:
student’s answer
gender
0 = “Stop research”
1 = “Continue research
0 = Female
1 = Male
Model summary
• -2 Log likelihood stat: the smaller the number
the better the model
• Cox & Snell and Nagelkerke R Squares: the
higher the number the better the model
Model Summary
Step
1
-2 Log
Cox & Snell
likelihood
R Square
399.913a
.078
Nagelkerke
R Square
.106
a. Estimation terminated at iteration number 3 because
parameter estimates changed by less than .001.
Model summary
• The model predicts that the probability for a woman
to decide to continue research is 30% while for a
man is 60%
• The ODDS of deciding to continue research are 3.4
times higher for men than for women
ODDS from model’s equation
P(continue _ research ) 
ODDSRatio  exp ̂
e
aˆ  bˆ Factor
1 e
aˆ  bˆ Factor
Classification
• Classify subjects with respect to what decision we
think they will make ex. predict that men will
continue research and women will stop
• 66% of correct predictions, false positive rate
41%, false negative 30%
Case study: a Japanese car
manufacturer exporting in the USA
• Modelling consumer preferences
– What drives US consumers to purchase a
Japanese car versus a non Japanese one?
– What is the probability for an individual to
buy a Japanese car?
– Which people should be targeted ?
– Are consumer preferences interdependent?
The data
• Purchase of mid-sized cars (1 = Japanese, 0 =
non Japanese)
– Difference in price (k$),Difference in options (k$)
– Age of buyer (years)
– Annual income of buyer ($)
– Ethnic origins(1=Asian, 0=non-Asian)
– Education(1=College, 0= below College)
– Latitude & Longitude
The model
1
f ( z) 
z
1 e
z  183  2.68  lat  2.34  long  ...  
Interpreting model’s output
•Young and Asian people from south west of the city
are more likely to buy a Japanese car
•Does price coefficient make sense ?
•People with higher education are less likely to buy a
Japanese car
Model diagnostics
Very strong in predicting Japanese cars purchases
but weak in predicting non-Japanese cars purchases
Model diagnostics
Area under the curve:
0.754
Gini coefficient = 0.5
Modelling interdependent consumer
preferences
• An individual preference can be influenced by
preferences of others
– Psychological benefits
– Social identification
• People who identify with a particular group
often adopt the preferences of the group
• Incorporate these dependences into the
model
Looking at models’ residuals
The residuals represents what is not explained by the
model
Group 1
P(person1
to buy)
Coeff*
Income
...
Residual 1
Coeff*
Income
..
Residual 2
R
P(person2
to buy)
Looking at models’ residuals
• The presence of interdependent networks
create preferences that are mutually
dependent resulting in covariance matrix with
non zero off diagonal elements.
• Residuals of people belonging to the same
group are positively correlated
– Correlation (Residual Person 1 , residual Person 2)>0
Looking at models’ residuals
• Creating groups by splitting individuals into
neighbours
– AGE (16-25 , 26-40, over 40)
– Demographic (Combination of Age, education,
ethnic)
– Geographic influence (Postal code)
• Analyze average residuals for each group
Looking at models’ residuals
Including customer interdependences
Adding a group dummy which is equal to 1 if the
individual fits in the group and 0 otherwise
Asian – 26-40
P(person1
to buy)
Coeff*
Income
Group
...
dummy
Residual 1
Coeff*
Income
Group
..
dummy
Residual 2
R
P(person2
to buy)
Spurious statistics
• A high correlation
between sales and TV
could mean:
Sales
Media
Income
– Either media causes
sales
– or sales causes media
– or a third variable causes
both sales and TV
What is the truth?
Taking causality seriously
• Using least squares regressions and data
mining could lead to unreliable results:
– Polishing the Ferraris rather than the Jeeps
can cause Ferraris to win more races than
Jeeps
• Propensity scores to estimate the casual
effects of marketing interventions
Propensity scores
• Pharmaceutical company is to promote a lifestyle drug and evaluate the market
• The scope is to rank a list of doctors
according to their likelihood to prescribe a
certain drug
• Marketing interventions:
– Visiting a doctor describing the drug
– Dining the doctor at a nice restaurant
– Offering free samples of the drug
Impact of marketing interventions
• The marketing interventions are designed to
increase the number of prescriptions written
by the doctors
• But how to quantify the number of
prescriptions generated by the intervention?
– Compare the number of prescription written
after been visited with the number of
prescriptions that would have been written
without the intervention
Prediction VS causal estimation
Causality
Doctor A
Visit
15 scripts
10 scripts
No visit
T=1
15 scripts
T=2
Prediction VS causal estimation
Causality
Doctor B
Visit
5 scripts
1 script
No visit
T=1
2 scripts
T=2
Propensity scores
• We should make investment decisions
comparing the expected returns when
making the investment and when not making
the investment
• Short stop: lack of data
– For a doctor who is visited by the salesperson,
the number of scripts after the visit is
measureable but we cannot measure the number
of scripts written if the doctor wouldn’t have
been visited
How do we do it?
• Finding clones: create matched pairs of
doctors where one member of the pair has
been exposed to the intervention and the
other has not. The doctors must be
“identical” or very similar before the time of
exposure
• Clones are found through the propensity
score method
The data
• 250000 Doctors and for each one:
– Number of prescription written at time 1
– Number of prescription at time 2
– Was the doctor visited by the salesperson from
time 1 to time 2?(Y/N)
– Doctor’s characteristics: specialty, region, date of
degree and more than a hundred of such factors
The data
Number of
scripts at
time 1
Doctor
Number of
characteristic scripts at
s
time 2 if
visited
Number of
scripts at
time 2 if not
visited
Causal effect
Doctor 1
ok
ok
5
missing
?
Doctor 2
ok
ok
7
missing
?
Doctor 3
ok
ok
missing
2
?
Doctor 4
ok
ok
missing
1
?
Cloning by propensity score
– Propensity score (Doctor A)= predicted
probability of logistic regression
– Dependent variable (1=“Visited by salesperson,
0=“Not Visited”
– Factors: Characteristics
P(beingvisit ed ) 
e
aˆ  bˆ Factor...
1 e
aˆ  bˆ Factor...
The data
Number of Doctor
Number of
scripts at
characteris scripts at
time 1
tics
time 2 if
visited
Number of Propensity Causal
scripts at
Score
effect
time 2 if
not visited
Doctor 1ok
ok
5
missing
0.67
OK
Doctor 2ok
ok
7
missing
0.30
?
Doctor 3ok
ok
missing
2
0.66
?
Doctor 4ok
ok
missing
1
0.80
?
What’s the
causal effect ?
Propensity score
Campaign execution
• Doctors are ranked accordingly to their causal
increase in prescriptions
• The first % of doctors in the list are then
contacted and/or offered free samples
– Priority is given to the doctors who have not been
contacted yet
• The % is chosen accordingly to company’s
budget
Propensity scores on e-commerce
• Online store with membership database
– Estimate the effectiveness of promotional mails
and identify people to be targeted
• Calculate the ROI of a free shipping initiative
– Individuals receive free shipping but pay an annual
fee for it
• A pharmaceutical company offering on its
website a coupon to encourage trial use of a
drug
Web analytics
Purchase path
Purchase path
In this example we drilled into the
AdWords > AdWords path to see
the specific ads that were clicked
on en route to purchase.
Timing
To further increase the
accuracy of attribution, an
advertiser is able to choose
the maximum log window.
Decision influencer
What we know
What we don’t know yet
Our Communications
┼Paid Search
┼Banner Ads
┼e-mail
┼Onsite Promotions
┼Comparison Shopping
┼Affiliate ad
┼
Consumer Search
┼Organic search
┼Site visits to us
Competitor Communications
┼Consumer search
┼ Site visits to competitors
┼ Product trials
┼ …….
┼
Other sources
┼ Social Media
┼ Word of mouth
┼ Opinion sites
┼ Expert opinions
┼ Traditional Mass Media
U
n
c
e
r
t
a
i
n
t
y
Consumer decision
• Build a model to predict consumer decisions
– Using data on influencers that we are able to track and
measure
– Representing data on influencers that we can’t yet track
and measure - our uncertainty - through a statistical
distribution
• Calibrate the model on observed consumer
decisions
– Purchase - yes/no , Purchase size - dollar volume, # of units
– Repeat purchases, word of mouth
Decision model
Consumer’s decision is a function of
Our communications,
Consumer Search,
Competitor
communications,
Other sources
Paid Search, Banner Ads, email, On-site Promotions,
Comparison shopping,
Affiliate ads
Site visits to us
Uncertainty
Decision model
Data example
Data example
Influence potential
Data availability
We are barely scratching the surface of
the potential of path data with the
attribution models!!!
Betting on keywords
Effectiveness of different campaigns
Agenda
• Introduction to predictive analytics
– Logistic regression
• Case study: Japanese car manufacturer exporting in
the USA
• Modelling interdependent consumer preferences
• Causality estimation
– Propensity scores to estimate effectiveness of marketing
interventions of a pharmaceutical company
• Web Analytics
References