Download a Censoring

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
New Chapter “99” Event Duration Models
This chapter covers models of elapsed duration.
 Customer Relationship Duration
 Loyalty Program Membership Duration
 Customer Retention Metrics
Mathematical
Marketing
This Section is 95% taken from
Helsen, Kristiaan and David C. Schmittlein (1993), "Analyzing Duration
Times in Marketing: Evidence for the Effectiveness of Hazard Rate
Models," Marketing Science, 12 (4), 395-414.
Slide 99d.1
Hazard Rate Models
Module Sequence
The sequence of coverage






Mathematical
Marketing
Definitions
The Hazard Function
Truncation
Censoring
Non-Parametric Models
Parametric Models
Slide 99d.2
Hazard Rate Models
Key Definitions
Define
Ti
as a random variable representing the duration for individual i. Then
F(t) = Pr(Ti < t)
is the probability function of duration failure times. The density, or unconditional failure
rate is
f(t) = F′(t) =
Mathematical
Marketing
dF( t )
dt
Slide 99d.3
Hazard Rate Models
More On Survivorship and Failureship
The cumulative failure function can now be written as an integral
t

F(t) = Pr(Ti < t)  f (u )du
0
The survivorship function is the complement of the Failureship distribution,


S(t) = 1 – F(t) = Pr(Ti > t) = f (u )du
t
Mathematical
Marketing
Slide 99d.4
Hazard Rate Models
What Is An Hazard Function?
The hazard function, or conditional (age specific) failure rate is
h(t) 
Mathematical
Marketing
f (t)
f (t )

1  F( t ) S( t )
Slide 99d.5
Hazard Rate Models
Elaboration on the Hazard Function
pr [failure at t]
f (t)
f (t )
h(t) 

1  F( t ) S( t )
pr [there has not been a failure up to t]
It is the instantaneous rate of failure given survival until now, or
the imminent failure risk
Mathematical
Marketing
Slide 99d.6
Hazard Rate Models
The Shape of the Hazard Function
h(t) 
f (t)
f (t )

1  F( t ) S( t )
The hazard function can take on any shape:
Mathematical
Marketing
1.
h(t) increases – snowballing (product adoption)
2.
h(t) constant – no dynamics or memory
dh ( t )
0
dt
3.
h(t) decreases – inertia (interpurchase times)
dh ( t )
0
dt
Slide 99d.7
Hazard Rate Models
Constant Hazard – No Memory
The exponential distribution
f(t) = e-t
implies
h(t) = 
and we have situation 2.
Mathematical
Marketing
Slide 99d.8
Hazard Rate Models
The Two-Parameter Weibull
The Weibull distribution
 1  t 
f (t )  t e
implies
h(t) = t-1
and we can create any of the three situations.
Mathematical
Marketing
Slide 99d.9
Hazard Rate Models
The Hazard Rate Impacts Average Retention
Since
h(t) 
f (t)
f (t )

1  F( t ) S( t )
We can solve for f(t) and see that the hazard rate will have an impact on the
mean of f(t).
So can we add independent variables to the model?
First, a digression on censoring.
Mathematical
Marketing
Slide 99d.10
Hazard Rate Models
Truncation and Censoring
Left
Truncation
Censoring
Mathematical
Marketing
Right
Ti is observed only if Ti < a
Ti is observed only if Ti > a
If Ti  a, then Ti = a
All values below a are observed as a
If Ti  a, then Ti = a
All values above a are observed as a
Slide 99d.11
Hazard Rate Models
A Typical Relationship Between x and y
y
x
Mathematical
Marketing
Slide 99d.12
Hazard Rate Models
Guest Survey at a Hospitality Establishment
There are no hotels cheaper than $0 per night
In a guest survey you get left truncation
Mathematical
Marketing
Slide 99d.13
Hazard Rate Models
Truncated Dependent Variables
Assume yi is observed only if yi > a
Detection of the observation is therefore subject to a selection process
This is called truncation
Mathematical
Marketing
Slide 99d.14
Hazard Rate Models
The Truncation Line Is y > a
y
a
y>a
x
Mathematical
Marketing
Slide 99d.15
Hazard Rate Models
Here Is What We Observe
y
x
Mathematical
Marketing
Slide 99d.16
Hazard Rate Models
Note That a New Line Has a Different Slope
y
x
Mathematical
Marketing
Slide 99d.17
Hazard Rate Models
Another View
y
x
Mathematical
Marketing
Slide 99d.18
Hazard Rate Models
Here Is Our Selection Process
y
y>a
x
Mathematical
Marketing
Slide 99d.19
Hazard Rate Models
Here Is the Part We Observe
y
y>a
x
Mathematical
Marketing
Slide 99d.20
Hazard Rate Models
Here Is the (Wrong) Line We Estimate
y
y>a
x
Mathematical
Marketing
Slide 99d.21
Hazard Rate Models
A General Survey Does Not Save You
In a general survey you get left censoring
Assume if yi  a, then yi = a
All values above a are observed as a
Mathematical
Marketing
Slide 99d.22
Hazard Rate Models
True Relationship of x and Duration
duration
Each dependent value
above the horizontal line
will be redefined as equal
to the line, i. e. y = a.
Ti=a
Ti=0
Mathematical
Marketing
x
Slide 99d.23
Hazard Rate Models
True Relationship of x and Duration
duration
Each dependent value
above the horizontal line
will be redefined as equal
to the line, i. e. y = a.
Ti=a
How will the bias work?
Ti=0
Mathematical
Marketing
x
Slide 99d.24
Hazard Rate Models
Customer Relationship Duration
Time
Ongoing Relationships
Are Right-Censored
Time of
Study
Mathematical
Marketing
Slide 99d.25
Hazard Rate Models
Relationship Duration Is Generally Right Censored
For right-censored individuals, we know only that Ti  a.
As we did the study at a certain time, all our current customers are right censored.
Mathematical
Marketing
Slide 99d.26
Hazard Rate Models
Proportional Hazards
h(t) = h0(t) hx(t)
This part is a function of individual x values
It adjusts h0 up or down as a function of marketing
instruments
This part is constant for all individuals
Mathematical
Marketing
Slide 99d.27
Hazard Rate Models
Parametric Hazard Models
We combine this model with “partial” maximum likelihood estimation. The partial likelihood is
the probability that individual i had duration Ti given that someone out of the group had
duration T. This model gets us two benefits:
1.
2.
This partial likelihood is a ratio of individual likelihoods, so h0(t) cancels.
Information from censored observations is appropriately taken into account.
We generally use
h x (t )  e
Mathematical
Marketing
βxi
Slide 99d.28
Hazard Rate Models
Two Parametric Funtional Forms
h(t) = h0(t) hx(t)
 e
βx i
 1 βx i
 t e
Exponential distribution
Weibull distribution
Can we make the Exponential a special case of the Weibull?
Mathematical
Marketing
Slide 99d.29
Hazard Rate Models
ML Estimation
Density function
Survivorship function Pr(Ti > t)
ln l   i ln f (Ti | β)   (1  i ) ln S(Ti | β)
i
i
with
Mathematical
Marketing
1
i  
0
for uncensored observations
for censored observations
Slide 99d.30
Hazard Rate Models
SAS PROC LIFEREG ;
proc lifereg data=input-data-set;
model y *flag-var (1) = iv1 iv2 / distribution = weibull ;
class nominal-var ;
This var tracks whether the observation is right censored or not
If flag-var is equal to this value, the observation is censored.
Mathematical
Marketing
Slide 99d.31
Hazard Rate Models
Related documents