Download Week 09

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Generalized linear model wikipedia , lookup

Data analysis wikipedia , lookup

Least squares wikipedia , lookup

Multidimensional empirical mode decomposition wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Pattern recognition wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Corecursion wikipedia , lookup

CE 552 Week 9
Crash statistical approaches
Identification of problem areas - High
crash locations
• purpose of its synthesis was to summarize the
current practice and research on statistical
methods in highway safety analysis, useful for:
– establishing relationships between crashes and
associated factors
– identifying locations for treatment
– evaluating the safety effect of engineering
– Also useful for driver and vehicle safety analyses
The Empirical Bayes
Method for Before and
After Analysis
Key Reference
Hauer, E., D.W. Harwood, F.M. Council, M.S.
Griffith, “The Empirical Bayes method for
estimating safety: A tutorial.”
Transportation Research Record 1784, pp.
126-131. National Academies Press,
Washington, D.C.. 2002
Open This Document and read through as
you go along on PPT
Definition: expected number of crashes,
by type and severity
 What is expected cannot be known
 Must be estimated
 Precision of estimates measured in
standard deviation
 Need a high frequency at a site to have
a precise estimate
The “imprecision” problem …
Assume 100 crashes per year,
and 3 years of data, we can
reliably estimate the number of
crashes per year with (Poisson)
standard deviation of about…
or 5.7% of the mean
However, if there are relatively
few crashes per time period (say, 1
crash per 10 years) the estimate
varies greatly …
or 180% of the mean!
Regression to the mean problem …
High crash locations are chosen for one reason
(high number of crashes!)
 Even with no treatment, we would expect, on
average, for this high crash rate to decrease
 This needs to be accounted for, but is often not,
e.g., reporting crash rate reductions after
treatment by comparing before and after rates
over short periods
The Empirical Bayes Approach
Empirical Bayes: an approach to before and
after crash studies that attempts to
estimate more accurately the real impact
of high crash location countermeasures
 Simply
comparing what occurred before with
what happened after is naïve and potentially
very inaccurate
The Empirical Bayes Approach
Simple definition:
 Compares
what safety was versus what safety
would have been with no change.
 Uses that comparison as the basis for making
an estimate of the real impact of a change.
Empirical Bayes
Increases precision
 Reduced RTM bias
 Uses information form the site, plus …
 Information from other, similar sites
Mr. Smith had no crashes last year
 The average of similar drivers is 0.8 crashes per
 What do we expect is the number of crashes Mr.
Smith will have next year … 0?, 0.8? … neither!
(pretend you would like to insure Mr. Smith)
 Answer … use both pieces of information and
weight the expectation
EB applications
 CHSIM (now Safety Analyst)
EB Procedures
 Last
2-3 years data
 Traffic volume
 Can
use more data
 Includes other factors
Empirical Bayes Procedure
Actual change or improvement in safety
 Where:
is the estimate of the expected number of
crashes that would have occurred in a location
in the before time period without a change or
safety project
 A is the actual number of crashes reported in
the after period
Empirical Bayes
Weight should be based on sound logic and real data
Maryland Modern Roundabout Conversion Data
(Five Locations 1994)
Empirical Bayes
Estimate Of
Expected Crashes
Improvement (B)
Period Crash Estimate
Count (A)
22.71 (62%)
10.63 (43%)
12.38 (86%)
4.33 (30%)
11.16 (74%)
Treatment effects often vary considerably!
The SPF – Safety Performance
So what is the expected number of crashes for
facilities of this type? Develop a (negative
binomial) regression model to fit all the data –
must have data to do this!
An example SPF:
μ=average crashes/km-yr (or /yr for intersections)
So, if ADT = 4000
Note: this SPF depends only on ADT … it needn’t
The overdispersion parameter
The negative binomial is a generalized Poisson where
the variance is larger than the mean (overdispersed)
 The “standard deviation-type” parameter of the
negative binomial is the overdispersion parameter φ
 variance = η[1+η/(φL)]
 Where …
 μ=average
crashes/km-yr (or /yr for intersections)
 η=μYL (or μY for intersections) = number of crashes/time
 φ=estimated by the regression (units must be
complementary with L, for intersections, L is taken as one)
Example 1:
How many crashes should we have expected last year???
Example 1: road segment, 1 yr. of data
Is this distribution “over”
Example 1: computing the weight
What happens
when Y is large
(compared to μ/φ)?
When μ is small
compared to φ?
Example 1 (cont):
Example 2:
3 years of data: 12, 7, 8
As before
 4000 vpd
Note effect of more data
 Step 1:
 Step 2:
 Step 3:
Example 3: AMFs
1.2 meter shoulders (instead of 1.5)
 AMF = 1.04 (4% increase in crashes)
 Step 1:
Why is weight lower?
 Step 2:
 Step 3:
Example 4: subsections
Total length = 1.5km, 11 crashes in 2 years
Example 5: Severity
2.41 x 0.019 = 0.046 … 1.8 x 3 x 0.046 = 0.247
Note: φ stays same (mult dist. by constant);
Note: 20.357 ≠ 23.9 (from prob 2) … why? What is an ad hoc solution?
Large for fatals (helps
you not to “chase” them
Example 6: intersection
AMF = 1.27
7 crashes in 3 years
Step 1:
Step 2:
Step 3:
Example 7: group of intersections
11 crashes in 3 years
Step 1:
Step 2: (simplistic)
However, not clear what
to use
Example 7: (cont)
Example 7: (cont)
Step 3: using w=0.088,
Why so much confidence in the actual number?
Is it because we have 3 yrs of data?
Is it because 11 is smaller than 20.7?
What would happen if 11 had been, say, 32?
The full procedure
Example 8
1.8 km, 9 yrs. Unchanged road
 ADT varies, AMF = 0.95, 74 total crashes
Example 8, cont.
(If all μ are equal)
Why so small???
Example 9: Secular Trends
Yearly multipliers can be used like AMFs
to account for weather, technology
changes (must be able to get them)
Example 10: Projections
Projections can be made by using a
simple ratio of ADTs (raised to the
appropriate power) multiplied by the
corresponding ratio of AMFs or yearly
Some thought questions
Does EB eliminate RTM as stated?
 What happened if the SPF is not appropriate for your
 What does appropriate mean?
Software for Homework
You will need some software to develop the NB
regression model for your SPF – that is the “R
project” program. Investigate that now (see HW).
 (info on “R”)
 (download “R”)
 R for Windows FAQ
Dr. Souleyrette, May I be excused? My brain is full.
Gary Larson, The Far Side, ©1986