Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Generalized linear model wikipedia , lookup
Data analysis wikipedia , lookup
Least squares wikipedia , lookup
Multidimensional empirical mode decomposition wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
Pattern recognition wikipedia , lookup
CE 552 Week 9 Crash statistical approaches Identification of problem areas - High crash locations NCHRP 295 • purpose of its synthesis was to summarize the current practice and research on statistical methods in highway safety analysis, useful for: – establishing relationships between crashes and associated factors – identifying locations for treatment – evaluating the safety effect of engineering improvements – Also useful for driver and vehicle safety analyses The Empirical Bayes Method for Before and After Analysis Key Reference Hauer, E., D.W. Harwood, F.M. Council, M.S. Griffith, “The Empirical Bayes method for estimating safety: A tutorial.” Transportation Research Record 1784, pp. 126-131. National Academies Press, Washington, D.C.. 2002 http://www.ctre.iastate.edu/educweb/CE55 2/docs/Bayes_tutor_hauer.pdf Open This Document and read through as you go along on PPT Safety Definition: expected number of crashes, by type and severity What is expected cannot be known Must be estimated Precision of estimates measured in standard deviation Need a high frequency at a site to have a precise estimate The “imprecision” problem … Assume 100 crashes per year, and 3 years of data, we can reliably estimate the number of crashes per year with (Poisson) standard deviation of about… or 5.7% of the mean However, if there are relatively few crashes per time period (say, 1 crash per 10 years) the estimate varies greatly … or 180% of the mean! Regression to the mean problem … High crash locations are chosen for one reason (high number of crashes!) Even with no treatment, we would expect, on average, for this high crash rate to decrease This needs to be accounted for, but is often not, e.g., reporting crash rate reductions after treatment by comparing before and after rates over short periods The Empirical Bayes Approach Empirical Bayes: an approach to before and after crash studies that attempts to estimate more accurately the real impact of high crash location countermeasures Simply comparing what occurred before with what happened after is naïve and potentially very inaccurate The Empirical Bayes Approach Simple definition: Compares what safety was versus what safety would have been with no change. Uses that comparison as the basis for making an estimate of the real impact of a change. Empirical Bayes Increases precision Reduced RTM bias Uses information form the site, plus … Information from other, similar sites Concept Mr. Smith had no crashes last year The average of similar drivers is 0.8 crashes per year What do we expect is the number of crashes Mr. Smith will have next year … 0?, 0.8? … neither! (pretend you would like to insure Mr. Smith) Answer … use both pieces of information and weight the expectation EB applications IHSDM CHSIM (now Safety Analyst) EB Procedures Abridged Last 2-3 years data Traffic volume Full Can use more data Includes other factors Empirical Bayes Procedure Actual change or improvement in safety =B-A Where: B is the estimate of the expected number of crashes that would have occurred in a location in the before time period without a change or safety project A is the actual number of crashes reported in the after period Empirical Bayes Weight should be based on sound logic and real data Maryland Modern Roundabout Conversion Data (Five Locations 1994) Empirical Bayes Estimate Of Expected Crashes Without Improvement (B) Actual Crash After Reduction Period Crash Estimate Count (A) with treatment 36.71 14 22.71 (62%) 24.63 14 10.63 (43%) 14.38 2 12.38 (86%) 14.33 10 4.33 (30%) 15.16 4 11.16 (74%) Treatment effects often vary considerably! The SPF – Safety Performance Function So what is the expected number of crashes for facilities of this type? Develop a (negative binomial) regression model to fit all the data – must have data to do this! An example SPF: μ=average crashes/km-yr (or /yr for intersections) So, if ADT = 4000 Note: this SPF depends only on ADT … it needn’t The overdispersion parameter The negative binomial is a generalized Poisson where the variance is larger than the mean (overdispersed) The “standard deviation-type” parameter of the negative binomial is the overdispersion parameter φ variance = η[1+η/(φL)] Where … μ=average crashes/km-yr (or /yr for intersections) η=μYL (or μY for intersections) = number of crashes/time φ=estimated by the regression (units must be complementary with L, for intersections, L is taken as one) Example 1: How many crashes should we have expected last year??? Example 1: road segment, 1 yr. of data Is this distribution “over” dispersed??? Example 1: computing the weight What happens when Y is large (compared to μ/φ)? When μ is small compared to φ? Example 1 (cont): Example 2: 3 years of data: 12, 7, 8 As before 4000 vpd Note effect of more data Step 1: Step 2: Step 3: Example 3: AMFs 1.2 meter shoulders (instead of 1.5) AMF = 1.04 (4% increase in crashes) Step 1: Why is weight lower? Step 2: Step 3: Example 4: subsections Total length = 1.5km, 11 crashes in 2 years Example 5: Severity 2.41 x 0.019 = 0.046 … 1.8 x 3 x 0.046 = 0.247 Note: φ stays same (mult dist. by constant); Note: 20.357 ≠ 23.9 (from prob 2) … why? What is an ad hoc solution? Large for fatals (helps you not to “chase” them Example 6: intersection ADT=4520 ADT=230 AMF = 1.27 7 crashes in 3 years Step 1: Step 2: Step 3: Example 7: group of intersections 11 crashes in 3 years Step 1: Step 2: (simplistic) However, not clear what to use Example 7: (cont) Example 7: (cont) Step 3: using w=0.088, Why so much confidence in the actual number? Is it because we have 3 yrs of data? Is it because 11 is smaller than 20.7? What would happen if 11 had been, say, 32? The full procedure Example 8 1.8 km, 9 yrs. Unchanged road ADT varies, AMF = 0.95, 74 total crashes μ= Example 8, cont. (If all μ are equal) Why so small??? Example 9: Secular Trends Yearly multipliers can be used like AMFs to account for weather, technology changes (must be able to get them) Example 10: Projections Projections can be made by using a simple ratio of ADTs (raised to the appropriate power) multiplied by the corresponding ratio of AMFs or yearly multipliers Some thought questions Does EB eliminate RTM as stated? What happened if the SPF is not appropriate for your site What does appropriate mean? Software for Homework You will need some software to develop the NB regression model for your SPF – that is the “R project” program. Investigate that now (see HW). http://www.r-project.org/ (info on “R”) http://cran.mtu.edu/bin/windows/base/R-2.8.1win32.exe (download “R”) R for Windows FAQ Dr. Souleyrette, May I be excused? My brain is full. Gary Larson, The Far Side, ©1986