Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
THE SAS APPLICATION FOR ROAD SAFETY ANALYSIS Marzena Nowakowska, Laboratory of Computer Science, Technical University of Kielce, Poland 1. INTRODUCTION In Poland, details about each road accident are recorded by the police in the standard format called Road Accident Card. The Card consists of a number of pieces of information that identify a variety of accident features such as location, kind of accident, number of victims, number and types of vehicles/other road users involved. This information is collected and stored in a computer accident database fi les for use by local, regional or nat ional authori ties. Each record in the computer data base file has the same structure as the Road Accident Card. On the basis of this information, research works on traffic road safety can be conducted by applying modern computational techniques. The elaboration of road safety appraisal method is a complex task as it involves an approach to mulitiactor accident events and the recognition of the interaction between different road, vehicle and road users aspects. One of the most important elements of the analysis is the identification of blackspots. A blackspot is a section of a road with high accident risk. Having them identified a traffic engineer can concentrate on the technical detai Is of combining sui table remedial countermeasures to improve traffic safety. Statistical Analysis System® has turned out to be very useful in various research works on traffic road safety. The paper describes a SAS software application, that enables accident databases analysis and processing, in the investigation of accident occurence patterns in order to evaluate the level of road safety and to identify blackspot sections of a road. The application works in Windows environment. The SAS system was chosen for the following reasons: -·it can run on a variety of enginees, - there are no data size constraints, except for hardware configuration, - it offers a very wide range of statistical procedures as well as graphical tools, data step and macro programming languages greatly ease database project~n and selec®ion, ® - SAS/EIS" , SAS/FSP" and SAS/AF" software enable us to provide a flexible, interactive and user-friendly interface for the analysis. Practical application facilities, which quickly allow researchers to explore a number of safety combination features, significantly enhance the quality of analysis. 2. THE MODEL OF INFERENCE The accident data are subject to stochastic variations as a consequence of random nature of accidents, and they are scarse. Taking these facts into account, and consequently uncertainty of statistical inference, the 849 r MM_ I bayesian theory has been adapted to do the research. statistic approach is used in safety analysis [2, 3, 6]. This kind of The idea of uncertainty brings together the randomness of an event X and the doubt as to the value of parameters of the model that describes this randomness. Using the bayesian approach it is possible to define the probability distribution of' such variable X with respect to the fact, that parameters of its distribution are also random variables. The new distribution hex), known as a compound distribution, can be written in the following form: hex) = Ig(xIA) 'k(A)dA (1) A where: g(XIA) is the conditional distribution of X, given the parameter ~, k(A) is the distribution of A. The distribution (1) is also called a bayesian distribution. The random variables considered here, that can stand tor X, are the following accident events that occur along a certain section of a road: number of accidents, number of victims and number of involved vehicles. The number of accident events is independent between different sections, with Poisson distribution of an annual intensity parameter A.The value of A is subject to. a gamma distribution and characterises all road sections. The combination of the Poisson distribution for X given the intensiy A, and a gamma distribution for the parameter A is a negative binomial distribution [4] h(x). It describes the distribution of event number over the road sections. Parameters for this distribution depend on the parameters of the gamma distribution. Their evaluation is straightforward as the distributions of Poisson and gamma types are natural conjugates [5]. I f the intensity A has a pior distribution k(A) and the probability of the x number of events given A is described by the conditional distribution g(XIA), then the conditional distribution k~(Alx) is given by: k~(Alx) = g(xIA)'k(A)IIg(xIT)'k(T)dT (2) A The posterior distribution k~(Alx), obtained from the modyfying factor g(XIA) defined for each individual road section, is used while defining blackspots classification criterion. For sections of a road two sources of information can be used to describe a proposed accident-proness model. The first one, h(x), is the distribution (1) that describes the. randomness of event number over these sections. The second, h~(ylx), is the distribution of event number Y which can occur at those road sections, where the number of x events has been recorded. The h~(ylx) function is defined by putting the posterior probability function (2) in the place of k(A) in the formula (1). \ >. '-' It is obvious that the level of safety varies between different roads and road sections. Therefore to define the change in accident occurrence for each road section, the value of cumulative probability of h-(ylx), calculated for the median of the hex) distribution, has been proposed. The median is considered as a threshold value that divides the range of possible event numbers along a section into two parts so that the prior 850 , ; 'J. probabilities are 0.5 for each part. The posterior probabilities of were used in the blackspots identification process. h~(ylx) 3. THE PARTS OF DATA PROCESSING The analysis is conducted on the basis of the data collected from rural roads of two neighbouring provinces during the last 3-year period. Only accidents with human victims are on record; the others as vehicle damage collisions are not included. A packet of programs has been worked out i~ which SAS language, S&S Elementary Statistics Procedures, SAS/GRAPH software and SAS/STAT software were extensively used to carry out essential tasks. These are: - regruoping and transforming the basic data to the form required by the application, - selection of road sections with "accident problem", evaluation of the needed distribution parameters and then calculation chosen characteristics of road accidents, identifying high risk accident sections for further in-depth engineering analysis, - presenting final results. Pic. 1. The main menu of the apl ication for road safety analysis. In majority of cases the output of one or more programs is the input to another program; SAS macro language was excellent tool to enable a fluent flow of information and to join some sequences of program tasks. The SAS software application for road safety analysis utilizes this packet of programs in the form of an interactive interface for a user. To build the application SAS/AF, SAS/FSP, SAS/EIS and SAS Screen Control Language were used. The first screen of the application is presented in the picture 1. 851 The main parts of the application ~re: START-UP, ROAD SECTIONS selection and BLACKSPOTS identification. They were designed using the procedure BUILD and utilizing the features of FRAME entries, which particularly well simplified SCL programs respective for these parts. At every stage while running the application the HELP option is offered to clarify the current problem. 3.1. THE START-UP PART To perform the analysis, road sections with "accident problem" should be determined. In the present context these are sect ions where accidents groupped; they have a specified allowable maximum length and a specified allowable mInImum number of accidents along a section. However, according to the suggestions of a local road safety authority, these values should fall within certain intervals. The interval for the number of accidents is <4, 00) and the interval for the section length is <0,4> km. The default values are 4 and 4.0 respectively. The user can easily redefine these defaults; see picture 2. Pic. 2. Starting-up the calculations for the road safety analysis. In the START-UP part all SAS language programs that take the longest time to prepare the data for the other parts of the application are run. Thus, the most time-consuming algorithms are processed once for the given initial defaults. When the OK push button is selected the set of information about all roads from the area is investigated, resulting in new SAS data sets. Each of them contains the accident data from an individual road. Thus the number of the new sets is equal to the number of the roads from the area under invest igat ion. These sets become the substant ial source of informat ion on the base of which sect ions with accident problem are then determined. This is the most important part of the whole application. Thank to iterative statements of Screen Control 852 ;.",.- Language a group of some important macros could be call~d for each road. Here the index variable is a road identification number taken from a suitable SAS data set using the util i ties of the Widget classes. The crucial point in macro calculations is the procedure CLUSTER from SAS/STAT software; the variable used in the clustering algorithm is a road kilometreage, where an accident has occured. The macros enable also the selection and projection of SAS data sets as well as the calculation of new variables to obtain the accident characteristics for a road section. 3.2. THE SELECTION OF ROAD SECTIONS WITH ACCIDENT PROBLEM In the second part of the application the user chooses a road for investigation. As a result a SAS data set, created in the START-UP part, is opened. It contains the information about sect ions of the choosen road that have accident problem. From these data statistical accident characteristics are calculated and then presented in the form shown in the picture 3. Pic. 3. The information of road sections with accident problem. It is also possible to view some detailed information that concerns every road section, like the starting and the ending kilometreage, the total number of ki lled and injured as well as the number of involed vehicles together with their type classification. The SAS procedures FSVIEW and FSBROWSE made it possible to present the results either in a full-screen mode or in one-record-at-a-time mode with a carefuly redesigned display. Every time the user chooses a road a new sequence of calculations is carried out. 853 3.3. BLACKSPOTS IDENTIFICATION It is obvious that the traffic volume along a road affects the number of sections with accident problem; there were several times more sections on interregional road than on local one. In the BLACKSPOTS ident ification part, roads wi th at least 5 sections are analysed. For roads where sufficient data are not available a broader grouping is used by putting their sections together into one SAS data set. To determine whether an accident problem section should be identified as a blackspot, not only number of accidents but also number of victims and involved vehicles were taken into account. Hence several random variables are investigated in the analysis. The probabilities of these variables occurrence higher than their respect i ve prior median values can be a significant issue in the inference process. These probabilities become a vector to be used in the classification process, which is carried out again by the cluster analysis. The nearest centroid sorting method processed by the procedure FASTCLUS was applied. To obtain binary grouping (below or above the median) two initial cluster seeds are specified; these are the vectors with minimum and maximum lengths respectively. The classified section can be indicated as hazardous one (blackspot) if i t belongs to this group of clusters that has a "havier" seed vector. It should be underlined here that the sections from the other set can in no way be treated as safe or without accident risk. Pic. 4. The detailed information about blackspot diagnosis. Like in the part ROAD SECTIONS, the user chooses a road for which blackspots have to be ident ified. As a result the global number of blackspots is displayed. In addition, for each road section some detailed information about classification vector values and a blackspot diagnosis can be presented on request. The one-at-a-t ime present at ion example is showed in the picture 4. 854 4. REMARKS AND CONCLUSIONS The presented SAS software application easily gives the required results provided the initial information is satisfactory. On one hand each blackspot section can be considered individually in in-depth safety analysis while on the other hand they can be investigated together to reveal accident similarities. Initial results obtained for an experimental data base have showed that the application is relevant and useful. New information can be easily added to data bases and consequently updates the output. At every stage of work while bulding the application, in order to obtain suitable subsets of the basic SAS data set, lots of possibilities offered by the SAS system were exploited: data set options, filtering operations (IF, WHERE), BY-group processing, combining SAS data sets (MERGE, APPEND). Algorithms of fundamental meaning for the analysis could have been worked out thank to the following procedures: MEANS, CLUSTER, FASTCLUS, TREE, CAPABILITY, FREQ. Organizing and helping procedures SORT, DATASETS, CONTENTS considerably eased and fasten the work. The macro language made it poss"ible to process iterative calculations for the same data but various values of parameters for each iteration step and to organize communication between programs. The SAS inspires open and engineer system is a powerful tool in research works. Whats more it to further investigations. Thus the application is designed as amenable to expansion, becoming a handy utensil for a traffic in his work on road safety .. 5. ACKNOWLEDGEMENTS The SAS software application was build in the course of the work on the author doctor's thesis. The doctor's degree is being taken under the direction of Prof. Marian Tracz, Chair of Highway & Traffic Engineering, Cracow University of Technology, Poland. The author would like to thank him very mauch for his valuable suggestions and remarks, especially those concerning the model of inference. 5. REFERENCES 1. 2. 3. 4. 5. Benjamin J. R., Cornell A. C., "Probability, Statistics and Decision for Civil Engineers", Mc Graw Hill, Inc, 1970. B. G. Heydecker and J. Wu, "Using the Information in Road Accident Records", 19-th Summer Annual Meeting of PTRS, 1991, Sussex, England. E. Hauer, "On the Estimation of Expected Number of Accidents", Accident Analysis and Prevention, Vol. 18, No.1, 1-12. Johnson L. N., Kotz S., "Discrete distributions", Wi ley & Sons, Inc. Maritz J.S., "Empirical Bayes Methods", Metheuen and Co LTD, London, 855 6. 7. 8. 9. 10. 11. 12. 1970. Mountain.L., Fawaz B., "The Accuracy of Estimates of Expected accident frequences obrained using an Empirical Bayes Approach", Traffic Engineering and Control, May 1991. SAS is a registered trademark of SAS Institute, Cary, NC, USA. SAS/GRAPH, SAS/EIS , SAS/FSP and SAS/AF, SAS/STAT are registered trademarks of SAS Institute Inc., Cary, NC, USA. SAS Guide to Macro Processing, Release 6.03 Edition, SAS Institute Inc., Cary, 1991.. SAS Language: Reference, Version 6 First Edition, SAS Inst itute Inc., Cary, 1991. SAS Procedures Guide, Release 6.03 Edition, SAS Institute Inc., Cary, 1991. SAS Screen Control Language, Version 6 First Edition, SAS Institute Inc., Cary, 1991. 856