- Ms Word Format
- 70 Pages
- ₦3000
- 1-5 Chapters
SELECTION BIAS AND HETEROGENEITY IN SEVERITY MODELS – SOME INSIGHTS FROM AN INTERSTATE ANALYSIS
SELECTION BIAS AND HETEROGENEITY IN SEVERITY MODELS – SOME INSIGHTS FROM AN INTERSTATE ANALYSISABSTRACT This paper addresses the potential effects of selection bias in the estimation of severity distributions in accident severity modeling. In particular, we address this issue in the context of frequency by severity models. Prior literature on frequency by severity models has focused on the use of discrete outcome models and count models as the baseline frameworks (Lord and Mannering 2010; Anastasopoulos and Mannering 2011; Milton, Shankar and Mannering 2008; Park and Lord 2007; Kweon and Kockelman 2003; Ye, Pendyala, Shankar and Konduri 2008). In discrete outcome models, the severity distribution is modeled as a proportions variable as function of geometric, traffic volume and potentially environmental factors. In count models, the outcome variables are either univariate or multivariate counts of severity, and modeled as functions of geometric, traffic volume and potential environmental effects (see for example Venkataraman et al. 2011). While the aforementioned modeling efforts provide significant insight into the unconditional probability of severity occurrence in terms of segment level measurements of geometry and volume, they ignore the impact of selection bias. The models are constructed on the observed histories of segments, which means that segments with no crash histories are omitted from the estimation procedure. Due to this omission, the severity distributions that are estimated may be biased. We provide a method to account for this selection bias via an examination of interstate crash histories in Washington State. In conventional severity analysis, as summarized via the above literature references, there are two main approaches – conditional analysis of severity where crash specific factors related to collision types, vehicle types, and occupant information are mainly used to estimate severity models. Roadway geometry is used in the form of dummy variables to describe the presence or absence of curvature for example, in the neighborhood of a crash site. The second aspect of conventional severity analysis relates to the unconditional analysis at the segmental level where frequencies of severities are estimated as a function of roadway geometry and traffic volume. In this case, collision type is ignored in conventional analysis because such information is not available in aggregate form. As a result of the above approaches, gaps exist in terms of hybrid data including both geometrics and collision type. This restricts the formulation of comprehensive models of severity. This thesis addresses this gap by using hybrid data including segmental geometry and collision information at the segmental aggregation level. Severity distributions are estimated as a function of interstate geometry and traffic volume factors, and observed collision type proportions. A comparison of parameters with and without the selection bias effect is provided. We also explore the effect of selection bias in terms of heterogeneity in the mean of parameters, where parameters are estimated to be random. We note in concluding that the extant literature on selection bias in the conditional context of severity modeling is scant (Tarko et al. 2010). It is a fruitful area of future research, which can lead to opportunities for integrating insights from conditional and unconditional severity analysis. Some evidence of this prospect can be seen in recently published papers (see for example, Anastasopoulos and Mannering 2011; Shankar et al. 2006). The original contribution of this thesis is two-fold: a) it addresses a gap in the published literature on the accommodation of potential information from segments that are observed to have not crashes, which as a result can affect the estimated severity distributions; and b) by accounting for such selection effects, the thesis also makes original contributions in the area of the nature of the impact of selection effects on parameters associated with severity models. In particular, the severity models are formulated as two-stage models where information on both geometrics and collision type is incorporated to provide for comprehensive analysis of segment level severity distributions. TABLE OF CONTENTS LIST OF TABLES……………………………………………………………………………………………… vii LIST OF FIGURES……………………………………………………………………………………………. viii ACKNOWLEDGEMENT…………………………………………………………………………………….. ix Chapter 1 INTRODUCTION…………………………………………………………………………………. 1 Chapter 2 RELATED WORKS AND RESEARCH QUESTIONS…………………………….. 3 2.1 Conventional Severity Analysis………………………………………………………………….. 3 2.2 Statistical Modeling of Severity………………………………………………………………….. 4 2.3 Panel Data and Random Parameters Framework…………………………………………… 5 Chapter 3 ANALYSIS PROCESS AND EMPIRICAL SETTINGS………………………….. 10 3.1 Analysis Process……………………………………………………………………………………… 10 3.2 Study Area…………………………………………………………………………………………….. 17 3.3 Crash Cluster Construction……………………………………………………………………….. 18 3.3.1 Data Collection………………………………………………………………………………… 19 3.3.2 Descriptive Statistics…………………………………………………………………………. 20 Chapter 4 STATISTICAL MODELING OF SELECTIVITY EFFECTS IN SEVERITY ANALYSIS 22 4.1 Random Parameters Model Specification…………………………………………………… 22 4.2 Random Parameters Approach for Modeling………………………………………………. 26 4.3 Estimation Results for Random Parameters Unconditional Severity Model……. 29 4.4 Comparison of Parameters with and without the Selection Bias Effect…………… 35 Chapter 5 CONCLUSION AND DIRECTIONS FOR FUTURE RESEARCH…………… 39 5.1 Discussion on Selection Effect………………………………………………………………….. 39 5.2 Conclusions and Recommendations…………………………………………………………… 40 References…………………………………………………………………………………………………………. 46 APPENDIX……………………………………………………………………………………………………….. 49
Chapter 1 INTRODUCTION
This thesis explores the impact of selection bias in severity analysis for traffic safety evaluation concerning crash data. Crash data are reported by state patrol or police personnel, and then assembled in raw form by state departments of transportation for statistical analysis and monitoring. Often times, the state traffic safety commission is involved in this effort. Dedicated funding allows for the continual monitoring of crash severities so that, key routes on the state transportation system do not trend toward high injury crash occurrences, thereby inflicting heavy social costs. The average cost of a traffic fatality has risen to over 4 million dollars per capita, and the total cost of traffic crashes has exceeded 300 billion annually in the United States. While fatalities and higher end severities such as disabling crashes inflict a majority of the burden from a social cost standpoint, the sheer number of low severity crashes (in excess of 75%) contributes as well to the social cost burden. Therefore, it is imperative that an analysis of severity distributions include the expected social cost burden due to locations where crashes have not occurred. A major reason for this expectation is that crash occurrence distributions are probabilistic (see for example, Shankar et al. 1996), and locations where crashes were not reported may have crashes in the future. In particular, if these locations are ignored in the analysis of severity, then, estimates of severity model parameters could be biased. Figure 1 : Sample Combined Severity Distribution with Crash and Non-Crash Segment For example, Figure 1 shows how non-crash segment effects to severity distribution of whole segments to investigate. The rest of this thesis is focused on the following discussions with a trend toward a detailed assessment of the severity model in the later chapters, followed by conclusions and recommendations. In chapter 2, I discuss the background of severity studies, where I lay out the gaps in the state of knowledge, and the research questions that still remain in the area of accurate estimation of network wide severity distributions. In Chapter 3, I discuss materials and methods used to address the research questions, thereby providing a data-model foundation for laying out the original contribution of my thesis. In chapter 4, I discuss the results of the statistical models of injury severity, which helps illustrate the effectiveness of my method for using hybrid geometric and collision data for estimating severity distributions. In chapter 5, I discuss conclusions and recommendations, including strengths and limitations of my thesis, and also address scope for further work that can build on insights from my thesis. SELECTION BIAS AND HETEROGENEITY IN SEVERITY MODELS – SOME INSIGHTS FROM AN INTERSTATE ANALYSIS