Accepted Manuscript Title: Modeling work zone crash frequency by quantifying measurement errors in work zone length Authors: Hong Yang, Kaan Ozbay, Ozgur Ozturk, Mehmet Yildirimoglu PII: DOI: Reference:
S0001-4575(13)00080-8 http://dx.doi.org/doi:10.1016/j.aap.2013.02.031 AAP 3059
To appear in:
Accident Analysis and Prevention
Received date: Revised date: Accepted date:
31-5-2012 27-12-2012 15-2-2013
Please cite this article as: Yang, H., Ozbay, K., Ozturk, O., Yildirimoglu, M., Modeling work zone crash frequency by quantifying measurement errors in work zone length, Accident Analysis and Prevention (2013), http://dx.doi.org/10.1016/j.aap.2013.02.031 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Reference #: AAP_3059 Highlights The paper develops improved crash frequency models for work zone safety analysis.
!!
Measurement errors in work zone length have been carefully studied and addressed.
!!
The paper examines more potential casual factors using detailed work zone project data.
!!
Enhanced models provide better understanding of casual factors of work zone crashes.
Ac ce p
te
d
M
an
us
cr
ip t
!!
1
Page 1 of 22
Modeling work zone crash frequency by quantifying measurement errors in work zone length Hong Yang
a, 1
, Kaan Ozbay
a, 2
, Ozgur Ozturk
a, 3
, Mehmet Yildirimoglu
b, 4
a
1
us
Corresponding author. Tel.: +1 732 445 0576; fax: +1 732 445 0577. E-mail address:
[email protected] 2 E-mail address:
[email protected] 3 E-mail address:
[email protected] 4 E-mail address:
[email protected]
cr
ip t
Rutgers Intelligent Transportation Systems (RITS) Laboratory, Department of Civil and Environmental Engineering, Rutgers University, 96 Frelinghuysen Road, Piscataway, NJ 08854, United States b Urban Transport Systems Laboratory (LUTS), School of Architecture, Civil and Environmental Engineering (ENAC), Ecole Polytechnique Fédérale de Lausanne (EPFL), GC C2 406 (Bâtiment GC), Station 18, CH-1015, Lausanne, Switzerland
M
an
1
Abstract
d
Work zones are temporary traffic control zones that can potentially cause safety problems. Maintaining safety, while implementing necessary changes on roadways, is an important challenge traffic engineers
te
and researchers have to confront. In this study, the risk factors in work zone safety evaluation were identified through the estimation of a crash frequency (CF) model. Measurement errors in explanatory
Ac ce p
variables of a CF model can lead to unreliable estimates of certain parameters. Among these, work zone length raises a major concern in this analysis because it may change as the construction schedule progresses generally without being properly documented. This paper proposes an improved modeling and estimation approach that involves the use of a measurement error (ME) model integrated with the traditional negative binomial (NB) model. The proposed approach was compared with the traditional NB approach. Both models were estimated using a large dataset that consists of 60 work zones in New Jersey. Results showed that the proposed improved approach outperformed the traditional approach in terms of goodness-of-fit statistics. Moreover it is shown that the use of the traditional NB approach in this context can lead to the overestimation of the effect of work zone length on the crash occurrence. Keywords: Work Zone; Crash Frequency; Safety Analysis; Measurement Error; Negative Binomial Model; Full Bayesian
2
Page 2 of 22
1. Introduction Much of the U.S. highway infrastructure is aging and the need for maintenance, rehabilitation, and upgrading of the existing networks increases. Consequently, road users are increasingly exposed to work zone activities. Nationally, about 23,745 miles of federal-aid roadway improvement projects were
ip t
underway annually from 1997 to 2001 (FHWA, 2001). On average motorists encountered an active work zone for every 100 miles driven on the national highway system (Ullman et al., 2004b). The number can be much larger considering other work zones deployed on municipal, county, and state roads.
cr
The presence of so many work zones directly affects the safety of road users and highway workers. According to the latest safety statistics, 667 work zone fatalities occurred in the U.S. in 2009.
us
Approximately 85 percent of those killed in work zone were drivers or passengers and the remaining 15 percent were workers. In addition to these fatalities, more than 40,000 injuries resulted from motor vehicle crashes in work zones (FHWA, 2011). As shown by many studies (Graham et al., 1978; Paulsen et al.,
an
1978; Garber and Woo, 1990; Casteel and Ullman, 1992; Pal and Sinha, 1996; Khattak et al., 1999; Venugopal and Tarko, 2000; Khattak et al., 2002; Qi et al., 2005; Harb et al., 2008a; Ullman et al., 2008), crash rates increase in the presence of work zones compared to the normal road conditions. This rise can be attributable to the complexity of the work zone circumstance that interrupts continuing traffic flow and
M
creates many traffic conflicts. However, precise reasons why more crashes occur at work zones may still not be clear. A thorough understanding of the risk factors associated with work zone crash occurrence is essential for the development of effective countermeasures to reduce the number of fatalities and injuries,
d
and to enhance traffic operation and safety within work zones. However, many site- and state-specific
te
factors need to be further investigated to better understand the reasons for work zone crashes. Therefore, the purpose of this paper is to develop a statistical model to identify the potential relationship
Ac ce p
between crash frequency and a set of explanatory variables at work zones in New Jersey. A relative large amount of detailed work zone project files and crash data obtained from the New Jersey Department of Transportation (NJDOT) provided us with opportunities to explore extra factors that have not been addressed before. Considering the presence of measurement errors in variables, special models that extend the use of the traditional Negative Binomial (NB) model are developed and parameters are estimated through the full Bayesian approach. By addressing the potential errors in variables, the developed models provide more reliable understanding of the safety impact of different contributory factors on work zone crashes. 2. Literature Review
A number of studies have conducted work zone safety analysis. However, majority of them focused on the development of the descriptive statistics of work zone crash data to interpret the characteristics such as crash experience, consequences, temporal, and spatial distributions of crashes at work zones. (Nemeth and Migletz, 1978; Rouphail et al., 1988; Hall and Lorenz, 1989; Garber and Woo, 1990; Sorock et al., 1996; Lin et al., 1997; Bryden et al., 1998; Raub et al., 2001; Chambless et al., 2002; Garber and 3
Page 3 of 22
Zhao, 2002; Schrock et al., 2004; Bushman et al., 2005; Müngen and Gürcanli, 2005; Salem et al., 2006; Harb et al., 2008b; Jin et al., 2008; Li and Bai, 2008; Ullman et al., 2008; Dissanayake and Akepati, 2009; Akepati and Dissanayake, 2011). Generally, the literature indicated that the presence of work zones increases the likelihood of crash occurrence (Graham et al., 1978; Paulsen et al., 1978; Garber and Woo, 1990; Casteel and Ullman, 1992; Pal and Sinha, 1996; Khattak et al., 1999; Venugopal and Tarko, 2000;
ip t
Khattak et al., 2002; Qi et al., 2005; Harb et al., 2008a; Ullman et al., 2008). In addition, crashes were found to disproportionately occur across different segments of work zones. For instance, the work zone activity area was the predominant location of crashes and rear-end collision was the predominant type of
cr
crash (Nemeth and Migletz, 1978; Hargroves, 1981; Nemeth and Rathi, 1983; Pigman and Agent, 1990; Garber and Zhao, 2002; Schrock et al., 2004; Srinivasan et al., 2008; Xing et al., 2010; Zhu et al., 2010). Comparisons between daytime and nighttime work zone crashes suggest no clear evidence that crash
us
rate significantly increased at night (Casteel and Ullman, 1992; Daniel et al., 2000; Ullman et al., 2004a; Udoka, 2005; Ullman et al., 2006; Ullman et al., 2008).
an
Although CF models have been extensively used in road safety analysis, only a few studies that have specifically focused on modeling work zone crash occurrence (Pal and Sinha, 1996; Khattak et al., 1999; Venugopal and Tarko, 2000; Khattak et al., 2002; Qi et al., 2005; Srinivasan et al., 2011). Several statistical techniques have been employed to analyze CF among these existing studies. For instance, a
M
few studies developed negative regression (NB) models to predict the expected number of crashes (Khattak et al., 1999; Venugopal and Tarko, 2000; Khattak et al., 2002; Srinivasan et al., 2011). Pal and Sinha (1996) also modeled crashes at interstate work zones in Indiana and found that a normal
d
regression model outperformed the classical NB and Poisson models. Similarly, Qi et al. (2005) constructed the truncated NB regression model and truncated Poisson regression model to analyze the
te
rear-end crashes at work zones in New York. The truncated NB regression model was found to have better predictive power. Other than these empirical models, Elias and Herbsman (2000) used the Monte
Ac ce p
Carlo simulation approach to develop a crash rate probability distribution function that considered the intrinsic scarcity of work zone crash data. Despite of model differences, factors most commonly found to significantly affect work zone CF included the length of the work zone, duration, and average daily traffic (ADT). Generally, the modeling results showed that work zone CF increased with increasing ADT, duration, and work zone length.
A possible reason of why few studies explored the casual factors associated with work zone CF is the deficiencies of work zone data as stated in Pal and Sinha (1996), Wang et al. (1996), Zhao and Garber (2001), and Bourne et al. (2010). For instance, work zone crash data derived from police crash reports were usually subject to a number of uncertainties (Wang et al., 1996). Explanatory variables such as ADT, work zone length, and duration were also found to be subject to measurement errors. For example, the presence of significant bias was found when using the estimated ADT instead of the actual volume during work zone conditions (Khattak et al., 1999; Venugopal and Tarko, 2000; Khattak et al., 2002). Problems in defining the length of certain work zones such as bridge works and those involving detours were identified by several studies (Wang et al., 1996; Venugopal and Tarko, 2000). Moreover, the exact starting date or ending date of a specific work zone may not be readily available to calculate the duration 4
Page 4 of 22
of that work zone project (Venugopal and Tarko, 2000). If data with deficiencies were used to develop predictive models, this could clearly lead to biased estimates of the model parameters. For instance, ElBasyouny and Sayed (2010) found that the bias can increase with the magnitude of the measurement error in traffic volume when developing a safety performance function. Therefore, to be able to develop more reliable models, a limited number of safety studies applied models that can capture measurement
ip t
errors in variables and better results were obtained (Lundevaller, 2006; El-Basyouny and Sayed, 2010). Similarly, measurement errors associated with work zone data were not addressed in the literature and thus they should also be carefully considered. This paper thus specifically uses the measurement error
cr
models to account for the work zone crash modeling issues when the work zone length is not accurately obtained.
us
3. Data Description
As pointed out by previous researchers (Wang et al., 1996; Garber and Zhao, 2002; Qi et al., 2005), a
an
few agencies such as the New York State Department of Transportation (DOT), Florida DOT, Oregon DOT, and Texas DOT attempted to collect supplemental data specifically for work zone crash and/or worker accident (Bourne et al., 2010). Many others have little experience in collecting, analyzing, and using work zone crash measures to assess and track safety in their work zones (Ullman et al., 2009;
M
Bourne et al., 2010). In New Jersey, work zone crashes as well as related information were also not perfectly archived. To investigate casual relationship between work zone characteristics and CF, data data, (2) crash data, and (3) traffic data.
d
from three independent sources have been obtained and assembled in a unified database: (1) work zone
te
For work zone data, project files of 60 work zones completed between 2004 and 2010 were obtained from the NJDOT engineering document unit. Information about the proposed work zone length, mileposts of
Ac ce p
the work zone, and number of lanes operated were extracted from these individual project files. Crash data of these work zones were obtained from the crash database of the NJDOT. Information about the time, date, milepost, type of crash, road type, and posted speed was extracted for each crash record. The work zone length can be determined in two ways: (1) by using the length from the work zone project file and (2) by employing spatial-temporal diagrams of work zone crash data. Figure 1 shows examples of the spatial-temporal distributions of the observed work zone crashes for a specific work zone site. The solidline box indicates the proposed work zone length obtained from the project file; the dashed-line box shows the adjusted work zone length based on the reported work zone crashes. Various issues related to measuring the work zone length by these two methods were discussed in modeling section 4.
5
Page 5 of 22
ip t cr us
an
Figure 1 Spatial-Temporal distribution of work zone crashes
M
Apart from the crash records and roadway characteristics, traffic volume data are another important exposure factor required to develop crash frequency models (Ullman et al., 2009). Notably, traffic volume should be collected during the presence of work zones. However, these actual traffic counts were usually unavailable for the type of ex-post evaluation study conducted in current study (Ullman et al., 2009;
d
Bourne et al., 2010). Therefore, straight line diagrams from the NJDOT were used to collect traffic volume
te
data. These diagrams provided annual average daily traffic (AADT) data for each segment of roadways. To account for the seasonal and time fluctuation under work zone conditions, AADT data were adjusted according to the seasonal and hourly correction factors from NJDOT to estimate the traffic volume of the
Ac ce p
work zones.
Based on information collected from these three sources, relationship between the variables shown in Table 1 and work zone crash frequency was examined. As property damage only (PDO) and injury crashes may have different causal factors, they were aggregated by season and analyzed separately.
6
Page 6 of 22
Table 1 - Variables considered in the work zone crash frequency models Crash
Mean Sd.
Min.
Max.
Numerical Crash counts at work zone every three month
2.7
4.8
0.0
67.0
Length
Numerical Length of the work zone
2.4
2.1
0.5
11.4
Traffic volume
Numerical Adjusted volume/lane (by direction, season, hour) 9,566 3,876 2,019 18,962
Road system
Indicator
Interstate = 0, state = 1
0.8
Light condition
Indicator
Daytime = 0,
0.5
0.5
0.0
1.0
Number of intersection Numerical Number of intersections within a work zone
7.0
9.1
0.0
33.0
Operated lanes
1.7
0.7
1.0
3.0
6.8
25.0
55.0
Indicator
Description
nighttime = 1
Number of operating lanes in work zone
Work zone speed limit Numerical Work zone posted speed limit (mph) Numerical Reduction in posted speed limit (mph) Numerical Number of lanes closed in a work zone
Number of ramps
Numerical Number of ramps within a work zone
42.8
0.0
1.0
7.2
5.7
0.0
20.0
0.9
0.8
0.0
2.5
3.3
3.1
0.0
15.0
an
us
Speed reduction Dropped lanes
0.4
ip t
Type
cr
Variable
4. Statistical Modeling Methodology
Crash occurrences are considered non-negative count data and the outcome of a set of contributing
M
factors. To model such count data, the Poisson regression model was frequently used in crash data analysis (Miaou et al., 1992; Shankar et al., 1995). However, the Poisson model cannot address potential over-dispersion issues of the data. To deal with the problem, many researchers have suggested
d
extensions of the simple Poisson model (Hauer, 2001; Lord et al., 2005). Among the extensions, the NB model has becomes one of the extensively used alternatives to model crashes (Abdel-Aty and Radwan,
te
2000; Mitra and Washington, 2007). Considering the potential errors in modeled variables, we made an attempt to extend the use of the NB model by incorporating measurement errors. The following sections
Ac ce p
described the models used in this study. 4. 1. The negative binomial (NB) model
Yi denote the number of crashes at the ith work zone in a given time period. Assuming that Yi follows the Poisson distribution with the mean ! i , the probability of observing yi crashes in the work zone can be described by the basic Poisson regression model:
P ! Yi ! yi ! !
e ! λi λ iyi yi !
(1)
Location-specific work zone explanatory variables are incorporated into the model to specify the parameter ! i :
λ i ! exp(β'x i )
(2) 7
Page 7 of 22
where ! is the vector of estimated coefficients of the model and xi is the explanatory variable vector (such as duration, length, and traffic volume) for the ith work zone. To deal with unobserved heterogeneity and to allow for unexplained randomness, a noisy measurement
ip t
εi is introduced into Eq. (2): λ i ! exp(β'x i ! ε i )
(3)
cr
where εi is an error term which represents a random effect of omitted explanatory variables and
unmeasured heterogeneity. In the NB regression model, we assume that exp( εi ) is gamma distributed:
us
exp(ε i ) ~ Gamma(κ, κ)
(4)
where κ is an inverse dispersion parameter. Compared with the Poisson regression model, the NB
Γ ! yi ! κ !
κ κ λ i yi ( ) ( ) yi !Γ ! κ ! κ ! λ i κ ! λi
(5)
λ i2 κ
(6)
M
P ! Yi ! yi | λi , κ ! !
an
model has the additional parameter κ to be estimated and the model has the following properties:
d
E ! Yi ! ! λ i ; Var ! Yi ! ! λ i !
te
The variance Var !Yi ! shown in Eq. (6) is always larger than the mean E !Yi ! . Thus, the NB model is
Ac ce p
more appropriate for modeling the over-dispersed crash data. Based on the variables listed in Table 1, the expected number of crashes in a given period can be specified as follows:
ln ! λ i ! ! α 0 ! α1 ln ! Li ! ! α 2 ln ! Qi ! !
m
!
β j X ji , i ! 1, ! , n
(7)
j! 1
where ! i is the expected number of crashes in given time period; Li is the work zone length; Qi is the traffic volume during the period of study; X ij represents the j th explanatory variable at the ith work zone; and α0 , α1 , α2 and !
j
are the model parameters.
4. 2. Modeling measurement errors in work zone length Using the aforementioned methods, the length measurement of each work zone used has important accuracy issues as none of the methods described in Section 3 to obtain the work zone length was perfect. 8
Page 8 of 22
First, the length extracted from the project plans provided by the NJDOT engineering document unit is likely to be inaccurate. These project plans may not reflect the final work zone layouts, which may be slightly changed in the field. For instance, the length of one of the studied projects 1,368 meters. This length only indicates that many tasks such as widening, grading, and paving have to be done for that section. It does not represents the actual work zone setting which may include other components such as
ip t
buffer area, termination area, and so on. Moreover, work zone projects generally have multiple stages of tasks and the length of a work zone can vary according to the progress of the project. The length information for each stage was not clearly described in most of the project plans.
cr
Second, the length of a work zone identified using the spatial-temporal diagrams of work zone crashes might have estimation errors. As shown in Figure 1, the length of the work zone is estimated as the
us
difference between the lowest and highest mileposts in which work zone crashes are observed. The estimation relies on the locations of the observed crashes. Obviously, this brings a major bias into work zone length measurement, mainly because of the randomness of crashes locations. If the observed
an
crashes are not uniformly distributed along the entire work zone, the estimated work zone length is biased. In addition, the reported milepost of an observed crash might not be accurate. Therefore, these imperfect data sources imply that either the length based on spatial-temporal diagrams
M
or the length obtained from project plans are likely to have measurement errors. To consider the measurement errors in the work zone length, a classical measurement-error model has
d
been proposed. Specifically, we assume that the length on the log-scale is measured as the true value on
te
the log-scale plus an additive error. The model structure can be expressed as:
ln(Lit ) ! ln(Fi ) ! τit , i ! 1, ! , n; t ! 1, ! , Ti
(8)
Ac ce p
where Lit denotes the measured work zone length in each time period (in terms of season), Fi denotes the true length, and τit is the measurement error term. Ti is the total number of periods for the work zone
i . Assume that error term τit and Fi are independent and τit follows a normal distribution N ! 0, σ ε2 ! , the measured work zone length follows the log-normal distribution shows here: ln(Lit ) ~ N ! ln(Fi ), σ ε2 ! , i ! 1, ! , n; t ! 1, ! , Ti
(9)
Eq. (9) is a classical measurement error model that captures the relationship between measured work zone length Lit in a given season t and the (unknown) actual length Fi . As Fi is unknown, it can be assumed to be a latent variable that follows a log-normal distribution: ln(Fi ) ~ N ! ln(μ iF ), σ 2F ! , i ! 1, ! , n
(10)
9
Page 9 of 22
Eq. (10) represents the seasonal variation model describing the distribution of the unknown work zone length over the work zone duration. Because the dataset is not homogeneous, each work zone has its own expected length μiF . A common seasonal variation parameter, σ F2 , is assumed. Instead of using the measured work zone length, the true (unknown) length obtained from Eq. (10) is incorporated into Eq. (7)
m
ln ! λ i ! ! α 0 ! α1 ln ! Fi ! ! α 2 ln ! Qi ! !
!
β j X ji , i ! 1, ! , n
(11)
cr
j! 1
ip t
and the mean function can be rewritten as follows:
Eq. (8) through Eq. (11) represent the fundamental of NB model with a measurement error (MENB) in work zone length. The adjust work length based on spatial-temporal diagrams of work zone crashes was project files was assumed to be the expected length μiF .
an
5. Model Estimation
us
assumed to be the measured length Lit . The work zone length determined from the individual work zone
M
5. 1. Full Bayesian estimation
All model parameters are estimated using the Full Bayesian method, which uses Monte Carlo Markov Chain (MCMC) sampling. The WinBUGS statistical software package was used. This estimation methodology has been widely used in road safety studies (El-Basyouny and Sayed, 2009; Lan et al.,
d
2009; Yanmaz-Tuzel and Ozbay, 2010). MCMC approach repeatedly samples from the posterior
te
distribution and generates chains of random points. Once the distribution of the simulated chains is observed to converge to the target posterior distribution, full Bayesian estimates of the model parameters are obtained from the remaining iterations. Trace plots of the chains and the Brooks-Gelman-Rubin
Ac ce p
(BGR) statistic are used to monitor the convergence. The BGR diagnostic first proposed by Brooks and Gelman (1998) examines average widths of 80 percent intervals of the pooled sample as well as each individual sample. When the average width of the individual samples approach the average width of the pooled sample, BGR statistic approaches 1. In other words, convergence is assumed to occur when the ratio is close to 1 (less than 1.2 is often sufficient to indicate convergence). The iterations up to the convergence point are excluded as burn-in samples and the remaining iterations are then used for parameter estimation. An inference is considered to be reliable when the Monte Carlo error relative to the standard deviation of the estimated parameter is less than 0.05 (Burnham and Anderson, 2002). 5. 2. Prior distributions Full Bayesian estimates of the parameters can be obtained by specifying prior distributions when they are available. Prior distribution is used to capture some known information about the distribution of each parameter. In the absence of strong prior information, uninformative priors can be used to obtain the Full Bayesian estimates. The normal distribution with zero mean and a relative large variance is a frequently used diffuse prior distribution for the regression parameters (El-Basyouny and Sayed, 2010). In this study, 10
Page 10 of 22
since we did not have known prior distributions, we have also employed normal distribution as the prior distribution for the parameters in Eq. (7) and Eq. (11). Gamma (0.001, 0.001) was used as the uninformative priors for parameters κ , σ ε! 2 , and σ F! 2 mentioned in section 4.
ip t
5. 3. Model selection The Deviance Information Criterion (DIC) is used to compare alternative models as well as to determine which model outperforms the others. DIC is calculated as follows:
cr
DIC ! D ! pD
(12)
us
where D ! E[ D ( ! )] ! ! 2 log[ L(Y | ! )] and pD ! D ! D ( ! ) . D is the posterior mean; D ( ! ) is the point estimate which measures how well the model fits the data via a log-likelihood function and unknown parameters of the model ! ;
is the data; L (Y | ! ) is the likelihood function; pD is a measure of model
an
complexity, which defines the effective number of parameters; and ! is the posterior mean of ! . In the literature, the model with the lowest DIC value is assumed to be the best estimated model. A
M
difference in DIC that is larger than 10 suggests that the model with a smaller DIC should be favored. On the other hand, if the difference in DIC is less than 5 then it is recommended that it might be misleading to report the model with the lower DIC as the best model (Gelman et al., 1996; Spiegelhalter et al., 2002).
d
To compare the goodness-of-fit of different models, the procedure developed by Gelfand (1996) was
te
used. The procedure involves a leave-one-out cross-validation process: a data set of n observations is repeatedly split into a training set of size n-1 and a test set of size of 1 and a prediction of the conditional predictive ordinate (CPO) of the test set is made based on the remaining training set. Assuming i th
(13):
Ac ce p
observation namely, yi , obs is left out and the corresponding CPOi based on model M a is denoted as Eq.
CPOi ! f ( yi , obs | Y! i , obs , M a )
(13)
where CPOi is equal to the predictive density f ( yi , obs | Y! i ,obs , M a ) and Y! i ,obs represents the all but i th observation used to estimate the model M a . For comparison, the CPOi ratio is calculated as follows:
CRi !
f ( yi ,obs | Y! i , obs , M a ) f ( yi ,obs | Y! i ,obs , M b )
(14)
CRi ! 1 indicates i th observation yi , obs supports model M a whereas CRi ! 1 suggests that i th observation yi , obs supports model M b . The model supported by more observations is then preferred. See Gelfand
(1996) for extensive discussion of this approach from a Bayesian perspective. It should be noted that 11
Page 11 of 22
when n is relative large the leave-one-out approach can be computationally intensive. Therefore, rather than running the approach for all n observations, this cross-validation approach is applied to a number of randomly selected observations in this study. 6. Simulation Study
ip t
To investigate the bias that might be arising from using the NB model, a Monte Carlo simulation through WinBUGS was conducted before analyzing the actual dataset. In simulation, two variables including the work zone length with measurement error and traffic volume are assumed to affect work zone crashes. It
cr
is assumed that the traffic volume Qi follows the normal distribution ln ! Qi ! ~ N ! ln(8,000),[ln(500)]2 ! and
! iF ! 4 , !
F
iF
), σ F2 ! , where
us
the true work zone length Fi follows the lognormal distribution so that ln(Fi ) ~ N ! ln( !
! 0.4 . In addition, the measured work zone length Li is assumed to follow the lognormal
distribution so that ln( Li ) ~ N ! ln( Fi ), σ !2 ! . To explore the potential bias caused by the measured work
Smaller !
!
an
zone length with different level of errors, five scenarios including ! ! ! 0.1,0.2,0.3,0.4, and 0.5 are tested. represents the smaller variation in the work zone length. The magnitude of the measurement
M
error relative to the variation in the work zone length is assessed by using reliability ratio σ F2 / (σ ε2 ! σ F2 ) . Hence the reliability ratios of the five scenarios are calculated as 0.941, 0.800,0.640,0.500,and 0.390. A total of n ! 150 simulated crash counts yi (i ! 1, 2,..., n) are generated from a negative binomial model link
d
function with ln ! λi ! ! α 0 ! α1 ln ! Fi ! ! α 2 ln ! Qi ! and ! ! 2 , where α 0 ! ! 3.0 , α1 ! 1.0 , α 2 ! 0.25 . Once the dataset is created, Eq. (7) is used to re-estimate the parameters of the NB mode. Then Eq. (8) through
te
Eq. (11) are used to re-estimate the MENB model in WinBUGS. Two chains of 10,000 iteration each are used. The first 5,000 iterations are used as burn-in runs.
Ac ce p
Table 2 presents the modeling results for the simulated dataset using both the NB and the MENB models. This results shows that the MENB model was able to reproduce the “true” parameter values. All coefficients were statistically significant based on their 95-percent Bayesian credible interval (CI). However, when the !
!
increased the reliability ratio decreased and the bias of the estimated results
increased. The bias of the estimated parameter a1 ranges from 8.9 percent to 54.1 percent. For instance, the scenario 1 shows that the estimated parameter a1 of work zone length is 0.911. It is close to the true parameter a1 ! 1.0 but still has a bias of 8.9 percent. All the estimated results of a1 in scenarios 3, 4 and 5 are far from the true value. Moreover, their 95-percent CI values also do not cover the true value. These findings are consistent with the previous simulation study conducted by El-Basyouny and Sayed (2010) that showed that the presence of measurement errors will cause bias in parameter estimations. In the meantime, estimations of other parameters are also affected (no specific direction) due to the bias caused by the variable with measurement errors.
12
Page 12 of 22
Table 2 - Results of simulated study based on different models Scenario
NB
Scenario 3: (!
F
! 0 ! ! 3.0 ! 1 ! 1.0 ! 2 ! 0.25 κ=2 Scenario 4: (!
F
! 0 ! ! 3.0 ! 1 ! 1.0 ! 2 ! 0.25 κ=2 Scenario 5: (!
F
SD
2.50%
97.50%
-2.705
0.416
-3.565
-1.923
-2.881
0.466
-3.865
-2.064
0.911
0.200
0.546
1.302
1.020
0.247
0.578
1.516
0.018 0.397
0.197 1.272
0.270 2.842
0.235 1.946
0.016 0.401
0.204 1.292
0.269 2.863
-2.520
0.359
-3.227
-1.824
-2.956
0.485
-3.958
-2.085
0.755
0.176
0.421
1.125
1.047
0.244
0.611
1.550
0.016 0.356
0.206 1.192
0.271 2.587
0.238 1.933
0.018 0.393
0.203 1.287
0.274 2.808
-2.194
0.330
-2.884
-1.590
-3.006
0.407
-3.787
-2.145
0.571
0.161
0.266
0.908
1.077
0.213
0.661
1.478
0.016 0.346
0.202 1.173
0.262 2.512
0.238 1.944
0.016 0.390
0.204 1.300
0.268 2.812
-2.190
0.357
-2.923
0.566
0.162
0.247
0.017 0.337
0.200 1.156
0.237 1.787 ! 0.4,! ! ! 0.3)
0.230 1.745 ! 0.4,! ! ! 0.4)
0.234 1.716 ! 0.4,! ! ! 0.5)
cr
0.234 1.923 ! 0.4,! ! ! 0.2)
ip t
Mean
-1.518
-2.832
0.431
-3.728
-2.013
0.880
0.996
0.220
0.533
1.436
0.268 2.465
0.234 1.954
0.016 0.398
0.201 1.295
0.266 2.837
-2.011
0.361
-2.725
-1.263
-2.852
0.381
-3.692
-2.175
0.459
0.160
0.141
0.775
0.997
0.197
0.626
1.407
0.228 1.710
0.017 0.346
0.195 1.140
0.262 2.497
0.235 1.954
0.016 0.395
0.207 1.298
0.269 2.851
Ac ce p
! 0 ! ! 3.0 ! 1 ! 1.0 ! 2 ! 0.25 κ=2
97.50%
us
F
! 0 ! ! 3.0 ! 1 ! 1.0 ! 2 ! 0.25 κ=2
2.50%
an
Scenario 2: (!
SD
M
! 0 ! ! 3.0 ! 1 ! 1.0 ! 2 ! 0.25 κ=2
Mean ! 0.4,! ! ! 0.1)
d
F
te
Scenario 1: (!
MENB
7. Results and Discussion
Considering the performance of the MENB model in the simulation study, the actual work zone dataset is investigated. The posterior estimates of model parameters are obtained via WinBUGS using two independent Markov chains. The convergence of each model's parameters was monitored using the BGR statistic (below 1.2) and also using visual approaches such as observing trace plots. In addition, the ratio of the Monte Carlo error relative to the standard deviation of each parameter is about 0.012 to 0.084. Table 3 and Table 4 present the parameter estimation and the 95 percent Bayesian CI for modeling PDO crashes and injury crashes, respectively. Results of a conventional NB model and the alternative improved model proposed in this study (MENB) are summarized in Table 3 and Table 4. Note that these are the final results that only keep the significant 13
Page 13 of 22
variables after implementing the Gibbs variable selection in WinBUGS. Based on the results, longer work zone length and higher volume per lane are positively associated with PDO crash occurrence and injury crash occurrence. Both PDO and injury crash occurrences are found to be lower during nighttime, which may be attributable to the lower traffic volumes and less vehicle interactions at the work zones in the evening. Work zones with higher speed limit tend to have lower PDO and injury crash occurrence rate.
ip t
Work zones on state roadways have higher PDO and injury crash occurrence rate. Moreover, work zones at sites that have more lanes are also observed to have higher PDO and injury crash occurrence rate compared to locations with less number of operated lanes. This finding might be attributed to the
cr
complexity of the site and more interactions (for example, lane change) among vehicles traveling in
different lanes. Notably, the positive sign of the variable "dropped lanes" indicates that if more lanes are closed, the PDO and injury crash occurrence rate will increase. In case of no smooth transition, many
us
vehicles are forced to merge, and some of late merge attempts might be a possible reason for the higher crash frequency.
an
Comparing the NB and MENB models, the inverse dispersion parameter κ of the NB model is less than that of the MENB model. This significance of κ validates the assumption that negative binomial model is a better alternative for Poisson model to represent over-dispersed crash counts (Abdel-Aty and Radwan,
M
2000). In addition, the larger value of 1/κ in the NB model implies that apparent over-dispersion can be partially affected by measurement error in the work zone length. The reliability ratios for PDO and injury crash models are 0.776 and 0.781, respectively, which indicate the presence of a relatively high magnitude of measurement errors in the observed work zone length. Without addressing the
d
measurement error, the coefficient of work zone length in the NB model can also lead to a bias on other
te
parameters in the model.
Table 3 - Modeling results for work zone PDO crashes
κ
! !
NB CF Model Sd. 2.5% 0.588 -6.809 ----0.060 0.662 0.054 0.575 0.007 -0.034 0.113 0.116 0.047 0.442 ----0.143 1.601
Ac ce p
Model Variable constant light condition ln(length) ln(traffic) work zone speed limit road system operated lanes dropped lanes
mean -5.757 --0.781 0.674 -0.021 0.342 0.536 --1.861
97.5% -4.636 --0.902 0.770 -0.009 0.562 0.629 --2.161
mean -3.643 -0.391 0.847 0.538 -0.034 --0.443 0.201 2.180
MENB CF Model Sd. 2.5% 0.497 -4.438 0.140 -0.685 0.060 0.729 0.066 0.415 0.006 -0.049 ----0.089 0.269 0.074 0.060 0.204 1.815
97.5% -2.702 -0.115 0.965 0.634 -0.023 --0.629 0.342 2.621
!
---
---
---
---
0.401
0.014
0.375
0.428
F
--4326.19
-----
-----
-----
0.748 4313.93
0.020 ---
0.710 ---
0.787 ---
DIC
14
Page 14 of 22
Table 4 - Modeling results for work zone injury crashes 97.50% -3.499 -0.027 0.999 0.483 1.319 0.894 0.636 2.940
mean -4.241 -0.587 0.995 0.201 0.943 0.527 0.471 3.011
---
---
---
---
0.398
97.50% -2.846 -0.263 1.154 0.364 1.258 0.787 0.677 4.606
F
--2738.45
-----
-----
-----
0.752 2735.22
0.014
0.372
0.426
0.020 ---
0.714 ---
0.791 ---
us
!
DIC
MENB CF Model sd 2.50% 0.794 -5.669 0.176 -0.942 0.082 0.831 0.076 0.075 0.157 0.673 0.135 0.279 0.106 0.269 0.661 2.050
ip t
κ
! !
mean -4.927 -0.382 0.862 0.257 1.047 0.652 0.432 2.210
NB CF Model sd 2.50% 0.821 -6.792 0.177 -0.706 0.070 0.730 0.102 0.070 0.139 0.775 0.133 0.377 0.108 0.217 0.326 1.668
cr
Model Variable constant light condition ln(length) ln(traffic) road system operated lanes dropped lanes
an
The comparisons of the model diagnostic criteria estimated from each model are also shown in Table 3 and Table 4. For POD crash modeling, the DIC for the NB model is 4326.19 and 4313.93 for the MENB model. For injury crash model, the DICs are 2738.45 and 2735.22 for NB and MENB models,
M
respectively. For both PDO and injury crash modeling, allowing accounting for the measurement error in the work zone length produces smaller DIC values. Thus model selection based on DIC suggests that the MENB is preferred, particularly for the PDO crash model.
d
In addition, the leave-one-out cross validation approach is used to compare the two estimated models. It is well known that the computational requirements of the Full Bayesian (FB) estimation are very high even
te
for a single iteration given the fact that FB requires thousands of iterations to converge. We also have a relatively larger set of observations compared to some of previous similar studies. Thus, in order to be
Ac ce p
able to complete this specific validation in a reasonable amount of time, 50 randomly selected samples were used to conduct the leave-one out tests. It took around two days to complete the computations with this randomly generated set. It is safe to assume that the same exercise with the complete data set would have taken around 40 to 45 days to complete. Thus, we used this randomly selected data set approach as an alternative way to demonstrate the performance of our modeling approach through leave-one-out validation procedure. However, as a future task, it is also possible to try other more computationally efficient validation approaches such as k-fold cross validation that can be more practical when Bayesian approach is used as the estimation method in conjunction with a relatively large data set. According to the Eq. (13) and Eq. (4), the CPO and CR for each sample are estimated. Figure 2 shows the results of CR for 50 tested samples. The results suggest that the leave-one-out method prefers the PDO MENB model in 36 (green dots in Figure 2) out of the 50 cases based on the comparison of CPOs between the MENB and NB models. Similarly, the injury MENB model is also preferred in 33 (green dots in Figure 2) out of the 50 cases. Therefore, based on DIC and the cross validation tests, the MENB model performs better in terms of modeling work zone crashes in the presence of the measurement errors in work zone lengths. 15
Page 15 of 22
ip t cr us an
M
Figure 2 Preference of different models based on CPO ratio of each tested sample of observations
To better understand the marginal effects of the variables, elasticity of each variable is also examined. In
d
general, elasticity is calculated as:
! ! i xij ! xij ! i
(15)
te
Ex!iji !
where E represents the elasticity; ! i is the expected crash frequency of the work zone i ; and xij is the
where !
Ac ce p
j th explanatory variable associated with work zone i . Eq. (15) is applied to Eq. (3):
j
Ex!iji ! ! j xij
(16)
is the coefficient of the j th corresponding variable.
Inclusion of a logarithmic transformation of variables in the NB model gives the opportunity to test the proportionalities between the variables and crash counts. The estimated log-transformed model parameters directly indicate the elasticity of corresponding variables. Please note that Eq. (16) is valid only for the continuous explanatory variables, not for dummy variables. Pseudo-elasticity can be used to approximate the elasticity of these dummy variables (Lee and Mannering, 2002). In case where the covariates are dummy variables, pseudo-elasticity indicates the incremental change in the CF produced by change in the corresponding dummy variables. Pseudoelasticity can be calculated as follows: 16
Page 16 of 22
E λxiij !
exp !β j ! ! 1
(17)
exp !β j !
The elasticity for each explanatory variable is shown Table 5. The elasticity of the MENB model indicates that a one percent increase in the work zone length on the log-scale resulted in about a 0.75 percent
ip t
increase of PDO crashes and a 0.87 percent increase in injury crashes. Similarly, a one percent increase in traffic volume on the log-scale resulted in about a 4.93 percent and 1.84 percent increase in PDO
crashes and injury crashes, respectively. When the work zone speed limit increased by one percent, the
cr
PDO crashes were reduced by about 1.45 percent. This finding about work zone speed supports the practical conclusion that if the presence of a work zone does not seriously affect traffic conditions (in
terms of both operations and safety), a relatively higher speed limit or normal speed limit generally should
us
be posted in the work zone. In addition, other variables including operated lanes and number of dropped lanes within the work zone collectively are observed to affect both types of work zone crash occurrences.
an
For indicator variables such as light condition and road system type, the interpretation of elasticity is different. The estimated elasticity indicates the change in CF given the existence of a condition (indicator=1). For example, during nighttime, PDO and injury crash occurrences in MENB models will be
M
reduced by about 48 percent and 80 percent, respectively. Similarly, if the road system is a state highway or other lower level road, the expected number of injury crash counts will increase by about 61 percent as per the MENB model, and the PDO crash occurrence rate will not significant change.
te
PDO CF Model NB --0.690 6.181 -0.884 0.290 0.896 ---
Ac ce p
Model Variable light condition ln(length) ln(traffic) work zone speed limit road system operated lanes dropped lanes
d
Table 5 - Elasticity estimates for explanatory variables MENB -0.479 0.748 4.929 -1.452 --0.741 0.180
Injury CF Model NB -0.465 0.761 2.357 --0.649 1.090 0.386
MENB -0.799 0.878 1.843 --0.611 0.881 0.421
As shown in Table 5, the elasticity of the work zone length in the NB model is less than that in the MENB model. The impact of work zone length on work zone crash occurrences is somehow under emphasized. Such biased understanding of the impact of work zone length on safety can lead to biased selection of the optimal work zone length (Chien and Schonfeld, 2001), which in turn may result in unreliable decisions for estimation of work zone user (crash) costs, design of work zone project contracting strategies, and impact of other safety related management strategies.
17
Page 17 of 22
8. Conclusions In this study, statistical relationships between a set of explanatory variables and work zone crash frequencies were examined using extensive and detailed work zone data collected in New Jersey. We extended the traditional NB model by incorporating the effects that arise from measurement errors related
ip t
to the work zone length. A new model to estimate the work zone CF---namely, the MENB---is proposed and estimated.
The modeling results suggest that both work zone length and traffic volume are positively associated with
cr
crash occurrence in work zones. This finding confirmed outcomes of previous studies by Pal and Sinha (1996), Venugopal and Tarko (2000), Khattak et al. (2002), and others. The crash frequency during
us
nighttime was less than that of daytime. The elasticity of the coefficient in the MENB model suggested crash occurrence rate was 48 percent and 80 percent less at nighttime for PDO crash and injury crash, respectively. The detailed information obtained from individual work zone project files provided us with
an
opportunities to examine additional factors than the ones considered in previous studies. First, two parameters related to work zone speed were specially considered in the models and the results suggest that more variations in the speed in work zones can result in more crashes. In addition, parameters that represent the type of roadway and the complexity of the work zone (in terms of the number of operated
M
lanes, closed lanes, number of intersections, and ramps within work zones) were investigated. The results imply that work zones on state highways tend to have higher crash occurrence rate than that of attributable to more work zone crashes.
d
the interstate highways. Increased complexity of work zones in terms of above factors was also
te
Because work zone length or configuration may change as the construction schedule progresses, using a fixed-length measurement throughout the work zone duration in the NB model was found to lead to bias
Ac ce p
on the estimation of the impact of work zone length. It also affects the estimation of other explanatory variables such as traffic volume and number of lane closure. By considering the measurement errors, the MENB model provides a better fit to the data than the traditional NB model, as indicated by DIC and cross validation tests. Such results confirmed the findings of the simulation study. Despite better performance, the proposed MENB model is not meant to justify the collection of low-quality data. If the work zones were not subject to such errors in length measurements, both models can provide comparable findings. The CF model proposed in this study can be enhanced in several ways. The proposed CF model can be improved to account for measurement errors related to other explanatory variables. For instance, traffic volume used in the model might not be precise enough. The best way is to collect traffic data during the actual time period when the work zone exists. Comparisons with other approaches such as random effect model would be helpful to further examine the performance of the MENB model. Moreover, statistical distributions other than normal distribution can be tested to test the existence of better representations of the measurement errors in variables. In addition, other factors such as work zone setup (traffic control plans) are known to influence crashes and congestion. Therefore, in the future they should be carefully addressed given the availability of such valuable information. 18
Page 18 of 22
Acknowledgements The authors would like to thank the New Jersey Department of Transportation for providing the data necessary in this paper. The authors also appreciate the valuable comments from the anonymous reviewers to improve this paper. The contents of this paper reflect views of the authors who are responsible for the facts and accuracy of the data presented herein. The contents of the paper do not
ip t
necessarily reflect the official views or policies of the agencies.
cr
References
Ac ce p
te
d
M
an
us
Abdel-Aty, M.A., Radwan, A.E., 2000. Modeling traffic accident occurrence and involvement. Accident Analysis & Prevention 32 (6), 633-642. Akepati, S.R., Dissanayake, S., 2011. Characteristics and contributory factors of work zone crashes. Transportation Research Board 90th Annual Meeting, CD-ROM, Transportation Research Board of the National Academies, Washington, D.C. Bourne, J.S., Scriba, T.A., Eng, C., Lipps, R.D., Ullman, G.L., Markow, D.L., Matthews, K.C., Gomez, D., Holstein, D.L., Zimmerman, B., Stargell, R., 2010. Best practices in work zone assessment, data collection, and performance evaluation. Scan Team Report (Scan 08-04), NCHRP Project 20 68A U.S. Domestic Scan. National Cooperative Highway Research Program (NCHRP). Brooks, S.P., Gelman, A., 1998. Alternative methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics 7 (4), 434-455. Bryden, J.E., Andrew, L.B., Fortuniewicz, J.S., 1998. Work zone traffic accidents involving traffic control devices, safety features, and construction operations. Transportation Research Record: Journal of the Transportation Research Board No. 1650, 71-81. Burnham, K.P., Anderson, D.R., 2002. Model selection and multimodel inference: A practical information theoretic approach, 2nd ed. Springer, New York. Bushman, R., Chan, J., Berthelot, C., 2005. Characteristics of work zone crashes and fatalities in canada. In: Proceedings of the Canada Multidisciplinary Road Safety Conference XV, Fredericton, NB., June 5-8, 2005. Casteel, D.B., Ullman, G.L., 1992. Accidents at entrance ramps in long-term construction work zones. Transportation Research Record: Journal of the Transportation Research Board No. 1352, 46-55. Chambless, J., Chadiali, A.M., Lindly, J.K., Mcfadden, J., 2002. Multistate work-zone crash characteristics. ITE Journal, Institute of Transportation Engineers 72 (5), 46-50. Chien, S., Schonfeld, P., 2001. Optimal work zone lengths for four-lane highways. Journal of Transportation Engineering 127 (2), 124-131. Daniel, J., Dixon, K., Jared, D., 2000. Analysis of fatal crashes in georgia work zones. Transportation Research Record: Journal of the Transportation Research Board No. 1715, 18-23. Dissanayake, S., Akepati, S.R., 2009. Characteristics of work zone crashes in the swzdi region: Differences and similarities. In: Proceedings of the 2009 Mid-Continent Transportation Research Symposium, Ames, Iowa, August 2009. El-Basyouny, K., Sayed, T., 2009. Accident prediction models with random corridor parameters. Accident Analysis & Prevention 41 (5), 1118-1123. El-Basyouny, K., Sayed, T., 2010. Safety performance functions with measurement errors in traffic volume. Safety Science 48 (10), 1339-1344. Elias, A.M., Herbsman, Z.J., 2000. Risk analysis techniques for safety evaluation of highway work zones. Transportation Research Record: Journal of the Transportation Research Board No. 1715, 10-17. FHWA, 2001. Highway statistics 2001. Publication No. FHWA-PL-02-020. U.S. Department of Transportation, Federal Highway Administration, Washington, D.C. FHWA, 2011. Safe driving, safer work zones: National work zone awareness week 2011. FOCUS, March, 4-5. Garber, N.J., Woo, T.H., 1990. Accident characteristics at construction and maintenance zones in urban areas. Report VTRC 90-R12. Virginia Transportation Research Council, Charlottesville. 19
Page 19 of 22
Ac ce p
te
d
M
an
us
cr
ip t
Garber, N.J., Zhao, M., 2002. Distribution and characteristics of crashes at different work zone locations in virginia. Transportation Research Record: Journal of the Transportation Research Board No. 1794, 19-25. Gelfand, A.E., 1996. Model determination using sampling-based methods. Markov chain monte carlo in practice, chapter 9. Chapman & Hall, London, UK, pp. 145-161. Gelman, A., Meng, X.L., Stern, H., 1996. Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Statistica Sinica 6 (4), 733-807. Graham, J.L., Paulsen, R.J., Glennon, J.C., 1978. Accident analyses of highway construction zones. Transportation Research Record: Journal of the Transportation Research Board No. 693, 25-32. Hall, J.W., Lorenz, V.M., 1989. Characteristics of construction-zone accidents. Transportation Research Record: Journal of the Transportation Research Board No. 1230, 20-27. Harb, R., Radwan, E., Yan, X., Abdel-Aty, M., Pande, A., 2008a. Environmental, driver and vehicle risk analysis for freeway work zone crashes. ITE Journal, Institute of Transportation Engineers 78 (1), 26-30. Harb, R., Radwan, E., Yan, X., Pande, A., Abdel-Aty, M., 2008b. Freeway work-zone crash analysis and risk identification using multiple and conditional logistic regression. Journal of Transportation Engineering 134 (5), 203-214. Hargroves, B.T., 1981. Vehicle accidents in highway work zones. Journal of Transportation Engineering 107 (5), 525-539. Hauer, E., 2001. Overdispersion in modelling accidents on road sections and in empirical bayes estimation. Accident Analysis & Prevention 33 (6), 799-808. Jin, T.G., Saito, M., L., E.D., 2008. Statistical comparisons of the crash characteristics on highways between construction time and non-construction time. Accident Analysis and Prevention 40 (6), 2015-2023. Khattak, A.J., Khattak, A.J., Council, F.M., 1999. Analysis of injury and non-injury crashes in california work zones. Transportation Research Board 78th Annual Meeting, CD-ROM, Transportation Research Board of the National Academies, Washington, D.C. Khattak, A.J., Khattak, A.J., Council, F.M., 2002. Effects of work zone presence on injury and non-injury crashes. Accident Analysis and Prevention 34 (1), 19-29. Lan, B., Persaud, B., Lyon, C., Bhim, R., 2009. Validation of a full bayes methodology for observational before-after road safety studies and application to evaluation of rural signal conversions. Accident Analysis & Prevention 41 (3), 574-580. Lee, J., Mannering, F., 2002. Impact of roadside features on the frequency and severity of run-offroadway accidents: An empirical analysis. Accident Analysis & Prevention 34 (2), 149-161. Li, Y., Bai, Y., 2008. Comparison of characteristics between fatal and injury accidents in the highway construction zones. Safety Science 46 (4), 646-660. Lin, H., Pignataro, L.J., He, Y., 1997. Analysis of truck accident reports in work zones in new jersey. Final Report. The National Center for Transportation and Industrial Productivity, Institute for Transportation, New Jersey Institute of Technology, New Jersey. Lord, D., Washington, S.P., Ivan, J.N., 2005. Poisson, poisson-gamma and zero inflated regression models of motor vehicle crashes: Balancing statistical fit and theory. Accident Analysis & Prevention 37 (1), 35-46. Lundevaller, E.H., 2006. Measurement errors in poisson regressions: A simulation study based on travel frequency data. Journal of Transportation and Statistics 9 (1), 39-47. Miaou, S., Hu, P., Wright, T., Rathi, A., Davis, S., 1992. Relationship between truck accidents and highway geometric design: A poisson regression approach. Transportation Research Record No. 1376, 10-18. Mitra, S., Washington, S.P., 2007. On the nature of over-dispersion in motor vehicle crash prediction models. Accident Analysis & Prevention 39 (3), 459-468. Müngen, U., Gürcanli, G.E., 2005. Fatal traffic accidents in the turkish construction industry. Safety Science 43 (5-6), 299-322. Nemeth, Z.A., Migletz, D.J., 1978. Accident characteristics before, during, and after safety upgrading projects on ohio’s rural interstate system. Transportation Research Record: Journal of the Transportation Research Board No. 672, 19-24. 20
Page 20 of 22
Ac ce p
te
d
M
an
us
cr
ip t
Nemeth, Z.A., Rathi, A., 1983. Freeway work zone accident characteristics. Transportation Quarterly 37 (1), 145-159. Pal, R., Sinha, K.C., 1996. Analysis of crash rates at interstate work zones in indiana. Transportation Research Record: Journal of the Transportation Research Board No. 1529, 43-53. Paulsen, R.J., Harwood, D.W., Graham, J.L., Glennon, J.C., 1978. Status of traffic safety in highway construction zones. Transportation Research Record: Journal of the Transportation Research Board No. 693, 6-12. Pigman, J.G., Agent, K.R., 1990. Highway accidents in construction and maintenance work zone. Transportation Research Record: Journal of the Transportation Research Board No. 1270, 12-21. Qi, Y., Srinivasan, R., Teng, H., Baker, R.F., 2005. Frequency of work zone accidents on construction projects. Report No: 55657-03-15. University Transportation Research Center City College of New York, New York. Raub, R., Sawaya, O., Schofer, J., Ziliaskopoulos, A., 2001. Enhanced crash reporting to explore workzone crash patterns. Transportation Research Board 80th Annual Meeting, CD-ROM, Transportation Research Board of the National Academies, Washington, D.C. Rouphail, N.M., Yang, Z.S., Fazio, J., 1988. Comparative study of short- and long-term urban freeway work zones. Transportation Research Record: Journal of the Transportation Research Board No. 1163, 4-14. Salem, O.M., Genaidy, A.M., Wei, H., N., D., 2006. Spatial distribution and characteristics of accident crashes at work zones of interstate freeways in ohio. In: Proceedings of the 2006 IEEE Intelligent Transportation Systems Conference, Toronto, Canada, September 17-20, pp.1642-1647. Schrock, S.D., Ullman, G.L., Cothron, A.S., Kraus, E., Voigt, A.P., 2004. An analysis of fatal work zone crashes in texas. Report No.: FHWA/TX-05/0-4028-1. Texas Transportation Institute, Texas. Shankar, V., Mannering, F., Barfield, W., 1995. Effect of roadway geometrics and environmental factors on rural freeway accident frequencies. Accident Analysis & Prevention 27 (3), 371-389. Sorock, G.S., Ranney, T.A., Lehto, M.R., 1996. Motor vehicle crashes in roadway construction workzones: An analysis using narrative text from insurance claims. Accident Analysis and Prevention 28 (1), 131-138. Spiegelhalter, D., Best, N., Carlin, B.P., Van Der Linde, A., 2002. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society Series B (Statistical Methodology) 64 (4), 583-639. Srinivasan, R., Ullman, G.L., Finley, M.D., Council, F.M., 2011. Use of empirical bayesian methods to estimate temporal-based crash modification factors for construction zones. Transportation Research Board 90th Annual Meeting, Transportation Research Board of the National Academies, Washington, D.C. Srinivasan, S., Carrick, G., Zhu, X., Heaslip, K., Washburn, S., 2008. Analysis of crashes in freeway work zone queues: A case study. Report No: 07-UF-R-S3. Transportation Research Center, University of Florida, Florida. Udoka, S.J., 2005. Evaluation of traffic safety in construction work zones. Urban Transit Institute, Transportation Institute, North Carolina Agricultural and Technical State University, Greensboro, NC. Ullman, G.L., Finley, M.D., Bryden, J.E., Srinivasan, R., Council, F.M., 2008. Traffic safety evaluation of nighttime and daytime work zones. NCHRP Report 627. Transportation Research Board, Washington, D.C. Ullman, G.L., Finley, M.D., Ullman, B.R., 2004a. Assessing the safety impacts of active night work zones in texas. Report No. FHWA/TX-05/0-4747-1. Texas Transportation Institute, The Texas A&M University System, College Station, Texas. Ullman, G.L., Holick, A.J., Scriba, T.A., Turner, S.M., 2004b. Estimates of work zone exposure on the national highway system in 2001. Transportation Research Record: Journal of the Transportation Research Board No. 1877, 62-68. Ullman, G.L., Porter, R.J., Karkee, G.J., 2009. Monitoring work zone safety and mobility impacts in texas. Report No.: FHWA/TX-09/0-5771-1. Texas Transportation Institute,The Texas A&M University System, College Station, Texas. 21
Page 21 of 22
Ac ce p
te
d
M
an
us
cr
ip t
Ullman, G.L., Ullman, B.R., Finley, M.D., 2006. Analysis of crashes at active night work zones in texas. Transportation Research Board 85th Annual Meeting, CD-ROM, Transportation Research Board of the National Academies, Washington, D.C. Venugopal, S., Tarko, A., 2000. Safety models for rural freeway work zones. Transportation Research Record: Journal of the Transportation Research Board No. 1715, 1-9. Wang, J., Hughes, W.E., Council, F.M., Paniati, J.F., 1996. Investigation of highway work zone crashes: What we know and what we don’t know. Transportation Research Record: Journal of the Transportation Research Board No. 1529, 54-62. Xing, J., Takahashi, H., Iida, K., 2010. Analysis of bottleneck capacity and traffic safety in japanese expressway work zones. Transportation Research Board 89th Annual Meeting, CD-ROM, Transportation Research Board of the National Academies, Washington, D.C. Yanmaz-Tuzel, O., Ozbay, K., 2010. A comparative full bayesian before-and-after analysis and application to urban road safety countermeasures in new jersey. Accident Analysis & Prevention 42 (6), 2099-2107. Zhao, M., Garber, N.J., 2001. Crash characteristics at work zones. Report No. UVA/29472/CE01/100. Center for Transportation Studies, University of Virginia, Charlottesville, VA. Zhu, X., Srinivasan, S., Carrick, G., Heaslip, K., 2010. Modeling the location of crashes within work zones: Methodology and a case-study application. Transportation Research Board 89th Annual Meeting, CD-ROM, Transportation Research Board of the National Academies, Washington, D.C.
22
Page 22 of 22