Page 1 ! " # $ %& '( ( ) *& ( ( ( ) + " ) ( " ( " '( ( ,( ( ) ( " ( - ( ) ( . ( ( " ( " ( / 0 1 ...

16 downloads 0 Views 10MB Size Report
Dec 27, 2011 - Sumaila, AbdulGaniyu Femi. 2013. Road crashes .... Otedola. Estate,Ik eja. A car. Num b er killed/injured not stated. Lost con trol, sumersault.
     

           

   

          

         !          "        #$   %&    '(         

()      *&  

     (   (        (  )+        "         ) (               



 "     (

"    

        '( 

       (     ,(

  ( )    ("     ( -  ()     (                  .     (      (

 "

     ( "                  (    

 

  

            

/ 0  12(   3    -" )4   5                  (     ),(  )    ,( )         (

 (  6  1   ) ) 0 )4 

         

   



 

            

 

  

                

 ! " #!   

                                   

 !      "        #      $  !%  !      &  $   '      '    ($     '   # %  % )   % *   % '   $  '      +      " %        &  '  !#       $,  ( $        -     .                                      !   "-           (    %                              .          %     %   %   %    $        $ $ -           -                           - - // $$$    0   1"1"#23."         

4& )*5/ +)     678%99:::&  %  ) 2  ; !   *   &        /-  0 and scale parameter ρ > 0, the log-logistic hazard function was as follow: h (t) = γρ (γt)ρ−1 / [1 + (γt)ρ ]

(2.3.3)

The equation suggests that the hazard is monotone decreasing from infinity if ρ < 1, and is monotone decreasing from γ if ρ = 1. If ρ > 1, the hazard first increased from 0 to a maximum 33

 at t = (ρ − 1)1/ρ γ and then decreased monotonically toward 0. The road networks with the highest frequencies of accidents, and the most dangerous, as measured by the number of accidents per mile were identified. Also, the duration (defined as amount of time between the time the officer receives a report of an accident and the time he or she leaves the scene of the accident) was obtained. This paper demonstrated how appropriate multivariate statistical models of accident frequency and duration can be used to isolate key relationships between site and condition characteristics and the frequency and duration of accidents. Such relationships can serve as a guide to uncover the underlying strength and weakness of existing accident management systems. Important direction for management system improvement can be provided. In terms of applying the methodology presented to other metropolitan areas, the key concern is the acquisition of an appropriate and accurate data set. In this case, it is important to note that in the absence of data relating to roadway geometrics, specific accident management characteristics, such as trooper allocations, and traffic volumes, transferability among study zones within and between metropolitan areas is problematic. The Poisson model of accident frequency provides important direction as to where and when detection and response should be enhanced in terms of resource allocation. The log-logistic duration models provide information of a more provocative and suggestive nature in relation to accident management/clearing procedures. To assess the frequency of accident occurrence, an appropriate statistical modelling technique is needed. Within this context, a Poisson distribution is a reasonable description of the number of traffic accidents in a given day. The Poisson regression model can effectively overcome the problems caused by discrete and non-negative values of observations that would be found in normal linear regression analysis (Mannering, 1989). The Poisson regression had also been previously applied to accident frequency and the suitability of the technique had been demonstrated theoretically and empirically by Jovanis and Chang (1986) and Maher and Mountain (1988). More recently, Roque and Cardoso (2014) made a comparison of 95-percentile confidence intervals for dual carriageway of one kilometre segments focusing on Lower boundary for Poisson mean (λ) and Upper boundary for Poisson mean (λ).This work was extended to a comparison of 95-percentile confidence interval for a one kilometre section of dual carriage ways on Upper boundary for the Gamma mean (m) and on Upper boundary for predictive response (y).

34

2.3.2

Multivariate Regression Poisson Model

Crash frequency data were typically analyzed by severity levels and crash types. In order to consider the cross-model correlation for dependent variables, multivariate regressin Poison model has been widely employed. Ye et al.(2009) proposed using multivariate regression Poisson model to simultaneously analyze different crash types. Furthermore, Yu and Abdel-Aty (2013) employed correlated random effects Poisson model with the multivariate regression Poison model to reveal the crash occurrence contributing factors for weekday and weekend crashes. Crash counts per segment for weekdays and weekends were modelled jointly. A 15-mile mountainous free way on I-70 in Colorado was chosen as the study area. Different data sets had been organized and prepared for the aggregate and disaggregate analyses for the crash frequency models. Two data sets were utilized a) crash data (from January 2006 to April 2011) provided by the Colorado Department of Transportation and b) roadway geometric characteristics data obtained from the roadway characteristics inventory. The 15-mile section had been split into 120 homogenous segments (60 in each direction). Detailed homogenous segmentation method can be found in a previous study by Ahmed et al. (2011). A total of 1,239 crashes were documented within the period; crashes had been assigned to each homogenous segment according to the mile maker. Crashes that happened between Friday 9 p.m. and Sunday 9 p.m. were labelled as weekend crashes while the other crashes were defined as weekday crashes. For the geometric characteristics, longitudinal grades had been grouped into four categories (0-2%, 2%-4%, 4%-6% and 6%-8%). The Poisson regression model with multivariate normal heterogeneity was set up as: Yit ∼ P oisson (λit ) ; t = 1, 2

(2.3.4)

log λi1 = Xi1 β + εi = Xi1 β + δ1 u1i

(2.3.5)

log λi2 = Xi2 β + εi = Xi2 β + δ2 u1i + δ3 u2i

(2.3.6)

where Yit was the crash count at segment i i = 1, 2, ..., 120, 60 segments in each direction for weekdays and weekends (t = 1) for weekdays and (t = 2) for weekends. u1t and u2t were independent random variables, which are standard normally distributed. Xit was a vector of explanatory variables and β and σi were coefficients that were to be estimated. Besides, the correlation coefficient of 1 and 2 was calculated as: δ1 δ2 Corr (ε1 , ε2 ) =  2 (δ2 + δ32 ) δ12

(2.3.7)

Furthermore, assuming each segment shares the same random error for weekday and weekend crashes, the correlated random effects Poisson model can be set up as (2.3.4) and log λit = Xit β + bi 35

(2.3.8)

where the correlated random effects could be set to follow normal distribution 

 1 (2.3.9) a a represents the precision parameter which is specified to a gamma prior as a ∼ Gamma(0.001, 0.00 bi ∼ N

0,

As more explanatory variables (geometric characteristics) were included in the study, the loglinear formulation provided a more straight forward approach to unveil the cause-effect relationships between crash frequency and the explanatory variables. This study provided a systematic approach to investigate the different characteristics of weekday and weekend crashes. It was concluded that weekday crashes were more likely to occur during congested sessions, while the weekend crashes mostly occurred under free flow conditions. Finally, real time crash prediction models were developed. Random effects Bayesian logistics regression models incorporating the microscopic traffic data were developed. Results of the real-time crash prediction models were consistent with the crash time propensity analysis. Furthermore, results from these models shed some lights on future geometric improvements and traffic management strategies to improve traffic safety.

2.3.3

Poisson-Lognormal, Poisson-Gamma and Poisson -Lognormal with Conditional Autoregressive priors Models

Many researchers have found the Poisson based models; Poisson-Lognormal, Poisson-Gamma and Poisson-Lognormal with CARpriors suitable for RTC count data and these have been widely used in literature. Examples include the works of Shanker et al. (1995), Milton and Mannering (1998), Abdel-Aty and Radwan (2000), Lord (2000), Amoros et al. (2003), Miaou et al. (2003), Kim et al. (2006), Lord and Miranda-Moreno (2007), Aguero-Valverde and Jovanis (2008) and Quddus (2008). Wang et al. (2009) explored the impact of TC on the frequency of Road Accidents (RA) using a spatial analysis technique while controlling for other relevant factors that may affect RA. Increased travel time caused by TC imposes costs to road users, both in terms of economic loss and also the reduced quality of life and mobility. Two externalities, TC and RA both impose a burden to society, and as such it is important to reduce their impacts. An ideal solution would be to reduce them simultaneously but this may not be possible. It was speculated that there may be an inverse relationship between TC and RA (Shefer and Rietveld, 1997). The hypothesis states that in a less congested road network; the average speed of traffic would be normally high, which is likely to result in more serious injuries or fatalities. 36

Whereas, in a congested road network: traffic would be slower, and may cause less fatalities and serious injuries. The increased TC may lead to more accidents due to increased traffic volume; however, those accidents may be less severe. Suggesting that, the total external cost of accidents, may be less, in a congested situation relative to an un-congested situation. This poses a potential dilemma for transport policy makers since it would appear that TC can improve road safety, however, TC reduces mobility which subsequently decreases economic productivity. It is therefore, important to understand the association between TC and RA so that effective policies can be implemented to control both congestion and road safety. The M25 London orbital motorway was used as a case study and disaggregated into 70 road segments. Accident data was obtained and traffic characteristic data such as traffic delay, traffic flow and average travel speed for each road segment for year 2006. Due to the error in accident location and the fact that accident data and spatial motorway network data were obtained from different sources it is highly likely that there be mismatches. The perpendicular distance and the link direction can be obtained from the coordinates of the start and end nodes of a segment and the direction of the vehicle just before the accident obtained from STATS19 data. A segment is more likely to be the correct segment if the distance is short and the angular difference is small. A weighting score was developed based on these two factors. W Si = 1/di + cos (Δθi )

(2.3.10)

di = 0 , where di was the perpendicular distance (in metres) from an accident point to a road segment, i and Δθi were the angular difference between the direction of an accident and the direction of a link i (0 − 1800 ). The minimum value of di was set to be one meter and the W Si for a segment ranged from −1 to +2. If the W S for a segment was high, then it was considered as the correct segment. The equation detailed by Taylor et al. (2000) was used to estimate segment-level traffic Congestion Index (CI), CI = T − T0 /T0 (2.3.11) where T was the actual travel time and T0 was the free flow travel time on a particular road segment. The CI was dimensionless and independent of road segment length or road geometry. So that it could be compared between different road segments. Also, the CI was non negative, the higher the value, the higher the level of congestion. Free flow travel time was calculated by average travel time minus average vehicle delay (weighted by traffic flow) for 2006. Average traffic flow was weighted (by traffic flow) harmonic mean of hourly speed data. Direction was a dummy variable with 0 representing anti-clockwise and 1 representing the clockwise direction. The main objective was to develop a series of models to

37

investigate the relationship between TC and the frequency of different RA. The base form of Poisson model was expressed as: Yi ∼ P oisson (μi )

(2.3.12)

log (μi ) = α + βXi + vi + ui

(2.3.13)

where Yi was the observed number of accidents which occurred on road segment i; μi was the expected Poisson accident rate at road segment i; α was the intercept; X was the vector of explanatory variables for road segment i; β is the vector of coefficients to be estimated; vi was a random term which captures the heterogeneity effects for road segment i; μi was a random term which captures the spatially correlated effects for road segment i. The models were estimated under a full hierarchical Bayesian framework using WinBUGS (Spiegelhalter et al., 2003). Models were differentiated by different specifications of the random terms (i.e. vi and ui ). The specification of each model was as follow: Poisson-lognormal model: the spatially correlated effects term ui was excluded in this model, to test the model with heterogeneity effects only. A uniform prior distribution was assigned to α; a highly non-informative normal prior was assigned to all β  s with zero mean and 100,000 variance. The prior distribution for uncorrelated heterogeneity term vi was a normal prior with N (0, τv2 ), where τv2 was the precision (1/variance) with a vague gamma prior Gamma(0.001, 0.001). A gamma distribution θ ∼ Gamma(a, b)

(2.3.14)

was defined with mean E(θ) = a/b and variance var(θ) = a/b2 . Poisson-gamma model: The spatial correlation term (ui ) was excluded in this model. The term exp(vi ) was assigned to a gamma prior, that is, exp(vi ) ∼ Gamma(φ, φ)

(2.3.15)

where φ was assigned to a non-vague hyper prior with Gamma(0.1, 1.0) as suggested by Lord and Miranda-Moreno (2007). Also, the same prior distributions were assigned α and β  s. In earlier studies, Schluter et al. (1997) and Lord (2006) analyzed crash frequency data using the Poisson - gamma model. Poisson-log normal conditional autoregressive model: This model accommodated both heterogeneity and spatial correlation effects (i.e. vi and ui ). The same priors were assigned to α, β  s and vi as in the Poisson-lognormal model. The spatial correlation term ui was modelled with a CAR model proposed by Besag (1974):

38

i = j ∼ N

ui /uj   2 j uj wij w , τu w it it

where wij denoted the weight between road segment i and j; wit =

(2.3.16) (2.3.17)

 j

wij and τu2 was a scale

parameter assumed as a gamma prior Gamma(0.5, 0.0005). There were several methods to define the weights (wij ) between road segments depending on the consideration of different neighbour structures. The weighting scheme could use contiguity based weights, for example, wij = 1, if spatial unit i and j are adjacent (that is, shared border and/or vertex) and wij = 0, otherwise. Alternatively, distance based weights could be used. For instance, the shorter the distance between i and j, the larger the weight (wij ). As suggested by Aguero-Valverde and Jovanis (2008), two different neighbouring structures were considered: first order neighbours and second order neighbours. First-order neighbours were defined as road segments, j was directly connected to segment i and wij = 1; second-order neighbours were defined as road segment j was connected to first-order neighbours of segment i and wij = 1/2 and wij = 0 if segment i and j were not neighbours to each other (first or second order). All models discussed in this section were estimated using the Markov Chain Monte Carlo (MCMC) method under the full hierarchical Bayesian framework. The Deviance Information Criterion (DIC), which was thought as a generalization of the Akaike Information Criterion (AIC), was used to compare goodness - of - fit and complexity of different models estimated under the Bayesian framework (Spiegelhalter et al., 2002). In terms of model fit and complexity, the lower the DIC the better the model. For each of fatal, serious and slight injury accidents, two models were estimated, for each of the four specifications Poisson-lognormal; Poisson-gamma; Poisson lognormal with CAR priors (first order neighbour) and Poisson-lognormal with CAR prior (second order neighbour). For fatal and serious injury accidents, CI showed the expected negative sign suggesting that the increased level of congestion is associated with the decreased level of fatal and serious injury accidents. However, this variable was found to be statistically insignificant in all forms of Poisson models for both categories of accidents. This means that the level of TC has no impact on the frequency of road accident according to the data on the M25 London orbital motorway. Therefore, spatial differences of congestion among the road segments cannot explain the variation in RA. This result was in line with findings of Noland and Quddus (2005), who investigated the association between congestion and RA in London. Their study was based on area wide data and 39

did not support the hypothesis of Sheffer and Riet Veld (1997). However, one limitation in the study of Wang et al. (2009) was that only road segments from the M25 London orbital motorway were included in the analysis. Whereas, there were many other waterways and major roads connected to the M25 motorway and its reasonable to believe that a study including all roads connected to the M25 motor way may provide a better understanding on the impact of TC on RA as there will be more spatio-temporal variations in the level of TC and frequency of RA. Moreover, the effects of RA at junctions on accidents should have been be explored. Data for multiple years may be collected and a spatio-temporal analysis can be employed to ensure that the time effects are controlled for. The features of the Poisson distribution that made it so attractive in the past are less relevant now, given the widespread use of computers. Analysis of the RA count data indicates that there are grounds for doubting the general validity of the Poisson assumption. It has been found that, at many locations, the pattern of RA occurrence is either too regular or too irregular to be well described by the Poisson process. There is too much variation in the variance/mean ratio for it to be attributed completely to chance. It may well be that the mechanism governing accident occurrence is not the same at all locations. An accident count probability distribution appropriate to each particular situation should be used. The procedure for analyzing temporal variations in RA occurrence at particular locations should take account of variations in the variability of accident counts. Nicholson (1985) suggested therefore, rather than use the Poisson distribution in all situations, the choice should vary with the circumstances. This should depend upon the practical implications of using the Poisson distribution when the observed variance/mean ratio differs from unity. Two convenient alternative distributions are the Binomial distribution, if variance/mean ratio < 1.0 and the Negative Binomial (NB) distribution if the variance/mean ratio > 1.0. While the most appropriate of the three distributions could be fitted to the observations, using the method of moments.

2.3.4

Negative Binomial Model

Poisson regression models were first introduced to analyze crash frequency data due to their non-negative integers’ characteristics (Jovanis and Chang, 1986). Researchers benefited from the Poisson regression models because of the easy estimations and straight forward explanations. However, these models were blamed for lacking the ability to handle over-dispersion problems. Extensions of the Poisson regression models, such as NB models (otherwise referred to as random effects Poisson models) were employed to analyze crash frequency data. The following

40

authors Shankar et al. (1998); Miaou and Lord (2003); Guo et al. (2010); Ahmed et al. (2011) and Yu et al.(2013) assumed a log-linear relationship between crash frequency and the explanatory variables. Fridstrom and Ingebrigtsen (1991) studied the determinants of personal injury, RA and their severity using Monthly data for 18 Norwegian countries from January 1974 to December 1986. They examined the relationships between the incidence and severity of personal injury, RA and a set of certain potentially relevant explanatory variables. The risk compensation hypothesis was central to their study. The theory of risk compensation states that drivers adjust their behaviour in response to exogenous changes in safety, in such a way that less care is exercised by the drivers if more safety is built into their environment. Thus, the ex post (equilibrium) change in accident risk may be quite different from the ex ante (engineering) effect that would have occurred in the absence of behavioural adjustment; indeed, the two effects need not even have the same sign.

41

Let y (r, t)

(2.3.18)

denote the number of accidents occurring within area r during the period t. Assume that the probability that an accident occur in area r during a given short time interval is constant throughout period t. In this case, the accident number can be shown to follow the Poisson probability law, with the expected value (and variance) equal to λ (r, t). In order words, the probability that m accidents occur was given by the formula [λ (r, t)]m .e−λ(r,t) , (m = 0, 1, 2, ...) (2.3.19) m y (r, t) denoted the number of accidents occurring within area r during period t. Assuming that the probability that an accident occurred in area r during a given (short) time interval is constant throughout period t. Then the accident number was shown to follow the Poisson probability law, with expected value (and variance) equal to λ (r, t). x (r, t) represented a row vector of explanatory variables xi (r, t) (i = 1, 2, ..., I) with α as a column vector of parameters αi . It was assumed that the expected numbers of accidents depended on their explanatory variables in the following way: p [y (r, t) = m] =

λ (r, t) = exp [x (r, t) .α]

[zi (r, t)]αi =

(2.3.20) (2.3.21)

i

where zi (r, t) = exi (r,t) The exponential formulation (2.3.20) was a natural choice to make sure that the expected number of accidents is always a positive number. From (2.3.21), it also means that the expected number of accidents is a multiplicative function of the transformed independent variables zi (r, t). Suppose xi (r, t) is measured on a logarithmic scale, zi (r, t) is a linear measure, and the parameter αi is interpretable as a constant accident elasticity. The term λ (r, t) = exp [x (r, t) .α] in (2.3.20) being indexed by r and t varied across the data set, containing the systematic variation in casualty counts. That is, the systematic variation in the expected number of casualties, or in the long-term average number of casualties given constant exposure and risk. When all sources of systematic variation are identified, where all the relevant variables are included in the vector x (r, t) of (2.3.20) and the correct values have been assigned to the parameter vector α, then all the units of observation for which x (r, t) takes on a given value will be homogeneous with respect to the expected number of casualties. The disturbance term y (r, t) − λ (r, t) consists of pure random variation, such that the Poisson probability law applies to y (r, t). One is seldom in the fortunate situation that all risk factors 42

have been identified and their effects correctly estimated. A more realistic probability model was obtained by specifying (Gourieroux et al., 1984).

λ (r, t) = exp [x (r, t) .α + u (r, t)]

(2.3.22)

where u (r, t) was a random variable. It can be shown that when exp [u (r, t)] or, equivalently, λ (r, t) follows a gamma distribution, then the accident number y (y, t) has a NB distribution with expected value μ (r, t) = exp [x (r, t) .α]

(2.3.23)

σ 2 (r, t) = μ (r, t) . [1 + μ (r, t) /ξ]

(2.3.24)

and variance

ξ being the shape parameter of the gamma distribution (Greenwood and Yule, 1920). Note that in the NB model, σ 2 (r, t) > μ (r, t), that is , the variance exceeds the mean. The study took advantage of a method suggested by Breslow (1984) and elaborated by Bennett (1989). By means of the GLIM computer package, a quasi-likelihood estimate was computed based on the first and second order moments. Starting from an initial ML estimate derived under the standard Poisson assumption, they exploit the over dispersion present in the data set to derive an estimate of ξ, and hence of σ 2 (r, t), which in turn was used in an iterative reweighting procedure yielding a consistent estimate of α based on the NB specification. Probability models based on the NB distribution were estimated for six different dependent variables. A seventh model specified was a binomial logit model for the share of personal injury accidents that are fatal (a severity measure). Independent variables describing exposure levels or the road variable in general were assumed to exhibit almost constant accident elasticities and were hence specified as logarithmic transformations. Variables exerting their effect through interaction with only a smaller part of the road users were, however, believed to exhibit increasing elasticities. Thus, for precipitation, an almost reciprocal functional form was found to have the largest explanatory power, whereas for daylight and alcohol consumptions the optimal functional form turned out nearly linear. For other variable such as alcohol sales and law enforcement measures where there was a reason to strongly believe that the cross sectional effect will differ from the time series effect, or at least that their interpretations will be different, the pooled effect was decomposed into cross-sectional and time-series component. Such that equality believed the two effects was contained as a testable special cause. 43

This study showed that the elasticity of road casualties with respect to exposure was of the order of 0.8 to 0.9. Although, the improved fuel economy over the observation period may account for a certain overestimation of the true exposure effect. However, the cross-sectional variation in gasoline sales dominated the time-series variation. On weather, snowfall appeared to have a negative effect on traffic casualties, a favourable traffic safety effect. Expected number of casualties appeared to increase significantly with the amount of rainfall, suggesting that road users did not perceive the real increase in accident risk involved. The risk factor seemed to strike bicyclists and pedestrians harder than car occupants.

44

The numerical value of parameter obtained for daylight suggested that as the amount of daylight during the rush hours increased from 0 to 240 minutes, the expected number of personal injury accidents was lowered by a factor which was equal to a percent, other things being equal. The road network measures showed that as the roads in a given county got more congested, few casualties seemed to occur. Fatalities decreased more than injuries, and non occupant injuries more than occupant injuries. More heavily congested counties seemed to have fewer traffic deaths, but more non-occupant injuries. Neither extensions nor improvements of the existing national network seemed to have had desired beneficial, overall safety effect. For the independent variable, accident reporting, the study identified only the combined effect of reporting and legislation. There was reason to believe that the reporting effect was by far the most important. The effects of vehicle inspection on traffic safety were generally not discernible in the data set. For road side, the surprise vehicle inspection; the sign pattern is, in fact, such as to suggest that drivers do trade improved mechanical fitness for increased speed or reduced attention. For law enforcement, results were more ambiguous. The time-series effect of convictions not due to drinking and driving is significant with the expected sign only with respect to fatal accidents and deaths. As for convictions due to alcohol use, no time-series effects were statistically significant. As expected the cross-sectional effects came out with positive signs whenever significant. On driver’s experience, the proportion of fresh drivers on the road adversely affected traffic casualties, although on the average the severity of accidents appeared inversely related to the number of inexperienced drivers. Result of analysis revealed a favourable and quite substantial effect of seat belts. The rise from, say 30% level typical of 1974 to the 85% level observed in 1980 was consistent with an approximately 21% reduction in personal injury accidents. The estimated effect on pedestrians and bicyclists was close to zero and statistically insignificant. Thus, with respect to seat belts use there was no sign of behavioural response on the part of drivers that would adversely affect the safety of other road users. Time-series data on the variable, alcohol, suggested that liquor consumption

45

adversely affects the number of personal injury accidents, while the opposite was true of wine. The overall beer effect was insignificant, however positive in terms of fatal accidents. Results pertaining to cross sectional variation were hard to interpret. The approach of Peltzman (1975) that was used to examine the rigorous test of the role of compensation hypothesis gave somewhat ambiguous results applicable only to the seat belt use variable where the evidence was clearly against compensation. As for roadside vehicle inspection, the sign pattern is, in fact, such as to suggest that drivers do trade improved mechanical fitness for increased speed or reduced attention. The credibility of this interpretation was, however, weakened by the fact that an almost opposite sign pattern was found for the more or less periodic vehicle inspection program, by which car owners were given notice to present their car at the premises of the vehicle inspectorate. The authors concluded that accident data are as a rule, eminently non experimental and the best any researcher can do is to mimic by means of a multivariate model. The more variables included in the analysis, the less is the probability of omitted variable bias. However, one must be careful not to treat endogenous factors as exogenous. Accidents appear to be random in a more fundamental sense than the behaviour of economic agents. Also, by changing our behaviour or our physical environment we can influence the probability that an accident will happen. The analyst working with multivariate statistical accident models will be in the very privileged position, compared with most econometric practitioners, of having excellent a priori knowledge of the shape of his error distribution. Provided he has been able to identify all sources of systematic variation, his disturbance terms should be Poisson distributed. Otherwise, the over dispersion visible in his data would warn him that one or more relevant independent variables were being omitted. Lastly, in many countries there is an abundance of accident data awaiting to be analyzed by means of multivariate statistical models. The proper interpretation of such data can be of great help in the design of efficient counter measures and is literally speaking, a life-saving enterprise.

46

Among the more obvious objections to the approach taken in this study was the problem of aggregation error. Since the proportion of accidents in which alcohol may have played a role or the proportion of vehicle miles driven by drinking drivers was not known, any inference made about such a connection was largely unsubstantiated. Also, not all variables influencing the expected numbers of accidents had been included in the model. Some of such variables were non occupant exposure, variation in accident reporting, age and sex composition of the road user, population and vehicle crash worthiness. The Poisson or NB specification inherently took account of heteroscedasticity of accident counts but the problem of auto correlation was not dealt with. More recently, the NB model was extended to include mixed effects. Chun et al. (2014) summarized findings on road safety performance and bus-involved accidents in Melbourne along roads where bus priority measures had been applied. Results from an empirical analysis of the accident types revealed significant reduction in the proportion of accidents involving buses hitting stationary objects and vehicles, which suggests the effect of bus priority in addressing manoeuvrability issues for buses. A Mixed-Effects Negative Binomial (MENB) regression and Back-Propagation Neural Network (BPNN) modelling of bus accidents considering wider influences on accident rates at a route section level also revealed significant safety benefits when bus priority is provided. Sensitivity analyses done on the BPNN model showed general agreement in the predicted accident frequency between both models. The slightly better performance recorded by the MENB model results suggested merits in adopting a mixed effects modelling approach for accident count prediction in practice given its capability to account for unobserved location and time-specific factors. A major implication of this research was that bus priority in Melbourne’s context acted to improve road safety and should be a major consideration for road management agencies when implementing bus priority and road schemes.

47

2.3.5

Bayes Models

Apart from the diverse model formulations utilized to analyze crash frequency data, Bayesian inference technique became popular in recent RTC studies. Compared to the conventional frequentist inference approach, the Bayesian inference method provides a complete and coherent way to balance the empirical data and prior expectations, which will enhance the models goodness of fit (Yu and Abdel-Aty, 2013). Another merit of the Bayesian inference approach is the advantage to deal with multilevel traffic safety data (Huang and Abdel-Aty, 2010). Building on earlier studies and attempting to account for spatial autocorrelation, AgueroValverde and Jovanis (2008) developed models of road crash frequency for the state of Pennsylvania at the county level while controlling for socioeconomic, transportation related and environmental factors. They compared the results from Full Bayes (FB) hierarchical models with the more traditional approach using NB distribution to model crash frequency. The problem of group estimation, namely estimating the parameters of a common distribution thought to underlay a collection of outcomes for similar types of units, had motivated much research in Bayesian statistics. One seeks to make conditional estimates of the true outcome rate in each unit of observation (e.g. fatal crashes rate by country), given the parameters of the common density. Fatal and injury crash data were obtained from Pen Dot (Bureau of Highway Safety and Traffic Engineering (1991-2001). Risk factors, all measured at the county level were obtained for the year 1996-2000. These were divided into three main categories namely socioeconomic, transportation infrastructure related and environmental factors. The base model was:

yi ∼ P oisson (ei , θi )

(2.3.25)

where yi was the number of fatal crashes in country i, θi was the risk in country i, and ei the exposure in country i; in this case the exposure was the total daily vehicle-miles traveled by county. The log risk was modelled as



log (θi ) = α + xi β + vi + ui

48

(2.3.26)

where xi represented a vector of explanatory variables, or covariates, β a vector of fixed effect parameters, vi the uncorrelated heterogeneity or unstructured error and ui the correlated heterogeneity. A uniform prior distribution was assigned to α and a highly non-informative normal distribution was assigned to the β  s with mean 0 and variance 1000 corresponding to vague prior beliefs, given the scale of covariates. On the other hand, the prior distribution for the uncorrelated heterogeneity (vi ) is N (0, τv2 ) where τv2 is the precision (precision = 1/variance) with a prior gamma distribution Ga (0.5, 0.0005) as suggested by Wakefield et al. (2000). As in the case of the β  s, this prior distribution, represents highly non-informative distribution. A pair of areas was considered neighbours if they were adjacent: Ui was the correlated heterogeneity reflecting a shared border. For the correlation term, the CAR model proposed by Besag (1974) was used:

where U¯i =

1 j wij

 j



 ui uj , i = j, τu2 ∼ N U¯i , τi2 uj wij and

τi2

=

2

 τu . j wij

(2.3.27)

wij = 1 if i, j were adjacent or 0 otherwise. As in the

unstructured case, the τv2 controls the variability of u and its prior was selected Ga (0.5, 0.0005). Extending the model of Bernardinelli et al. (1995) to include the effect of covariates: yij ∼ P oisson (eij θij ) log (θij ) = α +



βk xijk + vi + ui + (ϕ + δi ) tj

(2.3.28)

(2.3.29)

k

where yij was the observed number of crashes for the ith area, i = 1, ..., N and the jth time interval, j = 1, ..., T , α the constant term, xijk the kth covariate for the ith area in the jth interval, βk the regression coefficient, vi the uncorrelated heterogeneity, ui the correlation, ϕ the mean linear time trend over all areas, tj the time interval j and δi the interaction between time and area effect. For comparison of the different Bayesian hierarchical models the DIC proposed by Spiegelhalter et al. (2002) was used. DIC was defined as an estimate of fit plus twice the effective number of parameters.

 ¯ + PD DIC = D θ¯ + 2PD = D (2.3.30)

 ¯ ¯ where D θ was the deviance evaluated at θ, the posterior means of the parameters of interest, ¯ the posterior mean of the deviance PD the effective number of parameters in the model, and D statistic D (θ).

49

Results concerning the effects of the covariates on fatal and injury crash risk were mostly consistent in the direction and magnitude for NB and FB models. In general, highly significant variables in the NB models were also significant in the FB models. On the other hand, variables just marginally significant in the NB models were generally non-significant in the FB models. This was because the FB models took into consideration all sources of uncertainty, the authors believed the FB models more accurately associated covariates with crash risk and were better suited for this type of data. Focusing on the Bayes approach, Jiang et al. (2014) employed random effect Poisson LogNormal models for crash risk hotspot identification. Potential for Safety Improvement were adopted as a measure of the crash risk. Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in the Central Florida area, the random effect approaches were compared to the traditional Empirical Bayes (EB) method and the conventional Bayesian Poisson Log-Normal (BPLN) model. A series of method examination tests were conducted to evaluate the performance of different approaches. These tests included the previously developed site consistence test, method consistence test, total rank difference test, and the modified total score test, as well as the newly proposed total safety performance measure difference test. Results showed that the Bayesian Poisson model accounting for both Temporal and Spatial Random Effects (BPTSRE) outperformed the model with only temporal random effect, and both were superior to the conventional BPLN model and the EB model in the fitting of crash data. Additionally, the method evaluation tests indicated that the BPTSRE model was significantly superior to the BPLN model and the EB model in consistently identifying hotspots during successive time periods. The results suggested that the BPTSRE model was a superior alternative for road site crash risk hotspot identification.

2.3.6

Random Effects Logistic Regression Model

Random effects logistic regression models were utilized by Yu and Abdel-Aty (2013) to identify different occurrence mechanisms for weekday and weekend crashes. Suppose the weekday and weekend crashes have the outcomes y = 1 and y = 0 with respective probability p and 1 − p. The random effects logistic regression was set up as follows: y ∼ Binomial (p)

 log it (p) = log

p 1−p

50

(2.3.31)

 = Xβ + uj (i)

(2.3.32)

where X was the vector of the explanatory variables, and β was the vector of coefficients for the explanatory variables. uj was the random effects variable defined in the model, which represented the segment specific or grade group specific random error in this study. These random effects would account for the unobserved geometric factors (especially for the grades). Random effects were set to follow a normal distribution.  uj ∼ N

0,

1 τ

 (2.3.33)

where τ was the precision parameter and it was specified a gamma prior as τ ∼ Gamma (0.001, 0.001). For the explanatory variables, non-informative priors were set to follow normal distribution (normal (0, 0.001)). In the crash time propensity model, random effects were formulated based on segments (120segments); while in the real-time crash prediction models, random effects were defined as longitudinal grade group based (4grade − groups). Findings compliment the results in Section 2.3.2.

2.3.7

Multiple Regression Model

The multiple regression model had been employed by several researchers to investigate RTC. The model was set as follow: y = Xβ + ε (2.3.34) where y represented the number of accidents, X represented the explanatory variables, β represented the parameter estimates and  represented the disturbance term. A peculiar feature of road transport development in Nigeria is the attendant increase in road mishaps. Amidst visible development in road networks, and an apparent increase in vehicular traffic, it has become evident that incidence of motor accidents is also on the increase. One major aspect of RTC in Nigeria that needs to be brought under focus is the wide disparity in not only the frequency but also in the fatality of accidents occurring over the country’s geographical space. Jegede (1988) examined the general features of RTC occurrence in Oyo state and undertook a critical analysis of both temporal and spatial dimension of the problem. According to the existing police operational formations, there were 25 divisional police posts serving the 24 Local Government Areas (LGA) of Oyo State. Each LGA constituted a police division except Obokun LGA, which had two police divisions. This accident data collection system by the police on a divisional basis provided the opportunity to consider each police division as a traffic zone, following a comparative spatial analysis of occurrence of accidents 51

among the traffic zones. Accident data from 1980 to 1984 were obtained. Accident statistics in Oyo state showed spectacular differences in not only the total number of accidents but also in casualties recorded among the traffic zones. An analysis of variance (ANOVA) statistics proved that the variations in both the number of accidents recorded by types and number of accidents were significant even at 0.001 probability levels. Spatial distribution specifically showed high rate of motor accidents in Moniya, Oyo and Ogbomoso. An observation of the spatial patterns of casualties recorded from accidents, however, showed a slightly different picture. Generally speaking, the western part of the state, comprising Kisi, Saki, Okeho, Iseyin and Eruwa traffic zones, recorded lower number of accidents and accident casualties relative to the Eastern part. Specifically in terms of deaths on highways, Ibadan, Ife and Oyo constituted the danger zones for the state while Ikire and Ogbomoso were also very unsafe. In an attempt to explore the processes that accounted for the observed spatial pattern of RTC in Oyo State for the period under investigation a multiple regression analysis was carried out on selected accident related variables. Twelve explanatory variables were analyzed in relation to the number of accidents recorded in each traffic zone. These variables were population size, land area, total length of motorable road, length of trunk A roads, length of trunk B roads, length of bituminous roads, length of earth roads, average daily traffic, number of motor vehicles registered, number of industrial establishments and level of urbanization (number of villages). The result of the regression analysis on these variables showed that 11 of the 12 variables contributed to the explanation of RA. The number of industrial establishments in each traffic zone contributed the highest value to the observed pattern. This was closely followed by population size, then the length of trunk C roads. The Specific Seasonal Index (SSI) for road traffic occurrence in Oyo State showed all the indices for the month of December to be relatively higher than other months. The monthly returns of RTC were used to analyze seasonal variations through the computation of a SSI by a ratio-to-moving average method. It was noted that seasonal indices for the months of March, April and September were relatively high. All other months, other than December, March, April and September had fluctuating seasonal indices of high and low values. This study showed that there exist a significant variation in the frequency and fatality of road traffic accidents among the 24 zones in the state, as well as strong positive correlation between the number of accidents recorded and the number of industrial establishments, average daily traffic, and the number of vehicles registered or licensed in each traffic zone respectively. One policy implication is that greater road safety attention should be focused on the traffic zones designated as accident black spots in the state. A spatial model to account for individual specific effect and temporal effect was suggested for future research studies. 52

A similar study is the works of Aderamo (2012). In a bid to proffer measures to reduce the scourge of road traffic casualties, the spatial variation of road traffic casualties in Nigeria was examined. The multiple regression models developed relate total number of RA, population estimates, length of roads and number of registered vehicles for Nigeria. The regression result revealed that motor vehicle deaths had a positive association with population estimate and length of roads. Also, population estimate and length of roads had a significant effect on motor vehicle injuries. The Tokyo - Kobe expressway is the oldest and busiest intercity motorway in Japan and is 500km in length. A total of approximately 30,000 traffic accidents took place on the expressway from 1970 to 1975. The accident rates as well as the geometric design elements of each 100m section of the road had been recorded. The total sketch of 1,000km (500km in each direction) of the expressway was divided into segments to which the regression analysis was applied. Regression analysis was directly applied to the set of 100m segments which were the original data units of geometric design and accidents, but there were too large random errors of accidents rates. Okamoto and Koshi (1989) examined the method of linear regression for obtaining relationships between accident rates and geometric design of roads. Their paper discussed how to evaluate a random error contained in an accident rate of a road segment and showed that a random error depended on the number of accidents and vehicle kilometer age of the segment. The main issue was to seek for the practical optimum of the trade off between random errors of accident rates of the individual segments and the explanatory power of the set of the segments. Also, an investigation was made into how road segments division were made in an earlier study. Koshi and Ohkura (1978) conducted regression analyses of various sets of road segments which were made in more or less intuitive ways, dividing the whole sketch into the segments that were longer than 100m. The dependent variable was accident rate, defined as the number of accidents per vehicle kilometres travelled, and the explanatory variables were the geometric design elements of a set of road segments. Supposing m vehicles passed over a road segment of length l and k0 accidents took place , the accident rate p0 defined as p0 = k0 /ml

(2.3.35)

and the number of accidents per vehicle-kilometre was the unbiased MLE of p, that is, the unknown true accident involvement probability per unit length of travel of a vehicle, where p0 was the accident rate, k0 was the number of accidents in the segment, m was the number of vehicles that have passed over the segment and l was segment length. The random error or the observation error of the accident rate denoted by e was written as e = p − p0 53

(2.3.36)

= (λ − k0 ) /ml

(2.3.37)

where p was the true accident involvement per unit length of travel in the segment, e was the random (observation) error and λ = mlp was the expectation of the number of accidents in the segment. The confidence interval of p can be estimated by Neyman’s method of interval estimation. When the Poisson distribution can be approximated by the normal distribution N (λ, λ) for a large value of k0 , the following equation will hold:  √ √ P rob λ − zα λ < k0 < λ + zα λ = 1 − α (2.3.38) where 1 − α was the confidence level and zα was the value that holds 

zα −zα

f (z)dz = 1 − α

(2.3.39)

where f (z) was the p.d.f of the normal distribution. ew = zα2 + zα

  (4k0 + zα2 ) 2ml

(2.3.40)

was defined as the upper confidence limit, an error interval ew, used as a measure for evaluating the random error. Further, an error ratio er was defined as a relative magnitude of the random error in such a way that er = ew/p0

(2.3.41)

  = zα2 + zα (4k0 + zα2 ) /2k0

(2.3.42)

where er was a function of number of accidents. The relationship between the true but unknown accident involvement probability per unit length of a road and the geometric design elements was expressed as:

54

p = β1 + β2 x2 + ... + βj xj + u

(2.3.43)

where p was the accident involvement probability per unit length of the segment. x2 , ..., xj were the explanatory variables representing geometric design elements. β1 , ..., βj were regression coefficients and u was the error term representing a fraction of p that was not explained by the linear combination of x2 , ..., xj . The primary assumptions were E (u) = 0, E (uu ) = σ 2 In and p was unknown but its estimator, p0 , was the observed accident rates. Thus, p0 = p − e

(2.3.44)

where e a random variable vector was representing the observation errors. Since p0 was the unbiased MLE of p, we have E (e) = 0. Assuming that no correlation exists among the observation errors, then, E (ee ) = φn . Assuming the absence of correlation between u and e, then, p0 = Xβ + w (2.3.45) w =u−e

(2.3.46)

E (w) = 0 and E (ww ) = w = σ 2 In + φn Using the method of Ordinary Least Squares (OLS), the least square estimator βˆ was obtained as a linear unbiased estimator of β: −1 βˆ = (X  X) X  p0

(2.3.47)

Since the error term w was not of homogeneous variables, βˆ could not be guaranteed to be of the minimum variance. If w was known, the method of generalized least squares could be applied and the best linear unbiased estimator of β will be obtained as follows: −1  −1

b = X  W −1 X X W p0 (2.3.48) where b is the best linear unbiased estimator of β. −1

w−1 = σ 2 In + φn

(2.3.49) 2

This means that each segment should be weighted by the inverse of the sum of σ and ϕ2i to obtain the best estimator of β. It was suggested that a way to obtain a better estimation of smaller variance by the method of OLS is to have as much as possible homogeneous ϕ2i values. This implies that we should make the random errors of the accident rates of the segments as equal to each other as possible. The R2 was given as follows: sp = sr + s e

(2.3.50)

R2 = sr/sp

(2.3.51)

55

(2.3.52) = 1 − se/sp where sp was the sum of squares of the accident rates, sr was the sum of squares of the values of the regression equation and se is the sum of squares of the residuals. Three alternative criteria on the random errors for division into segments were discussed. Error interval criterion, error ratio criterion and error interval-error ratio combined criterion. Error interval criterion:

  ew = zα2 + zα (4k0 + zα2 ) /2ml

(2.3.53)

= ewa

(2.3.54)

ml = zα2 (ewa + p0 ) /ewa

(2.3.55)

that is,

where ewa was the allowable error interval. This indicated that vehicle-kilometers travelled ml should be increased in linear relation to the observed accident rate to maintain the approximate homogeneity of the random errors. Error ratio criterion: er = ew/p0    = zα2 + zα (4k0 + zα2 ) 2k0

(2.3.56)

= era

(2.3.58)

k0 = (1 + era) zα2 /era2

(2.3.59)

(2.3.57)

that is, where era was the allowable error ratio. This showed that every segment should have the same number of accidents. With this criterion, high accident locations can be isolated as individual segments. On the other hand, this may pose too severe precision on low accident locations, resulting in too long segments for low- accident rate locations. Error interval-error ratio combined criterion: Another alternative for avoiding the shortcomings of the above two criteria is a combination of both criterion. That is to use the error limit of whichever is larger of the error interval criterion or the error ratio criterion.   (2.3.60) ew = zα2 + zα (4k0 + zα2 ) /2ml = max (ewa, p0 × era)

(2.3.61)

This criterion avoids bunching a high accident location with the adjacent low - accident sections just to satisfy homogeneity of random errors . This was the authors proposal and was

56

proved using the following numerical examples. The data used were 28983 accidents that occurred on the 499.7-km stretch of the motor way from Tokyo to Kobe during the six years from January 1970 to December 1975. Those accidents that were not considered related to the geometric design of through lanes such as vehicle fires, merging or diverging accidents etc were excluded from the analysis. The number of accidents, the traffic volumes, and the road geometric design elements were filed for each 100-m road element. Results of multiple regression analyses applied to six segments sets were obtained with the multiple correlation coefficient with four variables which were selected stepwise to obtain the highest multiple correlation coefficient individually for each of the segment sets. It was shown that the results of regression depended on the methods of dividing the road into segments. The segment set which was based on the error interval-error ratio combined criterion, resulted in the highest correlation coefficient. The four explanatory variables used were as follow: centrifugal acceleration index, which represented potential lateral acceleration force of the speed when a vehicle makes turn at the radius of whichever is smaller, which was the radius of that spot, or 1000m; Difference of curvature; Combination of difference of curvature and downgrade and Linearity index. It was shown that the efficiency of regression analysis for accident rate versus road design elements largely depended on the random errors as well as the variances of the observed accident rates of the road segments to which the regression is applied. The paper proposed that the random error of the accident rate of a road segment could be evaluated with error interval and error ratio which are functions of the length and the number of accidents of the segment. The random errors therefore depended on the way in which the segment was formed. Based on the examination of several alternatives through numerical examples, the authors proposed a practical and useful method for dividing a road stretch into segments. The error interval - error ratio combined criterion proposed was that a road stretch be divided in such a way that the error of each segments is equal to whichever is larger of certain values of allowable interval or error ratio. There has not yet been found, however, a theoretical method for optimizing a certain value of allowable error interval and a certain value of allowable error ratio of the observed accident rates.

2.3.8

Random Parameters Tobit Regression Model

A large body of literature used variety of count data modelling techniques to study the factors that affect the frequency of highway accidents over some time period on roadway segments of

57

a specified length. An alternative approach to this problem views vehicle accident rates (accidents per mile driven) directly instead of their frequencies. Viewing the problem as continuous data instead of count data creates a problem, in that, roadway segments that do not have any observed accidents over the identified time period create continuous data that are left censored at zero. Anastasopoulos et al. (2011) addressed the possibility of unobserved heterogeneity in accident counts by estimating a random parameters tobit model and comparing the estimation results with a traditional fixed parameters tobit model. Accident frequency approach used to consider accident rate (such as the number of accidents per 100 million Vehicle Miles Travelled (VMT)) has data which are continuous. However, some highway segments will have no accidents reported during the analysis period over which accidents are observed, so the data will be left-censored at zero. Anastasopoulos et al. (2008) identified the tobit regression as an appropriate approach to the censoring problem. Their study used a statistical approach that constrained the estimated parameters to be fixed across observations. In the presence of unobserved heterogeneity across observations, such a fixed parameter approach could potentially result in biased parameter estimates and incorrect inferences (Washington et al., 2011). Given the potential heterogeneity in accident-rate data, a random (as opposed to a fixed) parameter approach may be appropriate. Because accident frequency /rate data do not contain residual environmental effects, and socio-economic and behavioural characteristics of drivers or vehicle specific information (such data elements are available only after an accident has occurred and thus cannot be used to predict the likelihood of an accident or an accident rate). Significant unobserved heterogeneity across observations was likely to be present. Using a left censored limit of zero, the tobit model was expressed as Yi∗ = β  Xi + εi , i = 1, 2, ..., N

(2.3.62)

Yi = Yi∗ if Yi∗ > 0, Yi = 0 if Yi∗ ≤ 0 where N was the number of observations, Yi was the dependent variable, Xi was a vector of independent variables (pavement, traffic, and roadway segment characteristics), β was a vector of estimable parameters, and i was a randomly and independently distributed error term with zero mean and constant variance σ 2 . The above model showed there was an implicit, stochastic index (latent variable) equal to Yi∗ which was observed only when positive.

58

The corresponding likelihood function for the tobit model was: L=



(1 − φ (βX/σ))



0

σ −1 φ [(Yi − βX) /σ]

(2.3.63)

1

where φ was the standard normal distribution function and φ was the standard normal density function. The expected value of the dependent variable for all cases was: E (Y ) = βXF (z) + σf (z)

(2.3.64)

where z = βX/σ was the Z −score for an area under the normal curve, F (z) was the cumulative normal distribution function associated with the proportion of cases above zero, f (z) was the unit normal density and σ the standard deviation of the error term. To determine the effect of an independent variable on the expected value, the first-order partial derivative of Y was used and then the expected value E (Y ) for observations above zero was βX plus the expected value of the truncated normal error term:

E (Y ) = βX + σf (z) /F (z)

(2.3.65)

Adopting the simulated MLE technique for incorporating random parameters in tobit regression.The approach of Green (2007) was used to account for heterogeneity. This approach viewed estimable parameters as: βi = β + ϕi , i = 1, 2, ..., N

(2.3.66)

where ϕi was a randomly distributed term. With this equation, the latent variable became: Yi∗ /ϕi = βXi + εi The likelihood function was written as the log- likelihood   LL = ln g (ϕi )p (Yi∗ /ϕi ) dϕi ∀i

ϕi

where g (.) was the probability density function of the ϕi .

59

(2.3.67)

(2.3.68)

A simulation based ML method was employed using Halton draws (Halton, 1960; Train, 1999; Bhat, 2003). Motor vehicle accident data from urban interstate roads in Indiana were collected over a nine year period (1 January 1999 to 31 December 2007) to investigate the effect of pavement characteristics ,that is, pavement roughness, rutting, surface deflection, and pavement condition rating; road geometrics namely, number of lanes, presence and characteristics of horizontal/vertical curves, junctions, barriers, gores, medians and shoulders and traffic characteristics (speed limits, passenger, car and truck traffic volumes) on accident rates per 100-million vehicle miles travelled. A total of 200 roadway segments were defined and the number of police reported motor vehicle accidents occurring on each segment over the 9 year period was obtained. 65 had no accidents while 135 had at least 1 accident. For model estimation, the data included the aggregated number of accidents on each roadway segment and the accident rate (number of accidents per 100-million VMT) calculated as n Accident − ratei =  n

yr−1

Accidentsyr,i 

yr−1 AADTyr,i × Li × 365

(2.3.69) 100000000

where accident rate i was the number of accidents per 100 million VMT on roadway segment i, yr denoted the year (from 1 to n). Accidentsyr,i was the number of accidents in year yr (year from 1 to n) on segment i, and Li the length of roadway segment i in miles. A likelihood ratio test was conducted to compare models. The test statistic was χ2 = −2 [LL (βF ) − LL (βRP )]

(2.3.70)

where LL (βF ) was the log-likelihood at convergence of the fixed-parameters Tobit model, and LL (βRP ) was the log-likelihood of the random-parameters Tobit model (Washington et al., 2011). Estimation results showed that the random parameters Tobit model had a better log-likelihood at convergence compared to the fixed parameters Tobit model. A number of variables in the fixed parameters model were found to be statistically insignificant, whereas they were significant in the random parameters model. This was attributed to the flexibility of the random parameter model that relaxed the restriction of the fixed parameters model that the effect of the covariates must be constant across the observations. The empirical results showed that the random-parameters tobit model outperformed its fixed parameters counterpart and had the potential to provide a fuller understanding of the factors determining accident rates on specific roadway segments. The presence of a junction was statistically insignificant in the fixed parameters model, but was 60

found to be a fixed parameter that increased accident rates in the random-parameters models. The study was exploratory in nature and suggested that random parameters have considerable potential for analyzing accident rates using tobit regression. The approach could therefore, be applied to other geographic areas and to non-interstate road segments to provide more information on the effect that pavement, geometric and traffic characteristics have on accident rates.

2.3.9

Fuzzy Linear Regression Model

Fuzzy linear regressions have been vastly investigated by many researchers. Tanaka et al. (1982) proposed the fuzzy regression with a purpose of minimizing fuzziness as an optimal criterion. The method was further developed by minimizing the total spread of the output by Chang et al. (1996), Heshmati and Kandel (1985) and Peters (1994). Abdullah and Zamri (2012) modelled RTC with two threshold levels h = 0.5 and h = 0.9. A fuzzy regression model was developed in describing RA in Malaysia over the period of 1974 to 2007 using three predictors. The model structure was developed after considering linearity assumption of the models variables. The algorithm of fuzzy regression analysis was partly reproduced as: y˜ = A˜0 + A˜1 x1 + A˜2 x2 + ... + A˜j xj + ... + A˜N xN = A˜x

(2.3.71)

where y˜ was the fuzzy output, [x1 , x2 , ..., xN ]T was the real-valued input vector of independent variables and A˜ = A˜0 , A˜1 , A˜2 , ..., A˜N was a vector of the models fuzzy parameters. The fuzzy parameters A˜j = (αj , cj ) , j = 0, 1, ..., N (2.3.72) with its membership function as shown below:  μA˜j (aj ) =

|α −a |

1 − jcj j , αj − αj ≤ aj ≤ αj + cj o, otherwise

(2.3.73)

where aj was the centre value of the fuzzy number and cj the spread. Hence, the fuzzy linear regression model was rewritten as follows:

y˜ = (α0 , c0 ) + (α1 , c1 ) x1 + (α2 , c2 ) x2 + ... + (αN , cN ) xN

(2.3.74)

The estimated output y˜ was obtained by using the extension principle. The derived membership function of the fuzzy number y˜ was: ⎧ y−αT x ⎪ ⎨ 1 − | cT |x| | , x = 0, μy˜ (y) = 1, otherwise (2.3.75) ⎪ ⎩ 0, otherwise

61

where |x| = (|x1 | , |x2 | , ..., |xN |)T , the central value of y˜ was αT x, and the spread of y˜ was cT |x|. To determine the fuzzy coefficients (2.3.72), the following linear programming problem was formulated: Minimize J=

N 

 cj

M 

j=0

 |xij |

(2.3.76)

i=1

Subject to N 

αj xij + (1 − h)

j=0

N 

cj |xij | ≥ yi

(2.3.77)

cj |xij | ≤ yi

(2.3.78)

j=0

and N 

αj xij − (1 − h)

j=0

N  j=0

cj ≥ 0, αj ∈ R, j = 0, 1, 2, ..., N, xi0 = 1, i = 1, 2, ..., M and 0 ≤ h ≤ 1 where J was the total fuzziness of the fuzzy regression model. The h value, which was between 0 and 1, a threshold level to be chosen by the decision maker. This term was referred to as a degree of fitness of the fuzzy linear model to its data. Each observation yi had at least h degree of belonging to y˜ as : yi ) ≥ h (i = 1, 2, ..., M ) μy˜i (˜

(2.3.79)

Therefore, the objective of solving the LP problem was to determine the fuzzy parameters A˜j such that the total vagueness J was minimized subject to (2.3.79). It was noted that the fuzzy regression contains all samples within its range. This indicates that fuzzy linear regression expresses all possibilities, which the samples embody and exist for the system under consideration. The exogenous variables included registered vehicles, population and road length. The model explained the behaviour of RA in crisp output and also output range.The upper and lower bounds of each model explained the fuzziness of accidents data. However, the trend lines with centres and spreads were far from being conclusive in knowing the performance of the model. Therefore,the models needed to be examined further. One of the popular analyses in testing model performances in linear predicting model is error analysis. The models were compared with the observed data to see the magnitude of errors. The performances of the models with the two threshold levels were scrutinized. To indicate the models’ performances (R2 ) was com-

62

puted. It was found that the two threshold levels yielded different performances. The results showed that by applying a multi-variable approach of fuzzy linear regression, the models provided not only crisp output but also output range for number of road accidents in Malaysia. The model with threshold level 0.5 outperformed the latter model. Registered vehicles and population were notable predictors to number of road accidents in Malaysia. Perhaps the policy makers could consider these two variables in managing road accident in Malaysia.

2.3.10

Geographically Weighted Regression

The inter province differences in traffic accident and mortality on roads of Turkey was described by Erdogan (2009). The study examined regional differences in traffic accidents and traffic accident mortality and their underlying determinants. Regional disparities hidden behind the national statistics on RA and fatalities were also part of the focus. Aggregated data reported in the provinces of Turkey from 2001 to 2006 were used for the study. The author observed an explosion in immigration and population and corresponding increase in vehicle numbers. Meanwhile, GDP and income per capita hard grown rapidly by 2007. Although population increased by 14.7%, motor vehicle ownership increased by over 75% between 1995 and 2007. During that period, the number of traffic accidents (167.98%) and injuries (30.45%) also increased. The number of deaths, however, decreased by 42.39%. The amount of material loss owing to traffic accident was approximated to $1.1 billion for 2007, excluding health expenses and loss of labour. Also, rapid expansion of road construction and increased number of vehicles meant that Road Traffic Accidents (RTA) were becoming an increasingly serious public health problem in Turkey. Geographic Information System (GIS) and spatial analyses were used for the study. The study described preliminary investigation of RA and mortality in Turkey at the province level, adjusting for population and number of registered motor vehicles owing to socio-economical differences. The number of accidents and mortality ratio were modelled with variables through Geographically Weighted Regression (GWR). For visualization and spatial analyses Arc GIS 9.2, developed by ESRI, GeoDa 0.9.51 developed by Luc Anselin, Crimestat 3.1 and GWR 3.0 were utilized. Population by census year, annual intercensal rate of increase and mid year population forecasts, number of traffic accidents and records of road deaths in the provinces were taken. The socio-economic index values of the provinces calculated by principal component analysis included 58 factors regarding demographic, education, employment, health, industrial, economic, agricultural and construction status. Analyses were based on the province levels of spatial aggregation.

63

A temporal equilibrium state was examined by means of the time series of mortality and number of accident cases, 2001-2006 emerged as a relatively stable period. Population density and the number of registered vehicle were used as standardization factors in the study. An excess risk map of numbers of RA was plotted. Each map was a choropleth map, where the natural break method for classification of the data had been applied to reflect the distribution as accurately as possible. The relation between the accident rates and the level of socio-economic development of provinces was explored by means of bivariate spatial autocorrelation. There were many aspects explaining the observed differences in accident statistics. Population density and the number of registered motor-vehicles were two of the many variables. Given the data availability at the province level, the main variables affecting the level of provincial accident and death rates were selected through the technique of stepwise regression. The number of accidents, the number of deaths as dependent variables was therefore linked to a set of independent variables with one of the main outputs of regression being the estimation of parameters that link each independent variable to the dependent variables. These independent determinant variables are length of highways, length of province roads, and number of registered cars, buses, minibuses, trucks, small trucks and motorcycles in the provinces. To overcome the problem associated with the OLS method, the processes under examination were assumed to be constant over space. Accident and death rates were modelled by the GWR. With the GWR, the relationship between the variables were expressed as: Z (Xi ) = β0 (xi ) + β1 (xi ) y1 + β2 (xi ) y2 + ... + βn (xi ) yn + ε

(2.3.80)

where xi indicated the parameters to be estimated at a location, and  was the random error term. To test the relative performance of the GWR and OLS models in replicating the observed data, an approximate likelihood ratio test, based on the F test was used. The significance of the spatial variability in the local parameter was estimated by conduction of a Monte Carlo test. The study identified safety deficient provinces, giving information about the distribution of accidents and mortality at the province level. The excess risk map provided solid reference for relative risk levels within the province. Although accident rates were significantly auto correlated, the autocorrelation was, however quite unstable with the aggregation level. According to the local spatial analyses results, the presence of hot spots with high rates of mortality and accidents showed that accidents and deaths caused by traffic were significant public health problems in Turkey.

64

The F − test suggested that the GWR model significantly improved model fitting over the OLS model. In principle, the intercept term referred to the basic level of regional accident, the fundamental level of accident rates, excluding the effects of all factors on regional accident statistics across Turkey. The length of highways parameter had a great effect on both the number of accidents and the number of deaths. Length of province roads also had a positive effect on the number of accidents and the number of deaths. The parameters of number of buses, mini trucks, and trucks had no significant influence on both the number of accidents and the number of deaths. The population variable had a negative effect on both accident rates and mortality.

2.3.11

Spatial Regression

An alternative method for identifying blackspots was developed by Levine et al. (1995a). Spatial patterns and the application of a spatial analysis program for describing point distributions were described. Police crash reports compiled by the state of Hawaii for 1990 was used for the study. Hawaii law required that all crashes involving fatalities, injuries or property damage in excess of $300 ($1000 after 20 June 1990) be reported. Unique incidents were examined with the location of the individual crash being the unit of analysis. Focus was on the city and county of Honolulu, the Island of Oahu. Honolulu accounts for 75% of the population and 74% of all motor vehicle collisions in 1990. There were 19,598 police reported crashes, 74 involved fatalities and 6733 involved injuries. To establish the location of crashes, a standardized dictionary of street names was developed using the AutoStan software (Jaro, 1992). These street names were matched to the 1990 tiger files but a set of alternative names were identified. A specialized software was developed for matching street names (Khoo , 1993). For every intersection, the program took the standardized Tiger name and searched through the street segments to select every link with the name, placed in two matrices A and B. The program compared the latitude and longitude of each link in the A list with that for each link in the B list for both nodes. The latitude and longitude of the matched node was then assigned to the accident location. About 2% of the crashes were identified manually because not all streets in the Tiger system had street names. A desktop GIS program ATLAS* GIS was used for displaying the point locations and subsequent spatial analyses. A software to derive different indices of spatial point pattern Hawaii Pointstat developed by Levin et al. (1994) took a list of latitudes and longitudes for each crash location and produced four measures of concentration for the point distribution which were relevant for the analysis.

65

The descriptive indices of crash location were made. The geo-referencing of accident locations, a procedure similar to blackspot analysis with a more sophisticated geo-referencing framework was made. Four measures of concentration for the point distribution were the mean center (the mean latitude and mean longitude by the list called the center of gravity) and the standard distance deviation, based on the Great Circle distance of each point from the mean center. The standard deviational ellipse, which calculates two standard deviations, one along a transformed axis of maximum concentration and one along an axis which was orthogonal. Fourthly, the nearest neighbour index, which measured the average distance from each point to the nearest point was obtained. This was then compared to a distribution that would be expected on the basis of chance. Three measures of collision concentration, mean center, standard distance deviation and standard deviational ellipse were shown. The standard deviational ellipse was a more elegant measure of spatial concentration than the standard distance deviation. Although, the standard distance deviation covered more cases than the standard deviational ellipse. The ellipse more concisely described the concentration in crash location. In terms of accident density (accidents area), the standard deviational ellipse had more than twice the density of the standard distance deviation. It was also a more focused measure and better graphical tool. For each point in a distribution, the distance to every other point was calculated and the shortest distance was selected. These shortest distances were then averaged and compared to a nearest neighbour distance which would be expected on the basis of chance. The ratio of the observed to the random nearest neighbour distance indicated a very high degree of concentration highly significant. Accidents were not spatially independent events. The distribution of different types of crashes were examined using these tools particularly the ellipse.These pointed toward two different spatial patterns of crashes one more geographically concentrated, most likely related to traffic density while the other more dispersed. Variations in automobile crashes were examined for each hour in the 24 hour day and for two different time periods - weekdays (Mondays-Friday) and weekends (Saturdays,Sundays). The area of the ellipse for the population distribution was calculated by taking the centroid of each block and weighting it by its 1990 population. The area of the ellipse for the employment distribution was calculated by taking the centroid of census block groups and weighting by a 1990 estimate of employment. Spatial variations of crash characteristics were examined the distribution of the ellipse for all crashes involving serious injuries, alcohol related crashes and fatalities. Each of the accident ellipses was compared to the ellipses for population and employment. A test was constructed which compared each ellipse to that of population and employment on five criteria. Differences in mean latitudes, differences in mean longitudes, differences in the standard deviation of the transformed Y-axis, differences in the standard deviation of the transformed X-axis and differ66

ences in the area of the ellipse. The ellipses of the population and employment were significantly different from each other on four of the five criteria, that is, statistically indistinguishable on only one of the four criteria. The ellipses which were more similar to the ellipse for residential population were those for fatal crashes, serious injury crashes, alcohol-related crashes, single vehicle crashes, head on crashes and opposite side wipe crashes. The ellipses which were more similar to the employment ellipse were those for all accidents two vehicle crashes, rear end crashes, broad side crashes and opposite side angular crashes. The addresses of the drivers were not known as such, the distances between their houses and the location of the crashes could not determined. But it appeared at least from the overall distribution that the majority of crashes were occurring near employment centers whereas those closer to residential area, while fewer, tend to be more severe. Limitations to the spatial tools were that they assumed a monocentric spatial plane. Levine et al. (1995b) further described a method for examining spatial variations in motor vehicle accidents aggregated into small geographical areas. An attempt was made to explain the spatial patterns by a number of population, employment and road characteristic variables in order to show that activities which generate trips also indirectly predict crashes. In the modeling literature, there were a range of different applications whereupon expected traffic levels were predicted from different activities and land uses. On the other hand there were no examples of using a trip generation model to predict an expected level of crashes, spatially dispersed through a series of zones in Honolulu. Even though such a model would be consistent with the assumption that concentrations of trips would be associated with higher levels of crashes. Yet without such a prediction, it becomes difficult to know whether particular areas have a higher collision risk than what would be expected on the basis of existing traffic levels. Motor vehicle crashes for Oahu were geo-coded. A distribution of point locations, such as, with automobile crashes, suggested a density distribution. Crashes occurring spatially close together were the products of higher levels of traffic, which in turn were a function of more concentrated social activities, either residential, employment or employment - related, such as shopping, entertainment. Census block groups were used as the unit of analysis. The number of crashes occurring in each block group was calculated using a desktop mapping program ATLAS* GIS. This became a dependent variable for the regression models. Characteristics were then assigned to zones which were then used to predict accidents; here the characteristics directly predicted the accidents. Daily trip generation estimates for 1990 were obtained, organized by traffic analysis zones of which there were 284. This included separate estimates for trip productions and trip attrac67

tions for each zone. A rough index of traffic volume was created by adding, for each zone, the number of trip productions to the number of trip attractions and subtracting the number of trips produced within a zone which also ended in that zone. These trips were then allocated to census block groups using a disaggregation function in ATLAS* GIS. From the 1990 census, data were obtained on the population of each census block group. Estimates of employment were obtained by each of these traffic analysis zones. Ten categories of employment were included manufacturing, retail trade, services, financial, hotel, military, government, agriculture, transportation and construction. Using a similar logic to that described for population, employment was allocated to block groups. Four summary variables were created for road characteristics. The existence of a free way crossing a block group, number of miles of major arterial or highway roads within each block group, number of miles of minor arterial or access roads within each block group and the number of miles of free way access roads or ramps within each block group. The spatial lag model was defined as y = λW y + Xβ + 

(2.3.81)

where y was an N x 1 vector of observations on the dependent variable for all locations, y, W y was a weighted matrix of N x 1 vector of values for the dependent variable summed over all locations, y, with λ (the spatial lag), the spatial autoregressive term, X was an N x K matrix of observations on the explanatory variables, β was a K x 1 vector of regression coefficients and  was an N x 1 vector of normally distributed random error terms, with mean 0 and constant variance σ 2 (Anselin, 1992). The special spatial statistics regression package, Spacestat was used. A number of different models were developed and tested. A simple indicator of collision risk was obtained by calculating the difference between the actual number of crashes occurring per block group and the expected number of block group crashes. This difference is then expressed as a percentage of the expected number. There was a significant relationship between the estimated traffic volume produced and the number of crashes occurring within a block group. The simple correlation between the number of crashes and the predicted number of tips was 0.54. Traffic crashes spatially parallel traffic volume. The overall R2 was moderately high (0.55). The spatial lag variable, as measured by an inverse distance matrix, was highly significant, indicating that crashes tend to be more clustered by block group than what would be expected by a random distribution. Then area of the block group was significant with more accidents in smaller block groups than larger ones. Financial employment and military employment were negatively related. Three of the road characteristics variables also produced positive and significant coefficients. Results supported the notion that traffic crashes tend to follow traffic patterns in the sense that variables predicting hips also predict crashes.

68

Several of the variables showed meaningful fluctuations over time, indicating that the risk of accidents also changes. The map of collision risks for the island Oahu showed areas which had low population densities, very little employment and yet high excess of actual collisions over expected collisions. The effect of introducing the spatial lag effect increased the predictability significantly. Improvements can be made to the model by using more precisely measured variables. The zone type model can be improved by adding traffic volume by zone, preferably for freeways, highways and major arterials. The effect of specific land uses on accident generation must be examined. Also, a method could be developed for allocating the expected accidents to particular streets.

2.3.12

Spatial Analysis

Black and Thomas (1998) described a method of assessing the extent to which the value of a variable on a given segment of a network influenced values of that variable on contiguous segments. The usefulness of network autocorrelation analysis in analyzing accidents distributed along the segments of a highway system was examined. The focus was to examine the extent to which accidents on a segment covary with contiguous segments on the Belgium motorway network. The concern of this study was with accidents on the Belgium motorways which had a total of 16000km2 . Approximately 5% of all RA with casualties occurred on the motorways. The study illustrated the utility of network autocorrelation analysis using Morans coefficient as an indicator of no autocorrelation (independence) or autocorrelation (dependence) for accident rates on segments of a network. The coefficient obtained could be converted to a standard score and evaluated for departure from random expectation. The method adopted was network autocorrelation analysis, a network variant of spatial autocorrelation analysis, and the Morans I statistic was used to make assessment. Illustrations of positive and negative network autocorrelation were given and interpreted for several simple linear networks. An empirical sampling distribution for the case of a ten link network was derived based on 100000 samples. This distribution was compared with a normal distribution and found not to be significantly different. The use of network autocorrelation analysis with more complex networks was demonstrated using 1991 motor vehicle accident rates for a portion of the motorway network of Belgium. A significant level of positive network autocorrelation was observed. It was further demonstrated that the source of the observed positive covariation could be identified using a secondary analysis that focused on the motorway and ring road components of the overall system to identify those portions of the network that were the major sources of the positive autocorrelation. A further analysis revealed that the major sources of this positive covariation had been identified. An index developed by Moran (1948) was used to assess the level of network autocorrelation.

69

Calculation of the index represented by M was as follows  wij (xi − x¯) (xj − x¯) n i j M =  .  wij (xi − x¯)2 i

j

(2.3.82)

i

where xi = the value of variable x on segment i, x = the mean of variable x, n = the number of segments, wij = a weight indicating if segment i was connected to segment j (e.g. 1) or if it is not (e.g. 0). The summation operators were for i = 1, 2, , n and j = 1, 2, , n in all cases. The expected value of M was E (M ) = (−1)/n − 1. The variance of M under the assumption of normally distributed data was:

n2 s1 − ns2 + 3( V ar(M ) =



 i

where s1 =

1 2

 i

j

(wij + wji )2 and s2 =





2 wij

i

wij )2

j

(n2

− 1)

j

  i

i

wij +



(2.3.83)

2 wji

. If two segments connect, a

j

value of 1 represented this connection, and if not, 0 was entered in the W . For any set of n segments of a linear route, there were 2 (n − 1) joins. If the focus of the analysis was not a single linear route, but an entire network, then the connectivity of segments to each other may need to be identified by inspection. Once identified, a binary connection matrix of segments represented the presence or absence of connections. The null hypothesis was that the distribution of accidents or accident rates on segment is not auto correlated. Rejection of this null hypothesis implied that the distribution of accident rates was positively or negatively auto correlated. Empirically, positive network autocorrelation would indicate similar accident levels or rates on neighbouring segments of a highway. This might indicate causal factors such as regional weather conditions, e.g. fog, rainfall, drifting snow, or design problems that increase accidents on neighbouring segments. Negative network autocorrelation would indicate a tendency for neighbouring segments of a highway to have extremely different accident levels. It is a possible outcome, but would rarely occur empirically unless every other segment had a point of access or egress that might cause additional accidents on those segments. The presence of no autocorrelation will probably not indicate a complete absence of positive or negative autocorrelation, but may indicate these tendencies cancelling out each others influence. Standard normal curve was used to check whether the positive network autocorrelation was significantly greater than expectation at the 0.01 and 0.05 levels. To further check the significance of these analyses they randomly generated 100000 samples of the samples of the same variable70

motor vehicle accidents in different order as illustrated in part by the ten linear networks. Based on these 100000 samples, a sampling distribution was created. The analysis indicated a lack of independence in accident rates for a 1287 segment network. The network and motorway analyses undertaken could be viewed as global and local analyses. Time was not incorporated into the data, future works may therefore include time. In a similar vein, Erdogan (2009) performed global and local spatial autocorrelation analyses to show whether the provinces with high rates of deaths - accidents showed clustering or are located closer by chance. Two different risk indicators were used to evaluate the road safety performance of the provinces in Turkey. These indicators were the ratios between the numbers of persons killed in RTA and the number of accidents. Also, the ratios between the numbers of persons killed in RTA and their exposure to traffic risk. The W were calculated by the criterion of contiguity according to the centroids of the nearest 5 − 10 neighbours and then according to the criterion of general social distance. Smoothed maps for the provinces between 2001 and 2006 were also shown. To identify and measure the strength of spatial patterns, showing how the accident statistics were correlated in the country, Morans I (2.3.82) and Geary C values were calculated with three W matrices. The Gearys index is given as follow: C = (N − 1)

N  N 

 wij (xi − xj )2

i=1 j=1

2

N  N 

wij

i=1 j=1

N 

(xi − x¯)2

(2.3.84)

i=1

The general G statistic was used to give an understanding of the clustering of high or low values to show either hotspots or cold spots in the region. Mathematical formulae of G statistic, Z − score, and expected G statistic were written as follow

G (d) =

 i

 wij (xi xj )

j





i

x i xj

(2.3.85)

j

ZG(d) = G (d)0 − G (d)E SDG(d)

(2.3.86)

GE (d) = w/n (n − 1)

(2.3.87)

and

Local Morans I (LISA, Anselin, 1995) and G∗i statistics of Getis and Ord (1992) were used to explore where the accident ratios were clustered. Local Morans I and Z − Score were given below, where x¯ was mean value and s2 was variance. (xi − x¯)  wij (xj − x) (2.3.88) Mi = s2 j 71

Z (Mi ) = Mi − E (Mi )



V ar (Mi )

(2.3.89)

G∗i statistics were used to detect local pockets of dependence that may not show up with global spatial autocorrelation methods, as suggested by Getis and Ord (1992). G∗i statistic and Z − score of statistic were written as follows, where wij was the weight for the target neighbour pair (Mitchell, 2005).

G∗i (d) =



 wij (d)xj

j

xj

(2.3.90)

j

Z (G∗i ) = G∗i − E (G∗i )

2.3.13



 V ar (G∗i )

(2.3.91)

Time Series Analysis

Multivariate Moran analyses indicated significant correlation between accidental rates and the socio - economic status index values declared by the state planning Agency in 2003. The increasing level of road traffic accident in Imo State and the consequent injuries and deaths strengthened the case for its regular analysis by Ohakwe et al. (2011). Data on recorded cases of road traffic accidents were collected from the Motor Traffic Division, the Nigerian Police Force, Divisional Headquarters Umuguma, Owerri West, Imo State Police Command. Using the method of time series decomposition, traffic road accidents were characterized to have an upward trend and significant seasonal influences. Using chi-square test of significance, it was discovered that there were significant differences among the various causes of accidents and accident cases (Minor, fatal and serious) with respect to types of vehicles involved over the years. Out of 5921 accident cases, reckless driving, inexperience and mechanical fault and road defects accounted for 30.3, 21.5 and 21.1 percent respectively. Two motorcycles, motorcycle-vehicle and vehicle-vehicle crashes are the lead types and have resulted in 38.9, 37.5 and 14.9 percent of the total of 855 deaths recorded within the period of study. Furthermore, it was also found that private cars, minibuses and taxis accounted for most of the accidents with 94.7 percent of the total accidents. Sumaila (2013) was concerned with traffic accidents and safety management in Nigeria. He focused on trends in road crashes and carried out a critical review of current road safety approaches with a view to identifying their defects and deficiencies in tackling the traffic accident problem in the country. Accident records and details of current safety measures obtained from such relevant agencies as the Federal Road Safety Corps, the Nigeria Police and Department of Road Transport Services in addition to maps and photographs provided the basic data for the

72

study. The results showed the generally high rate of accidents in Nigeria with the driver as the main culprit, the functional limitations of FRSC as the lead agency for road safety matters, the practical difficulties of implementing the driver licence and vehicle registration schemes, and poor driving culture of Nigerians was seen as a result of weak traffic education, public awareness and enforcement programmes. Restructuring and re-tooling the lead agency, declaration of traffic accidents as a national health problem and the institution of driver identity management system among others are proposed to improve safe motoring in Nigeria. Agbeboh and Osabuohien-Irabor (2013) studied the trend of road accident in Kogi State from January, 1997 to December, 2010, using a univariate time series data collected from the Federal Road Safety Commission, Lokoja, Kogi State. The model for the data was found to be Y = 22.062 + 0.252T. Test for the existence of trend and seasonal variation was conducted at 0.05 level of significance, and a four years forecast for road accident was made for 2012, 2013, 2014 and 2015. It was found that there was no seasonal variation but the trend showed steady increase in Kogi state accident rate.

2.4

Trends in the Methods for Handling Road Traffic Crashes Data

Wolfe (1982) discussed the potential utility of exposure data for developing safety counter measures; an overview of various exposure measures and data collection methods was given. Exposure to the risk of a road traffic accident is a concept that had been frequently talked about in road safety research with great divergences in opinion as to its basic meaning. Wolfe (1982) observed two limitations. First the expression ’driving exposure’ ignored the exposure to the risk of accident of pedestrians as well as of fixed objects such as bridge abutments, utility poles and parked cars. In road safety research it seemed desirable to define a road traffic accident broadly to include any interaction between a non-fixed track land vehicle and another vehicle or a pedestrian, or a fixed object or the ground which causes any property damage or personal injury, whether it takes place on or along a public roadway or in a parking lot or driveway. Usually only accidents involving motor vehicle are entered into police accident files, road safety research tends to neglect accidents involving two bicyclists or a bicyclist and a pedestrian, or a bicyclist and a fixed object. But these types of accidents are important in understanding the totality of road traffic accident risks, and the definition of the concept of exposure should be broad to encompass RTA not involving the driver of a vehicle. Secondly, exposure usually implied an active process; while in reality one can be exposed to the possibility of being hit by a moving vehicle just sitting in a parked car or even sleeping in

73

a ground floor bedroom. Thus a broader but less active definition of the concept of exposure might simply be being in a situation which has some risk of involvement in a RTA, a risk which theoretically can be measured for both active and passive element of the traffic system. An operational definition of exposure for road safety researchers might be: a measure of the frequency of being in a given traffic situation, which number can be used as the denominator in a fraction with the number of accidents which take place in that situation as the numerator, thus producing an accident rate or risk of being in an accident when in that situation. However, there must be a common measure of exposure (time, mileage or whatever) as the denominator for each rate. As Chapman (1973) pointed out, there are two ways of viewing exposure to the risk of accident in the road traffic network. Exposure and accident rates for the vehicle or road user as it moved along in the system, or one can seek to determine exposure and accident rates for particular sites or fixed objects as the would users go past. A direct count of road user movements seemed the most appropriate measure. Others include counts of number of conflicts and sum of entering flows at intersections amongst several others. The two basic approaches of exposure data collection involved obtaining data while trips are in process, and the second involved obtaining data on completed trips. These two methods are sometimes combined by beginning the data collection while the trip is in process and then obtaining more data after the trip is completed. The mechanical counter and sophisticated equipment can record vehicle length and weight, turning movements, exact time of day, moisture conditions etc. More detailed information can be obtained by human observation or automatic cameras. Even more information can be obtained by actually stopping a vehicle along the roadside and interviewing the occupants or similarly stopping a pedestrian. Another approach involved installing special instruments in a sample of vehicles to obtain information such as acceleration and deceleration rates, distance travelled, fuel consumed etc. For exposure data on completed trips, three obvious methods are in - person interview, telephone interview and mail back questionnaire. Such surveys seek data on all relevant travel behaviour during one or more days. Distance travelled seemed generally the most appropriate exposure measure (Carroll, 1973). Others suggested include duration of travel, number of discrete trips and number of road crossings. A combination of the trip-in-process and completed trip approaches involved a short roadside interview plus passing out a mail back form for the road user to complete later, or obtaining a telephone number for a subsequent telephone interview about the completed trip. Another combined method used automatic cameras or human observation. Very little research had been carried out concerning the relative reliability validity and costs of the exposure data collected by the various methods discussed. A comparative research on appropriate exposure data collection methodologies in the near future will support road safety researchers, transportation and energy 74

consumption planners. Nicholson (1999) discussed the need for spatial distributions of accidents to be analysed, as an aid to selecting the most appropriate type of accident reduction programme and also assessing the effectiveness of such plans after implementation. A new classification scheme for spatial distributions was proposed such that detection of a particular pattern indicates which type of accident reduction programme was likely to be the most effective. The three types of accident reduction programmes considered include site, route and area plans. It was argued that current practice for assessing spatial distributions of accidents was insufficiently objective, and statistical analysis techniques for spatial data needed to be reviewed. Two issues were addressed, the first being which plan type should be selected initially and when should one progress to a different plan type? Current practice for the analysis of spatial distributions of accidents relied almost entirely upon visually examining a map showing the location of accidents, super imposed upon a map of the road network. Cressie (1991) noted that due to the subjective nature of this approach, observers may well disagree as to the existence and nature of any pattern in the accident distribution. There was thus a need for techniques for quantifying the extent of deviations from complete spatial randomness using statistical techniques to take account of the variability inherent in accident occurrence. The existing approach for measures of clustering and spatial autocorrelation involved either measuring the extent of either accident clustering or spatial autocorrelation (Hoque and Andreassen, 1986). The extent of accident clustering can be represented pictorially using accident count profiles, frequency and cumulative distribution and Lorenz curves (Nicholson, 1989). Cowell (1977) and Nicholson (1990) suggested a number of qualitative indices of inequality of wealth developed by economists to be adapted to the analysis of accident clustering. Whereas, Loveday (1989, 1991) proposed investigation of the extent of spatial autocorrelation using Morans index. The traditional random classification of distribution of accidents included complete spatial randomness, accidents not clustered, but arranged regularly, accident clustered at randomly distributed points, and accidents clustered along lines. An alternative classification system suggested by Nicholson (1999) was termed stationary and isotropic, stationary and anisotropic, non-stationary and anisotropic respectively. Given the disadvantages of the quadrat approach, it was decided to analyze both distances and directions to nearest neighbours, to assist detection of clustering at sites or along routes. For the analysis of distances, an alternative was to use the K function instead of the nearest neighbour distance statistics (Cressie, 1991). The directions to the nearest neighbours are bearings in the range (00 , 3600 ) and special techniques were required for the description and analysis of such circular data (Upton and Fingleton, 75

1989). If the distribution of directions to the nearest neighbours was sufficiently non-uniform one might conclude that an accident process was not isotropic. Pearson and Hartley (1972) noted that the Kolmogorov Smirnov statistics were not suitable for circular data. Upton and Fingleton (1989) indicated three potentially useful tests, the Rayleigh-Wilkie, the Kuiper Stephens and the Waston Stephens tests. To correct edge effects, Cressie (1991) proposed three general approaches which included construction of a guard area, assuming that the region is the centre plot of a 3 x 3 grid of plots identical to the region and obtaining empirical, finite-sample corrections for statistics or indices . Nicholson (1999) suggested that when the region is rectangular, the second method be adopted, whereas if there is a strong linear pattern of events within the region, then the first method may well be better. If the region is irregular, then, the guard area approach would be the best method. For distinction between the analysis of accidents as a continuum and accidents on a lattice there had been no distinction. Road accident data can be considered to be lattice data and the lattice might be regular or irregular. Nicholson (1999) proposed there would be no need to modify the tests of directions for implementation with lattice data, but the distance distribution testing should be modified. Nicholson (1999) suggested if one had the spatial distribution of accidents for a large area, it may well be that different types of accident reduction plans will be best in different parts of the area and the problem is to identify those sub-areas with spatial distributions distinctly different from those in other sub-areas. Also, GIS had been concerned with such matters as computational geometry, spatial languages, user interfaces, systems designs and architectures for data integration etc. while little effort have gone into incorporating statistical analysis techniques for spatial data into information processing subsystems (Cressie, 1991) . Further work to thoroughly evaluate the statistical analysis techniques for spatial data with more realistic accident distributions was said to be required. Savolainen et al. (2011) summarized the evolution of research and current thinking as it related to the statistical analysis of motor-vehicle injury severities and provided a discussion of future methodological directions. Reducing the severity of injuries resulting from motor-vehicle crashes had long been a primary emphasis of highway agencies and motor-vehicle manufacturers. While progress could be simply measured by the reduction in injury levels over time, insight into the effectiveness of injury reduction technologies, policies and regulations required a more detailed empirical assessment of the complex interactions that vehicle, roadway and human factors had on resulting crashinjury severities. To gain such an understanding, safety researchers had applied a while variety of methodological techniques over the years. While these various methodological applications 76

have undoubtedly provided new insights, the fundamental characteristics of crash data often resulted in methodological limitations that were not fully understood or accounted for. A summary of characteristics of crash injury severity data was given. Injury severity data were said to be generally represented by discrete categories such as fatal injury or killed, incapacitating injury, non-incapacitating, possible injury and property damage. Statistical models were generally developed under the assumption that the sample data were randomly selected from a population and that each crash or crash-involved individual had an equal probability of being sampled. However, in traditional crash databases not all crashes were reported, hence the problem of under reporting of crashes. Savolainen et al. (2011) suggested rather than sampling randomly from the population, crash data were more appropriately described as an outcome based sample. The ordinal nature of crash and injury severity data, recognition should be an important consideration in selecting an appropriate methodological approach. In addition, because injury severity categories are ordered and in sometimes closely related categories, there may be shared unobserved effects among adjacent injury categories. Savolainen et al. (2011) therefore, suggested accounting for such correlation because failing to account for such correlation could result in biased parameter estimates and in correct inferences. Also, the majority of existing literature related to injury severity of existing literature related to injury-severity models included models comprised of fixed parameters, which restricted the effects of explanatory variables to be the same across individual injury observations. Whereas, unobserved heterogeneity was likely to exist among the population of crash-involved road users as a result of differences in risk-taking behaviour, physiological factors etc. Suggesting that these parameters may vary across injury observations, such effects need to be accounted for otherwise potential bias and erroneous statistical inference may result. Other limitations included the omission of relevant variables due to limitations from amount of information available from crash reports or other factors. It is often a necessary limitation when drawing injury-severity insights to guide safety decisions. However, Savolainen et al. (2011) suggested simple models, such as those which assume fixed parameters when the volume of crash data available was small. Another area of consideration was the endogeneity issue. A problem where explanatory variables are correlated with the disturbance terms (unobserved heterogeneity). Thus, making endogeneity problem far more subtle and difficult to handle. Other characteristics discussed were within-crash correlation and spatial and temporal correlations.

77

2.5

Summary and Conclusion

Aggregated and disaggregated RTC data are collected across spatial units of observation located in geographic space. Despite the locational attributes of RTC data and the long existence of spatial models, no study on RTC known to the researcher have been investigated through modifications of the SAR model. In literature, the Poisson and classical regression based models are known to be widely applied to RTC data, while spatial models are widely applied to crime and homicide data collected across geographic regions (Anselin, 1988; Drukker et al., 2013). Similarly, most empirical exercises data are obtained for observations which are ordered in space or time. In these situations, the observations can be characterized by their absolute location, using a coordinate system, or by their relative location, based on a particular distance metric. In other words, data are organized by spatial units of observation, in the most general sense. Familiar examples of this empirical situation are the use of data on population, employment, and other economic activity collected for administrative units such as states, provinces, countries or census tracts located in geographic space. Therefore, the importance of space as a fundamental concept underlying the essence of data analysis is unquestionable. In most circumstances, the questionable assumption in cross - sectional studies is that cross-sectional units are mutually independent. This is not always true and generally spatial effects are not taken into account. For instance, when the cross-sectional units are geographical regions with arbitrarily drawn boundaries such as the states of the United States we would not expect this assumption to be well satisfied (Kmenta, 1971). Ultimately, ignoring spatial dependence when it is actually present leads, at best, to inefficient OLS estimators and biased statistical inference and, at worst, to biased and inconsistent OLS estimators. Table 2.2 summarizes the literature according to date of publication while Table 2.3 represent an overview of literature according to topical contributions and intended contributions of the present work.

78

Table 2.2: Summary of Literature According to Date of Publication Date

Authors

2001-2014

Anselin (2001), Spiegelhalter et al.(2002), Spiegelhalter et al.(2003) Amoros et al.(2003), Miaou et al.(2003), Miaou and Lord (2003),Bhat (2003), Peden et al.

(2004),WHO (2004),Griffith(2005),Mitchell (2005),Noland and Qud-

dus (2005), Kim et al.

(2006),Lord(2006), Aguero-Valverde and Jovanis

(2006),Green (2007), Artur(2007), Lord and Miranda-Moreno (2007), Quddus(2008), Aguero-Valverde and Jovanis (2008), Anastasopolous et al.(2008), Erdogan (2009), Wang et al. (2009),Ye et al. (2009), Guo et al.(2010), Huang and Abdel-Aty(2010), Anastasopolous et al.(2011), Ahmed et al. (2011), Washington et al. (2011), Ogbodo and Nduoma (2011),Ohakwe et al. (2011), Savolainen et.

al.

(2011), Sukhai et al.

(2011),Ugwu(2011), Ki-moon

(2012),Abdullah and Zamri (2012), Aderamo (2012), The Africa Report (2013), Agbeboh and Osabuohien-Irabor (2013), Drukker et al. (2013), Drukker et al. (2013a), Drukker et al. (2013b),Global Status Report on Road Safety (2013), Sumaila (2013), Yu and Abdel-Aty(2013), Chun et al. (2014), Roque and Cardoso (2014), Jiang et al. (2014),NBS (2014). 1991- 2000

Cressie (1991), Case (1991), Fridstrom and Ingebrigstsen (1991), Jones et al. (1991), Loveday (1991), Anselin (1992), Getis and Ord (1992), Kelejian and Robinson (1992), Kelejian and Robinson (1993), Levine et. al., (1994), Peters (1994), Anselin (1995), Kelejian and Robinson (1995), Shanker et al. (1995), Bernardinelli et al. (1995), Levine et. al. (1995a and b), Chang et al. (1996), Anselin and Smirnov (1996), Black and Thomas (1998), Shefer and Rietveld (1997), Schluter et al. (1997), Kelejian and Prucha (1998), Anselin and Bera (1998), Shanker et al. (1998),Milton and Mannering (1998), Nicholson (1999), Train (1999), Wakefield et al. (2000), Taylor et al. (2000),Lord (2000), AbdelAty and Radwan (2000).

1981-1990

Ripley (1981), Tanaka et.

al.

(1982), Wolfe (1982), Bloomestein (1983),

Gourieroux et al. (1984b), Breslow(1984), Bloomestein (1985), Heshmati and 79 Kandel (1985), Nicholson(1985), Hoque and Andreassen (1986), Jovanis and Chang(1986), Anselin (1988), Jegede (1988), Maher and Mountain (1988), Bennet (1989), Loveday (1989), Mannering (1989), Nicholson (1989), Okamoto and

80

2

approach

analysis/

tion/

Poisson

analysis

tion/

Bayes

regression

distribu-

regression

distribu-

Poisson

1 Poisson

based

regression

Yu

et

al and

Abdel- Aty(2013)

(2011),

Ahmed

(2009),

model

son

regression

servations that would be found in nor-

(2014)

Ye et al.

crete and non negative values of ob-

Roque and Cardoso

Provides

fied to a gamma prior.

a precision parameter which was speci-

the different characteristics RTC with

a systematic approach to investigate

ity levels and crash types.

up to analyze crash frequency by sever-

tivariate normal heterogeneity was set

The Poisson regression model with mul-

dents in a given duration.

scription of the number of traffic acci-

mal regression analysis. Reasonable de-

overcome the problems caused by dis-

Jones et al. (1991),

model for accident duration. The Pois-

Proposed

Contributions

son regression model can effectively

Multivariate Pois-

model

Poisson regression

Models

(1988),

mountain

(1986), Maher and

Jovanis and Chang

Authors

variable.

bance term and introduce instrumental

in the endogenous variable, the distur-

to incorporate spatial autocorrelation

Modify the spatial autoregressive model

variable.

bance term and introduce instrumental

in the endogenous variable, the distur-

to incorporate spatial autocorrelation

Modify the spatial autoregressive model

Intended Contributions

Table 2.3: Overview of Literature According to Topical Contributions and Intended Contribution of the Present Research Work

Research Area

S/N

81

5

4

3

able, the disturbance term and introduce instrumental variable.

els were estimated under the Full hierarchical Bayesian framework.

with autoregressive priors models

and Radwan (2000), Amoros et al.(2003),

Miaou et al. (2003), Noland and Quddus

(2005), Kim et al. (2006), Lord (2006),

able, the disturbance term and introduce instrumental variable.

ates using a Poisson based formulation.

analysis

Modify the spatial autoregressive

correlation in the endogenous vari-

Modelled crash frequency by ap-

ables.

model to incorporate spatial auto-

cal models.

Full Bayes hierarchi-

nique. Include the effects of covari-

and Abdel-Aty (2013), Jiang et al. (2014)

Aguero-Valverde and Jovanis (2008), Yu

troduce instrumental variable.

able, the disturbance term and in-

plying the Bayesian inference tech-

dis-

frame-

frequency and the explanatory vari-

correlation in the endogenous vari-

model to incorporate spatial auto-

tribution/regression

work/Poisson

Bayesian

linear relationship between crash

Chun et al. (2014).

son regression model. Assumed log-

Ahmed et al, (2011), Yu et al.

(2013),

Miaou and Lord (2003), Guo et al. (2010),

sion analysis/

dispersion problems found in Pois-

model.

Ingebrigtsen (1991), Shankar et al. (1998),

binomial

Negative

Jovanis and Chang (1986), Fridstrom and

binomial

distribution/ regres-

Negative

(2009).

Valverde and Jovanis (2008), Wang et al.

Lord and Miranda-Moreno (2007), Aguero

Modify the spatial autoregressive

correlation in the endogenous vari-

tial contiguity based weights. Mod-

Poisson-lognormal

ton and Mannering (1998), Abdel-Aty

Bayes approach

Proposed model can handle over-

model to incorporate spatial auto-

accounted for in the form of spa-

Poisson-gamma,

conditional

Modify the spatial autoregressive

Neighbourhood characteristics are

Poisson-lognormal,

etveld(1997), Schluter et al. (1997),Mil-

(1995), Shefer and Ri-

Shanker et al.

regression

analysis/

Poisson distribution/

82

8

7

6

Regression analysis

Regression analysis

analysis

effects

(2008,2011),

(2011).

Washington et al.

al.

et

model.

ters tobit regression

variable.

not have accidents recorded during the analysis period.

bance term and introduce instrumental

in the endogenous variable, the distur-

to incorporate spatial autocorrelation

Modify the spatial autoregressive model

variable.

bance term and introduce instrumental

in the endogenous variable, the distur-

to incorporate spatial autocorrelation

occurs when some highway segments do

addressed the censoring problem that

heterogeneity in accident counts. Also,

Addressed the possibility of unobserved

and number of accidents are significant.

Aderamo (2012),

Anastopoulous

number of accidents recorded by types

parame-

ANOVA statistics

proved that the variations in both the

gression model.

Modelled RTC using the multiple re-

follow a normal distribution. Modify the spatial autoregressive model

variable.

effects variable which account for the unobserved geometric factors are set to

bance term and introduce instrumental

in the endogenous variable, the distur-

to incorporate spatial autocorrelation

Modfiy the spatial autoregressive model

binomial distribution and the random

fects logistic regression set to follow a

a normal distribution with random ef-

Random effects variable is set to follow

and Koshi (1989),

Random

model

(1988) , Okamoto

Jegede

(1978),

regression

Multiple regression

model

logistic

Random

Koshi and Ohkura

(2013)

tion/ normal distri-

bution /regression

Yu and Abdel-Aty

Binomial distribu-

83

84

Regression analysis/

10

Spatial regression

Time series analysis

12

13

Source: Produced by authors

Spatial analysis

11

weights matrix

Regression analysis

9

regression

and introduce instrumental variable.

(2013)

incorporate spatial autocorrelation in the

Modify the spatial autoregressive model to

endogenous variable, the disturbance term

time series

Time series decomposition and univariate

endogenous variable, the disturbance term and introduce instrumental variable.

incorporate spatial autocorrelation in the

of spatial autocorrelation

Modify the spatial autoregressive model to

and introduce instrumental variable.

endogenous variable, the disturbance term

incorporate spatial autocorrelation in the

neighbourhood characteristics in the form

The regression model took account of

Measure spillover effects

dents and number of deaths Modify the spatial autoregressive model to

and introduce instrumental variable.

significant effect on both number of acci-

endogenous variable, the disturbance term

highways parameter was shown to have a

incorporate spatial autocorrelation in the

Modify the spatial autoregressive model to

and introduce instrumental variable.

autocorrelation in the endogenous variable, the disturbance term

Modify the spatial autoregressive model to incorporate spatial

Ordinary Least Squares model. Length of

nificantly improved model fitting over the

F tests suggest that the GWR model sig-

describe RTC at two threshold levels.

Fuzzy regression model was developed to

Irabor (2013), Sumaila

Agbeboh and Osabuohien-

Time series models

model

1995b)

Ohakwe et al.(2011),

Spatial

Levine et. al (1995a,

regression

weights matrices

(2009)

Spatial

Erdogan

Black and Thomas contiguity

regression

(1998),

model

weighted

Geographically

model.

(2012)

Erdogan (2009)

Fuzzy

Abdullah and Zamri

Chapter 3

METHODOLOGY 3.1

Introduction to the Chapter

This section outlines variants of SAR models, the underlying assumptions and the estimation techniques intended for use in the present research work. Various criteria for accounting for contiguity are examined, while forms of measure of proximity between spatial units are discussed. The indices used to measure spatial dependence are also presented. This section is exclusively drawn from Anselin (1988) and Drukker et al. (2013a, 2013b).

3.2

Spatial Autoregressive Models and the Variants

A simple spatial model considers spatial dependence either in the dependent variable or in the disturbance term. When the focus is on the dependent variable, the spatial dependence is modelled by including a Right-Hand-Side (RHS) variable known as a spatial lag. Each observation of the spatial-lag variable is a weighted average of the values of the dependent variable observed for the other cross-sectional units. This model is usually referred to as a SAR model. A generalized version of this model also allows for the disturbances to be generated by a SAR process. The combined SAR model with SAR Disturbances is referred to as (SAR-SARD) model.The SAR-SARD model allows for additional endogenous RHS variables. In such a case, the model is a linear cross sectional SAR model with additional endogenous regressors, exogenous variables and SAR disturbances.This can be described as the SAR model with SAR disturbances and additional endogenous variables (SAR-SARD-IV) model. Spatial interactions are modeled through spatial lags, and the model allows for spatial interactions in the dependent variable, the exogenous variables, and the disturbances (Drukker et al., 2013).

85

3.2.1

Spatial Autoregressive Model y = λW y + Xβ + ε

(3.2.1)

where y is an N x1 vector of observations on the dependent variable for all locations W y is a weighted matrix of N x1 vector of values for the dependent variable summed over all locations W is a known N xN spatial weights matrix whose diagonal elements are zero λ is the scalar spatial autoregressive coefficient X is an N xK matrix of observations on the K explanatory variables β is a Kx1 vector of K parameters  is an N x1 vector of normally distributed random error terms, with zero mean and constant variance In other words, the value of the dependent variable yi is a function of the value yj at all other locations as well as a function of the independent variables. In practice, the SAR model is too general and has to be defined specifically. A W is defined by using an interaction between the geographical location of each observation and all observations at every other location.

Assumptions X is non singular, that is, there is no perfect multicollinearity and is independently and identically distributed. It is also assumed to be of full column rank and its elements are assumed to be asymptotically bounded in absolute value  is independently and identically distributed W satisfies the condition that (IN − λW ) is non singular for all |λ| < 1 IN is an identity matrix of dimension N E(X|ε) = 0 indicating strict exogeneity ∧



Residuals are the estimates for the model error term (IN − λ W )y − X β ∧





The predicted values are the estimates of y = (IN − λ W )−1 X β ∧

The prediction errors are the estimates of y − y

86

3.2.2

The Spatial Autoregressive Model with Spatial Autoregressive Disturbances Model

The SAR model with SAR disturbances (SAR-SARD) is specified as follow: y = λW1 y + Xβ + u

(3.2.2)

u = ρW2 u + ε

(3.2.3)

where y is an N x1 vector of observations on the dependent variable λ and ρ are the corresponding scalar parameters typically referred to as SAR parameters W1 and W2 are N xN spatial weighting matrices with 0 diagonal elements W1 y and W2 u are N x1 vectors typically referred to as spatial weighted lags X is an N xK matrix of observations on K RHS exogenous variables

87

β is the corresponding Kx1 parameter vector u is an N x1 vector of disturbances  is an N x1 vector of innovations which indicate the nature of the distribution

Assumptions X, E(X|) and IN are as previously defined. The spatial weighting matrices W1 and W2 are taken to be known and non stochastic In most cases W1 = W2 Suppose y¯ = W y, then, y¯i =

n 

wij yj , which clearly shows the dependence of yi on neighbour-

j=1

ing outcomes via the spatial lag y¯i The innovations  are assumed to be independently and identically distributed (IID) or independent but heteroskedastically distributed, where the heteroskedasticity is of unknown form. The model in (3.2.2) and (3.2.3) is a SAR-SARD with exogenous regressors. Spatial interactions are modelled through spatial lags. The model allows for spatial interactions in the dependent variable, the exogenous variables and the disturbances. By construction, the spatial lag W1 y is an endogenous variable. The weights wij will typically be modelled as inversely related to some measure of proximity between the units. The SAR parameter λ and ρ measures the extent of these interactions. The model in (3.2.2) and (3.2.3) is a first order SAR model with first-order SAR disturbances. It is also referred to as a SAR-SARD(1,1) model, which is a special case of the more general SAR-SARD (p, q) model. We refer to a SAR-SARD (1,1) model as a SAR-SARD model. When ρ = 0 , the model in equations (3.2.2) and (3.2.3) reduces to the SAR model y = λW1 y +Xβ +ε. When λ = 0, the model in equations (3.2.2) and (3.2.3) reduces to y = Xβ + u with u = ρW2 u + ε which is sometimes referred to as the SAR error model. Setting λ = 0 and ρ = 0, the model in equations (3.2.2) and (3.2.3) reduces to a linear regression model with exogenous variables (see equation 2.2.34). If the SAR parameters are significant and have a positive or negative sign, it means there is relationship between what obtains in a particular spatial unit and its neighbours. 88

3.2.3

The Spatial Autoregressive Model with Spatial Autoregressive Disturbances with Additional Endogenous Variables Model y = Y π + Xβ + λW1 y + u

(3.2.4)

u = ρW2 u + ε

(3.2.5)

where y is as previously defined Y is an N xP matrix of observations of additional endogenous variable π is the corresponding P x1 parameter vector X is an N xK matrix of observations on K RHS exogenous variables (where some of the variables may be spatial lags of exogenous variables) β, W1 , W2 , W1 y, W2 u, λ, ρ and  are as previously defined

Assumptions X, W1 , W2 , y¯, y¯i ,  and E(X|) are as previously defined The model in equations (3.2.4) and (3.2.5) is a SAR-SARD model with exogenous regressors and additional endogenous regressors. That is, the SAR model with SAR disturbances and Instrumental Variables (SAR-SARD-IV). Spatial interactions are modelled through spatial lags, and the model allows for spatial interactions in the dependent variable and the disturbances. The weights wij will typically be modelled as inversely related to some measure of distance between the units. The SAR parameter λ and ρ measure the extent of these interactions.

89

3.3

Rook, Bishop and Queen Contiguity

The ultimate objective of the use of W in the specification of spatial econometric models is to relate a variable at one point in space to the observations for that variable in the other spatial units in the system. In a time series context, this is achieved by using a lag operator, which shifts the variable by one or more periods in time (Box and Jenkins, 1976; Dhrymes, 1981). For instance: (3.3.1) yt−k = Lk yt Equation (3.3.1) shows the variable y shifted k periods back from t, as a k − th power of the lag operator L. In space, matters are not this straightforward, due to the many directions in which the shift can take place. As an illustration, consider the regular lattice structure in Figure 3.1. The variable x, observed at location i, j can be shifted in the following ways , using the simple contiguity criteria: 1. Rook Criterion Using a rook criterion of contiguity, to: xi−1,j ; xi,j−1 ; xi+1,j ; xi,j+1 2. Bishop Criterion Using a bishop criterion of contiguity, to: xi−1,j−1 ; xi+1,j−1 ; xi+1,j+1 ; xi−1,j+1 3. Queen Criterion For a queen type of contiguity, the number of possible locations increases to a total of eight, that is, xi−1,j ; xi,j−1 ; xi+1,j ; xi,j+1 and xi−1,j−1 ; xi+1,j−1 ; xi+1,j+1 ; xi−1,j+1 In most applied situations, there are no strong a priori motivations to guide the choice of the relevant direction of dependence. This problem is compounded when the spatial arrangement of observations is irregular, since then an infinite number of directional shifts becomes possible. Clearly, the number of parameters associated with all shifted positions quickly would become unwieldy and preclude any regular way. Therefore, the remaining degrees of freedom would be insufficient to allow an efficient estimation of parameters. This problem is resolved by considering a weighted sum of all values belonging to a given contiguity class, rather than taking each of them individually. The terms of the sum are obtained by multiplying the observations in question times the associated weight from the W . Formally, Ls x i =

 j

wij xj , ∀j ∈ Ji

90

(3.3.2)

where Ls is the lag operator associated with contiguity class s, j is the index of the observations belonging to the contiguity class s for i, and wij are the spatial weights. In matrix notation, with all observations in the system contained in a vector x, this becomes, for contiguity class s: Ls x = W s x

91

(3.3.3)





Figure 3.1: Spatial Lags on a Regular Lattice Source: Anselin (1988)

92

Clearly, the resulting notion of a spatially lagged variable is not the same as in time series analysis, but instead is similar to the concept of a distributed lag. The joint determination of the weights and measures of statistical association, such as correlation or regression coefficients becomes a non linear problem. However, by fixing the weights a priori, this is reduced to a more manageable linear problem, at the risk of imposing a potentially wrong structure. A less restrictive spatially lagged variable can be constructed from the notion of potential or accessibility as: fi =

 j

q (dij , θ).xj

(3.3.4)

where fi is the potential at i, and q is a function of distance dij between i and the other spatial units j, parameterized in terms of a vector of coefficients θ. Since the resulting expression is non-linear, estimation and hypothesis testing will be more complex. The actual weight matrix W used in the spatial lag is often standardized such that the row elements sum to one. Although there is no mathematical or statistical requirement for this, in many instances it facilitates the interpretation of the model coefficients. On the other hand, the standardization matrix is usually not symmetric, which has implications for the numerical complexity of estimation and testing procedures. The standardization of the W will usually not be carried out automatically. In fact, when the weights are based on an inverse distance function or a similar concept of distance decay, which has a meaningful economic interpretation, scaling the rows so that the weights sum to one may result in a loss of that interpretation.

93

3.3.1

Coordinate Point Distances

This section specifies how distances are measured for each feature to its nearest neighbourhood feature. These measures of distances are usually taken in meters or kilometres. Centroid Distance: Calculates the average distance of all coordinates in all of the coordinate directions. Threshold Distance: Calculates the distance between points given by two different latitude/ longitude reports. The procedure involves calculating the distance between a latest location report and a previous report. Subsequently, this information is updated when the distance exceeds a given threshold.

Euclidean Distance: Calculates foe each spatial unit, the Euclidean distance to the closest spatial unit. As the crow flies, the straight line distance between two points. Manhattan Distance: The distance between two points measured along axes at right angles. Calculated by summing the absolute difference between the X and Y coordinates. Standard Distance: Measures the degree to which features are concentrated or dispersed around the geographic mean center. Inverse Distance: All features impact/influence other features but the farther away something is, the smaller the impact it has. Inverse Distance Squared: Same as inverse distance except that the slope is sharper, so influence drops off more quickly and only a target features closest neighbours will exert substantial influence in computations for that feature. Fixed Distance Band: Each feature is analyzed within the context of those neighboring features within some specified critical distance. Features outside the critical distance of a target feature do not influence calculations for that feature. Point Distance: Determines the distances between point features in the input features to all points in the near feature within the search radius. The GeoDa package has provision for threshold and centroid distances while the ArcGIS 10.1 software has provision for manhattan, euclidean, inverse and inverse squared distances. This study used the centroid distance for the calculations and the unit of measurement is meters.

94

3.4

Estimation Techniques

The estimation methods include Maximum Likelihood (ML), Quasi-Maximum Likelihood (QML), Ordinary Least Squares (OLS), Generalized Least Squares estimators (GLS), Generalized Spatial Two-Stage Least Squares (GS2SLS) and Generalized Method of Moment (GMM) techniques.

3.4.1

Estimation of Spatial Autoregressive Model

The estimation technique for the spatial regression involves two stages. The ML technique was used to filter the spatial dependence. After which, the OLS and the GLS estimators were used to obtain the values for the coefficients of the SAR model respectively (Anselin, 1988). Following the specification for the spatial process model from (3.2.2) and (3.2.3) with ε ∼ N (0, Ω)

(3.4.1)

and the diagonal elements of the error covariance matrix Ω as:

Ωii = hi (zα) , hi > 0

95

(3.4.2)

In this specification β is a Kx1 vector of parameters associated with exogenous (not lagged dependent) variables X (N xK), λ is the coefficient of the spatially lagged dependent variable and ρ is the coefficient in a spatial autoregressive structure for the disturbance u. The disturbance  is taken to be normally distributed with a general diagonal covariance matrix Ω. The diagonal elements allow for heteroscedasticity as a function of P + 1 exogenous variables z, which include constant term. The P parameters α are associated with the non constant terms, such that for α = 0, it follows that h = σ 2 (the classic homoscedastic situation). The two N xN matrices W1 and W2 are standardized or unstandardized spatial weight matrices, respectively associated with a spatial autoregressive process in the dependent variable and in the disturbance term. This allows for the two processes to be driven by a different spatial structure. In all, the model has 3 + K + P unknown parameters, in vector form:  θ = λ, β  , ρ, σ 2 , α

(3.4.3)

This model can also be expressed in a non-linear form, which facilitates the illustration of the relevant results. Let A = I − λW1 (3.4.4) B = I − ρW2

(3.4.5)

Ay = Xβ + u

(3.4.6)

Bu = ε

(3.4.7)

which gives, for (3.2.2) and (3.2.3):



Also, since the error covariance matrix E (εε ) = Ω is diagonal, there exists a vector of homoskedastic random disturbances v, as v = Ω1/2 

(3.4.8)

ε = Ω1/2 v

(3.4.9)

u = B −1 .Ω1/2 v

(3.4.10)

or, alternatively, and the disturbance in (3.4.7) becomes

Substituting (3.4.10) in (3.4.6) gives: Ay = Xβ + B −1 .Ω1/2 v or, alternatively, Ω1/2 .B (Ay − Xβ) = v

96

(3.4.11)

In this nonlinear in parameters expression, v is a vector of standard normal and independent error terms. Consequently, (3.4.11) conforms to the usual expression for the implicit form of nonlinear models, f (y, X, θ) = v (3.4.12) where f is a general nonlinear functional form relating y, X and a vector of parameters θ, and v is the disturbance term. Although the error term v has a well behaved joint distribution, it cannot be observed, and the likelihood function has to be based on y. Therefore, it is necessary to introduce the concept of a Jacobian, which allows the joint distribution for the y to be derived from that for the v, through the functional relationship expressed in (3.4.11). The Jacobian for the transformation of the vector of random variables v into the vector of random variable y is:  J = det ∂v/∂y

(3.4.13)

 −1/2    Ω BA = Ω−1/2  . |B| . |A|

(3.4.14)

which, using (3.4.11) becomes:

97

Based on a joint standard normal distribution for the error term v, and using (3.4.14), the log-likelihood function for the joint vector of observations y is obtained as :       1 1 N (3.4.15) L=− .ln (π) − .ln (Ω) + ln |B| + ln |A| − vv 2 2 2 with, v  v = (Ay − Xβ) B  Ω−1 B (Ay − Xβ)

(3.4.16)

as a sum of squares of appropriately transformed error terms. From (3.4.15), it follows that a maximization of the likelihood function is equivalent to a minimization of a sum of squared transformed errors, corrected by the determinants from the Jacobian. This correction, and in particular its spatial terms in A and B, will keep the least squares estimate from being equivalent to ML. The extent of the difference between the two estimators is largely a function of the magnitude of these two determinants. Specifically, for standardized weight matrices W1 and W2 , as either λ → +1 or ρ → +1, the adjustment becomes infinitely large. The essential part of the log - likelihood consists of a quadratic form in the error terms, which leads to a well - behaved optimization problem. However, the determinants , |Ω|, |A| and |B| in (3.4.15) can cause problems in this respect. Indeed, the asymptotic properties for the ML estimates will only hold if the regularity conditions for the log -likelihood function are satisfied. In the current context, both A and B can lead to explosive behaviour for particular parameter values, and Ω may fail to be positive definite. It therefore becomes necessary to ensure that the following general condition holds for the Jacobian:  −1/2  Ω .A.B  > 0

(3.4.17)

which is satisfied by the partial requirements:

|I − λW1 | > 0

(3.4.18)

|I − ρW2 | > 0

(3.4.19)

hi (zα) > 0, ∀i

(3.4.20)

The constraint (3.4.20) is a familiar one in random coefficient models. Constraints (3.4.18) and (3.4.19), result in restrictions on the values that the SAR coefficients can take. For standardized 98

weights matrices this usually means that the parameter should be less than one. The first order conditions for the MLE in the model (3.4.11) are obtained by taking the partial derivatives of the loglikelihood (3.4.15) with respect to the parameter vector. This involves a tedious but fairly straightforward application of matrix calculus to derive the score vector and information matrix. In the notation below, tr stands for trace of a matrix, and αp stands for the p − th element of the vector α, with p = 0, 1, ..., P. ∂h , where h is h (zα), or explicitly, for Hp stands for the diagonal matrix with elements ∂α p where s = zα and zp is the p − th element of the vector z.

∂h  ∂s

.zp ,

Considering equation (3.4.17) For the derivation of the first partial derivative of this expression with respect to the elements of the parameter vector, a derivative of the natural log of a determinant is needed to deal with the partials of |Ω|, |A| and |B| with respect to their parameters. Also, the rule for the partial derivative of a matrix product and of a quadratic form has to be applied several times to take the partial of the v  v term with respect to each parameter. In the derivation of the information matrix, the partial derivative of the inverse of a matrix with respect to a scalar is needed.

99

The various rules are: ∂ (λW1 ) = W1 ∂λ

(3.4.21)

∂A ∂ (I − λW1 ) = ∂λ ∂λ

(3.4.22)

=

∂ (λW1 ) ∂I − ∂λ ∂λ

(3.4.23)

= −W1

(3.4.24)

∂A ∂In |A| = trA−1 . ∂λ ∂λ

(3.4.25)

= trA−1 . (−W1 )

(3.4.26)

∂ Ω−1/2 .B (Ay − Xβ) ∂v = ∂λ ∂λ   = Ω−1/2 .B ∂A/∂λ y = Ω−1/2 .B. (−W1 ) .y    ∂v ∂v + .v ∂λ ∂λ   ∂v = 2v  ∂λ

∂v  v = v ∂λ

(3.4.27) (3.4.28) (3.4.29)



= 2.Ω−1/2 .B. (−W1 ) .y ∂A−1 = −A−1 . ∂λ

100



∂A ∂λ

(3.4.30) (3.4.31) (3.4.32)

 .A−1

(3.4.33)

= A−1 .W1 .A−1 

∂tr (A−1 W1 ) ∂A−1 W1 = tr ∂λ ∂λ

(3.4.34)  (3.4.35)

The resulting vector of the first partial derivatives, the score vector, is set equal to zero and needs to be solved for the parameter values: (the first order conditions)  d=

∂L ∂θ

 =0

(3.4.36)

with the elements of d as:

 ∂L = v  Ω−1/2 .B.X ∂β

(3.4.37)

∂L = −trA−1 W1 + v  Ω−1/2 .B.W1 .y ∂λ

(3.4.38)

∂L = −trB −1 W2 + v  Ω−1/2 .W2 . (Ay − Xβ) ∂λ     1 ∂L 1 =− trΩ−1 Hp + v  Ω−3/2 .Hp .B (Ay − Xβ) ∂αp 2 2

(3.4.39) (3.4.40)

for p = 1, ...P This system of highly nonlinear equations does not have an analytic solution and needs to be solved by numerical methods.

101

The Asymptotic Variance Matrix Under the usual regularity conditions, the ML estimates that are found as solutions to the system (3.4.37) - (3.4.40) will be asymptotically efficient. This means that they achieve the Cramer - Rao lower variance bound, given by the inverse of the information matrix:

[I (θ)]−1 = −E



∂ 2L ∂θ∂θ

−1 (3.4.41)

The elements of the information matrix are found by taking the second partial derivatives with respect to the elements of the parameter vector θ (see equation 3.4.3) and by using the structure for the disturbance terms given in (3.2.3) and (3.4.7) - (3.4.10) to derive the relevant expected values. The first step in the derivation of the elements of the information matrix for the general model consists of obtaining the second partial derivatives of the log likelihood with respect to the elements of the parameter vector θ. A fairly straightforward but tedious application of the matrix calculus principles is obtained from the derivation of the first partial derivative with respect to the elements of the parameter vector in the earlier part of this section. For the diagonal elements: ∂ 2L = −X  B  Ω−1 BX ∂β∂β 

(3.4.42)

 ∂ 2L = −tr A−1 W1 A−1 W1 − (BW1 y) Ω−1 BW1 y ∂λ2

(3.4.43)

 ∂ 2L = −tr B −1 W2 B −1 W2 − [W2 (Ay − Xβ)] Ω−1 W2 (Ay − Xβ) ∂ρ2

(3.4.44)

For the cross - product terms: ∂ 2L ∂ 2L = ∂β∂λ ∂λ∂β 

(3.4.45)

= − (BX) Ω−1 BW1 y

(3.4.46)

∂ 2L ∂ 2L = ∂β∂ρ ∂ρ∂β 

(3.4.47)

= − (BX) Ω−1 BW2 (Ay − Xb) − v  Ω−1/2 W2 X

(3.4.48)

102

∂ 2L ∂ 2L = ∂β∂αp ∂αp ∂β 

(3.4.50)

∂ 2L ∂ 2L = ∂λ∂ρ ∂ρ∂λ

(3.4.51)

= − [W2 (Ay − Xβ)] Ω−1 BW1 y − v  Ω−1/2 W2 W1 y

(3.4.52)

∂ 2L ∂ 2L = ∂λ∂αp ∂αp ∂λ

(3.4.53)

=−

    1 1 (BW1 y) Ω−2 Hp B (Ay − Xβ) − v  Ω−3/2 Hp BW1 y 2 2 ∂ 2L ∂ 2L = ∂ρ∂αp ∂αp ∂ρ

=−

(3.4.49)

    1 1 (BX) Ω−2 Hp B (Ay − Xβ) − v  Ω−3/2 Hp BX =− 2 2

    1 1 [W2 (Ay − Xβ)] Ω−2 Hp B (Ay − Xβ) − v  Ω−3/2 Hp W2 (Ay − Xβ) 2 2 ∂ 2L = ∂αp ∂αq

      1 1 1 trΩ−2 Hp Hq − trΩ−1 Hpq − [B (Ay − Xβ)] 2 2 4   3 Ω−3 Hq Hp B (Ay − Xβ) − v  Ω−5/2 Hq Hp B (Ay − Xβ) 4   1 + v  Ω−3/2 Hpq B (Ay − Xβ) 2

(3.4.54) (3.4.55)

(3.4.56)

(3.4.57)

To obtain the expected values, the following definitions and relations between the error terms are used: u = Ay − Xβ (3.4.58) ε = B (Ay − Xβ)

(3.4.59)

= Bu

(3.4.60)

v = Ω−1/2 B (Ay − Xβ)

(3.4.61)

= Ω−1/2 ε

(3.4.62)

103

= Ω−1/2 Bu

(3.4.63)

It follows that, in terms of expected values: E [u] = E [ε] = E [v] = 0

(3.4.64)

E [uu ] = B −1 ΩB −1

(3.4.65)

E [uu ] = Ω

(3.4.66)

E [vv  ] = I

(3.4.67)

y = A−1 Xβ + A−1 B −1 Ω1/2 v

(3.4.68)

= A−1 Xβ + A−1 B −1 ε

(3.4.69)

E [y] = A−1 Xβ

(3.4.70)

  E [yy  ] = A−1 Xβ A−1 Xβ + A−1 B −1 ΩB −1 A−1

(3.4.71)

and for y,

An application of these properties to the above partial derivatives, in combination with a judicious use of the trace operator, yields the elements of the information matrix given below. For the various combinations of parameters, the following results are obtained: Iββ  = X  B  Ω−1 BX

(3.4.72)

Iβλ = (BX) Ω−1 BW1 A−1 Xβ

(3.4.73)

Iβρ = 0

(3.4.74)

Iβα = 0

(3.4.75)



2   Iλλ =tr W1 · A−1 + trΩ · BW1 · A−1 B −1 · Ω−1 · BW1 · A−1 · B −1  

 + BW1 · A−1 Xβ BW1 · A−1 Xβ

(3.4.76)

 Iλρ = tr W2 B −1 · Ω−1 BW1 · A−1 B −1 · Ω + trW2 · W1 · A−1 B −1

(3.4.77)

Iλαp = trΩ−1 · Hp · BW1 · A−1 B −1 104

(3.4.78)



2  Iρρ = tr W2 · B −1 + trΩ · W2 · B −1 Ω−1 · W2 · B −1

(3.4.79)

Iραp = trΩ−1 · Hp · W2 · B −1

(3.4.80)

  1 = trΩ−2 · Hp · Hq 2

(3.4.81)

I α p αq

The asymptotic variance matrix is obtained by substituting the ML estimates for the parameters in expressions (3.4.72) - (3.4.81) and taking the inverse of the information matrix. Since the dimension of this matrix is 3 + K + P , no analytical results are available. The estimated asymptotic variance matrix can then be used as the basis for various hypothesis tests. In the loglikelihood function for the general spatial process model:       1 1 N .ln (π) − .ln (Ω) + ln |B| + ln |A| − (Ay − Xβ) B  Ω−1 B (Ay − Xβ) L=− 2 2 2 (3.4.82) with as before, A = I − ρW B = I − λW Now, for SAR Model B = I and Ω = σ 2 I Thus, the log likelihood becomes:      1 N N (3.4.83) lnπ − lnσ 2 + ln |A| − (Ay − Xβ) (Ay − Xβ) 2 2 2 2σ The application of the general first order conditions expressions (3.4.37) - ( 3.4.40) to this special case yields b as the estimator for β, for which: 

L=−

b = (X  X)

−1

X  Ay

(3.4.84)

or, b = (X  X)

−1

X  y − λ (X  X) = b0 − λbL

105

−1

X W y

(3.4.85) (3.4.86)

The OLS estimators b0 and bL are obtained from a regression of the y on X and on W y respectively. Clearly, the MLE for β is a function of these auxiliary regression coefficients as well as of λ. However, whereas the estimate for λ cannot be expressed analytically, neither b0 nor bL are a function of any other parameters. Therefore, the estimate for β can be found directly, once the value for λ has been determined. The two coefficient vectors b0 and bL lead to two sets of residuals, e0 and eL , which depend on the y and X and W y only: e0 = y − Xb0 (3.4.87) eL = W y − XbL

(3.4.88)

Further application of the first order conditions and taking into account the auxiliary residuals yields the estimate for the error variance σ 2 as:   1 σ2 = (e0 − λeL ) (e0 − λeL ) (3.4.89) N

Again, this estimate can be readily obtained once a value for λ has been determined. Substitution of the estimates for β and σ 2 into the likelihood results in a concentrated likelihood of the following form:  LC = C −

N 2



 ln

1 N



 (e0 − λeL ) (e0 − λeL ) + ln [I − λW ]

(3.4.90)

where C is the usual constant. This expression is a nonlinear function in one parameter only, namely λ, and can be easily maximized by means of numerical techniques such as Direct Search approach to Optimization, Steepest Decent method, Gauss Newton approach, or a Davidon Fletcher Powell procedure. Consequently, the estimation process can proceed according to the following steps: Step 1: Carry out OLS of y on X: yields b0

106

Step 2: Carry out OLS of y on W y: yields bL Step 3: Compute residuals e0 and eL Step 4: Given e0 and eL , find λ that maximizes LC Step 5: Given λ, compute b = b0 − λbL and σ 2 =

1 N

(e0 − λeL ) (e0 − λeL )

These steps can be carried out by standardized regression package. This study employed the Geoda software regression package. The nonlinear optimization technique uses the Gauss Newton approach.

107

3.4.2

Estimation of the Spatial Autoregressive Model with Spatial Autoregressive Disturbances Model

Recall that the SAR-SARD model under consideration is given by (3.2.2) and (3.2.3). In the following section, a presentation is made on the log-likelihood function under the assumption that ε ∼ N (0, σ 2 I). Here, the maximizer of the likelihood function when the innovations are not normally distributed is referred to as the Quasi-Maximum Likelihood (QML) estimator. Lee (2004) gives results concerning the consistency and asymptotic normality of the QML estimator when  is IDD but not necessarily normally distributed. Violations of the assumption that the innovations  are IDD can cause the QML estimator to produce inconsistent results. In particular, this may be the case if the innovations  are heteroskedastic, as discussed by Arraiz et al. (2010). The STATA software was used to run the analyses.

Likelihood Function The reduced form of the model in (3.2.2) and (3.2.3) is given by y = (I − λW1 )−1 Xβ + (I − λW1 )−1 (I − ρW2 )−1 ε

(3.4.91)

The unconcentrated log-likelihood function is



 n  n InL y β, σ 2 , λ, ρ = − ln (2π) − ln σ 2 + ln|I − λW1 | + ln|I − ρW2 | 2 2 1 − 2 {(I − λW1 ) y − Xβ}T (I − ρW2 )T (I − ρW2 ) 2σ {(I − λW1 ) y − Xβ}

(3.4.92)

It is possible to concentrate the log-likelihood function by first maximizing (3.4.92) with respect to β and σ 2 , yielding the maximizers −1  X T (I − ρW2 )T (I − ρW2 ) (I − λW1 ) y βˆ (λ, ρ) = X T (I − ρW2 )T (I − ρW2 ) X  T σ ˆ 2 (λ, ρ) = (1/n) (I − λW1 ) y − X βˆ (λ, ρ) (I − ρW2 )T (I − ρW2 )   (I − λW1 ) y − X βˆ (λ, ρ)

(3.4.93)

(3.4.94)

Substitution of the above expressions(3.4.93) and(3.4.94) into(3.4.92) yields the concentrated log-likelihood function

108

 n n 2 LC (y |λ, ρ) = − {In (2π) + 1} − In σ ˆ (λ, ρ) + In|I − λW1 | + In|I − ρW2 | 2 2

(3.4.95)

ˆ and ρˆ can now be computed by maxiThe QML estimates for the autoregressive parameters λ ˆ mizing the concentrated log-likelihood function. Once the QML estimates  λ and ρˆ, are obtained,  ˆ ρˆ and σ ˆ ρˆ . then the QML estimates for β and σ 2 can be calculated as βˆ = βˆ λ, ˆ2 = σ ˆ 2 λ,

Initial Values As noted in Anselin (1988), poor initial starting values for ρ and λ in the concentrated likelihood may result in the optimization algorithm settling on a local, rather than the global maximum. To prevent this problem from happening the spregml estimation process performs a grid search to find suitable initial values for ρ and λ. To override the grid search, researchers may specify their own initial values.

Generalized Spatial Two-Stage Least Squares Estimator Arraize et al. (2010) and Drukker et al. (2013a, 2013b) built on Kelejian and Prucha (1998, 1999, 2010) . The GS2SLS estimator requires instruments. Kelejian and Prucha (1998, 1999) suggest using as instruments H, the linearly independent columns of X, W1 X, ..., W1q X, W2 X, W2 W1 X, ..., W2 W1q X

(3.4.96)

where q = 2 has worked well in Monte Carlo simulations over a wide range of reasonable specifications. The choice of those instruments provides a computationally convenient approximation of the ideal instruments (Lee, 2003; Kelejian et al., 2004). At a minimum, the instruments should include the linearly independent columns of X and W2 X. When there is a constant in the model and thus X contains a constant term, the constant term is only included once in H.

109

3.4.3

Estimation of the Spatial Autoregressive Model with Spatial Autoregressive Disturbances with Additional Endogenous Variables Model

In this section, we present a detailed description of the calculations for the SAR-SARD-IV model. First, a discussion on the estimation of the general model as specified in (3.2.4) and (3.2.5) is given, both under the assumption that the innovations  are homoskedastic and under the assumption that the innovations  are heteroskedastic of unknown form. The STATA software was used to run the analyses. Next is an examination of the two special cases ρ = 0 and λ = 0 respectively. It is helpful to rewrite the model in (3.2.4) and (3.2.5) as y = Zδ + u

(3.4.97)

u = ρW2 u + ε

(3.4.98)

where Z = (Y, X, W1 y) and δ = (π  , β  , λ) . A review of the two-step GMM and IV estimation approach was made (see Drukker et al., 2013b; Arraize et al., 2010). The articles built on and specialized the estimation theory developed in Kelejian and Prucha (1998, 1999, 2004, 2010). A full set of assumptions, formal consistency and asymptotic normality theorems, and further details and discussions are given in the literature. The IV estimators δ depend on the choice of a set of instruments, say H. Suppose that in addition to the included exogenous variables X, we also have excluded exogenous variables Xε , allowing us to define Xf = (X, Xε )

(3.4.99)

If we do not have excluded exogenous variables, then Xf = X

(3.4.100)

Following the above literature, the instruments H may then be taken as the linearly independent columns of (Xf , W1 Xf , ..., W1q Xf , W2 Xf , W2 W1 Xf , ..., W2 W1q Xf ) (3.4.101) The motivations for the above instruments is that they are computationally simple while facilitating an approximation of the ideal instruments under reasonable assumptions. Taking q = 2 has worked wellin Monte Carlo simulations over a wide range of specifications . At a minimum, the instruments should include the linearly independent columns of Xf and W2 Xf , and the rank of H should be at least the number of variables in Z. For the following discussion, it proves convenient to define the instrument projection matrix: PH = H (H  H)

110

−1

H

(3.4.102)

When there is a constant in the model, it is only included once in H. The GMM estimators for ρ are motivated by quadratic moment conditions of the form: E (ε As ε) = 0,

s = 1, ..., S

(3.4.103)

where the matrices As satisfy tr(As ) = 0. Specific choices for these matrices will be given below. It has been observed that under heteroskedasticity, it is furthermore assumed that the diagonal elements of the matrices As are 0. This assumption implies the formula for the asymptotic Variance - Covariance (VC) matrix; in particular, it avoids the fact that the VC matrix must depend on third and fourth moments of the innovations in addition to second moments. Next, the steps involved in computing the GMM and IV estimators and an estimate of their asymptotic VC matrix are described. The second step operates on a spatial Cochrane-Orcutt transformation of the above model given by: y (ρ) = Z (ρ) δ + ε

(3.4.104)

with y (ρ) = (IN − ρW2 ) y and Z (ρ) = (IN − ρW2 ) Z Step 1a: Two-stage least-squares estimator (2SLS) In the first step, we apply 2SLS to the untransformed model by using the instruments H. The 2SLS estimator of δ is then given by: −1  Z˜  y (3.4.105) δ˜ = Z˜  Z where Z˜ = PH Z Step 1b: Initial GMM estimator of ρ The initial GMM estimator of ρ is given by  !   !   ρ ρ ˜ ˜ − τ˜ Γ − τ˜ ρ˜ = arg min Γ 2 2 ρ ρ

where u˜ = y − Z δ˜ are the 2SLS residuals, ˜¯u = W2 u˜ ⎡  u˜ (A1 + A1 )˜¯u ⎢ . ⎢ ˜ = n−1 ⎢ . Γ ⎢ ⎣ . u˜ (AS + AS )˜¯u

and

111

−˜¯u A1˜¯u . . . −˜¯u As˜¯u

(3.4.106)

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

(3.4.107)

⎡ ⎢ ⎢ τ˜ = n−1 ⎢ ⎢ ⎣

u˜ A1 u˜ . . . u˜ AS u˜

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

(3.4.108)

Writing the GMM estimator in this form shows that we can calculate it by solving a simple nonlinear least-squares problem. By default, S = 2 and homoskedasticity is specified. In this case,  ) ( )2 −1 (  A1 = 1 + n−1 tr (W2 W2 ) W2 W2 − n−1 tr (W2 W2 ) IN

(3.4.109)

and A2 = W2

(3.4.110)

If heteroskedasticity is specified, then by default, A1 = W2 W2 − diag (W2 W2 )

(3.4.111)

and A2 = W2 as in (3.4.110) Step 2a: Generalized Spatial Two- Stage Least- Squares Estimator of δ In the second step, we first estimate δ by 2SLS from the transformed model by using the instruments H and from where the spatial Cochrane-Orcutt transformation uses ρ˜. The resulting GS2SLS estimator of δ is now given by −1  ρ) Z (˜ ρ) ρ) Zˆ (˜ ρ) y (˜ δˆ (˜ ρ) = Zˆ (˜

(3.4.112)

where y (˜ ρ) = (IN − ρ˜W2 ) y, Z (˜ ρ) = (IN − ρ˜W2 ) Z and Zˆ (˜ ρ) = PH Z (˜ ρ) Step 2b: Efficient Generalized Method of Moments Estimator of ρ The efficient GMM estimator of ρ corresponding to GS2SLS residuals is given by  ρˆ = arg min

 ˆ Γ

ρ ρ2



!    ! −1 ρ ˆ − τˆ Γ − τ˜ ρ) ψˆρρ (˜ 2 ρ

where uˆ = y − Z δˆ denotes the GS2SLS residuals ˆ¯u = W2 uˆ,

112

(3.4.113)

⎡ ⎢ ⎢ ˆ = n−1 ⎢ Γ ⎢ ⎣

and

uˆ (A1 + A1 )ˆ¯u . . . uˆ (AS + AS )ˆ¯u ⎡

uˆ A1 uˆ ⎢ . ⎢ τˆ = n−1 ⎢ ⎢ . ⎣ . uˆ AS uˆ

−¯ˆu A1ˆ¯u . . . −ˆ¯u Asˆ¯u

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

(3.4.114)

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

(3.4.115)

ρ) is an estimator for the VC matrix of the normalized sample moment vector and where ψˆρρ (˜ based on GS2SLS residuals, say, ψ ρρ . The estimator ψˆρρ (˜ ρ) and ψ ρρ

113

differ for the cases of homoskedastic and heteroskedastic errors. When homoskedasticity is specified, the r, s element of ψˆρρ (˜ ρ) is given by r, s = 1, 2. ( 2 )2 ρρ (˜ ρ) = σ ˜ (˜ ρ) (2n)−1 tr {(Ar + Ar ) (As + As )} + σ ˜ 2 (˜ ρ) n−1 a ˜r (˜ ρ) a ˜s (˜ ρ) ψˆr,s  ( 2 )2   −1 (4) (3.4.116) +n ρ) − 3 σ ˜ (˜ ρ) μ ˜ (˜ vecD (Ar ) vecD (As ) ( ) + n−1 μ ˜(3) (˜ ρ) a ˜r (˜ ρ) vecD (As ) + a ˜s (˜ ρ) vecD (Ar ) ˆ (˜ where a ˆr (˜ ρ) = T ρ) α ˆ r (˜ ρ) ˆ (˜ T ρ) = H Pˆ (˜ ρ)  −1 ˆ HZ (˜ ˆ −1 Q ˆ ˆ −1 Q ˆ Pˆ (˜ ρ) = Q ρ) Q ρ)  Q ρ) HH HZ (˜ HH HZ (˜ ˆ HH = (n−1 H  H) Q ˆ HZ (˜ ρ) = (n−1 H  Z (˜ ρ)) Q Z (˜ ρ) = (I − ρ˜W2 ) Z ( ) ρ) = −n−1 Z (˜ ρ) ρ) (Ar + Ar ) εˆ (˜ α ˜ r (˜ εˆ (˜ ρ) = (I − ρ˜W2 ) uˆ σ ˆ 2 (˜ ρ) = n−1 εˆ (˜ ρ) εˆ (˜ ρ) μ ˆ(3) (˜ ρ) = n−1 μ ˆ(4) (˜ ρ) = n−1

n  i=1 n 

εˆi (˜ ρ) 3 εˆi (˜ ρ)4

i=1

When heteroskedasticity is specified,the r ,s element of ψ ρρ is estimated by  * ∧ ∧ ∧    ρρ (˜ ρ) = (2n)−1 tr (Ar + Ar ) ˆr (˜ ρ) ρ) (3.4.117) (˜ ρ) (As + As ) (˜ ρ) + n−1 a ψˆr,s (˜ ρ)ˆ as (˜ where

∧ 

ρ), and εˆ (˜ ρ) and a ˆr (˜ ρ) are (˜ ρ) is a diagonal matrix whose diagonal ith element is εˆ2i (˜

as defined above. The last two terms in (3.4.116) do not appear in (3.4.117) because the As matrices used in the heteroskedastic case have diagonal elements equal 0.  Having computed the estimator θˆ = δˆ , ρˆ in steps 1a, 1b, 2a, and 2b, we next compute a consistent estimator for its asymptotic VC matrix, say, Ω. The estimator is given by ˆ nΩ 114

(3.4.118)

where  ˆ= Ω

ˆ δδ Ω ˆ δρ Ω ˆ δρ Ω ˆ ρρ Ω



ˆ δδ = Pˆ (ˆ Ω ρ) ψˆδδ (ˆ ρ) Pˆ (ˆ ρ)  −1   −1 −1 ˆ δρ = Pˆ (ˆ Jˆ Jˆ ψˆρρ (ˆ Ω Jˆ ρ) ψˆδρ (ˆ ρ) ψˆρρ (ˆ ρ) ρ)   −1 −1 ˆ ρρ = Jˆ ψˆρρ (ˆ ρ) Ω Jˆ  ˆ Jˆ = Γ

1 2ˆ ρ



In the above, ψˆρρ (ˆ ρ) and Pˆ (ˆ ρ) are as defined in (3.4.116) and (3.4.117) with ρ˜ replaced by ρˆ. The estimators ψˆδδ (ˆ ρ) and ψˆδρ (ˆ ρ) are defined as follows: When homoskedasticity is specified, ˆ HH ρ) = σ ˆ 2 (ˆ ρ) Q ψˆδδ (ˆ

ρ) = σ ˆ 2 (ˆ ρ) n−1 H  {a1 (ˆ ρ) , a2 (ˆ ρ)} + μ(3) (ˆ ρ) n−1 H  {vecD (A1 ) , vecD (A2 )} ψˆδρ (ˆ

(3.4.119)

(3.4.120)

When heteroskedasticity is specified, ψˆδδ (ˆ ρ) = n−1 H 

ρ) = n−1 H  ψˆδρ (ˆ

∧ 

∧ 

(ˆ ρ)H

(3.4.121)

ρ) , a2 (ˆ ρ)} (ˆ ρ) {a1 (ˆ

(3.4.122)

ˆ pp has the simple form given above because the estimator in We note that the expression for Ω step 2b is the efficient GMM estimator.

Spatial Autoregressive Model without Spatially Correlated Errors Consider the case ρ = 0, that is , the case where the disturbances are not spatially correlated. In this case, only step 1a is necessary, and spivreg (in STATA) estimates δ by 2SLS using as 115

instruments H the linearly independent columns of {Xf , W1 Xf , ..., W1q Xf }. The 2SLS estimator is given by:  −1 Z˜  y (3.4.123) δ˜ = Z˜  Z where Z˜ = PH Z When the heteroskedasticity is specified,the asymptotic VC matrix of δ˜ can be estimated consistently by −1  (3.4.124) σ ˜ 2 Z˜  Z˜ n 2 2 −1 ˜ ˜i and u˜ = y − Z δ denotes the 2SLS residuals. where σ ˜ =n i=1 u

116

When heteroskedasticity is specified, the asymptotic VC matrix of δ˜ can be estimated consistently by the standard form 

where

∼ 

Z˜  Z˜

−1

Z˜ 

∼ 

−1  Z˜ Z˜  Z˜

(3.4.125)

is the diagonal matrix whose ith element is u˜2i

Spatially Correlated Errors without a Spatial Autoregressive Term Consider the case λ = 0, that is, the case where there is no spatially lagged dependent variable in the model. In this case, we use the same formulas as in section 3.4.3 after re-defining Z = Y, X, δ = (π  , β  ) , and we take H to be composed of linearly independent columns of (Xf , W2 Xf ).

No Spatial Autoregressive Term or Spatially Correlated Errors When the model does not contain a SAR term or spatially correlated errors, the 2SLS estimator provides consistent estimates, and we obtain our results by using ivregress. When homoskedasticity is specified, the conventional estimator of the asymptotic VC is used. When heteroskedasticity is specified, the vce (robust) estimator of the asymptotic VC is used. When no endogenous is specified, we obtain our results by using regress.

3.5

Measures of Spatial Dependence

In geography, the best known statistic for measuring spatial dependence is Morans I, denoted by M (Moran, 1948) and, to a lesser extent is the Gearys C (Cliff and Ord, 1973). The G statistics is a family of statistics that have a number of attributes that make them attractive

117

for measuring dependence in a spatially distributed variable especially when used in conjunction with the M statistic. They deepen the knowledge of the process that gives rise to spatial dependence and enhance detection of local ’pockets’ of dependence that may not show up when using the global statistic. In the following section, we outline the characteristics and attributes of the statistic that makes it equally attractive for this study and then compare it with the M statistic as illustrated in Getis and Ord (1992).

3.5.1

Morans Index

This global measure of spatial dependence was developed by Moran (1948). The index measures spatial dependence based on feature locations and attribute values. The measure evaluates whether the pattern is clustered, dispersed or random. The null hypothesis states that the feature values are randomly distributed across the study area. When the z − score or p − value indicates statistical significance, a positive M index value indicates tendency towards clustering while a negative M index value indicates tendency toward dispersion. The M statistic is structured as a Pearson product moment correlation coefficient, plus W , the contiguity weights matrix. Y is a covariance matrix, that is, the relation between  nthe spatial  n (yi − y¯)2 . units is calculated as (yi − y¯) (yj − y¯). The obtained measure is scaled by  n n  i=1 j=1

Wij

i=1

By convention, i = j. As a result,

n n  

n M=  n  n i=1 j=1

i=1 j=1

Wij

Wij (yi − y¯) (yj − y¯) n 

, (yi − y¯)2

i = j

(3.5.1)

i=1

where yi = the value of variable y on segment i, y¯ the mean of variable y, n = the number of segments, Wij = a weight indicating if segment i is connected to segment j (e.g. 1) or if it is not (e.g. 0). The summation operators are for i = 1, 2, ..., n and j = 1, 2, ..., n in all cases.

118

The expected value of M is E (M ) = (−1)/n − 1

(3.5.2)

The variance of M under the assumption of normally distributed data is

n2 s1 − ns2 + 3( V ar(M ) =



 i

where, s1 =

1 2

 i

(wij + wji )2 and s2 =

j

i

2 wij

wij )2

j

(3.5.3)

(n2 − 1)

j

   i



wij +



i

2 wji

.

j

Local M and Z − Score are given below where y¯ is mean value and s2 is variance.

Mi =

(yi − y¯)  wij (yj − y) s2 j

while, Z (Mi ) = Mi − E (Mi )



V ar (Mi )

(3.5.4)

(3.5.5)

When two segments connect, a value of 1 represents this, and if not, 0 is entered in the W . For any set of n segments of a linear route, there will be 2 (n − 1) joins. If the focus of the analysis is not a single linear route, but an entire network, then the connectivity of segments to each other may need to be identified by inspection. Once identified, a binary connection matrix of segments represents the presence or absence of connections.

119

3.5.2

Getis and Ord General Statistic Gi (d)

This statistic measures the degree of association that results from the concentration of weighted points (or area represented by a weighted point) and all other weighted points included within a radius of distance d from the original weighted point. The basis is now  Wij Yij , i = j (3.5.6) Γi = j

We assume an area subdivided into n regions, i = 1, 2, ..., n, where each region is identified with a point whose Cartesian coordinates are known. Each i has associate with it a value y (a weight) taken from variable Y .The variable has a natural origin and is positive. The Getis and Ord, Gi (d) statistic developed below allows for tests of hypotheses about the spatial concentration of the sum of y values associated with the j points within d of the ith point. The statistic is,

n 

Gi (d) =

j=1

wij (d) y j n 

,

j = i

(3.5.7)

yj

j=1

where (wij ) is a symmetric one/zero W with ones for all links defined as being within distance d of a given i; all other links are zero including the link of point i to itself. The numerator is the sum of all y j within d of i but not including y i . The denominator is the sum of all y j not including y i . We may fix the value y i for the ith point and consider the set (n − 1)! random permutations of the remaining y values at the j points. Under the null hypothesis of no spatial dependence, these permutations are equally likely. That is, let Y j be the random variable describing the value assigned to point j, then P (Yj = yr ) = 1/(n − 1), and E (Yj ) =



r = i

yr /(n − 1)

(3.5.8)

(3.5.9)

r=1

Thus, E (Gi ) =



wij (d) E (Y j ) /

j=i

 j

Yj

(3.5.10)

j=i

Wi / (n − 1)

where Wi =



wij (d)

120

(3.5.11)

Similarly, 

 

 1 w2ij (d) E Yj2 + wij (d)wik (d) E (Yj YK ) E G2i =  2 [ j j=k j yj

Since,

  2 E Y 2j = yr (n − 1)

(3.5.12)

(3.5.13)

r=i

E (Yj Yk ) =

=

⎧ ⎨  ⎩

 

yr ys /(n − 1) (n − 2)

r=s=i

2



yr

r=i



y 2r

r=i

Recalling that the weights are binary, 

⎫ ⎬ ⎭

/ (n − 1) (n − 2)

wij wik = Wi2 − Wi

(3.5.14)

(3.5.15)

(3.5.16)

j=k

and so

 1 E G2i =  2 j yj

Thus,

= 

1 j

yj

. 2



 *  Wi j yj2 Wi (Wi − 1)  2  2 + yj − yj j j (n − 1) (n − 1) (n − 2)

 V ar (Gi ) = E G2i − E 2 (Gi )  2/ Wi (n − 1 − Wi ) j yj Wi (Wi − 1) Wi2 + − (n − 1) (n − 2) (n − 1) (n − 2) (n − 1)2

If we set,

and

(3.5.17)

(3.5.18) (3.5.19)



j yj = Xi1 (n − 1)

(3.5.20)

2 j yj 2 = Xi2 = Xi1 (n − 1)

(3.5.21)



121

Then, V ar (Gi ) =

Wi (n − 1 − Wi ) (n − 1)2 (n − 2)



Xi2 2 Xi1

 (3.5.22)

As expected, V ar (Gi ) = 0

(3.5.23)

when Wi = 0 (no neighbors are within d), or when Wi = n − 1 (all n − 1 observations are within d), or when Xi2 = 0 (all n − 1 observations are equal). Note that Wi , Xi1 and Xi2 depend on i. Since Gi is a weighted sum of the variable Yj and denominator of Gi is invariant under random permutations of {yj , j = i}. It follows provided Wi /(n − 1) is bounded away from 0 and from 1, that the permutations distributions of Gi under null hypothesis approaches normality as n → ∞. When d, and thus Wi , is small, normality is lost, and when d is large enough to encompass the whole study area and thus (n − 1 − Wi ) is small, normality is also lost. It is important to note that the conditions must be satisfied separately for each point if its Gi is to be assessed via the normal approximation.

122

Getis and Ord Specific Statistic Gi(d) The characteristic equations for Gi (d) and the related Getis and Ord specific statistic G∗i (d) that measures association in cases where the j equal to i term is given as follow: for j not equal to i,  j wij (d) yj  Gi (d) = (3.5.24) j yj

Wi =



wij (d)

j

Here,

(3.5.25)



j yj Xi1 = (n − 1)  2 j yj 2 Xi2 = − Xi1 (n − 1)

(3.5.26) (3.5.27)

Thus, Wi (n − 1)

(3.5.28)

Wi (n − 1 − Wi ) Xi2 2 (n − 1)2 (n − 2) Xi1

(3.5.29)

E [Gi (d)] = and V [Gi (d)] =

Now, suppose j equal to i,

 G∗i (d) = Wj∗ =

j



wij (d) yj  j yj

(3.5.30)

wij (d)

(3.5.31)

j

Now,

 ∗ = Xi1



while ∗ = Xi2

j

yj

n

(3.5.32)

(yi yj )2 ∗ 2 ) − (Xi1 n

(3.5.33)

Wi∗ n

(3.5.34)

ij

and E [G∗i (d)] =

123

while V [G∗i (d)] =

∗ Wi∗ (n − Wi∗ ) Xi2 ∗ 2 n2 (n − 1) (Xi1 )

(3.5.35)

For the G∗i (d) statistic, it implies that any concentration of the y values includes the y at i. Note that the distribution is evaluated under the null hypothesis that all n! random permutations are equally likely.

Attributes of Getis and Ord Specific Statistic Gi It is important to note that the Getis and Ord specific statistic Gi is scale-invariant, such that, Xi = bYi yields the same scores as Yi , but not location- invariant Xi = a + Yi gives different results from Yi . The statistic is intended for use only for those variables that possess a natural origin. Similar to all other such statistics, transformations like Xi = log Yi will change the results. Gi (d) measures the concentration or lack of concentration of the sum values associated with variable Y in the region under study. Gi (d) is a proportion of the sum of all yj values that are within d of i. For example, if high-value yj s are within d of point i, then, Gi (d) is high. Whether the Gi (d) value is statistically significant depends on the statistic’s distribution. In typical circumstances, the null hypothesis is that the set of y values within d of location i is a random sample drawn without replacement from the set of all y values. The estimated Gi (d) is computed from  j wij (d) yj  Gi (d) = (3.5.36) j yj

using the observed yj values. Assuming that Gi (d) is approximately normally distributed, when  Zi = {Gi (d) − E [Gi (d)]} V arGi (d)

(3.5.37)

is positively or negatively greater than some specified level of significance, then, we say that positive spatial dependence exists. A large positive Zi implies that large values of yj (values above the mean yj ) are within the d of point i. A large negative Zi means that small values of yj are within d of point i. A special feature of this statistic is that the pattern of data points is neutralized when the expectation is that all y values are the same. Suppose data point densities are high in the vicinity of point i and d is just large enough to contain the area of the clustered points. Theoretical Gi (d) values are high because Wi is high. However, only if the observed yj values in the vicinity 124

of point i differ systematically from the mean is there the opportunity to identify significant spatial concentration of the sum of yj s. That is, as the data points become more clustered in the vicinity of point i, the expectation of Gi (d) rises, neutralizing the effect of the dense cluster of j values. In addition, the value of d can be interpreted as a distance that incorporates specified cells in a lattice. It is to be expected that neighbouring Gi will be correlated if d includes neighbours.

Attributes of Getis and Ord General Statistic G The G statistic is general in the sense that it is based on all pairs of values (yi , yj ) such that i and j are within distance d of each other. No particular location i is fixed in this case. The statistic is n  n 

G(d) =

wij (d)y i y j i=1 j=1 , n n 

j = i

(3.5.38)

yiyj

i=1 j=1

Now, W =



wij (d), j not equal to i, so that, E[G(d)] =

i=1 j=1

W , [n(n−1)]

while, the variance of G

follows from Cliff and Ord (1973).

 E G2 =

where, mj =

 i=1

1 2

(m21 − m2 ) n(4)

[B 0 m22 + B 1 m4 + B 2 m21 m2 + B 3 m1 m3 + B 4 m41 ]

y ji , j = 1, 2, 3, 4, and, n(r) = n (n − 1) (n − 2) ... (n − r + 1).

The coefficients, B, are B 0 = (n2 − 3n + 3) S 1 − nS 2 + 3W 2 ; B 1 = −[(n2 − n) S 1 − 2nS 2 + 3W 2 ]; B 2 = −[2nS1 − (n + 3) S 2 + 6W 2 ]; B 3 = 4 (n − 1) S 1 − 2 (n + 1) S 2 + 8W 2

125

(3.5.39)

and B 4 = S 1 − S 2 + W 2 ,    where S 1 = 1/2 (wij + wji )2 , j not equal to i, and, S 2 = (wi. + w.i )2 ; wi. = j w ij , j i

j

i

not equal to i. Thus,

 V ar (G) = E G2 − { W/ [n (n − 1)]} 2

126

(3.5.40)

3.5.3

Getis and Ord and Moran Statistics

The G (d) statistic measures overall concentration or lack of concentration of all pairs of (y i, yj ) such that i and j are within d of each other. From

n n  

G(d) =

wij (d)y i y j i=1 j=1 n  n 

(3.5.41)

yiyj

i=1 j=1

one finds the value by taking the sum of the multiples of each y i with all y j s within d of all i as a proportion of the sum of all y i yj . M on the other hand, is often used to measure the correlation of each y i with all y j s within d of i. Therefore, it is based on the degree of covariance within d of all y i . Consider K 1 , K 2 as constants invariant under random permutations. Then,  G (d) = K1 (3.5.42) w ij y i y j

and 

wij (y i − y) (yj − y)  = (K2 /K1 ) G (d) − K2 y (wi. + w.i )yi + K2 y 2 W M (d) = K2

where wi. =

 j

wij and w.i =



(3.5.43) (3.5.44)

wji

j

Since both G (d) and M (d) can measure the association among the same set of weighted points or areas represented by points, they may be compared. They will differ when the weighted   sums wi. y i and w.i y i differ from W y, that is, when the patterns of weights are unequal. The basic hypothesis is of random pattern in each case. We may compare the performance of the two measures by using their equivalent Z values of the approximate normal distribution.

3.5.4

Multivariate Spatial Correlation

A multivariate coefficient of spatial autocorrelation between two standardized random variables zk and zi is defined as:  (3.5.45) mkl = zk W s zl

127

where zk = [xk − x¯k ]/σk and zl = [xl − x¯l ]/σl have been standardized such that the mean is zero and standard deviation equals one, and W s is a doubly standardized (or, stochastic) spatial weights matrix. The weights matrix defines the neighbour set for each observation (with non-zero elements for neighbours, zero for others) and has zero on the diagonal by convention. This yields a multivariate counterpart of a Moran-like spatial autocorrelation statistic as: Mkl =

zk W zl zk zk

(3.5.46)

or

zk W zl (3.5.47) n with n as the number of observations, and W as the familiar row-standardized spatial weights matrix. Mkl =

3.5.5

Generalized Moran Scatterplot

The Moran Scatterplot visualizes a spatial autocorrelation statistic as the slope of the regression line in a scatterplot with the spatial lag on the vertical axis and the original variable on the horizontal axis (using the variables in standardized form). This follows from the structure of M statistic, which has a cross product between z and W z in the numerator, and the sum of squares of z in the denominator. For standardized variates, this corresponds to the slope of a regression line of W z on z. A multivariate generalization of this plot follows by using W zl on the vertical axis and zk on the horizontal axis. The slope of the linear regression through this scatterplot equals the statistic

Mkl =

zk W zl zk zk

(3.5.48)

In addition, the four quadrants of the scatterplot correspond to four types of multivariate spatial association, depending on how the value for zk at i compares to the corresponding spatial lag for zl . Relative to the mean (all values are standardized) this suggests two classes of positive spatial correlation, or spatial clusters (high-high and low-low), and two classes of negative spatial correlation, or spatial outliers(high-low and low-high). Points in each of the quadrants can be linked with their location on a map or on any of the other statistical graphs, such as a non-spatial scatterplot between zl and zk . Inference can be based on a permutation approach.

128

3.5.6

Summary and Conclusion

The standard reasoning behind spatial dependence is that they control for all space- specific time invariant variables whose ommission could bias the estimates in a typical cross sectional study.

129

Chapter 4

ANALYSIS AND RESULTS 4.1

Introduction to the Chapter

This chapter describes the data used for this study and their sources. It explains the and the GIS procedure used for data generation based on variables of the secondary data set obtained. Also, a succinct explanation of the spatial analysis procedure used for hotspots analysis is presented. Each of the section is followed by the results of data analysis and discussions on findings. In addition, two modifications of the SAR model were presented.

4.2

Data Description

In February 1988, the Federal Government created the Federal Road Safety Commission (FRSC) through Decree No. 45 of the 1988 as amended by Decree 35 of 1992 referred to in the statute books as the FRSC Act cap 141 Laws of the Federation of Nigeria. Prior to this time, the Nigerian Police Force established in 1930 was saddled with the responsibility of keeping RTC records. In most instances, especially when the form of accident is not serious such events are not usually reported. Thus, it is possible that this data does not contain all the accidents that occurred within the study time and area. Nonetheless, the data for this study and the sources are reliable and despite possible omissions the findings will not be negatively influenced. The FRSC RS11.3 Oyo sector command with headquarters at Eleyele, Ibadan in Oyo state comprise of ten unit commands, each unit command has designated service routes within the LGAs. The unit commands and the LGA they oversee are as follow. RS11.30 Eleyele : Ibadan North West and Ibadan South West; RS11.31 Ogbomoso : Ogbomoso South, Ogbomoso North, Ogo - Oluwa and Surulere; RS11.32 Oluyole: Ibadan South East, Oluyole and Ona Ara; RS 11.33 Iddo: Ibarapa North, Ibarapa Central, Ibarapa, East and Ido; RS11. 34 Mokola : Ibadan 130

North and Ibadan North East; RS 11.35 Egbeda : Egbeda and Lagelu; RS 11.36 Saki: Saki West, Saki East, Orelope, Atisbo, Iwajowa, Kajola and Itesiwaju; RS11.37 Kisi: Irepo, Olorunsogo, and Orire; RS11.38 Atiba: Atiba, Afijio, Oyo East, Oyo West and Iseyin while, RS 11.39 Moniya: Akinyele LGA respectively. Data on RTC 2011 and 2012 obtained from the RS11.3 FRSC Oyo sector command was used for this study. Also, the 2012 traffic volume for the command was used. This research enjoyed the good cooperation of the FRSC, which is a good potential to obtain quality results. The study focussed on area of land encompassing each LGA, total length of major roads within each LGA, travel densities within each LGA and the residential population for every LGA were sourced from the National Bureau of Statistics bulletin. For each of the unit command, road/route, location and time of RTC were obtained. The type of vehicles that were involved in the crashes was reported as either government diplomat, commercial or private vehicles. The vehicles registration numbers, the number of passengers, cause of RTC and total casualties namely; the number of adults injured, number of children injured and the gender of injured persons were obtained. Also, the number of adults killed, number of children killed and the gender of persons killed were included in the RTC data. The possible cause of each incidence of accidents was recorded along with other information. The possible causes of RTC reported were grouped into four categories for the purpose of this study as follows. Dangerous Driving (DVD): dangerous driving, wrongful overtaking and dangerous overtaking; Speed Limit Violation (SPV): speed limit violation and loss of control;Mechanical Fault (MF): brake failure and tyre burst and Human Factors (HF): driving under alcohol influence,overloading, obstruction and bad road. The approximate locations of the RTC were estimated from the record obtained from the RS 11.3 Oyo FRSC command. Generally, the record provides an indicative description of where the RTC occurred and in some cases the site was described using the nearest landmarks such as filling stations, roundabouts, stores, markets, garages, institutions or road intersections. Therefore, using the nearest landmarks or locality information provided in the record together with the Google Earth image, the existing digital road network, and the knowledge of the area of study, it was easy to place points on the approximate locations where the crashes occurred. The Google Earth image was particularly helpful because it provides a photographic view of the area of study together with the associated landmarks and road networks. Through the instrumentality of Global Positioning System (GPS) and the information on locations of crashes for each of the unit commands, the coordinates of crash location were obtained on the geographic coordinates system of the world. The coordinate locations generated were subsequently exported into ArcGIS where they were plotted as point locations. The point locations, which represent RTC locations, were overlaid

131

on the road network of the LGA of Oyo state. This made it possible to clip the Oyo state geo-referenced map within the ArcGIS environment to ascertain where each RTC location falls. Therefore, the number of RTC points within each LGA was counted and recorded accordingly. The GIS was used to create polygon shapefile for the study area. The shapefile was used to create spatial contiguity weight matrix based on the rook and queen criteria. The lengths of the roads and the area encompassing each LGA were calculated using the measuring tool in the software environment. Population (pop) is defined as the population figures for each LGA as reported by the NBS based on the 2006 population census in Nigeria. Travel density (tdensity) is defined as traffic count for each FRSC unit command divided by total major road length in kilometers within each LGA. Major road length (mlength) is the total length of roads in kilometres from each settlement to another within each LGA. Area is defined as the area per square (area) kilometre encompassing each LGA. Accident is defined as the total cases of RTC recorded in each LGA whose location was identified by taking the longitudes and latitudes. The logarithms of all the variables were taken for the purpose of this study.The lags of the population figures were used as instrumental variable.

4.2.1

Procedure for Geographic Information System Data Generation

The first stage of the data generating process using the GIS includes georeferencing the study area. This stage involves creating shape files for the study area and the roads under study within the study area. The next stage is to collect data on length of routes and area of LGA encompassing the routes under study. ArcGIS 10.1 software was used to generate data.

Georeferencing The procedure for georeferencing includes the following steps: Step 1: Get the Nigerian map showing state LGA division, scan and save in JPEG format into a folder. Step 2: Save the folder into a C- drive on your personal computer. Step 3: From the ArcGIS 10.1 software interface, use the add data menu to call up the map from the C drive.

132

Step 4: To establish control points on the map. i. Pick the zoom in tool to highlight the coordinates of the Eastings and Northings in focus. ii. Select the add control points tool to locate the Eastings and Northings. iii. Click on the focus point. iv. Enter the coordinates X and Y in degree decimals. v. From the menu, click the return to full extent tool. Repeat the procedure for three other coordinates within the Northern hemisphere using procedure i, ii, iii, iv and v above. Step 5: Go to georeferencing on the main menu. Select rectify to complete georeferencing. Step 6: Click ok Choose a name for the map and save in image format in a folder on your C drive (see Figure 4.1).

Shapefiles for Study Area The second stage is to create shapefiles for the study area and the roads understudy within the study area. The following procedures are to be used: Step 1: Open the ArcCatalogue window from the main menu. Step 2: Rightclick on the content. Select new. Step 3: Select shapefile to create new shapefile . Step 4: In the create new shapefile window, give the shapefile a name: oyo. Select feature type: polygon Step 5: To make a spatial reference. Select edit to describe the coordinate system. Step6: Choose select to browse for XY coordinate system. Step 7: Select geographic coordinate systems. Step 8: Look in Geographic coordinate systems. Select world. Step 9: Look in World. Select WGS 1984.prj. World Geodetic System, 1984 projection. Step 10: From the spatial reference properties window. Select apply. Select OK. Step 11: In the create new shape file window. Click OK. The next phase is to edit shapefiles for the study area.

133



Figure 4.1: The Oyo State Map Source: Produced by authors

134

Step 12: From the ArcGIS 10.1 software interface, select add data to call up Oyo polygon shapefiles Step 13: Go to editor on the main menu. Step 14: Select editor, then go to start editing. Step 15: Choose task, create new feature. Step 16: Choose target, Oyo polygon. Step 17: Go to sketch tool box, select sketch tool. Use the sketch tool to create the Oyo state shapefiles. Step 18: On the editor tool in the menu, open the drop box and click save edits to save edits (see Figure 4.2). The next phase is to edit shapefiles for the roads under study within the study area using polylines. The service road networks for the Oyo FRSC unit commands and the cadastral maps for Oyo state served as guide. Step 19: Go to editor on the main menu. Step 20: Select editor, then go to start editing. Step 21: Choose task, create new feature. Step 22: Choose target Oyo polyline . Step 23: Go to sketch tool box. Select sketch tool box. Use the sketch tool to create Oyo state roads shapefiles. Step 24: On the editor tool in the menu, open the drop box and click save edits to save edits. The next phase is to group roads into distinct units with special focus on the LGAs and the FRSC service units. Step 25: From the main menu, click the add data tool. Call up Oyo state polygon and polyline shapefiles. Step 26: Go to editor on the main menu. Select start editing. Step 27: Choose task, cut polygon features. Step 28: Choose target Step 29: Go to sketch tool box and start digitizing the polygon covering the commands Step 30: Save edits. Repeat procedures in step 25 to 30 for all road segments.

135



Figure 4.2: The Oyo State shapefile Source: Produced by authors Measuring Road Lengths

The third stage is to calculate the length of routes in each sector and to get the area of LGAs encompassing the roads. Step 1: Go to the measure tool box on the main menu. Step 2: Select draw a line tool, use the tool to measure the length of the roads in meters for all roads within each segment. Step 3: Use the sum tool to get the total length of the roads in meters for all roads within each segment. Step 4: Select the measure an area tool from the measure tool box on the main menu. Step 5: Calculate the area in square meters of the LGAS encompassing the roads. Repeat the process for all road segments.

Map Embellishment The next stage of data generation for this study is to embellish the map for the study area.

136

Step 1: Select view from the ArcGIS 10.1 main menu. Step 2: Select layout view. Step 3: Rightclick on layers, select properties. Step 4: Select new grid on the data frame properties window. Step 5: Create grids from grids and graticules wizard window. Step 6: Select grid name, graticule to divide maps by meridians and parallels. Step 7: Select next. Step 8: Create appearance by selecting graticules and labels from create a graticule window. Choose intervals. Place parallels every 10 01 011 latitude. Place meridians every 10 01 011 longitude. Select next. Step 9: Choose axes from axes and labels window. Major division ticks or minor division ticks Click next. Step 10: From the create a graticule window. Create a graticule border. Select place a simple border at the edge of graticule. For graticule properties, select store as a fixed grid that updates with changes to the data frame. Click finish. Step 11: At the data frame properties window. Click apply. Click OK. Step 12: From the main menu go to insert. Select legend. Step 13: From the legend wizard window, choose layers to include in the legend . Select next. Step 14: Give the legend title and legend title font properties. Click next. Choose legend frame border, background and gap. Click next. To change the size and shape of symbol patch in the legend. Select legend items whose patches are to be changed. Click next. Click finish. Step 15: From the main menu go to insert. Select north arrow. Step 16: From the north arrow selector window, select a north arrow. Click Ok. Step 17: From the main menu go to insert. Select scale bar. 137

Step 18: From the scale bar selector window, choose scale for distance in meters. Click Ok. To save maps and data generated in Microsoft word format, the following procedure is used: From the main menu select file, select export map, select save in desk top. Select file name Research Data Save as type JPEG. Copy from desktop to microsoft word. To save attributes of the shapefiles existing in the C drive. Right click on shapefiles. Open attribute table and select the portion you want to export. Select export. In export data window, select output table name Select Ok Save attributes of the map into Microsoft excel from the C drive.

4.3

Contiguity Weights Matrices

The queen and rook criteria were used to determine neighbours in the study area.The OpenGeoDa software developed by Anselin (2005) was used to obtain results.

4.3.1

Queen Contiguity Weights Matrices

Corner and common border neighbours for each of the 33 LGAs across Oyo state are presented in the matrix in Table 4.1, while the corresponding identity LGAs are presented in Table 4.2.

138

Table 4.1: 33 x 33 Queen weights matrix for Local Government Areas of Oyo State



                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                



Source: Produced by authors

139

Table 4.2: Queen Neighbours for each of the 33 Local Government Areas in Oyo State S/N

Local Government Area

Neighbours

1

Irepo

Orelope and Olorunsogo .

2

Saki West

Atisbo and Saki East.

3

Ibarapa Central

Ibarapa North and Ibarapa East.

4

Ogbomoso North

Surulere, Orire and Ogbomoso South.

5

Kajola

Itesiwaju, Iseyin and Iwajowa.

6

Egbeda

Ona Ara, Lagelu and Ibadan North East.

7

Saki East

Saki West, Atisbo, Atiba and Orelope.

8

Orelope

Irepo, Olorunsogo, Saki East and Atiba.

9

Olorunsogo

Orire, Atiba, Orelope and Irepo.

10

Ibarapa North

Iwajowa, Iseyin, Ibarapa East and Ibarapa Central.

11

Ogbomoso South

Ori-ire, Ogbomoso North, Surulere and Ogo Oluwa.

12

Surulere

Ori-ire, Ogo-Oluwa, Ogbomoso North and Ogbomoso South.

13

Ibadan North

Akinyele, Ibadan North East, Ibadan North West and Lagelu.

14

Ibadan South East

Ibadan North East, Ona Ara, Oluyole and Ibadan South West.

15

Lagelu

Akinyele, Ibadan North, Egbeda and Ibadan North East.

16

Ona Ara

Ibadan North East, Ibadan South East, Egbeda and Oluyole.

17

Oluyole

Ibadan South East, Ona Ara, Ibadan North East, Ibadan South West

18

Ibadan South West

19

Ibadan North West

and Ido. Ido, Ibadan North East, Ibadan South East, Oluyole and Ibadan North West. Ibadan South West, Ibadan North East, Ido, Akinyele and Ibadan North. 20

Akinyele

Afijio, Ido, Lagelu, Ibadan North and Ibadan North West.

21

Ogo-Oluwa

Surulere, Ogbomoso South, Orire, Oyo East and Afijio.

22

Oyo East

Ogo-Oluwa, Orire, Oyo West, Afijio, and Atiba.

23

Oyo West

Afijio, Oyo East, Atiba, Itesiwaju and Iseyin.

24

Atisbo

Saki West, Iwajowa, Atiba, Itesiwaju and Saki East.

25

Iwajowa

Atisbo, Itesiwaju, Kajola, Iseyin and Ibarapa North.

26

Ibarapa East

Ibarapa North, Ibarapa Central, Iseyin, Ido and Afijio.

27

Itesiwaju

Atisbo, Kajola,Iwajowa, Atiba, Iseyin and Oyo West.

28

Ido

Ibarapa East, Iseyin, Afijio, Akinyele, Oluyole, Ibadan North West and

29

Afijio

Oyo West, Oyo East, Ogo-oluwa, Akinyele,Ido, Iseyin and Ibarapa East.

30

Ori-ire

Olorunsogo Atiba Oyo East Ogo-Oluwa Ogbomoso South Ogbo-

140

Ibadan South West.

4.3.2

Rook Contiguity Weights Matrices

Common border neighbours for each of the 33 LGAs across Oyo state are presented in the matrix in Table 4.3, while the corresponding identity LGAs are presented in Table 4.4. Table 4.3: 33 x 33 Rook weights matrix for Local Government Areas of Oyo State

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                

                                



Source: Produced by authors 141

Table 4.4: Rook Neighbours for each of the 33 Local Government Areas in Oyo State S/N

Local Government Area

Neighbours

1

Irepo

Orelope and Olorunsogo.

2

Saki West

Atisbo and Saki East.

3

Ibarapa Central

Ibarapa North and Ibarapa East.

4

Ogbomoso North

Surulere, Orire and Ogbomoso South.

5

Kajola

Itesiwaju, Iseyin and Iwajowa.

6

Egbeda

Ona Ara, Lagelu and Ibadan North East.

7

Saki East

Saki West, Atisbo, Atiba and Orelope.

8

Orelope

Irepo, Olorunsogo, Saki East and Atiba.

9

Olorunsogo

Orire, Atiba, Orelope and Irepo.

10

Ibarapa North

Iwajowa, Iseyin, Ibarapa East and Ibarapa Central.

11

Ogbomoso South

Ori-ire, Ogbomoso North, Surulere and Ogo Oluwa.

12

Surulere

Ori-ire, Ogo-Oluwa, Ogbomoso North and Ogbomoso South.

13

Ibadan North

Akinyele, Ibadan North East, Ibadan North West and Lagelu.

14

Ibadan South East

Ibadan North East, Oluyole and Ibadan South West.

15

Lagelu

Akinyele, Ibadan North, Egbeda and Ibadan North East.

16

Ona Ara

Ibadan North East, Egbeda and Oluyole.

17

Oluyole

Ibadan South East, Ona Ara, Ibadan South West and Ido.

18

Ibadan South West

Ido, Ibadan North East, Ibadan South East, Oluyole and Ibadan North West.

19

Ibadan North West

Ibadan South West, Ibadan North East, Ido, Akinyele and Ibadan

20

Akinyele

Afijio, Ido, Lagelu, Ibadan North and Ibadan North West.

21

Ogo-Oluwa

Surulere, Ogbomoso South, Orire, Oyo East and Afijio.

22

Oyo East

Ogo-Oluwa, Orire, Oyo West, Afijio, and Atiba.

23

Oyo West

Afijio, Oyo East, Atiba, Itesiwaju and Iseyin.

24

Atisbo

Saki West, Iwajowa, Atiba, Itesiwaju and Saki East.

25

Iwajowa

Atisbo, Itesiwaju, Kajola, Iseyin and Ibarapa North.

North.

26

Ibarapa East

Ibarapa North, Ibarapa Central, Iseyin and Ido.

27

Itesiwaju

Atisbo, Kajola,Iwajowa, Atiba, Iseyin and Oyo West.

28

Ido

Ibarapa East, Afijio, Akinyele, Oluyole, Ibadan North West and Ibadan South West.

29

Afijio

30

Ori-ire

142

Oyo West, Oyo East, Ogo-Oluwa, Akinyele, Ido and Iseyin. Olorunsogo, Atiba, Oyo East, Ogo-Oluwa, Ogbomoso South, Ogbomoso North and Surulere

Source:

4.4

Hotspots Analysis

The ArcGIS 10.1 software was used for spatial analysis. The following steps describe the process for point analysis of crashes within RS11.3 Oyo command. The process for digitizing the map has been fully explained in the preceeding section. The process enables the spatial distribution of the crash points to be located on map. Step 1: Collect the coordinates using GPS. Step 2: Type the points in Microsoft excel and save in a particular folder in C drive for easy access Step 3: Go to file in the main menu Click on Add X and Y data A dialog box XY data will appear. Navigate to the XY data folder in excel by browsing for the table. Step 4: Specify the X and Y fields Step 5: Choose edit to get the projection, which is the coordinate of the input coordinates Step 6: The spatial reference properties dialog will come up Choose select. The browse for coordinate system dialogue will appear Step 7: Choose geographic coordinates system Select world Select WGS 1984.prj The name of the coordinate system automatically comes up. Step 8: Select apply Choose OK on the add XY data dialog after choosing the coordinate system. Click OK Step 9: Right click on the sheet and events layer Choose data Export data The Export data dialog will appear Step 10: Use output feature class to name file Click browse to save in working folder. Step 11: Look in working folder C drive for the results Step 12: Select working folder Step 13: Give the folder a name and save Next, we proceed to spatial and blackspot analyses.

4.4.1

Spatial Distribution of Population and Road Traffic Crashes Frequencies

Figures 4.3 and 4.4 show the population density (Population/Area) and RTC density (Number of RTC/Area) for Oyo state. 80.7% of LGAs with high RTC density have very high population 143

density. Spatial units with dense population include Ibadan North East, Ibadan North, Ibadan South East, Ibadan South West, Ibadan North West, Egbeda, Ogbomoso South, Ogbomoso North, Oyo East, Ona Ara, Akinyele, Kajola, Oluyole and Oyo West LGAs (Figure 4.3). Next, we examine the RTC density. Clustering of spatial units for population seems to be closely related to the RTC density. Spatial units with high RTC density include Ibadan North East, Ibadan North, Ibadan South East, Ibadan South West, Ibadan North West, Egbeda, Ogbomoso South, Oyo West, Oluyole, Akinyele, Ogbomoso North, Lagelu, Ido and Afijio LGAs (Figure 4.4).

144



Figure 4.3: Population density for Oyo State based on 2006 population figures Source: Produced by authors

145



Figure 4.4: Road traffic crashes (accident) density for Oyo State (2012 RTC record) Source: Produced by authors

146

Figure 4.5, shows the mean center for RTC distribution along with the standard distance deviation and the standard deviational ellipse. The geographic coordinates corresponding to the mean are 3.933670 and 7.588620 , that is, the longitude and latitude. This implies, the mean center, that is, the center of gravity of the concentration of RTC in Oyo state is Akinyele LGA. To be sure we are well guided a quick bird satellite imagery of a portion of Oyo state was obtained and the RTC geographic coordinates clipped on it (see Figure 4.6). This showed some RTC point falling on road networks in the area.

147



Figure 4.5: The mean center, standard distance deviation and standard deviational ellipse of RTC (accident) points in Oyo State of Nigeria Source: Produced by authors

148



    

Figure 4.6: Portion of the Oyo State quick bird satellite imagery indicating some RTC locations Source: Produced by authors

149

As seen, the standard deviational ellipse is a more elegant measure of spatial concentration than the standard distance deviation. The standard distance deviation covers less RTC cases than the standard deviational ellipse. Although, it covers more localities, this measure extends to areas outside the state where RTC cases were not even considered. 82.11% of the total RTC cases were represented by the ellipse, while, 76.21% was covered by the standard distance deviation. For the standard deviational ellipse, the center longitude is 3.9336680 , the center latitude is7.5886240 , and the standard distribution for longitude is 0.2158290 , while the standard distribution latitude is 0.5207080 . The angle of rotation is calculated to be 4.78110 . The area is 4,328.952433 per square kilometers. The standard deviational ellipse concisely describes the concentration in RTC locations to be within Oyo West, Oyo East, Afijio, Akinyele, Lagelu, Egbeda, Ona Ara, Oluyole, Ido, Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East and Ibadan South West LGAs of Oyo state. The standard deviation has the center longitude to be 3.933660 and the center latitude to be 7.5885240 . The standard distance deviation is 0.3985720 , while the area is 6,090.305873 per square kilometer. The standard deviation distance shows the concentration in RTC locations to be within Oyo West, Oyo East, Afijio, Akinyele, Lagelu, Egbeda, Ona Ara, Oluyole, Ido, Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East,Ibadan South West, Iseyin and Ibarapa East LGAs of Oyo state. The spatial dependence values associated with RTC are given in the next section. The Morans I Index calculated equal 0.19. The Z score equal 2.61, while the P value equal 0.01. The null hypothesis of randomness is therefore rejected. There is less than 1% likelihood that this clustered pattern could be a result of random chance. Therefore, we conclude that the spatial pattern is clustered, as such; there is strong spatial dependence across the study area. The results from the G statistic for testing spatial dependence are given as follow: G Index equal 0.36. The Z score equal 3.77, while the P value equal 0.01. The null hypothesis of randomness is therefore rejected. There is less than 1% likelihood that this clustered pattern could be a result of random chance. Therefore, we conclude that the spatial pattern is clustered, as such; there is strong spatial dependence across the study area. As seen in Table 4.5 below, result reveals high high values of concentration of accident clustering within Egbeda, Oluyole and Akinyele LGAs. The low high concentration for Ona Ara LGA is an indication of clustering from low to high clustering. Presently, there is low concentration of accidents but there is a high tendency of the spatial pattern to shift towards high clustering. Ona Ara LGA is contiguous to Oluyole and Egbeda LGAs. The geographic representation of Morans Index is given in Figure 4.7.

150

Table 4.5: Morans Spatial Dependence among Local Government Areas: Frequency of Road Traffic Crashes in Oyo State

Local Government Area

LMiIndex

LMiZScore

LMi Pvalue

Concentration

Egbeda

10.597

4.487

0.000007

HH

Oluyole

07.246

3.385

0.000712

HH

Akinyele

05.187

2.131

0.033104

HH

Ona Ara

- 05.542

-2.170

0.030010

LH

Source: Produced by authors

151



   4.7: Map showing concentration level of Road Traffic Crashes in Oyo State (Morans) Figure 

Source: Produced by authors

152

From Table 4.6, the highest negative values for the normalized Z values for the G Statistic was observed between Iseyin, Kajola, Itesiwaju, Ibarapa Central, Ibarapa North , Orelope, Olorunsogo and Irepo LGAs. There is a tendency of low association of accident occurrences across these LGAs at 0.05% level of significance. The LGAs with high positive values for the normalized Z values for the G Statistic was observed for nine LGAs (Table 4.7). These LGAs include, Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East, Ibadan South West, Egbeda, Ona Ara, Oluyole and Akinyele. There is a tendency of high association of RTC occurrences across these LGAs at 0.05% level of significance. The concentration of RTC for these localities is high indicating spatial dependence. These localities have more neighbours when compared to the localities with low negative values. The geographic representation of G Index is given in Figure 4.8.

153

Table 4.6: Getis and Ord Spatial Dependence among Local Government Areas: Frequency of Road Traffic Crashes in Oyo State (Highest Negative Values)

Local Government Area

Gi Z Score

Gi P value

Iseyin

-1.49327

0.135366

Kajola

-1.48729

0.136938

Itesiwaju

-1.40531

0.159930

Ibarapa Central

-1.35946

0.174002

Ibarapa North

-1.35946

0.174002

Orelope

-1.26867

0.204559

Olorunsogo

-1.20433

0.228461

Irepo

-1.20433

0.228461

Source: Produced by authors

154

Table 4.7: Getis and Ord Spatial Dependence among Local Government Areas: Frequency of Road Traffic Crashes in Oyo State (Highest Positive Values)

Local Government Area

Gi Z Score

Gi P value

Ibadan North

2.99858

0.002712

Ibadan North East

2.97677

0.002913

Ibadan North West

2.97677

0.002913

Ibadan South East

2.97677

0.002913

Ibadan South West

2.97677

0.002913

Egbeda

2.52179

0.011676

Ona Ara

2.52179

0.011676

Oluyole

2.27168

0.023106

Akinyele

2.10861

0.034978

Afijio

1.30374

0.192324

Oyo East

0.84089

0.400406

Source: Produced by authors

155



   Figure 4.8: Map showing concentration level of Road Traffic Crashes in Oyo State (Getis and Ord)

Source: Produced by authors

156

From the analysis above, the spatial pattern of RTC is clustered. The concentration of RTC is across Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East, Ibadan South West, Egbeda, Oluyole, Ona Ara and Akinyele LGAs of Oyo state. The hotspots include Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East, Ibadan South West, Egbeda, Oluyole and Akinyele LGAs of Oyo state.

4.4.2

Spatio-Temporal Analysis of Characteristics and Causes of Road Traffic Crashes

Figure 4.9 shows number of RTC cases in 2011(yellow) and 2012 (blue). Figure 4.10 shows the total casualties in 2011(yellow) and 2012 (blue).

157



Figure 4.9: Spatio-Temporal presentation of Road Traffic Crashes (accidents) for 2011 and 2012 (Oyo State) Source: Produced by authors

158



Figure 4.10: Spatio-Temporal presentation of Road Traffic Crashes Casualties for 2011 and 2012 (Oyo State) Source: Produced by authors

159

Next, we consider the categories of persons injured in RTC. Figures 4.11 shows the number of males (blue) and females (yellow) injured in 2011. Figures 4 .12 shows the number of males (blue) and females (yellow) injured in 2012.

160



Figure 4.11: Geovisual presentation of Males/Females Injured in Road Traffic Crashes in Oyo State in 2011 Source: Produced by authors

161



Figure 4.12: Geovisual presentation of Males/Females Injured in Road Traffic Crashes in Oyo State in 2012 Source: Produced by authors

162

Figure 4.13 shows the number of adults (blue) and children (yellow) that sustained injuries as a result of RTC in 2011. Figure 4.14 shows the number of adults (blue) and children (yellow) that sustained injuries as a result of RTC in 2012.

163



Figure 4.13: Geovisual presentation of Children/Adults Injured in Road Traffic Crashesin Oyo State in 2011 Source: Produced by authors

164



Figure 4.14: Geovisual presentation of Children/Adults Injured in Road Traffic Crashes in Oyo State in 2012 Source: Produced by authors

In the following section, we consider the categories of persons killed in RTC. Figure 4.15 shows the number of males (blue) and females (yellow) killed in 2011. Figure 4.16 shows the number of males (blue) and females (yellow) killed in 2012.

165



Figure 4.15: Geovisual presentation of Males/Females killed in Road Traffic Crashes in Oyo State in 2011 Source: Produced by authors

166



Figure 4.16: Geovisual presentation of Males/Females Killed in Road Traffic Crashes in Oyo State in 2012 Source: Produced by authors

Figure 4.17 shows the number of children (yellow) and adults (blue) killed in 2011. Figure 4.18 shows the number of children (yellow) and adults (blue) killed in 2012.

167



Figure 4.17: Geovisual presentation of Children/Adults Killed in Road Traffic Crashes in Oyo State in 2011 Source: Produced by authors

168



Figure 4.18: Geovisual presentation of Children/Adults Killed in Road Traffic Crashes in Oyo State in 2012 Source: Produced by authors

The spatial dependence values associated with causes of RTC are given in the next section.

Dangerous Driving Spatial Dependence (Moran) The null hypothesis states that the spatial pattern is random across the study area. In 2011, the Morans I Index calculated was 0.29, while the Z score was 3.51 standard deviations, with a P value of 0.01. For 2012, the Morans I Index calculated was 0.36, while the Z score was 4.37 standard deviations, with a P value of 0.01. This indicates less than 1% likelihood that this clustered pattern could be the result of random chance (Table 4.8).

169

Table 4.8: Morans Spatial Dependence for Dangerous Driving among Local Government Areas: Number of Dangerous Driving in Oyo State (2011-2012)

YEAR, 2011 LGA

LMi Index

LMi Z Score

LMi P value

Concentration

Oyo West

13.587

7.342

0.000

HH

Ibarapa East

4.876

2.678

0.007

HH

Afijio

5.693

2.663

0.008

HH

Oyo East

4.808

2.419

0.016

HH

YEAR, 2012 LGA

LMi Index

LMi Z Score

LMi P value

Concentration

Oyo West

20.031

11.085

0.000

HH

Ibarapa East

10.616

5.385

0.000

HH

Afijio

7.630

3.627

0.0003

HH

Oyo East

3.412

2.621

0.009

HH

Key : HH =High High Source: Produced by authors

170

The hotspots as a result of dangerous driving in 2011 (Figure 4.19) were on major roads within Oyo West, Oyo East, Afijio and Ibarapa East LGAs. While, the hotspots as a result of dangerous driving in 2012 (Figure 4.20) were on major roads within Atiba, Oyo East, Oyo West and Afijio LGAs. Result reveals high high values of concentration of RTC clustering as a result of dangerous driving within Oyo West, Ibarapa East, Afijio and Oyo East LGAs in 2011, while, Oyo West, Oyo East, Afijio and Atiba LGAs had the highest concentration in 2012.

171



Figure 4.19: Geovisual presentation of Hotspots for Dangerous Driving in Oyo State in 2011 (Morans) Source: Produced by authors

172



Figure 4.20: Geovisual presentation of Hotspots for Dangerous Driving in Oyo State in 2012 (Morans) Source: Produced by authors

173

Spatial Dependence (Getis and Ord) The null hypothesis states that the spatial pattern is random across the study area. The results from the G statistic for testing spatial dependence are given as follow for 2011: G Index of 0.21, with a Z score of 1.12 standard deviations. This indicates that while there is some clustering, the pattern may be due to random chance. While, for 2012, the G index was 0.27 with a Z score of 1.82 standard deviations, indicative of a less than 5 − 10% likelihood that the clustering of high values could be the result of random chance (Table 4.9).

174

Table 4.9: Getis and Ord Spatial Dependence for Dangerous Driving among Local Government Areas: Number of Dangerous Driving in Oyo State, 2011-2012 (HIGHEST POSITIVE VALUES)

YEAR, 2011 LGA

YEAR, 2012

Gi Z Score

Gi P value

LGA

Gi Z Score

Gi P value

Oyo West

4.474

0.000008

Oyo West

5.432

0.0000

Ibarapa East

2.845

0.004

Oyo East

3.151

0.0016

Afijio

2.100

0.036

Afijio

2.405

0.0161

Oyo East

2.009

0.045

Atiba

2.278

0.0230

Ibarapa North

1.824

0.068

None

None

None

Ibarapa Central

1.824

0.068

None

None

None

Source: Produced by authors

175

Hotspots (Getis and Ord) The hotspots for RTC as a result of dangerous driving in 2011 (Figure 4.21) were on major roads within Oyo West and Ibarapa East LGAs. While, the hotspots prone toRTC as a result of dangerous driving in 2012 (Figure 4.22) were on major roads within Oyo West and Oyo East LGAs. The hotspots in 2011 include Oyo West, Ibarapa East, Afijio and Oyo East LGAs. In the year 2012 the hotspots include Oyo West, Oyo East, Afijio and Atiba LGAs

176



Figure 4.21: Geovisual presentation of Hotspots for Dangerous Driving in Oyo State in 2011 (Getis and Ord) Source: Produced by authors

177



Figure 4.22: Geovisual presentation of Hotspots for Dangerous Driving in Oyo State in 2012 (Getis and Ord) Source: Produced by authors

178

Speed Limit Violation Spatial Dependence (Moran) The null hypothesis states that the spatial pattern is random across the study area. In 2011, the Morans I Index calculated equal 0.14. The Z score equal 1.95 standard deviations, while the P value equal 0.10. There is less than 5 to 10% likelihood that this clustered pattern could be a result of random chance. For 2012, the Morans I Index calculated equal 0.17. The Z score equal 2.38, while the P value equal 0.05. There is less than 5% likelihood that this clustered pattern is the result of random chance (Table 4.10).

179

Table 4.10: Morans Spatial Dependence for Speed Limit Violation among Local Government Areas (LGA): Number of Speed Limit Violation in Oyo State, 2011-2012

YEAR, 2011 LGA

LMiIndex

LMiZScore

LMi Pvalue

Concentration

Oluyole

11.250

5.156

0.000

HH

Ona Ara

9.891

4.163

0.000

HH

IBS

8.013

3.287

0.001

HH

YEAR, 2012 LGA

LMiIndex

LMiZScore

LMi Pvalue

Concentration

Oluyole

11.725

5.369

0.000

HH

Ona Ara

11.142

4.675

0.000

HH

IBS

9.518

3.882

0.0001

HH

Key : HH =High High; IBSE= Ibadan South East Source: Produced by authors

180

Hotspots (Moran) The hotspots as a result of speed limit violation in 2011 (Figure 4.23) were on major roads within Oluyole, Ona Ara and Ibadan South East LGAs. While, the hotspots as a result of speed limit violation in 2012 (Figure 4.24) were on major roads within Oluyole, Ona Ara and Ibadan South East LGAs. Result reveals high high values of concentration of RTC clustering within Oluyole,Ona Ara and Ibadan South East LGAs for 2011 and 2012.

181



Figure 4.23: Geovisual presentation of Hotspots for Speed Limit Violation in Oyo State in 2011 (Morans) Source: Produced by authors

182



Figure 4.24: Geovisual presentation of Hotspots for Speed Limit Violation in Oyo State in 2012 (Morans) Source: Produced by authors

4.5

Spatial Dependence(Getis and Ord)

The null hypothesis states that the spatial pattern is random across the study area. The results from the G statistic for testing spatial dependence are given as follow: In 2011, the G Index was 0.29 with a Z score of 2.55. There was less than 5%likelihood that the clustering of high values is the result of random chance. While, for 2012, the G was equal 0.31 while the Z score equalled 3.01 standard deviations. There was less than 1% likelihood that the clustering of high values could be the result of random chance (Table 4.11).

183

Table 4.11: Getis and Ord Spatial Dependence for Speed Limit Violation among Local Government Areas (LGAs): Number of Speed Limit Violation in Oyo State, 2011-2012

(HIGHEST POSITIVE VALUES) YEAR, 2011 LGA

YEAR, 2012

Gi Z Score

Gi P value

Oluyole

2.746

0.006

Ona Ara

2.395

Egbeda

Gi Z Score

Gi P value

Oluyole

2.831

0.005

0.017

Ona Ara

2.562

0.010

2.395

0.017

Egbeda

2.562

0.010

Ibadan North

2.212

0.027

Ibadan North

2.436

0.014

Ibadan North

2.091

0.037

Ibadan North

2.287

0.022

2.287

0.022

2.287

0.022

2.287

0.022

East Ibadan North

East 2.091

0.037

West Ibadan South

Ibadan North West

2.091

0.037

West Ibadan South

LGA

Ibadan South West

2.091

East

0.037

Ibadan South East

Source: Produced by authors

184

Hotspots (Getis and Ord) The hotspots for RTC as a result of speed limit violation in 2011 (Figure 4.25) were on major roads within Oluyole LGA. The concentration is less significant in Ona Ara, Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East, and Ibadan South West LGAs. In 2012 the hotspots are the same with the later (Figure 4.26). The hotspots include Oluyole, Ona Ara and Egbeda LGAs in 2011 and 2012.

185



Figure 4.25: Geovisual presentation of Hotspots for Speed Limit Violation in Oyo State in 2011 (Getis and Ord) Source: Produced by authors

186



Figure 4.26: Geovisual presentation of Hotspots for Speed Limit Violation in Oyo State in 2012 (Getis and Ord) Source: Produced by authors

187

Mechanical Fault Spatial Dependence(Moran) The null hypothesis states that the spatial pattern is random across the study area. In 2011, the Morans I Index calculated was 0.28, with a Z score of 3.52, and P value of 0.01. This indicates less than 1% likelihood that this clustered pattern could be the result of random chance. For 2012, the Morans I Index calculated was 0.47, while the Z score was 5.54, with a P value of 0.01 and less than 1% likelihood that this clustered pattern could be the result of random chance (Table 4.12).

188

Table 4.12: Morans Spatial Dependence for Mechanical Fault among Local Government Areas (LGA): Number of Mechanical Fault in Oyo State, 2011-2012

YEAR, 2011 LGA

LMi Index

LMi Z Score

LMi P value

Concentration

Oluyole

14.825

6.071

0.000

HH

IBSE

13.884

5.648

0.000

HH

Ona Ara

12.621

5.514

0.000

HH

Egbeda

11.496

4.733

0.000

HH

Lagelu

6.119

2.498

0.013

HH

YEAR, 2012 LGA

LMi Index

LMi Z Score

LMi P value

Concentration

Egbeda

14.728

5.908

0.000

HH

Ona Ara

14.728

5.908

0.000

HH

IBSE

13.702

5.342

0.000

HH

Oluyole

11.455

5.010

0.000

HH

Lagelu

8.378

3.313

0.001

HH

Key : HH =High High; IBSE = Ibadan South East Source: Produced by authors

189

4.6

Hotspots (Moran)

The hotspots as a result of mechanical fault in 2011 were on major roads within Oluyole, Ona Ara, Lagelu, Ibadan South East and Egbeda LGAs (Figure 4.27). In addition to these LGAs, in 2012, Lagelu LGA became more concentrated with accidents resulting from mechanical fault (Figure 4.28). Result reveals high high values of concentration of RTC clustering due to mechanical fault within Oluyole, Ibadan South East, Ona Ara, Egbeda and Lagelu LGAs for 2011 and 2012.

190



Figure 4.27: Geovisual presentation of Hotspots for Mechanical Fault in Oyo State in 2011 (Morans) Source: Produced by authors

191



Figure 4.28: Geovisual presentation of Hotspots for Mechanical Fault in Oyo State in 2012 (Morans) Source: Produced by authors

Spatial Dependence (Getis and Ord) The null hypothesis states that the spatial pattern is random across the study area. The results from the G statistic for testing spatial dependence are given as follow: for 2011, the G Index was 0.34, while the Z score was 3.91 standard deviations with less than 1% likelihood that the clustering of high values could be the result of random chance. While, for 2012, the G index was 0.5 while the Z score equalled 5.56 standard deviations with less than 1% likelihood that the clustering of high values could be the result of random chance (Table 4.13).

192

Table 4.13: Getis and Ord Spatial Dependence for Mechanical Fault among Local Government Areas (LGAs): Number of Road Traffic Crashes Resulting from Mechanical Faults in Oyo State, 2011-2012

YEAR, 2011 LGA

YEAR, 2012

Gi Z Score

Gi P value

LGA

Gi Z Score

Gi P value

Egbeda

3.176

0.002

Egbeda

3.568

0.000

Ona Ara

3.176

0.002

Ona Ara

3.568

0.000

Oluyole

3.053

0.002

Ibadan North East

3.287

0.001

Ibadan South East

2.960

0.003

Ibadan North West

3.287

0.001

Ibadan South West

2.960

0.003

Ibadan South East

3.287

0.001

Ibadan North East

2.960

0.003

Ibadan South West

3.287

0.001

Ibadan North West

2.960

0.003

Oluyole

3.155

0.002

Ibadan North

2.824

0.005

Ibadan North

2.964

0.003

Source: Produced by authors

193

Hotspots (Getis and Ord) The hotspots for RTC as a result of mechanical fault in 2011 (Figure 4.29) were on major roads within Egbeda, Ona Ara, Oluyole, Ibadan South East, Ibadan South West, Ibadan North East, Ibadan North West and Ibadan North LGAs. The hotspots in 2012 were the same LGAs (Figure 4.30). For the two periods under consideration the concentration of number of RTC attributable to mechanical fault is high in Egbeda, Ona Ara, Oluyole, Ibadan South East, Ibadan South West, Ibadan North East, Ibadan North West and Ibadan North LGAs.

194

Figure 4.29: Geovisual presentation of Hotspots for Mechanical Fault in Oyo State in 2011 (Getis and Ord) Source: Produced by authors

195

Figure 4.30: Geovisual presentation of Hotspots for Mechanical Fault in Oyo State in 2012 (Getis and Ord) Source: Produced by authors

196

Spatial Dependence (Moran) Human Factor The null hypothesis states that the spatial pattern is random across the study area. In 2011, the Morans I Index calculated equalled 0.18. The Z score equalled 2.28 standard deviations, while the P value was 0.05, with less than 5% likelihood that this clustered pattern could be the result of random chance. For 2012, the Morans I Index calculated was 0.7, with a Z score of 8.02 standard deviations, and a P value of 0.01 indicating a less than 1% likelihood that this clustered pattern could be the result of random chance (Table 4.14).

197

Table 4.14: Morans Spatial Dependence for Human Factors among Local Government Areas: Number of Human Factors in Oyo State, 2011-2012

YEAR, 2011 LGA

LMi Index

LMi Z Score

LMi P value

Concentration

Oyo West

5.721

3.092

0.002

LL

Oyo East

5.552

2.749

0.006

LL

Atiba

2.861

2.113

0.035

LL

YEAR, 2012 LGA

LMi Index

LMi Z Score

LMi P value

Concentration

Oluyole

18.848

8.138

0.000

HH

Ona Ara

17.961

7.145

0.000

HH

Ibadan South East

16.054

6.212

0.000

HH

Ibadan South West

11.023

4.302

0.000

HH

Ibadan North East

11.023

4.302

0.000

HH

Ibadan North West

11.023

4.302

0.000

HH

Ibadan North

9.823

3.761

0.000

HH

Egbeda

5.398

2.225

0.026

HH

Key : HH =High High; LL= Low Low Source: Produced by authors

198

Hotspots (Moran) The hotspots as a result of human factors in 2011 were on major roads within Oyo West and Oyo East LGAs (Figure 4.31). In 2012, the concentration became higher in Oluyole, Ona Ara, Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East, and Ibadan South West LGAs (Figure 4.32). Result reveals high high values of concentration of accident clustering due to human factors within Oluyole, Ona Ara, Ibadan South East, Ibadan South West, Ibadan North East, Ibadan North West, Ibadan North and Egbeda LGAs in 2012. For 2011, the concentration was low in Oyo West and Oyo East LGA.

199



Figure 4.31: Geovisual presentation of Hotspots for Human Factor in Oyo State in 2011 (Morans) Source: Produced by authors

200

Figure 4.32: Geovisual presentation of Hotspots for Human Factor in Oyo State in 2012 (Morans) Source: Produced by authors

201

Spatial Dependence (Getis and Ord) The null hypothesis states that the spatial pattern is random across the study area. The results from the G statistic for testing spatial dependence are given as follows: for 2011, G Index equalled 0.19, while the Z score equalled 0.93 standard deviations, with no apparent clustering detected at this scale. For 2012, the G index equalled 0.42 with a Z score of 6.21 standard deviations, with a less than 1% likelihood that the clustering of high values could be the result of random chance (Table 4.15).

202

Table 4.15: Getis and Ord Spatial Dependence for Human Factors among Local Government Areas: Number of Human Factors in Oyo State, 2011-2012

(HIGHEST POSITIVE VALUES) YEAR, 2011 LGA

YEAR, 2012

Gi Z Score

Gi P value

LGA

Gi Z Score

Gi P value

Oluyole

1.548

0.122

Oluyole

4.661

0.000

None

None

None

Ona Ara

4.177

0.000

None

None

None

Egbeda

4.177

0.000

None

None

None

Ibadan North East

3.718

0.000

None

None

None

Ibadan North West

3.718

0.000

None

None

None

Ibadan South East

3.718

0.000

None

None

None

Ibadan South West

3.718

0.000

None

None

None

Ibadan North

3.296

0.000

None

None

None

Lagelu

2.630

0.008

Source: Produced by authors

203

Hotspots (Getis and Ord) There were no hotspots for RTC as a result of human factors in 2011 (Figure 4.33). In 2012, the hotspots for RTC as a result of human factors include Oluyole, Ona Ara, Egbeda, Lagelu, Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East, and Ibadan South West LGAs (Figure 4.34). High positive values for the normalized Z values for the G Statistic were observed in Oluyole, Ona Ara, Egbeda, Ibadan North East, Ibadan North West, Ibadan South East, Ibadan South West, Ibadan North and Lagelu LGA in 2012. The concentration was only high in Oluyole LGA for the year 2011.

204

Figure 4.33: Geovisual presentation of Hotspots for Human Factor in Oyo State in 2011 (Getis and Ord) Source: Produced by authors

205

Figure 4.34: Geovisual presentation of Hotspots for Human Factor in Oyo State in 2012 (Getis and Ord) Source: Produced by authors

4.6.1

Multivariate Associations

Multivariate Moran Scatterplot Figures 4.35 and 4.36 show the familiar bell-like shape characteristic of a normally distributed random variable, with the values following a continuous colour ramp. The histogram is classified into 7 categories. The figures by the side indicate the number of observations falling in each category. The observations themselves are shown in bracket. For instance, in Figure 4.35, for the first category 2(3); three observations have two neighbours. The significance of taking cognizance of neighbourhood characteristics is illustrated in these figures, even though the shape file is the same, the distribution of the two histograms is not exactly the same. The only corresponding observations are 2(3), 4(10) and 7(3). This depicts the distribution of a variable for a selected subset of locations on the map, possibly suggesting the existence of spatial heterogeneity. A particular approach is likely to yield different result. To know the most effective approach for dispensing intervention measures, there is a need to investigate the relationships that exist between localities, namely, spatial units (LGAs) before remedial effects would be administered.

206

For instance, from Table 4.1 corresponding to Figure 4.35, Irepo LGA has two neighbours Orelope and Olorunsogo. Technically, the three LGA must be observed as the same when administering any form of safety and security measure. Otherwise significant remedial effect cannot be achieved. Table 4.3 corresponds to Figure 4.36.

207

Figure 4.35: Connectivity histogram (Queen) Source: Produced by authors

208



Figure 4.36: Connectivity histogram (Rook) Source: Produced by authors

209

The degree of linear associations are examined based on the queen criteria. As illustrated in Figure 4.37 for the variables RTC and major road lenghts, travel density, residential population and land area encompassing each LGA, the multivariate Moran scatterplot relates the values for each variable at a location to the average RTC for the neighbouring locations. Figure 4.38 shows the corresponding empirical reference distribution for the statistics under spatial randomness, constructed from 999 random permutaions. This suggest that each of the observed corresponding values of -0.236; 0.280; 0.242 and -0.333 is highly significant and not compatible with a notion of spatial randomness. The existence of a freeway link crossing a spatial unit,traffic generated within each spatial unit, residential population and area of administration encompassing each LGA will lead to reduction in number of RTC; higher RTC; more RTC and decrease in RTC respectively. Although, the position of the later indicates the existence of other inhibiting factors such as spatial heterogeneity. These results give a clear indication that explanatory variables namely, major road lenghts, travel density, residential population and area of administration are significant issues to be considered in the policy design for RTC measures of intervention in any geographical region. RTC are best managed within small geographical regions and measures for intervention will yield significant results, only when the remedial measures are not administered to the spatial units (LGAs) in isolation. A good account of neighbourhood characteristics will identify spatial units, that is, localities within a geographical area that need to be treated simultaneously on the same scale and those that need not be considered for intervention at a particular point in time. Also, based on an understanding of the linear association that exist amongst variables, governments can be better informed on factors that could be responsible for RTC and thus be scientifically guided to intervene appropriately.

210

Figure 4.37: Generalized Moran scatterplot for major road length, travel density, population and area using the queen contiguity weights matrices Source: Produced by authors

211

Figure 4.38: Empirical reference distribution for major road length, travel density, population and area using the queen contiguity weights matrices Source: Produced by authors

212

4.7

Model Estimation

Results of estimation using the SAR, SAR-SARD, SAR-SARD-IV and the traditional classical linear regression models are presented.

4.7.1

Spatial Autoreggressive Model Estimation

The estimation of SAR model involves the following procedure: In the Open-GeoDa environment (see Anselin, 2005) Load the sample data set with crash events and related variables namely travel density (tdensity), population (pop), major road length (mlength) and encompassing area(area) for RS11.3 Oyo Command. The procedure follows below: Step 1: Click regress on the main menu The title and output file dialog is identical for all regression analyses. This allows for the specification of an output file name. Step 2: Click Ok The familiar regression specification dialog comes up. Enter the dependent and explanatory variables and specify the spatial weights file. Step 3: Instead of the default classic, check the radio button next to spatial lag. Step 4: Invoke the estimation routine as before, by checking on the run button. A progress bar will appear after the estimation is completed. Step 5: Click on the save button. Select Ok. This will bring up the dialog to specify variable names to add the residuals and predicted values to the data table. There are three options LAG PREDIC for the predicted value, LAG PRDERR for the prediction error and LAG RESIDU for the residual. Step 6: Select all three check boxes and keep the variable names to their defaults. Step 7: Click OK to get back to the regression dialog Step 8: Select Ok again to bring up the estimation results and diagnostics Step 9: Save results. Go to main menu. Click on the file and save in word format. The long output result is similar to that obtained in the classical regression model, except that the lag coefficient is included.

213

Spatial Autoregressive Regression Estimates Adopting the rook contiguity based W , the SAR model was built (Table 4.16). The number of observations equal 33, number of variables 6, while the degrees of freedom equal 27. The R-squared equal 27%. The mean of the dependent variable (RTC) is 0.82 and the standard deviation equal 0.65. The residual variance (sigma-square) is 0.31, while, the standard error estimate (standard error regression) is 0.56. A limited number of diagnostics are provided with the ML lag estimation. First is the BreuschPagan test for heteroskedasticity in the error terms. The highly insignificant value of 1.74 (p-value 0.78) suggests that heteroskedasticity is not a serious problem. The second test is an alternative to the asymptotic significance test on the SAR coefficient; it is not a test on remaining spatial autocorrelation. The value 2.90 (p-value 0.09) confirms strong significance of the SAR coefficient. The estimated λ is positive and significant, indicating moderate SAR dependence in RTC. In other words, RTC tend to be more clustered by LGAs than what would be expected by a random distribution. The number of RTC cases for a given LGA is affected by the RTC cases of the neighbouring LGAs. Removing the effect of SAR variable from the dependent variable (number of RTC), the area of each LGA, residential population, major road lengths and travel densities were used to predict areas with higher than expected future likelihood of RTC.

214

Table 4.16: Spatial Autoregressive Model

Exogenous Variables Coefficient Probability SARλ

0.37

0.06∗

Intercept

-2.31

0.50

Log(area)

0.02

0.96

Log(pop)

0.74

0.23

Log(mlength)

-0.44

0.43

Log(tdensity)

-0.10

0.58

Source: Produced by authors Note: ∗,

∗∗

and

∗∗∗

denotes statistical significance at 10%, 5% and 1%

respectively.

215

The parameter estimates for area, population, road lengths characteristic and travel densities are not significant. This means that each of these exogenous variables does not contribute significantly to the incidence of RTC in Oyo state. However, the sign of the coefficient suggests the following: Controlling for the spatial lag and the LGAs, population is positively related to the number of RTC occurring within the localities. Indicating that population generate a certain level of RTC. All other things being equal, LGAs with larger residential populations tend to have more RTC. The coefficient indicates the expected number of RTC for each LGA for every person living in the LGA. A one percent increase in population will generate 0.74 percent increase in number of RTC cases within each LGA. Road Lengths characteristic produce a negative coefficient. This means the existence of a freeway link crossing a LGA, inversely impact the incidence of RTC cases for the period. One percent increase in major road length will generate 0.44 percent decrease in number of RTC cases within each LGA. Travel densities are negatively related to number of RTC. One percent increase in travel densities will generate 0.1 percent decrease in number of RTC cases. This suggests inhibiting factors in the sense that traffic generated tend to be associated with fewer crashes. There are a number of possible reasons for this negative coefficient. The high intensity of vehicles in urban areas as a result of economic activities, low concentration of vehicles in less populated areas and the existence of few major roads in rural areas could be responsible for this result. LGAs with higher than expected future likelihood of RTC include Ibadan North East, Ibadan South East, Ibadan North , Ibadan South West, Egbeda and Ibadan North West.

4.7.2

Spatial Autoregressive Model with Spatial Autoregressive Disturbances Model Estimation

The estimation process of the model follow the steps highlighted below: Stata 12 software was used for spatial regression analysis (see Drukker et al., 2013 and Drukker et al., 2013a). Step 1: First of all install the commands (shp2dta, spreg, spregiv) Connect to internet and launch Stata Software Goto - Help menu and click “Search” type in the commands one after the other and it will give options for you to install. 216

• Goto - File- Set working directory • Goto - File- Log-begin and save • shp2dta using oyo st, database(oyodb) coordinates(oyocoord) genid(id) OR • shp2dta using oyo st, database(oyodb) coordinates(oyocoord) genid(id) gencentroids(c) • use oyodb, clear • describe • list id NAME 2 in 1/5 create stats.dta (scode column) create trans.dta (scode and id column) • use stats • merge scode using trans, sort unique • tabulate merge • drop merge • merge id using oyodb, sort unique • tabulate merge • drop if merge! = 3 • spmap RTC using oyocoord if id ! = 13&id! = 56, id(id) fcolor(Blues) Map displays Our dependent variable is the number of RTC cases per LGA. Figure 4.39 shows the distribution of RTC across LGAs, with darker colors representing higher values of the dependent variable. Spatial pattern of RTC are clearly visible. The highest concentration of RTC is across Oluyole, Ido, Akinyele, Egbeda, Atiba, Oyo East and Ogbomosho South LGAs. This is followed by Ibadan North, Ibadan North West, Ibadan South East, Ibadan South West, Lagelu, Afijio, Oyo West, Ori-Ire and Ogbomosho North LGAs. Step 2: Create Weights Matrices • use oyo st, clear • spmat contiguity coyo st using oyocoord, id(ID) normalize(minmax) • spmat summarize coyo st, links The content of the W is summarized in Table 4.17. Some basic information about the normalized contiguity matrix, including the dimensions of the matrix and its storage are displayed. The number of neighbours found is reported as 156, with each LGA having 5 neighbours on average. Each LGA has a minimum of 2 neighbours and a maximum of 8 neighbours.

217



Figure 4.39: Number of Road Traffic Crashes for Local Government Areas in Oyo State, Nigeria Source: Produced by authors

218

Table 4.17: Summary of Spatial Contiguity Weighting Matrix

Matrix

Description

Dimensions

33 x 33

Stored as Links

33 x 33

Total

156

Min

2

Mean

4.727273

Max

8

Source: Produced by the authors

219

• spmat save coyo st using coyo st.spmat • spmat drop coyo st • spmat use coyo st using coyo st.spmat • spmat note coyo st • spmat export coyo st using nlist.txt, nlist • use oyopop • spmat use coyopop using coyo st.spmat Here the contiguity matrix as discussed in section 4.2 results Next are the spatial regression commands Step 3 Perform Regression Analysis • spreg gs2sls RTC LOG POP LG MJRD LG TRD LG AR LG, id(id) dlmat(coyo st) elmat(coyo st) nolog • spreg ml RTC LOG POP LG MJRD LG TRD LG AR LG, id(id) dlmat(coyo st) elmat(coyo st) nolog

Spatial Autoregressive Model with Spatial Autoregressive Disturbances Regression Estimates The GS2SLS and MLE parameter estimates for the SAR model with SAR disturbances (SARSARD) are as shown in Table 4.18. The results for the GS2SLS and MLE estimators are apparently the same, indicating moderate and significant SAR dependence in the dependent variable and the error term, except in the case of Area where the sign of the coefficient differs.

220

Table 4.18: Spatial Autoregressive Model with Spatial Autoregressive Disturbances Model: Generalized Spatial Two Stage Least Squares and Maximum Liklelihood Estimates

Variables

GS2SLS

Probability

MLE

Probability

Log(pop)

0.378

0.503

0.630

0.321

Log(mlength)

-0.263

0.594

-0.253

0.660

Log(tdensity)

-0.076

0.613

-0.069

0.694

Log(area)

0.018

0.951

-0.098

0.781

Lambda(λ)

1.369

0.000∗∗∗

0.708

0.001∗∗∗

Rho (ρ)

-1.431

0.071∗∗

-0.452

0.050∗∗

Sigma2 (σ 2 )

None

None

0.304

0.000∗∗∗

Source: Produced by authors Note: ∗ ,

∗∗

and

∗∗∗

denotes statistical significance at 10%, 5% and 1% respectively.

221

However, the GS2SLS estimator produces consistent estimates when the heteroskedastic option is specified (Kelejian and Prucha, 1998, 1999, 2010; Arraiz et al., 2010; Drukker et al., 2013) . The MLE estimator produces consistent estimates in the IID cases but generally not in the heteroskedastic case (Lee, 2004). The MLE estimator does not generally produce consistent estimates in the heteroskedastic case (see Arraiz et al., 2010). Thus, given the normalization of the spatial-weighting matrix, the parameter space for λ and ρ is taken to be the interval (−1, 1), (Kelejian and Prucha, 2010). The estimated λ is positive and significant, indicating moderate SAR dependence in RTC cases. In other words, the RTC rate for a given LGA is affected by the RTC rates of the neighbouring LGAs. The estimated ρ coefficient is negative, moderate and significant, indicating moderate SAR dependence in the error term. In other words, an exogenous shock to one LGA will cause moderate changes in the RTC cases in the neighbouring LGAs. Including a spatial lag of the dependent variable implies that the outcomes are determined simultaneously. Thus, the estimated β vector does not have the same interpretation as in a simple linear model. The parameter estimates for area, population, road lengths characteristic and travel densities are not significant. This means that each of these exogenous variables does not contribute significantly to the incidence of RTC in Oyo state. However, the sign of the coefficient suggests the following: The sign of the coefficient for the area is positive. There are more RTC in larger LGAs. Every one percent increase in area per square kilometre encompassing a LGA will generate 0.02 percent increase in number of RTC cases. This means an increase in the area of administration of LGAs will lead to more RTC in each LGA. Controlling for the spatial lag and the LGAs, population is positively related to the number of RTC occurring within the localities. Indicating that population generate a certain level of RTC. All other things being equal, LGAs with larger residential populations tend to have more RTC. The coefficient indicates the expected number of RTC for each LGA for every person living in the LGA. A one percent increase in population will generate 0.38 percent increase in number of RTC cases within each LGA. Major road lengths characteristic produce a negative coefficient. This means the existence of a freeway link crossing a LGA, inversely impact the incidence of RTC cases for the period. One percent increase in major road length will generate 0.26 percent decrease in number of RTC cases within each LGA. Travel densities are negatively related to number of RTC. One percent increase in travel densities will generate 0.08 percent decrease in number of RTC cases. This suggests inhibiting

222

factors in the sense that traffic generated tend to be associated with fewer crashes. There are a number of possible reasons for this negative coefficient. The high intensity of vehicles in urban areas as a result of economic activities, low concentration of vehicles in less populated areas and the existence of few major roads in rural areas could be responsible for this result. This is similar to the works of Shefer and Rietveld (1997) where an inverse relationship between traffic congestion and road accidents exist (see section 2.3.3)

4.7.3

Spatial Autoreggressive Model with Spatial Autoreggressive Disturbances and Additional Endogenous Variable Model Estimation

Steps 1 and 2 in section 4.7.2 apply. The instrumental variable is the lag of population figures for each of the LGA. We assume traffic movement is a function of human population. In addition, earlier results suggest close relationship between RTC and population in Oyo state (see section 4.4.1: Figures 4.3 and 4.4). Next is to use the command that follow. . spivreg RTC LOG (POP LG = elect) MJRD LG TRD LG AR LG, id(id) dlmat(coyo st) elmat(coyo st) nolog

Spatial Autoreggressive Model with Spatial Autoreggressive Disturbances and Additional Endogenous Variable Model Regression Estimates The GS2SLS parameter estimates for the SAR-SARD-IV model are as shown in Table 4.19. Given the normalization of the W , the parameter space forλ and ρ is taken to be the interval (1, 1) (see Kelejian and Prucha, 2010) for further discussions of the parameter space. The estimate of λ is positive, large, and significant, indicating strong SAR dependence in RTC cases. In other words, the RTC rate for a given LGA is strongly affected by the RTC rates in the neighbouring counties. One possible explanation for this may be the existence of freeways linking LGAs. Another may be the result of high travel density on major road networks of particular LGAs. The parameter estimates for area, population, road lengths characteristic and travel densities are not significant. This means that each of these exogenous variables does not contribute significantly to the incidence of RTC in Oyo state. However, the sign of the coefficient suggests 223

the following: The estimated ρ is negative and moderate indicating moderate spatial autocorrelation in the innovations. In other words, an exogenous shock to one LGA will cause moderate changes in the RTC cases rate in the neighbouring LGAs. The estimated β vector does not have the same interpretation as in a simple linear model, because including a spatial lag of the dependent variable implies that the outcomes are determined simultaneously. The parameter estimates for area, population, road lengths characteristic, travel densities are not significant. However, the sign of the coefficient may suggest the following: Controlling for the spatial lag and the LGAs, population is positively related to the number of RTC occurring within the localities. All other things being equal, LGAs with larger residential populations tend to have more RTC. The coefficient indicates the expected number of RTC for each LGA for every person living in the LGA. A one percent increase in population will generate 0.42 percent increase in number of RTC cases within each LGA.

224

Table 4.19: Spatial Autoreggressive Model with Spatial Autoreggressive Disturbances and Additional Endoegenous Variable Model: Generalized Spatial Two Stage Least Squares Estimates

Variables

Coefficient

Probability

Log(pop)

0.417

0.624

Log(mlength)

-0.104

0.849

Log(tdensity)

0.009

0.962

Log(area)

-0.175

0.606

Lambda(λ)

1.205

0.001∗∗∗

Rho (ρ)

-1.179

0.018∗∗

Source: Produced by authors Note: ∗ ,

∗∗

and

∗∗∗

denotes statistical significance at 10%, 5% and 1% respectively.

Major road lengths characteristic produce a negative coefficient. This means the existence of a freeway link crossing a LGA, inversely impact the incidence of RTC cases for the period. One percent increase in major road length will generate 0.10 percent decrease in number of RTC cases within each LGA. Travel densities are positively related to number of RTC. One percent increase in travel densities will generate 0.01 percent increase in number of RTC cases. This suggests that traffic generated tend to be associated with larger crashes. The high intensity of vehicles in urban areas as a result of economic activities and the existence of few major roads in rural areas could be responsible for this result. This is similar to the findings of Noland and Quddus (2005) and Wang et al. (2009) (see section 2.3.3). The sign of the coefficient for the area is negative. There are fewer RTC in larger LGAs. Every one percent decrease in area per square kilometre encompassing a LGA will generate 0.18 percent increase in number of RTC cases. This means an increase in the area of administration of LGAs will lead to less RTC in each LGA.

225

4.7.4

Classical Linear Regression Model

To allow for comparison of model fit we modelled RTC with the traditional classical linear regression analysis technique and using the OLS technique. The software used for this analysis is Open-GeoDa. In Open-GeoDa, the regression functionality can be invoked after the shapefile has been loaded from the main interface. The following procedures apply: Step 1: Select methods Step 2: From the drop down menu select regress. This brings up the default regression title and output dialog. Step 3: Enter the regression output file name Step 4: To specify long output options, check the predicted value and residual and coefficient variance matrix boxes. Step 5: Click on the OK button in the regression title and output dialogue. This brings up the regression model specification box. Step 6: Select the dependent variable and the independent variables to specify the regression model. First select RTC as the dependent variable, by clicking on the variable name in the select variables column. Click on the > button next to dependent variable. The explanatory variables namely travel density, population, major road length and encompassing area are selected in a similar manner. Step 7: Create a spatial weight matrix Select the rook/queen criterion. Name the file Step 8: Click on the run button to run the regression The included constant term is checked by default. The progress bar will appear and indicate when the estimation process is complete. Step 9: Click on OK to bring up the regression results window Step 10: Select save This brings up a dialog to specify the variable names for residuals and/or predicted values. Check the box next to predicted value Check the box next to residual Over write the default names with meaningful names Step 11: Click OK to add the selected columns to the data table Step 12: Click OK in the regression variable dialog to bring up the results window Step 13: To save regression result Go to file on the main menu and paste results on word processor. The top part of the window contains several summary statistics of the model as well as mea-

226

sures of fit. This is followed by a list of variable names, with associated coefficient estimates, standard error, t- statistic and probability of rejecting the null hypothesis.

Classical Regression Estimates For the classical regression (OLS) that does not incorporate spatial effect, the estimates shown in Table 4.20 were obtained. Number of observations (33), number of variables (5), degrees of freedom (28), mean dependent variable RTC (-0.82), standard deviation (0.65), Rsquared (0.16%), Adjusted R - squared (0.06%), sum of squared residual (11.639), sigma square (0.42), standard error regression (0.64), sigma square ML (0.35) and standard error regression ML (0.59). The parameter estimates for area, population, road lengths characteristic and travel densities are not significant. This means that each of these exogenous variables does not contribute significantly to the incidence of RTC in Oyo state. However, the sign of the coefficient suggests the following: The sign of the coefficient for the area is negative. There are more RTC in smaller LGAs. Every one percent decrease in area per square kilometre encompassing a LGA will generate 0.18 percent increase in number of RTC cases. Population is positively related to the number of RTC occurring within the localities. Road Lengths characteristic produce a negative coefficient. One percent increase in major road length will generate 0.24 percent decrease in number of RTC cases within each LGA. Travel densities are negatively related to number of RTC. One percent increase in travel densities will generate 0.05 percent decrease in number of RTC cases. It can be deduced that, the sign and magnitude of the coefficients for the SAR model and the classical regression model are not the same. For two variables namely, major road lengths and travel densities the signs are both negative, while the sign for population is positive in the two instances but the sign for area of land encompassing each LGA is different. The magnitudes for all the coefficients are also observed to be different.

227

Table 4.20: Classical Regression Model : Ordinary Least Squares Estimates

Exogenous Variables

Coefficient

Probability

Intercept

-2.39

0.55

Log(area)

-0.18

0.66

Log(pop)

0.82

0.26

Log(mlength)

-0.24

0.71

Log(tdensity)

-0.05

0.81

Source: Produced by authors

228

4.8

Model Diagnostics

In the Open GeoDa environment after obtaining the regression estimates as in section 4.4.4. Next is the list of variables for model diagnostics. The summary characteristics of the model listed at the top include the name of the data set, the dependent variable, its mean and standard deviation. In addition, the numbers of observations are listed; the number of variables included in the model including the constant term and the degrees of freedom are displayed. In the left hand column of the standard output are traditional measures of fit, including the R2 and the adjusted R2 , the sum of squared residuals, the residual variance and standard error estimate, both with adjustment for a loss in degrees of freedom (sigma-square and standard error of regression) as well as without (sigma-square ML and standard error of regression ML). The difference between the two measures is that the first divides the sum of squared residuals by the degrees of freedom, the second by the total number of observation. The second measure will therefore always be smaller than the first, but for large data sets, the difference will become negligible. In the right hand column are listed the F- statistic on the null hypothesis that all regression coefficients are jointly zero, and the associated probability. This test statistic is included for completeness sake, since it typically will reject the null hypothesis and is therefore not useful. Finally, this column contains three measures that are included to maintain comparability with the fit of the SAR model. They are the log likelihood, AIC and SC. These three measures are based on an assumption of multivariate normality and the corresponding likelihood function for the standard regression model. The higher the log likelihood, the better the fit, that is, high on the real line, so less negative is better. For the information criteria, the direction is opposite and the lower the measure, the better the fit. AIC = −2L + 2K , where L is the log-likelihood and K is the number of parameters in the model. SC = −2L + K ln(N ) , where ln is the natural logarithm. When the long output options are checked in the regression dialog, an additional set of results is included in the output window. These are the full covariance matrix for the regression coefficient estimates, and/or the predicted values and residuals for each observation. These results will be listed after the diagnostics. The variable names are given at the top of the 229

columns of the covariance matrix. In addition, for each observation, the observed dependent variable is listed, as well as the predicted value and residual: observed less predicted.

4.8.1

Classical Regression Model versus Spatial Autoregressive Model

In Table 4.21, three measures based on the assumption of multivariate normality and the corresponding likelihood functions for the standard regression model are used for comparability with the fit of the SAR regression model. The higher the log-likelihood of the SAR model the better the fit. Whereas, the lower the AIC and SC the better the fit. The log-likelihood of the SAR model is greater than the log-likelihood of the CLR model indicating that the SAR model is better than CLR model. The coefficients of the AIC and SC for the SAR model are smaller than the coefficients for the CLR model. Also, indicating that the SAR model is better than the CLR model.

230

231

76.74

68.36

-28.18

Source: Produced by authors

77.34

69.26

Akaike Information Criterion

Schwarz Criterion

-29.63

sical linear regression model

SAR model is better than clas-

sical linear regression model

SAR model is better than clas-

sical linear regression model

SAR model is better than clas-

Classical Regression Model SAR Model Remark

Log-Likelihood

Measure

Table 4.21: Comparability of Model Specification

4.9

Summary and Conclusion

The framework provides a means for linking RTC with neighbourhoods and exogenous variables and allows for post estimation analysis. The study concludes that the number of RTC cases in a given LGA is affected by the number of RTC cases of neighbouring LGAs, that is, there is spatial dependence. The policy implication of our result is that, for the LGAs identified as having higher than expected future likelihood of RTC; safety and security measures must be administered within these LGAs along with their neighbouring LGAs in order to achieve significant remedial effect. The area of administration of LGA and the FRSC should be reduced to give room for effective service delivery which invariably will lead to a reduction in the number of RTC cases within each LGA. The resulting models can be used to calculate predictions which in turn could be used to compute marginal effects. That is, we may examine how a change in one exogenous variable potentially changes the predicted values for all the observations of the dependent variable in LGAs across the State. In addition, to reduce RTC, improve safety and security measures and achieve maximum remedial effect; policy designs and decision making on transportation and road networks construction/maintenance should incorporate spatial effects and take cognizance of exogenous variables across the state.

232

Chapter 5

SUMMARY AND CONCLUSION 5.1

Introduction to the Chapter

This is the concluding chapter; the section contains a summary of the research work and explains the contribution of the present study to existing knowledge. Also, the chapter presents gaps in the study, identifies areas requiring attention and alternative techniques for further investigations.

5.2

Summary

There is a growing recognition among philosophers of science, that certain causal laws are irreducibly statistical in nature (Salmon, 1984), or , at least, that social mechanisms are epistemically (if not objectively) random in character and should be understood in terms of probabilistic rather than deterministic theories (Bunge, 1961; Suppes, 1970; Papineau, 1978; Cartwright, 1989). The concept of randomness is, therefore, by no means at odds with the idea that RTC are explicable in terms of causal relations (Elvik, 1989). Indeed, the use of a statistically formulated conceptual framework seems virtually unavoidable if an understanding of RTC generating mechanisms is to be reached. Such a formulation allows us to separate out the systematic variation in RTC counts from the pure chance variation. The present study allows a perfect blending of statistical principles and GIS to understand the dynamics of RTC.

233

The present approach provides more information about the dynamics of RTC. Moreover, a better understanding of the behaviour of the exogenous variables and spatial dependence is made possible. There are basically five chapters in this thesis. In chapter one, a brief overview on the need to carry out this task of modelling RTC using variants of SAR model that account for spatial effects was explained. Three overwhelming research questions raised on the basis of examining existing literature were stated. With a focus on the distinct gap identified and established for investigation, the four objectives of this research work were presented in an unsophisticated manner. In addition, the section contains stylized facts that revealed RTC as a greater risk factor when compared to most of the diseases that is the focus of individuals, governments, nongovernmental organizations and international bodies. Also, there was a systematic presentation of hidden facts that gave an indication that the level of decline in RTC required to bring about a significant change in total casualties is yet to be attained for the country and the study focus. Thus, reasons based on methodological issues and the implications of RTC at global and local levels were structured to form justifications for the study. Spectacular research works on RTC modelling were presented in an explicit manner in chapter two. The best we gathered investigating these past works is that the exclusion of spatial dependence in regression analysis, especially when it is actually present, leads to inefficient, biased and /or inconsistent OLS estimators , which invariably leads to biased statistical inference. It was also gathered that accounting for spatial effects could impact positively on model fit, statistical efficiency and parameter estimates. This led to chapter three, where the methodology was discussed exhaustively. First, we examined measures of spillover effects, and then adopted the index for measuring the levels of concentration and hotspots by Moran (1948). To deepen the understanding of the spatial autocorrelation, a complimenting statistic was used, the G statistic developed by Getis and Ord (1992). Spatial dependence was then incorporated into the classical linear regression analysis using the approach of Anselin (1988) and Drukker et al. (2013). With a focus on the disturbance term, the dependent variable and/or the exogenous variables spatial autocorrelation was incorporated into the the classical linear regression model. In chapter four, we identified the hotspots and levels of concentration of RTC for the study area. In addition, the causes and spatio-temporal variation in RTC were investigated. This should enable higher levels of precision when administering road safety and security measures. Also, results showed the gains in using SAR models over the Poisson and the classical linear regression based models. The policy implications of the results showed that RTC modelling 234

could aid reduction in the current trend of RTC on road networks across the globe.

5.3

Summary of Results and Consequences

1. Concentration Levels of Road Traffic Crashes Results reveal high concentration of road traffic crashes within Egbeda, Oluyole and Akinyele Local Government Areas of Oyo state. The consequence of this result is that there is high number of road traffic crashes in these particular Local Government Areas. Road safety remedial measures need to be administered to these Local Government Areas immediately to cut down the incidence of road traffic crashes in Oyo state.

2. Hotspots of Road Traffic Crashes The hotspots for road traffic crashes include Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East, Ibadan South West, Egbeda, Oluyole and Akinyele Local Government Areas of Oyo state. The consequence of this result is that Egbeda, Oluyole and Akinyele Local Government Areas have the highest concentration of road traffic crashes at the time of this study. However, Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East and Ibadan South West Local Government Areas of Oyo state are statistically significant clusters to Egbeda, Oluyole and Akinyele Local Government Areas. Thus, if current practices and policies on administering road safety remedial measures continue without novel intervention, the concentration of road traffic crashes will spill over to Ibadan North, Ibadan North East, Ibadan North West, Ibadan South East and Ibadan South West Local Government Areas of Oyo state. In the long run, the level of concentration of road traffic crashes may increase in the eight Local Government Areas of Oyo state. Apparently, there is a need for urgent intervention measures in the Local Government Areas with the highest concentration as indicated in 1 above.

3. Spatial Dependence between Road Traffic Crashes Occurrences The null hypothesis of randomness was rejected. Therefore, we conclude that the spatial pattern is clustered, as such; there is strong spatial dependence across Oyo state. Consequently, what happens in any Local Government Area of Oyo state is a function of what happens in the contiguous Local Government Areas. This indicates that the administration of road safety measures cannot be done in isolation if maximum remedial effect is expected. Here, Egbeda, Oluyole and Akinyele Local Government Areas of Oyo state must be given road safety remedial measures simultaneously because they have the same concentration of road traffic crashes. 235

Otherwise, the effect of administering any remedial measures will not be significant, and infact, may amount to a waste of resources. Ultimately, there should be a model to identify contiguous Local Government Areas to every Local Government Area in Oyo state to enable effective delivery of road safety remedial measures. In line with 1 above, the usual method which is based on historical frequencies and political influence is not effective for the delivery of road safety remedial measures.

4. Spatial Autoregressive Model with Spatial Autoregressive Disturbances Model The estimated λ is positive and significant, indicating moderate spatial autoregressive dependence in road traffic crashes. In other words, the road traffic crashes rate for a given Local Government Area is affected by the road traffic crashes rates of the contiguous Local Government Areas. The consequence is as indicated in 3 above. The estimated ρ coefficient is negative, moderate and significant, indicating moderate spatial autoregressive dependence in the error term. In other words, an exogenous shock to one Local Government Area will cause moderate changes in the road traffic crashes cases in the contigous Local Government Areas. If road safety remedial measure is applied to a Local Government Area the effect will be seen in contiguous Local Government Areas in Oyo state. The parameter estimates for area, population, road lengths characteristic and travel densities are not significant. This means that each of these exogenous variables does not contribute significantly to the incidence of road traffic crashes in Oyo state. There may be a joint effect of factors responsible for the occurrence of road traffic crashes. However, the sign of the coefficient suggests the following: The sign of the coefficient for the area was positive. There were more road traffic crashes in larger Local Government Areas. This means an increase in the area of administration will lead to more road traffic crashes in each Local Government Area. Population was positively related to the number of road traffic crashes occurring within the localities. Indicating that population generate a certain level of road traffic crashes. All other things being equal, Local Government Areas with larger residential populations tend to have more road traffic crashes. In addition, major road lengths characteristic produced a negative coefficient. This means the existence of a freeway link crossing a Local Government Area, inversely impact the incidence of road traffic crashes cases for the period. Travel densities were negatively related to number of road traffic crashes. This suggests inhibiting factors in the sense that traffic generated tend to be associated with fewer crashes. There are a number of possible reasons for this negative coefficient. The high intensity of vehicles in urban areas as a result of economic activities, low concentration of vehicles in less populated areas and the existence of few major roads in rural areas could be responsible for this result. Also, high traffic forces drivers to go at a slow speed,

236

thus, the probability of a road crash is reduced. Consequently,the spatial autoregressive model has been modified to include spatial dependence in the disturbance term. This provided more information on the effect of administering remedial measures on a particular Local Government Area and its contiguous Local Government Areas. These results suggest that stakeholders and policy makers need to have a proper understanding of spatial dependence and take cognizance of exogenous variables in order to maximize information for the effective delivery of road safety remedial measures.

5. Spatial Autoreggressive Model with Spatial Autoreggressive Disturbances and Additional Endogenous Variable Model With the population lags as additional endogenous variable, the results were consistent with the estimates in the spatial autoregressive model with spatial autoregressive disturbances model. Consequently, the spatial autoregressive model has been modified to include spatial dependence in the disturbance term and instrumental variable. These results are a confirmation that the parameter estimates in 4 above are consistent. In summary, the dynamics of road traffic crashes were investigated. Alternative models were introduced by modifying the spatial autoregressive model. These results should enable the orientation of strategies and policies targeted towards cutting down the number of road traffic crashes.

5.4 Contribution to Knowledge In a nutshell, this book developed alternative modelling techniques for road traffic crashes using modifications of the spatial autoregressive model, namely, the spatial autoregressive model which included spatial autoregressive disturbances and the spatial autoregressive model which included spatial autoregressive disturbances with instrumental variable models. The study revealed that the exclusion of spatial dependence when modelling road traffic crashes could impact negatively on model fit, statistical efficiency and parameter estimates. This in turn will impact negatively on significance tests and the measure of fit may be misleading. One of the technological benefits of this study is the revelation of the relationship between modelling, policy issues and reduction in road traffic crashes. This study revealed that identification

237

of concentration levels, hotspots, measures of association of road traffic crashes occurrences and exogenous variables is a basic prerequisite to policy formulations on safety and injury prevention strategies if maximum significant remedial effect must be achieved. Furthermore, this study forms an evidence based framework for road traffic crashes management. The models could serve as a technical guide to the government, road maintenance agencies, road users, business owners, stakeholders and the public to identify localities with higher than expected future likelihood of road traffic crashes in order to promote safety and security on road networks across the globe. In addition, findings have revealed that to achieve the first and most important MDG, the main catalyst required is the reduction of road traffic crashes which can help to reduce deaths of young adults who are often the bread winners in a family and significantly eradicate extreme poverty and starvation simultaneously. In the long run, family income will improve; this will in turn stimulate economic growth and ultimately improve the country’s GDP. Finally, this study is a novel intervention to the current policies and practices in road traffic crashes management and the implementation will compliment existing efforts by governments to achieve the United Nations General Assembly’s target to bring down the expected estimated rise in deaths from road traffic crashes by 50% within the present 2011-2020 decade.

5.5

Suggestions for Further Studies

The following are possible areas to improve this study and extend investigations 1. There is the need to investigate more exogenous variables. 2. Investigations on how a change in one exogenous variable could effect changes on the predicted values for all observations of the dependent variable could be carried out while simultaneously examining its effect on policy issues. 3. Alternative approaches could be explored in the context of the Bayesian framework. 4. The study can be applied across borders on small geographical scales to promote safety and security on the world’s road networks. 5. Incorporating spillover effects in the form of spatial heterogeneity could be the focus of future studies. 6. Furthermore, the joint impact of spillover effects,namely, spatial autocorrelation and spatial heterogeneity could be investigated simultaneously.

238

5.6

Summary and Conclusion

In conclusion, this thesis presents alternative modelling techniques and an evidence based framework for road traffic crashes management that can promote safety and security on road networks, reduce number of deaths and injuries, while simultaneously reducing the frequencies of road traffic crashes. The spatial autoregressive with spatial autoregressive disturbance and the spatial autoregressive with spatial autoregressive disturbance with instrumental variable models are more informative and appropriate for modelling road traffic crashes than the spatial autoregressive model.

239

REFERENCES Abdel-Aty, M.A. and Radwan, A.E. 2000. Modelling traffic accident occuerence and involvement. Accident Analysis and Prevention 32.5: 633-642. Abdullah L. and Zamri N. 2012. Road accident models with two threshold levels of fuzzy linear regression. Journal of Emerging Trends in Computing and Information Sciences 3.2: 225-230 Aborisade, S., 2010. Collapse of innocent dreams: how 28 kids died in road mishap. Sunday punch. Mar.28: 6. Abuh, A.2011. Anger over Tamburawa river sallah tragedy. The Guardian. Nov. 11: 12. Aderamo, A.J. 2012. Spatial pattern of road traffic accident casualties in Nigeria. Mediterranean Journal of Social Sciences 3.2: 61-72. Adepegba, A. 2013. Auto fatalities rise as FRSC grapples with convoy accidents. Saturday Punch. Feb.9:38 Agbeboh G. U. and Osabuohien-Irabor Osarumwense. 2013. Empirical analysis of road traffic accidents: a case study of Kogi State, North-Central Nigeria. International Journal of Physical Sciences 8.40:1923-1933. Aguero-Valverde, J. and Jovanis, P.P. 2006. Spatial analysis of fatal and injury crashes in Pennsylvania. Accident Analysis and Prevention 38: 618-625. 2008. Analysis of road crash frequency using spatial model. Paper Presented at the 87th Annual Meeting of the Prevention, 31.1-2: 161-168. Ahmed, M., Huang, H., Abdel-Aty, M. and Guevara, B. 2011. Exploring a bayesian hierarchical approach for developing safety performance functions for a mountainous freeway. Accident Analysis and Prevention 43: 1581-1589. Akinkuotu, E., and Adebayo, B. 2014. Lagos tanker fire burns 15 to death, 11 vehicles, 60 shops and bank razed. Retrieved Feb. 05, 2014 from http://www.punch.ng.com/news/lagos-tanker-fire-burns-15-to-death/ Amoros, E., Martin, J.L. and Laumon, B. 2003. Comparison of road crashes incidence and severity between some French countries. Accident Analysis and Prevention 35. 4: 537-547. Anastasopoulos, P. Ch., Tarko, A.P. and Mannering, F.L. 2008. Tobit analysis of vehicle accident rates on interstate highways. Accident Analysis and Prevention 40.2: 768-775. , Mannering F.L., Shankar V.N. and Haddock J.E. 2011. A study of factors affecting highway accident rates using the random-parameters tobit model. Accident Analysis and Prevention doi:10.1016/j.aap.2011.09.015. Anon. 2013. Accidents claim 47 lives,121 injured. Saturday Tribune. Feb.9:8. . 2013. Eight perish as trailer, car crash. Saturday Tribune. Mar. 2: 8. . 2013. 3 Die in Ibadan expressway explosion. Retrieved Mar. 17, 2013, from http://www.enownow.com/news/story.php?sno=91 . 2013. 7 Five dead as fire rages on Lagos-Ibadan expressway. Retrieved Mar. 17, 2013, from http://www.enownow.com/news/story.php?sno=10995

. 2013.42 people die in 78 road accidents. Retrieved Mar. 17, 2013, from http://premiumtimesng.com/news/121915-42people-died-in-78-roadaccidents-in-nigeria-last-week-frsc.html

240

. 2014. A scene of an accident opposite Odetola estate, Ikeja, on the Lagos-Ibadan expressway. The Punch. Jan. 29:38 Anselin, L. 1988. Spatial econometrics: methods and models. Kluwer Academic Publishers, Dordrecht: The Netherlands. . 1990. Some robust approaches to testing and estimation in spatial econometrics. Regional Science and Urban Economics 20: 1-17. . 1992. SpaceStat: A Program for the Statistical Analysis of Spatial Data. Santa Barbara, CA: National Center for Geographic Information and Analysis . 1995. Local indicator of spatial association- LISA. Geography Analysis 27: 93-115. . 2001. Raos score tests in spatial econometrics. Journal of Statistical Planning and Inference 97: 113-139. . 2005. Exploring spatial data with Geoda: a workbook. Centre for Spatially Integrated Social Science. University of Illinois, Urbana-Champaign. Retrieved Jan.20, 2013, from http://www.csiss.org/ or http://sal.agecon.uiuc. edu/ , D. A. Griffith 1988. Do spatial effects really matter in regression analysis? Papers of the Regional Science Association 65: 19-33. , Smirnov, O. 1996. in Anselin Luc. 2002. Under the hood issues in the specification and interpretation of spatial regression models. Agricultural Economics 27: 247-267. , A.K. Bera. 1998. Spatial dependence in linear regression models with an introduction to spatial econometrics, in A. Ullah and D.E.A. Giles, (eds.): Handbook of Applied Economic Statistics, Marcel Dekker, New York. Arraiz, I., D. M. Drukker, H. H. Kelejian, and I. R. Prucha. 2010. A spatial Cliff-Ord type model with heteroskedastic innovations: small and large sample results. Journal of Regional Science 50: 592614. Arthur G. 2007. Reflections on spatial autocorrelation. Regional Science and Urban Economics 37: 491-496. Bennett, S. 1989. An extension of Williams method for overdispersion models. GLIM Newsletter 17:12-16. Bernardinelli, L., Clayton, D., Pascutto, C., Montomoli, C., Ghislandi, M. and Songini. 1995. Bayesian analysis of space time variation in disease risk. Statistical Medicine 14: 2433 2443 Besag. J. 1974 . Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society Series B (Methodological) 36. 2: 192-236 Bhat, C. 2003. Simulation estimation of mixed discrete choice models using randomized and scrambled halton sequences. Transportation Research Part B 37. 1: 837-855. Black, W.R. and Thomas, I. 1998. Accidents on Belgiums motorways: a network autocorrelation analysis. Journal of Transport Geography 6.1: 23-31. Blommestein, H. 1983. Specification and estimation of spatial econometric models: A Discussion of Alternative Strategies for Spatial Economic Modeling. Regional Science and Urban Economics 13: 251-270. . 1985. Elimination of circular routes in spatial dynamic regression equations. Regional Science and Urban Economics 15: 121-130. Box, G. and Jenkins, G. 1976. Time series analysis, forecasting and control. San Francisco: Holden Day. Brandsma, A. and Ketellapper, R. 1979. Further evidence on alternative procedures for testing of spatial autocorrelation among regression disturbances in Exploratory and Explanatory Analysis in Spatial Data, ed. C. Bartels and R. Ketellapper, Boston: Martinus Nijhoff : 11-36. Breslow, N. E. 1984. Extra-poisson variation in log - linear models. Applied Statistics 33:38-44. Bunge, M. 1961. Causality, chance and law. American Science 49:432-448. Burridge, P. 1980. On the Cliff-Ord test for spatial autocorrelation. Journal of the Royal Statistical Society B 42: 107-108. Carroll, P.S. 1973. Symposium on Driving Exposure. National Technical Information Service, Springfield: V.A.

241

Cartwright,N . 1989. Natures capacities and their measurements. Oxford:Clarendon Press. Case, A. 1991. Spatial patterns in household demand. Econometrica 59: 953-965. Chang T., Konz S.A. and Lee E.S. 1996. Applying fuzzy linear regression to VDT legibility. Fuzzy Sets and Systems 80: 197-204 Chapman R. 1973. The Concept of Exposure. Accident Analysis and Prevention 5:95-110. Chun, K., Goh, K. Currie G. , Sarvi M. and Logan D. 2014. Bus accident analysis of routes with/without bus priority. Accident Analysis and Prevention 65: 18-27 Cliff, A.D and Ord, J.K. 1973. Spatial autocorrelation. London: Pion Cowell, F.A. 1977. Measuring Inequality. Oxford : Philip. Cressie, N.A.C. 1991. Statistics for Spatial Data New York: Wiley. Davis, J. C. 1986. Statistics and Data Analysis in Geology. New York: Wiley. Deolu, 2014. Accident Leaves One Dead, Others Injured In Sokoto. Information Nigeria.Oct.6. Retrieved Oct.9,2014, from http://www.informationng.com/2014/10/accident-leaves-one-dead-othersinjured-in-sokoto.html Dhrymes, P. 1981. Distributed lags: problems of estimation and formulation. Amsterdam: North Holland. Drukker, D. M., H. Peng, I. R. Prucha, and R. Raciborski. 2013. Creating and managing spatial-weighting matrices with the spmat command. Stata Journal 13: 242-286. , Prucha I. R. and Raciborski R. 2013a. Maximum likelihood and generalized spatial two-stage least-squares estimators for a spatialautoregressive model with spatial-autoregressive disturbances. Stata Journal 13: 221-241. 2013b.

A command for estimating spatial-autoregressive models with spatial-

autoregressive disturbances and additional endogenous variables. Stata Journal 13: 287-301. Elvik, R. 1989. Explaining road accidents: Three mechanisms generating randomness in accident counts.Unpublished manuscript.Oslo Institute of Transport Economics. Erdogan, S. 2009. Explorative spatial analysis of traffic accident statistics and road mortality among the provinces of Turkey. Journal of Safety Research 40: 341-351. Ezekiel, E. 2010. Grappling with the scourge of road accidents. Sunday Punch. Nov. 7:12. Federal Road Safety Corps 2013. Federal Republic of Nigeria. Road traffic crashes data. 2012. RS11.3 Oyo Sector Command. Federal Road Safety Commission. 2007. Summary of reported traffic crashes trends in Nigeria(1960-2006) Retrieved Oct.9, 2013, from www.frsc.gov.ng Fridstrom, L. and Ingebrigtsen, S. 1991. An aggregate accident model based on pooled regional time series data. Accident Analysis and Prevention. 23 .5: 363-378. Geary, R. 1954. The contiguity ratio and statistical mapping. The Incorporated Statistician 5: 115-145. Getis, A. and Ord, J.K. 1992. The analysis of spatial association by use of distance statistics. Geographical Analysis 24: 189-206. Global Status Report on Road Safety. 2013. Supporting a decade of action.[pdf] Retrieved Oct.20, 2013, from www.who.int/iris/bitstream/10665/78256/1/9789241564564 eng. Gourieroux, C., Monfort, A., Trognon, A. 1984. Pseudo maximum likelihood method: application to Poisson models. Econometrica 52: 701-720. Green,W. 2007. Limdep, version 9.0 econometric software, Inc., Plainview, NY. Greenwood,M. and Yule,G.U. 1920. An enquiry into the nature of frequency distributions of multiple happenings, with particular reference to the occurrence of multiple attacks of disease or repeated accidents. Journal of the Royal Statistical Society, Series A 83: 239-262 in Fridstrom, L. and Ingebrigtsen, S. 1991. An aggregate accident model based on pooled, regional time series data. Accident Analysis and Prevention 23. 5: 363-378.

242

Griffith,D. A. 2005. Spatial autocorrelation. Encyclopedia of social measurement 3: 581-590. . 2009. Spatially autoregressive model. University of Texas:Dallas. Guo, F., Wang, X., and Abdel-Aty, M. 2010. Modelling signalized intersection safety with corridor-level spatial correlations. Accident Analysis and Prevention 42: 84-92. Halton, J. 1960. On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numerische Mathematik 2: 84-90. Heshmati B. and Kandel A. 1985. Fuzzy linear regression and its applications to forecasting in uncertain environment. Fuzzy Sets and Systems 15: 159-191. Hoque, M.M. and Andreassen, D.C. 1986. Pedestrian accidents: an examination by road class, with reference to accident cluster. Traffic Engineering and Control 27. 7/8: 391-395. Huang H. and Abdel-Aty, M. 2010. Multilevel data and bayesian analysis in traffic safety. Accident Analysis and Prevention 42: 1556-1565. Jaro, M.A. 1992. AutoMatch: Generalized Record Linkage System (version 2.0). Silver Springs, MD: Data Star, Inc. Jeffrey P. C. 2007. Economic benefits of investments in transport infrastructure. Discussion paper No. 2007-13. Jegede, F.J. 1988. Spatio-temporal analysis of road traffic accidents in Oyo State, Nigeria. Accident Analysis and Prevention 20. 3 : 227-243. Jiang X., Abdel-Aty M. and Alamili S.2014. Application of poisson random effect models for highway network screening. Accident Analysis and Prevention 63: 74-82. Jones, B., Janssen, L. and Mannering, F. 1991. Analysis of the frequency and duration of freeway accidents in Seattle. Accident Analysis & Prevention 23.4: 239-255. Jones, A.P., Haynes, R., Harvey I.M. and Jewell, T. 2011. Road traffic crashes and the protective effect of road curvature over small areas. Health and Place doc: 10.1016/j. healthplace 2011. 10.008. Jovanis, P., and Chang, H. 1986. Modelling the relationship of accidents to miles travelled. Transportation Research Record 1068: 42-51. Kelejian, H. and D. P. Robinson. 1992. Spatial autocorrelation: a New computationally simple test with an application to per capita county police expenditures. Regional Science and Urban Economics 22: 317-333. . 1993. A Suggested method of estimation for spatial interdependent models with autocorrelated errors, and an application to a country expenditure model. Papers in Regional Science 72: 297-312. .1995. Spatial correlation: A suggested alternative to the autoregressive model in L. Anselin and R. Florax (Eds), New Directions in Spatial Econometrics. Springer-Verlag, Berlin. , I. R. Prucha. 1998. A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. Journal of Real Estate Finance and Economics 17: 99-121. .1999. A generalized moments estimator for the autoregressive parameter in a spatial model. International Economic Review 40: 509-533. . 2004. Estimation of simultaneous systems of spatially interrelated cross sectional equations. Journal of Econometrics 118:27-50. . 2010. Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. Journal of Econometrics 157: 53-67. ., I.R. Prucha, and E. Yuzefovich. 2004. Instrumental variable estimation of a spatial autoregressive model with autoregressive disturbances: large and small sample results in: LeSage, J. and Pace, R.K. (eds.), Spatial and spatiotemporal econometrics (Advances in Econometrics) 18: Elsevier, New York :163-198.

243

Khoo, C. 1993. A Program for Matching Street Intersections.Special software. Department of Urban and Regional Planning, University of Hawaii, Manoa. Ki-moon, B. 2012. Improving global road safety. Roadsafe. Retrieved Mar.17, 2013, from http://www.roadsafe.com/news/article.aspx Kim, K., Brunner, I. and Yamashita, E. 2006. Influence of land use, population, employment, and economic activity on accidents. Journal of Transport Research Board : 56-64. Kmenta, J. 1971. Elements of econometrics. New York: Macmillan. Koshi, M. and Okura, I. 1978. Statistical analysis of accident proneness of roadway sections.Tokyo: Japan Society of Traffic Engineers. Lee, L.F. 2003. Best spatial two stage least squares estimators for a spatial autoregressive model with autoregressive disturbances. Econometric Reviews 22: 307-335. . 2004. Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72: 1899-1925. LeSage, J., and R. K. Pace. 2009. Introduction to spatial econometrics. Boca Raton: Chapman and Hall/CRC. Levine,N., Khoo, C.,Okazaki, D., Kim, K. and Nitz, L.1994. Pointstat: A Statistical Program for Analyzing the Spatial Distribution of Point Locations (Version 1.0). Honolulu, HI: University of Hawaii at Manoa. , Kim, K.E. and Nitz L.H. 1995a. Spatial Analysis of Honolulu Motor Vehicle Crashes: I. Spatial Patterns. Accident Analysis and Prevention, 27.5: 663-674. . 1995b. Spatial Analysis of Honolulu Motor Vehicle Crashes: II. Zonal Generators. Accident Analysis and Prevention 27.5 :675-685. Liem, T.C., Dagenais and M.G. Gaudry. 1987. M.L-1.2: A Program for Box-Cox transformations in regression models with heteroskedastic and autoregressive residuals. Publication # 510.Montreal: Centre de Re cherche sur les Transports, Universite de Montreal. Lord, D. 2000. The prediction of accidents on digital networks: Characteristics and issues related to the application of accident prediction model. Ph.D. Dissertation. Department of Civil Engineering. University of Toronto, Toronto. 2006. Modelling motor-vehicle crashes using poisson-gamma models: examining the effect of low sample mean values and small sample size on the estimation of the fixed dispersion parameter. Accident Analysis and Prevention 38: 751-766. , Miranda Moreno, L.F. 2007. Effects of low sample mean values and small sample size on the estimation of the fixed dispersion parameter of poisson-gamma models for modelling motor vehicle crashes: a bayesian perspective. Loveday, J. 1989. Spatial Analysis of Road Accident Data and the Accident Migration Hypothesis. 21st Annual Conference of Universities Transport Studies Group, Napier University, Edinburgh. . 1991 Spatial Autocorrelation and Road Accident Migration. 23rd Annual Conference of Universities Transport Studies Group, University of Nottingham. Maher, M.J. and Mountain, L.J. 1988. The identification of accident blackspots: a comparison of current methods. Accident Analysis and Prevention 20.2: 143-151. Mannering, F. L. 1989. Poisson analysis of commuter flexibility in changing routes and departures times. Transport Research Record 23B: 53-60. Miaou, S., and Lord, D. 2003. Modelling traffic crash-flow relationship for intersections: dispersion parameter, functional form, and Bayes versus empirical Bayes. Transportation Research Record 1840: 31-40. , Song, J.J. and Mallick, B. 2003. Roadway traffic crash mapping: a space-time modeling approaches. Journal of Transportation and Statistics 6.1: 33-57. Milton, J. and Mannering, F. 1998. The relationship among highway geometrics, traffic-related elements and motor-vehicle accident frequencies. Transportation 25: 395-413.

244

Moran, P. 1948. The interpretation of statistical maps. Journal of the Royal Statistical Society B 10: 243-251. Mitchell, A. 2005. The ESRI guide to GIS analysis. 2: Spatial Measurements. California: ESRI press. National Bureau of Statistics. 2007. Social statistics in Nigeria. [pdf] Federal Republic of Nigeria. Retrieved Mar.17, 2013, from http://www.nigerianstat.gov.ng/ext/latest release/ssd09.pdf. . 2009. Reported cases of notifiable diseases [2002-2006].Retrieved Mar.17, 2013, from www.nigerianstat.gov.ng/ext/l release/ssd09.pdf . 2014. Transport. Retrieved Feb.05, 2014 from http://www.nigerianstat.gov.ng/sectorstat/sectors/Transport. Nicholson, A.J. 1985. The variability of accident counts. Accident Analysis and Prevention 17.1: 47-56. . 1989. Accident Clustering: Some Simple Measures. Traffic Engineering and Control 30.5 :241-246. . 1990. Measures of Accident Clustering. In: Koshi, M. (Ed), Transportation and Traffic Theory, Elsevier, New York. . 1999. Analysis of Spatial Distributions of Accidents. Safety Science 31: 71-91. Nigerian Police. 2009. Summary of reported traffic crashes trends in Nigeria (2000-2007). National Bureau of Statistics(2009) Social Statistics in Nigeria. Federal Republic of Nigeria.Retrieved Feb.05, 2014, from www.nigerianstat.gov.ng/ext/latest release/ssd09.pdf. Nigerian Police/ Federal Road Safety Commission. 2007. Summary of reported traffic crashes trends in Nigeria (1960-2006). Retrieved Feb.05, 2014 from, www.frsc.gov.ng. Noland, R.B. and Quddus, M.A. 2005. Congestion and safety: a spatial analysis of London. Transportation Research Part A: Policy and Practice 39. 7-9 : 737-754. Nwogu, S. 2010. One feared dead, others injured in Ogun accident. The Punch. Nov.10:11. Obe, E. 2011. 78 died in road crashes in christmas week-FRSC. The Punch. Dec.27:8. Ogbodo, D. and Nduoma, E. Nov. 17, 2011. FRSC: Nigerian roads second worst in the world. Thisdaylive. Retrieved Mar.17, 2013, from http://www.thisdaylive.com/articles/frsc-nigerian-roads-second-worst-in-theworld/103012/ Ohakwe J., Iwueze I.S. and Chikezie, D.C. 2011. Analysis of road traffic accidents in Nigeria: a case study of Obinze/Nekede/Iheagwa road in Imo state, Southeastern, Nigeria. Asian Journal of Applied Sciences 4: 166-175. Okamoto, H. and Koshi, M. 1989. A method to cope with the random errors of observed accident rates in regression analysis. Accident Analysis and Prevention 21.4:317-332. Olufowobi, S., Okon, A. and Emhonyon, E. 2009. 5 feared dead in molue accident in Lagos. Saturday Punch. Jun. 20: 13. Ord, J. 1975. Estimation methods for models of spatial interaction. Journal of the American Statistical Association 70: 120-26. Paelinck, J. and L. Klaassen. 1979. Spatial econometrics. Farnborough: Saxon House. Papineau, D. 1978. For science in the social sciences. London: Macmillan Press Ltd. . 1985. Probabilities and causes. Journal of philosophy. 82.2 :57-74 Pearson, E.S. and Hartley, H.O. 1972. Biometrika Tables for Statisticians, Vol. 2, Cambridge University Press; Cambridge. Peden, M., Scurfield, R., Sleet, D., Mohan, D., Hyder, A.A., Jarawan, E. and Mathers, C. 2004. World report on road traffic injury prevention. World Health Organization. Retrieved Mar. 17, 2013, from http://www.siteresources.worldbank.org/EXTTOPGLOROASAF/Resou rces/ WHO full report en pdf Peltzman, S. 1975. The effects of automobile safety regulations. Journal of Political Economy 83: 677-725

245

Peters G. 1994. Fuzzy linear regression with intervals. Fuzzy Sets and Systems 63: 45-55. Quddus, M.A. 2008. Modelling area-wide count outcomes with spatial correlation and heterogeneity: an analysis of London crash data. Accident Analysis and Prevention 40. 4: 1486-1497. Ripley, B. 1981. Spatial statistics. New York: John Wiley and Sons. Road Safety Reports. 2011. Federal Road Safety Commission, Nigeria. Prince Micheal International Road Safety Awards, International Awards, 2008. Retrieved Mar.05, 2013, from www.roadsafetyawards.com/international/view.aspx?winnerid=181. Roque C. and Cardoso, J.L. 2014. Investigating the relationship between run-off-the-road crash frequency and traffic flow through different functional forms. Accident Analysis and Prevention 63: 121-132. Salmon, W.C. 1984. Scientific explanation and the causal structure of the world. Princeton University Press. Savolainen, P.T., Mannering, F.L., Lord, D., Quddus, M.A. 2011. The Statistical Analysis of Highway Crash Injury Severities: A Review and Assessment of Methodology Alternatives. Accident Analysis and Prevention 43: 1666-1676. Schluter, P., Deely, J., and Nicholson, A. 1997. Ranking and selecting motor accident sites by using a hierarchical Bayesian model. Journal of the Royal Statistical Society Series D 46: 293-316. Shanker, V., Albin, R., Milton, J., and Mannering, F. 1998. Evaluating median crossover likelihoods with cluster accident counts. An empirical inquiry using the random effect negative binomial model. Transportation Research Record 16-35. , Mannering, F. and Barfield, W. 1995. Effect of roadway geometrics and environmental factors on rural accident frequencies. Accident Analysis and Prevention, 27: 371-389. Shefer, D., Rietveld, P. 1997. Congestion and safety on highways: towards an analytical model. Urban Studies 34.4: 679-692. Sukhai, A., Jones, A. P., Love, B.S. and Haynes, R. 2011. Temporal variations in road traffic fatalities in South Africa. Accident Analysis and Prevention 43:421-428. Sumaila, AbdulGaniyu Femi. 2013. Road crashes trends and safety management in Nigeria. Journal of Geography and Regional Planning 6.3: 53-62. Suppes,P. A. 1970. Probabilistic theory of causality. Amsterdam: North-Holland. Spiegelhalter, D., Best, N., Carlin, B.P. and Linde, A. 2002. Bayesian measures of model complexity and fit. Journal of Royal Statistical Society B. 64.4: 583 639. Spiegelhalter, D., Thomas, A., Best, N. and Lunn, D. 2003. WinBUGS user manual version 1.4. Tanaka H. Uejima S. and Asai, K. 1982. Linear regression analysis with fuzzy model. IEEE Transactions on System,Man and Cybernetics. 12: 903-907 Taylor, M.A.P., Woolley, J.E., Zito,R. 2000. Integration of the global positioning system and geographical information systems for traffic congestion studies. Transportation Research Part C: Emerging Technologies 8.1-6: 257-285. The Africa Report. 2013. Worlds un-safest roads in Nigeria unacceptable.Retrieved Oct.20, 2013 from, http://www.theafricareport.com/West-Africa/worlds-un-safest-roads-innigeria-unacceptable.html. Tobler, W. 1979. Cellular Geography. In philosophy in geography, edited by S. Gale and G. Olsson. Dordrecht: Reidel :379-386. Train , K. 1999. Halton sequences for mixed Logit. Working Paper. University of California, Department of Economics, Berkley. Ugwu, C. June 17, 2011. Africa: UN predicts more traffic accidents. Daily Champion. Retrieved Mar. 17, 2013, from http://www.allafrica.com/stories/201106170708.html Upton, G.J.G. and Fingleton, B. 1989. Spatial data analysis by example, vol.2. categorical and directional Data. New York:Wiley. Wakefield, J.C., Best, N.G. and Waller, L. 2000. Bayesian approaches to disease mapping, on spatial epidemiology. Methods and Applications. Oxford University.Press.

246

Wang, C., Quddus, M.A. and Ison, S.G. 2009. Impact of traffic congestion on road accidents: a spatial analysis of the M25 motorway in England. Accident Analysis and Prevention 41: 798-808. Washington, S.P., Karlaftis, M.G. and Mannering, F.L. 2011. Statistical and Econometric Methods for Transportation Data Analysis. 2nd ed., Chapman & Hall/CRC. Wolfe, A.C. 1982. The concept of exposure to the risk of a road traffic accident and an overview of exposure data collection methods. Accident Analysis and Prevention 14. 5: 337-340. World Health Organization. 2004. World report on road traffic injury prevention. Summary World Health Organization. Retrieved Mar. 17, 2013, from http://www.who.int/violence injury.../road.../world.../summary en rev.pdf Ximaio Jiang, Mohamed Abdel-Aty and Samer Alamili. 2014. Application of poisson random effect models for highway network screening. Accident Analysis and Prevention 63: 74-82. Ye, X., Pendyala, R., Washington, S., Konduri, K., and Oh, j. 2009. A simultaneous equations model of crash frequency by collision type for rural intersections. Safety Science 47: 443-452. Yu, R. and Abdel-Aty, M. 2013. Investigating different approaches to develop information priors in hierarchical Bayesian safety performance functions. Accident Analysis and Prevention 56: 51-58. Yu, R., Abdel-Aty, M., and Ahmed, M. 2013. Bayesian random effect models incorporating real-time weather and traffic data to investigate mountainous freeway hazardous factors. Accident Analysis and Prevention 50: 371-376.

247

APPENDICES

248

249

Sunday Punch

Sunday Punch

The Punch

28/03/2010

07/11/2010

10/11/2010

The Guardian

Saturday Punch

20/06/2009

11/11/2011

Source

Date of Report

Warewa bus stop, along LagosIbadan Expressway. 15 Kilometers from Kano Metropolis

Ajayi bus stop along the LagosAbeokuta expressway Olupitan area of Ore, Ondo-Ore road FCT.

Location

6 persons, a Mercedes Benze.

Two buses, two motorcycles, about 29 passengers

Bus(Ondo XC 536 REE) loaded with 42 pupils,3 persons and a trailer (Katsina XD 919 KTN) 289 cases

A fully loaded 911 bus (XC 869 GGE ), pedestrians and okada man.

Number of Victims/Vehicles Involved

A family of 5 killed and 1 person

53 Killed and 586 injured Within Jan. And June,2010 1 dead, 20 injured

No fewer than 28 kids died.

Number Killed/Injured 5 feared dead, others injured.

Appendix 1 Some Road Traffic Crashes Reports and Deaths on Nigerian Roads

Car skidded off the road and tumbled into the Tamburawa river at the point where the bridge had no guardrail.

Head on collision

No specific details

Overtaking/Head on collision

Over speeding

Cause of Accidents

250

Saturday Punch

Saturday Tribune

The Punch

The Punch

9/2/2013

2/3/2013

9/1/2014

29/1/2014

Information Nigeria

Saturday Tribune

9/2/2013

06/10/2014

The Punch

27/12/2011

Lagos-Ibadan Expressway,opposite Otedola Estate,Ikeja Sabon-garin Kwannawa Junction,Sokoto

IbadanLagos Expressway and some other parts of the country. Kirikiri, Lagos

Ibadan-Lagos Expressway and some other parts of the country Report on convoy accidents

Iganmu Bridge, Lagos

Source: Compiled by authors

A Pick-up van, marked Edo XF 739 BEN.

Tanker, parked vehicles, bus,12 heavy duty engines A car

13,000 vehicles booked for various traffic offences out of 36,000 flagged down by FRSC officials during a survey centered on 23 road corridors and 65 routes in a 26 days special operation by FRSC. 23 vehicles

39 Vehicles , 1 motor cycle

Multiple accidents at U-turn

Number killed/injured not stated 1 dead, 17 injured

Lost control,sumersault

Lost control, sumersault

Lost control

Tanker crash and other factors not enlisted.

25 killed, 51 injured

15 dead, 5 injured

Recklessness, overspeeding and other various traffic offences.

Tyre burst and other factors not listed

Ignorance on the part of drivers and heavy traffic

26 lives lost in governors and other VIPs convoy in the past three years

78 died across the country between Dec. 20 and Dec. 27. 2011 47 killed, 121 injured

251 3 Die In Ibadan Expressway Explosion Retrieved Mar. 17, 2013 from http://www.enownow.com/news/story.php?sno=9151

Appendix 2

252 Five Dead As Fire Rages On Lagos-Ibadan Expressway Retrieved Mar. 17,2013 from http://www.enownow.com/news/story.php?sno=10995

Appendix 3

253 

42 People Die in 78 Road Accidents Retrieved Mar. 17,2013 from http://premiumtimesng.com/news/121915-42-people-died-in-78-road-accidents-in-nigeria-last-week-frsc.html

Appendix 4

254

Lagos Tanker Fire Burns 15 to Death Retrieved Feb 05, 2014 from http://www.punchng.com/news/lagos-tanker-fire-burns-15-to-death/

Appendix 5

255

5.7

Data for the study

256

257

258

259

260

261

262

263

264

265

266

267

ROAD TRAFFIC CRASHES COORDINATES POINT_X

POINT_Y 3.82242

7.44859

3.90155

7.08165

3.900947

7.0824

3.900622

7.106681

3.8898

7.13893

3.77437

7.159909

3.774588

7.160084

3.774751

7.160214

3.775093

7.160487

3.775512

7.160823

3.776318

7.161466

3.777499

7.1624

3.782777

7.16531

3.78475

7.166386

3.87852

7.16811

3.787921

7.16892

3.788769

7.16989

3.789235

7.170482

3.789273

7.170536

3.789309

7.170586

3.789374

7.170679

3.78997

7.171553

3.79692

7.187021

3.797578

7.187855

3.797706

7.188017

3.798002

7.188392

3.798297

7.188767

3.79946

7.19042

3.80048

7.191229

3.804814

7.195878

3.804976

7.196266

3.805136

7.196611

3.805484

7.1973

3.805748

7.197978



268

3.878427

7.198275

3.87855

7.19961

3.80652

7.202106

3.877214

7.20281

3.806368

7.204992

3.808088

7.215682

3.809166

7.217516

3.813142

7.228957

3.813203

7.229539

3.81329

7.229731

3.8145

7.23207

3.817765

7.238703

3.818281

7.239835

3.818621

7.240365

3.818975

7.240761

3.862106

7.241273

3.819782

7.241661

3.863176

7.250363

3.831564

7.251005

3.831892

7.251453

3.863655

7.251667

3.832248

7.251939

3.832486

7.252265

3.836145

7.258984

3.836855

7.260944

3.838952

7.266355

3.839174

7.266843

3.843175

7.272418

3.845541

7.274889

3.846634

7.275869

3.847395

7.276493

3.847565

7.276629

3.847902

7.2769

3.848197

7.277136

3.848677

7.277501

3.849452

7.278087

3.850152

7.278457

3.850953

7.278834

3.852624

7.279661

3.86616

7.28019

3.854501

7.280631

3.85502

7.280899

3.855649

7.281224

3.864082

7.284281

3.864092

7.287281

3.864956

7.288695

269

3.865726

7.290187

3.866133

7.290989

3.86625

7.291224

3.866892

7.293051

3.866939

7.293207

3.867242

7.294392

3.867414

7.296912

3.6663

7.29695

3.867485

7.298426

3.867733

7.301054

3.868965

7.306988

3.869175

7.307906

3.869736

7.309713

3.868344

7.310886

3.86836

7.310949

3.868377

7.311021

3.868397

7.311103

3.868413

7.311169

3.869359

7.31609

3.870635

7.319082

3.86935

7.323976

3.87496

7.324772

3.875148

7.325294

3.875237

7.325407

3.875235

7.325666

3.875288

7.325886

3.869055

7.3284

3.875345

7.331475

3.881504

7.333742

3.882106

7.333852

3.882832

7.333868

3.891378

7.336599

3.891923

7.336832

3.89217

7.336931

3.892302

7.336984

3.892417

7.33703

3.892566

7.33709

3.892766

7.33717

3.893257

7.337368

3.9294

7.33757

3.87789

7.33801

3.902606

7.338991

3.903047

7.339204

3.903398

7.339219

3.903758

7.339428

3.903758

7.339428

270

3.916725

7.344701

3.87765

7.34825

3.86991

7.35025

3.920354

7.354078

3.920473

7.354257

3.920628

7.354492

3.921767

7.356155

4.122175

7.36064

4.11671

7.36199

4.1117

7.36396

3.86422

7.36399

3.925938

7.364239

3.926047

7.364381

3.926073

7.364537

3.926176

7.364762

4.070023

7.367403

4.105492

7.367468

4.069477

7.367608

4.069165

7.367724

4.068779

7.367869

4.077599

7.368526

4.077972

7.368661

4.096256

7.368723

4.095986

7.368761

4.07829

7.368762

4.064735

7.369406

4.082329

7.369656

4.059592

7.371048

4.058771

7.371318

4.057493

7.371742

4.057264

7.371824

3.929479

7.371836

4.057039

7.371905

4.056799

7.371991

4.05417

7.372887

4.053941

7.372976

4.053565

7.373085

4.053565

7.373085

3.85961

7.37325

4.051387

7.373712

4.051387

7.373712

4.051287

7.373741

4.05113

7.373788

4.050967

7.373836

4.050714

7.373911

4.045805

7.375722

 271

4.044776

7.37598

4.043798

7.376244

4.040053

7.376751

4.038086

7.376906

4.037774

7.376908

4.037774

7.376908

4.037415

7.37691

4.037151

7.376911

4.036887

7.376913

4.034374

7.376919

4.03844

7.376931

3.857527

7.377612

4.029459

7.377656

3.849367

7.377789

4.028372

7.378162

4.027977

7.378425

4.027977

7.378425

4.027855

7.378481

4.027662

7.378674

4.027504

7.378765

3.5282

7.37944

3.86371

7.37962

3.83305

7.38001

4.02117

7.38078

3.840766

7.381175

4.021512

7.381241

3.840545

7.381256

3.996984

7.381315

3.840299

7.381346

3.996547

7.381369

3.99617

7.381431

3.995869

7.381493

3.995932

7.381513

3.995499

7.381579

4.00547

7.382249

4.005759

7.382286

3.837625

7.382511

4.011516

7.382953

4.011214

7.382968

4.012428

7.383063

4.011856

7.383074

4.012689

7.383094

4.01212

7.383115

4.013402

7.383175

3.9339

7.383504

3.98815

7.384478

 272

3.986407

7.385303

3.829199

7.385349

3.828984

7.385459

3.82869

7.385536

3.828526

7.385579

3.98541

7.38585

3.985078

7.385873

3.984589

7.386083

3.98437

7.386173

3.742267

7.386588

3.983134

7.386726

3.74158

7.386728

3.982248

7.387152

3.982587

7.38718

3.747946

7.387397

3.748136

7.38751

3.748406

7.387688

3.87222

7.38784

3.980743

7.38811

3.980374

7.3883

3.87253

7.3884

3.87253

7.3884

3.978874

7.388841

3.977879

7.389211

3.87917

7.38942

3.97664

7.38979

3.97664

7.38979

3.88028

7.38996

3.975225

7.390562

3.819037

7.390645

3.97444

7.39092

3.974143

7.391233

3.972938

7.391743

3.815652

7.39199

3.815514

7.392046

3.814719

7.392597

3.813451

7.393511

3.966305

7.394106

3.812423

7.394251

3.88015

7.39426

3.88124

7.39493

3.88755

7.39522

3.9712

7.39589

3.88733

7.39606

3.757327

7.396299

3.757451

7.396301

 273

3.757518

7.396303

3.781061

7.396364

3.786028

7.396449

3.786306

7.396472

3.78931

7.396732

3.88233

7.39691

3.807063

7.398483

3.798271

7.398792

3.765429

7.398935

3.765878

7.399156

3.9576

7.399395

3.800886

7.399801

3.774912

7.400094

3.803432

7.400338

3.77447

7.40046

3.89149

7.40059

3.89121

7.40064

3.770216

7.401448

3.770858

7.401827

3.94559

7.40334

3.94409

7.40386

3.94409

7.40386

3.89024

7.40486

3.90478

7.40631

3.89008

7.40725

3.94747

7.40754

3.944

7.40942

3.89147

7.40957

3.86748

7.40991

3.86748

7.40991

3.86735

7.41011

3.86735

7.41011

3.8672

7.41042

3.8672

7.41042

3.86714

7.41051

3.89284

7.41102

3.89355

7.41152

3.91229

7.41231

3.89569

7.41483

3.96233

7.41562

3.9117

7.41599

3.85639

7.41683

3.89715

7.41826

3.8599

7.41835

3.8706

7.41845

3.86014

7.41916

 274

3.87861

7.42008

3.94001

7.42044

3.97702

7.42135

3.91019

7.42332

3.9955

7.426

3.9353

7.42768

3.90401

7.42843

3.837833

7.428592

3.8381

7.42908

3.90717

7.4313

4.00537

7.43207

3.83047

7.43539

4.00602

7.43593

3.93338

7.43604

3.825526

7.439647

4.00971

7.44105

4.0238

7.44627

3.821524

7.447608

3.82106

7.448471

3.90811

7.45451

3.91332

7.46771

3.9126

7.468321

3.761339

7.472926

3.7619

7.473142

4.07678

7.47342

3.780193

7.475719

4.078

7.47648

3.91434

7.488291

3.914345

7.488396

3.914351

7.48855

3.916294

7.494984

3.91629

7.495539

3.916385

7.496068

3.916266

7.496689

4.08222

7.49958

3.913343

7.500539

3.913158

7.50103

3.912489

7.511906

3.9124

7.513097

3.912376

7.513414

3.918542

7.513486

3.911731

7.526776

3.911731

7.526776

4.09195

7.53007

3.916639

7.53299

3.916609

7.533289

275

3.916532

7.534038

3.916469

7.534657

3.9164

7.535327

3.911346

7.537363

3.91609

7.538362

3.90009

7.54048

3.910519

7.548074

3.914672

7.551479

3.914405

7.553918

3.88064

7.57644

3.87938

7.57815

3.915032

7.579797

3.915216

7.58306

3.914129

7.589664

3.916652

7.600589

3.917327

7.605436

3.918177

7.610524

3.919534

7.619306

3.920145

7.622935

3.920981

7.629103

3.921664

7.648273

3.919214

7.658319

3.919419

7.659194

3.919251

7.661648

3.91136

7.678438

3.915645

7.703295

3.915659

7.71298

3.915558

7.713327

3.91535

7.71404

3.914962

7.747988

3.915045

7.748083

3.915134

7.748201

3.915468

7.748572

3.916124

7.749612

3.917169

7.752026

3.919041

7.754393

3.919342

7.757868

3.919235

7.758226

3.918766

7.761669

3.918767

7.76178

3.918776

7.761848

3.918796

7.761937

3.921347

7.766434

3.916237

7.801032

3.914126

7.808119

3.914112

7.808228

 276

3.914103

7.808302

3.920483

7.826405

3.926121

7.832883

3.925772

7.832929

3.936169

7.835984

3.93997

7.838289

3.940154

7.838399

3.941573

7.839283

3.945803

7.843199

3.946048

7.843347

3.946332

7.843519

3.951184

7.847388

3.951861

7.848106

3.955606

7.851711

3.958304

7.853208

3.958521

7.853326

3.958753

7.853451

3.958836

7.853496

3.970777

7.856763

3.96992

7.856801

3.968295

7.85715

3.967703

7.857284

3.967376

7.857358

3.96722

7.857394

3.966976

7.85745

3.966645

7.857525

3.966238

7.85758

3.972522

7.857663

3.972626

7.857723

3.979374

7.867159

3.975498

7.867362

3.975546

7.867531

3.980914

7.875667

3.981176

7.875732

3.981332

7.875771

3.981515

7.875807

3.981811

7.875842

3.990726

7.880008

3.994743

7.881764

3.995114

7.881888

3.995495

7.882016

3.995826

7.882127

4.002867

7.886098

3.400747

7.89276

4.010231

7.901634

4.010496

7.902053

 277

4.010659

7.902292

4.014161

7.910768

4.014355

7.911621

4.012863

7.918986

4.012849

7.919123

4.014947

7.926637

4.015578

7.926844

4.015631

7.926879

4.018488

7.928899

4.01174

7.92936

4.019182

7.931159

4.019247

7.931305

4.027143

7.93413

4.027388

7.93431

4.02753

7.934439

4.027615

7.934515

4.027726

7.934627

4.035054

7.941624

4.03769

7.945412

4.042031

7.94769

4.055568

7.954075

4.071767

7.961556

4.072349

7.962311

4.075189

7.965111

4.086008

7.981042

4.08923

7.9835

4.088377

7.984198

4.08921

7.985495

4.10175

8.005257

4.108226

8.012996

4.12165

8.02302

4.133991

8.044349

4.135776

8.045254

4.151642

8.055318

4.1519

8.055435

4.152174

8.055488

4.152439

8.055675

4.152716

8.055897

4.159723

8.061033

4.165636

8.06159

4.356202

8.069364

4.180133

8.081092

4.180728

8.081859

4.188096

8.088778

4.188466

8.089142

4.188724

8.089395

 278

4.189269

8.089932

4.189269

8.089932

4.189847

8.0905

4.1903

8.090946

4.196784

8.097102

4.197214

8.097515

4.197741

8.098052

4.197741

8.098052

4.198313

8.098637

4.198968

8.099286

4.199509

8.099657

4.201181

8.100555

4.202749

8.101088

4.205207

8.102067

4.205207

8.102067

4.205402

8.102222

4.20557

8.102362

4.205954

8.102687

4.20649

8.103223

4.209118

8.105947

4.211466

8.108292

4.212095

8.108876

4.291926

8.11003

4.214192

8.110745

4.21589

8.11076

4.2193

8.11325

4.22519

8.11913

4.22519

8.11913

3.50853

8.12527

4.23841

8.13562

4.2512

8.14555

4.250212

8.145886

4.25464

8.15195

4.22835

8.15604

4.255364

8.156131

4.227032

8.156871

4.24374

8.163811

4.26912

8.18041

4.278778

8.200445

3.47019

8.20212

4.283907

8.209566

3.46119

8.21555

4.299323

8.235503

4.30525

8.2441

4.30525

8.24411

3.43289

8.25602

279

4.31477

8.26332

4.314009

8.264245

3.41193

8.29556

3.3997

8.32489

3.40145

8.42602

3.408269

8.551551

3.486717

8.59902

3.47466

8.60637

3.40931

8.60653

3.5946

8.61613

3.43701

8.63167

3.40464

8.63741

3.4022

8.64218

3.40215

8.6581

3.37835

8.67078

3.39685

8.67918

3.584662

8.744011

3.89044

8.970343

3.89564

8.97521

3.89564

8.97521

3.901277

8.982225

3.916513

9.006554

3.874759

9.042777



280

281

282

283

284

Suggest Documents