An algorithm to solve explicitly the path enumeration problem is pro- posed. This algorithm is based on the branch-and-bound technique and belongs to the ...
Applying Branch-and-Bound Technique to Route Choice Set Generation Carlo Giacomo Prato and Shlomo Bekhor The application of the branch-and-bound algorithm to a real case study is illustrated here. The procedure is compared with different deterministic techniques with respect to the ability to reproduce actual route choices of individuals driving habitually from home to work in an urban network. A choice set for modeling purposes resulting from the application of the method is constructed by considering path set consistency with the observed behavior and is used for comparing estimates and performance of different model specifications.
An algorithm to solve explicitly the path enumeration problem is proposed. This algorithm is based on the branch-and-bound technique and belongs to the class of deterministic methods along with existing approaches that combine heuristic or randomization procedures with shortest-path search. The branch-and-bound algorithm is formulated, and a methodology is designed for the application of deterministic approaches to a real case study. Path sets generated with different methods are compared for behavioral consistency, namely, the ability to reproduce actual routes chosen by individuals driving habitually from home to work. Choice set compositions for modeling purposes are determined for the consistency of the path generation process with the observed behavior. Further, model estimates and performance for different route choice specifications are examined for both path set compositions. Results suggest that the proposed branch-and-bound algorithm generates realistic and heterogeneous routes, reproduces better the observed behavior of the interviewed drivers, and produces a good choice set for route choice model estimation and performance comparison.
DETERMINISTIC APPROACHES FOR PATH GENERATION The exhaustive approach to path generation assumes unrealistically that all physical routes connecting origin and destination of a trip are considered. Selective approaches account for deterministic and probabilistic procedures, depending on the methods used to generate the paths. The proposed branch-and-bound algorithm enters the class of deterministic approaches reviewed in this study. A review of the probabilistic methods for choice set generation may be found elsewhere (1). Prashker and Bekhor review the route choice models estimated for comparing the different generated choice sets (2). The most straightforward path generation approach searches for the first K-shortest paths that minimize the generalized path costs. Shortest-path algorithms, for example, that of Dijkstra (3), assume implicit awareness of all the link attributes. Van der Zijpp and Fiorenzo-Catalano (4) present a constrained method to generate routes by finding directly feasible K-shortest paths and exploiting a wide class of constraints. Ben-Akiva et al. (5) propose an approach for generating possible paths by labeling each route according to a criterion for which the path is optimum. This approach assumes that travelers may have different objective functions. Each criterion corresponds to a different preferred route, and each route can be labeled according to a different objective function. Dial (6 ) generalizes the labeling method by constructing a set of efficient paths in which being efficient means minimizing a linear combination of label costs. Azevedo et al. (7) define an approach in which all the shortest-path links are removed from the network to find the next best path. The main problem with the link elimination approach is related to network disconnection, since removing centroid connectors and major junctions does not guarantee the existence of more paths between the origin and the destination. A variant to this approach obviates the problem by eliminating individual links or a combination of links from the shortest path. De la Barra et al. (8) illustrate the link penalty approach, in which impedances of the shortest-path links are increased in order to calculate the next best path. The process continues until no more new paths are produced. Park and Rilett (9) modify this approach by not increasing the impedance on links within a certain distance from the
Route choice behavior modeling generally considers separately the individuation of available alternative routes and the calculation of the probability of choosing a certain route from the generated choice set. Enumeration of alternatives is not a straightforward problem since the actual size of real networks creates problems in the choice set definition. The generated set should exclude unrealistic paths that no traveler would ever consider and highly similar paths that no traveler would ever differentiate between and include relevant and heterogeneous routes that different travelers would choose. Several approaches found in the literature to generate routes are based on variations to the shortest-path search. Definition of objective functions to be minimized, formulation of heuristic rules, and implementation of randomization techniques are combined with K-shortestpath algorithms in order to create a path set. The analysis of relevance and heterogeneity of paths generated with methods relying on the shortest-path search suggests a different approach to the problem. An algorithm based on the branch-and-bound technique is proposed in this study, which intends to explicitly generate realistic and heterogeneous routes while limiting computational costs. This procedure constructs a connection tree between the origin and destination of a trip by processing sequences of links according to a branching rule that accounts for logical constraints formulated to increase route likelihood and heterogeneity. Each sequence of links connecting origin and destination and satisfying all the constraints enters the choice set as a feasible solution to the path enumeration problem. Transportation Research Institute, Technion–Israel Institute of Technology, Haifa 32000, Israel. Transportation Research Record: Journal of the Transportation Research Board, No. 1985, Transportation Research Board of the National Academies, Washington, D.C., 2006, pp. 19–28.
19
20
origin or the destination. Scott et al. (10) optimize the program for determining the penalizing factor for impedances on shortest-path links in order to generate a next best path that overlaps with the shortest path by no more than a given number of links. Simulation methods assume that travelers erroneously perceive link attributes, and therefore extraction of random draws from a distribution that might represent drivers’ perceptions appears to be suitable. Sheffi and Powell (11) apply the Monte Carlo technique with multinomial probit to the traffic assignment problem. Since the assignment algorithm does not present the network loading steps, the application presents analogies with path generation procedures. Fiorenzo-Catalano and Van der Zijpp (12) implement a version of the Monte Carlo technique by gradually increasing the variance of the random components in the model in order to keep the frequency with which new paths are found at a constant rate. Bekhor et al. (13) verify the suitability of simulation methods to produce paths similar to those observed for a real case study in Boston. The aspect common to the foregoing approaches is the shortest-path search. The method proposed in this study provides an alternative to the shortest-path-based methods by applying the branch-and-bound technique to solve the path enumeration problem. In transportation-related problems, Friedrich et al. (14) apply a branch-and-bound assignment procedure for transit networks by using a timetable-based search algorithm. Hoogendoorn-Lanser (15) adapts the same procedure for choice set generation in the analysis of multimodal transport networks. In these transit applications, the method exploits predefined route sections. In the following section, a different approach is designed for a road network. BRANCH-AND-BOUND ALGORITHM FOR PATH GENERATION The branch-and-bound algorithm enumerates paths by generating a tree of routes connecting an origin node o to a destination node d. The preprocessing phase of the algorithm creates arrays containing the network elements to be processed: • An array lists the generic links Li,j by tracing initial node i, final node j, predecessors of i, successors of j, length D(Li,j), travel time T(Li,j), and straight linear distance Dj,d from the final node j of the link to the destination node d; • Defining a path segment Po,k as the sequence of links connecting the origin node o to the initial node k of the next link processed, an array records path segments Po,k by registering length D(Po,k), travel time T(Po,k), number of left turns LT(Po,k), and straight linear distance Dk,d from the final node k of the path segment to the destination node d; and • An array catalogs the set Ck of path segments Po,k arriving at node k. The processing phase of the algorithm represents the centroid node of the origin zone as the root of the tree and the centroid connector to the origin node o as the only outgoing branch. Starting from the origin node o, given a path segment Po,x from the current tree level, all possible successors of node x are considered. Let L*x,y be the currently processed link starting at node x and terminating at node y. Let P*o,y be the processed path segment between origin node o and node y, formed by adding the link L*x,y to the path segment Po,x arriving at node x. Let Cy be the set of all the connections to node y. L*x,y is inserted into the tree as an appendix of the connection set C *x if and only if all the following conditions hold:
Transportation Research Record 1985
• Directional constraint. Consider the straight linear distance D*y,d from the final node y of the processed link L*x,y to the destination d and the straight linear distance D*x,d from the arriving node x of the processed connection set C *x to the destination d. The processed link enters the connection set Cy if Dy*, d ≤ Δ D Dx*, d
(1)
where ΔD is a distance factor larger than 1. This directional constraint excludes from consideration links that take the driver significantly farther from the destination and closer to the origin. The distance factor introduces a tolerance because it considers drivers that move slightly farther from the destination to reach a faster road, for example, a highway. • Temporal constraint. Consider the travel time T(P*o,y) of the processed path segment P*o,y and the minimum travel time Tmin(Po,y) among the path segments belonging to the connection set Cy. The connection set Cy includes the processed path segment if T ( Po*, y ) ≤ Δ T Tmin ( Po , y )
(2)
where ΔT is a time factor larger than 1. This temporal constraint rejects path segments that travelers would consider unrealistic since their travel time is excessively high for connecting the origin node o to the arrival node y. • Loop constraint. Consider each link L*i,j belonging to the set of links {L*o,(o+1), L*(o+1),(o+2), . . . , L*(o+n),y} that form the processed path segment P *o,y. Define the functions Ns(Li,j) and Ne(Li,j) that map the initial and the final node of each link Li,j. Consider a subsequence of links {L*r,(r+1), L*(r+1),(r+2), . . . , L*(r+m),s} that compose a path subsegment P*r,s connecting two generic nodes r and s on the processed path segment P*o,y. Consider the length D(L*r,s) of each link belonging to the path subsegment P*r,s and the minimum distance D[Ns(L*r,(r+1)), Ne(L*(r+m),s)] between nodes r and s. The path segment P*o,y is excluded from the connection set Cy if there exists at least one path subsegment P*r,s for which
∑ D(L ) > Δ
( ) Pr*, s
* r ,s
L
D ⎡⎣ N s ( L*r ,(r +1) ) , N e ( L*(r + m),s )⎤⎦
∀Pr*,s ∈ Po*, y
(3)
where ΔL is a detour factor larger than 1. This loop constraint discards path segments that travelers would not consider because they constitute a detour larger than an acceptable value. • Similarity constraint. Considering the same definitions as for the previous constraint, the path segment P*o,y is not taken into consideration in the connection set Cy if there exists at least one path subsegment P*r,s for which ⎧ ∑ D ( L*r ,s ) > D ⎡⎣ N s ( L*r ,(r +1) ) , N e ( L*(r + m),s )⎤⎦ ⎪ Pr*,s ⎪ ⎨ ⎪ ∑ D ( L*r ,s ) < Δ o ⎪⎩ Pr*,s
∀Pr*,s ∈ Po*, y
(4))
where Δo is an overlap factor smaller than 1. This similarity constraint removes highly overlapping path segments that travelers would not consider as separate alternatives. The sequence of links {L*r,(r+1), L*(r+1),(r+2), . . . , L*(r+m),s} is actually a detour and the overlap factor is the minimum required detour length in order to individuate a different path segment.
Prato and Bekhor
21
• Movement constraint. Consider the number of left turns LT(P*o,y) in the processed path segment P*o,y. The connection set Cy includes the processed path segment if
( )
LT Po*, y ≤ Δ LT
ministic approaches to path enumeration are tested with respect to the ability to reproduce the observed behavior of individuals driving habitually from home to work in an urban network. Actual route choices were collected among faculty and staff members of Turin Polytechnic, in Italy, who participated voluntarily in a Web-based survey. Prato et al. (16) describe the details of the questionnaire, which in the first part collected information about spatial abilities and driving attitudes of the respondents and in the second part recorded the chosen route and possible alternatives to reach their workplace. The application of path generation algorithms exploits the network of the city of Turin, built on the basis of the urban traffic plan designed by the municipality in 2001 and composed of 419 nodes and 1,427 links (17). Different path generation methods are evaluated with respect to the coverage of the collected routes. The coverage is the percentage of observations for which an algorithm generates a route that satisfies a threshold for the overlap measure:
(5)
where ΔLT is the maximum number of left turns for each route. Especially in urban networks where traffic light regulation does not reserve green time for left turns, these movements markedly increase travel time by adding long waiting times at junctions before a left turn. This movement constraint removes unrealistic path segments, causing delay in terms of travel time and apprehension in drivers approaching the junction. The algorithm processes a tree level completely before path segments of the next level are considered. Two queues store unprocessed segments of the current and the next level, and the level is completed when the queues are empty. The algorithm completes the connection search when all the tree levels are processed and node y corresponds to destination node d for all the branches. Figure 1 outlines the structure of the generic connection tree. The tree width depends on the network structure and may be wider than the usual shortest-path tree. According to operational research theory, the speed of the branch-and-bound algorithm depends linearly on the width of the connection tree but exponentially on its depth. Consequently the proposed technique behaves conveniently from the computational perspective.
N
max ∑ I (Onr ≥ δ ) r
(6)
n =1
where I() is the coverage function, equal to 1 when its argument is true and zero when its argument is false; Onr is the overlap percentage; and δ is a threshold for the overlap measure. The overlap measure evaluates the consistency of a path enumeration method with respect to the observed behavior by considering the length of the links shared between generated and collected routes: Lnr Ln
APPLICATION OF PATH GENERATION ALGORITHMS
Onr =
For modeling purposes, choice sets are considered consistent with the observed behavior if they contain the actual chosen route among paths produced with generation techniques. For this reason, different deter-
where Lnr is the overlapping length between the path generated by algorithm r and the observed path for driver n, and Ln is the length of the observed path for driver n.
(7)
node d 3
node d 7
1
node d node d
4
node d node o
node d 8
5
tree width node d
node d
2 node d
node d
6
node d
level 1
level 2
level 3
level 4
level 5 tree depth
FIGURE 1
Structure of generic connection tree.
22
Transportation Research Record 1985
The ideal algorithm would reproduce perfectly the observed behavior by replicating link by link all the routes collected in the survey and would result in 100% coverage for a 100% overlap threshold. The actual techniques partially reproduce the observed behavior, and an index measures the behavioral consistency of path generation methods with respect to the ideal algorithm by accounting for total overlap over all the observations: N
∑O
nr ,max
CI r =
n =1
N Omax
(8)
where CIr is the consistency index of algorithm r, Onr,max is the maximum overlap measure obtained with the paths generated by algorithm r for the observed choice of each driver n, and Omax is the 100% overlap over all the N observations for the ideal algorithm. Five different deterministic algorithms are applied with the objective of maximizing the coverage function: • The branch-and-bound algorithm is implemented by defining the parameters for the constraints of the branching rule: the distance factor is 1.10, the time factor is 1.50, the loop factor is 1.20, the overlap factor is 0.80, and the maximum number of left turns is 4. The first factor introduces tolerance with respect to links that take the driver farther from the destination. The following three factors guarantee heterogeneity among paths, since routes that are too similar and overcircuitous are excluded from the generated path set. • The labeling approach is applied by calculating the shortest path with respect to route attributes such as distance, free-flow time, travel time, and delay, which measure the level of congestion by evaluating the difference between travel times in congested and in free-flow conditions. • The link elimination approach is modified from the original formulation described by Azevedo et al. (7) by repeating for 10 iterations the following three-step method: (a) computation of the shortest path by considering the travel time, (b) elimination of a link belonging to the current shortest path, and (c) computation of the nextshortest path. Shortest-path links are eliminated if they take the driver farther from the destination or compel the driver to turn from a high hierarchical road to a low hierarchical road. • The link penalty approach is adapted from the original method proposed by De la Barra et al. (8) by replicating for 15 iterations the following three-step procedure: (a) computation of the shortest path by considering the travel time, (b) penalization of the links belonging to the current shortest path with a factor equal to 5% of the travel time, and (c) computation of the next-shortest path. The penalizing factor is a compromise between a low value for which the same path is identified repeatedly and a high value for which longer paths are generated before routes that are more similar to the shortest path. • Two simulation approaches are implemented by computing the shortest path for each draw of impedances of the links belonging to the network. The two approaches exploit the same procedure to draw impedances from a truncated normal distribution characterized from the following parameters: (a) mean equal to the travel time, (b) variance equal to a percentage of the mean, (c) left truncation limit equal to the free-flow time, and (d) right truncation limit equal to the travel time calculated for a minimum speed assumed equal to 10 km/h. The first simulation approach sets the variance equal to 20% of the mean
and extracts 25 draws. The second simulation approach defines the variance equal to the mean and extracts 35 draws. For all algorithms based on the shortest-path search, the number of iterations or draws is determined by the asymptotically decreasing ability of each technique to generate unique routes with an increasing number of repetitions, and the postprocessing step consists of the elimination of duplicated paths. The generated path sets are compared with respect to the observed routes, and the resulting coverage measures the goodness of fit of each method. Considering the number of reproduced routes and the different nature of each technique and looking for consistency of path sets with the observed behavior, two different choice sets are built considering the same reproduced observations: the first contains all the paths generated by the approaches relying on the shortest-path search, and the second consists of all the paths generated by the branch-andbound algorithm. Six route choice models are estimated and compared within the different choice sets: multinomial logit (MNL), C-logit, path-size logit (PSL), generalized-nested logit (GNL), cross-nested logit (CNL), and link-nested logit (LNL).
RESULTS OF PATH GENERATION ALGORITHMS The application of the described methodology allows comparison of coverage results from path sets generated with different methods and model performance within different choice sets.
Coverage Measures of Path Generation Algorithms The database of the Web-based survey responses contains 276 observations. Incomplete observations, presenting either missing values in the attitudinal section or any incorrectly coded route in the path-recording section, are excluded from consideration. Of the 236 observations entering the clean data set 90% contain information not only about the chosen route but also about the alternative routes considered for reaching the workplace. A total of 236 actual chosen routes and 339 possible alternatives, covering 182 different origin–destination pairs, constitute the comparison term for measuring the coverage of the path generation algorithms described in the methodological section. Table 1 presents coverage results according to different overlap thresholds varying from complete replication to the reproduction of 70% of the collected routes. Each of the single labels performs weakly, which suggests that the actual behavior of habitual drivers does not usually correspond to the shortest-path selection. The analysis of the combined effect of the four labels shows that only 40% of the chosen routes are replicated, whereas almost 45% are reproduced with 80% overlap threshold. The link elimination approach duplicates almost 60% and covers almost 70% of the observed routes with an 80% overlap threshold. It also outperforms the link penalty and simulation techniques, with a small variance for the probability distribution. The simulation method with large variance surpasses the thresholds of 60% of the replicated routes and 70% of at least 80% overlapping routes. The branch-and-bound algorithm largely outperforms each singlegeneration algorithm by replicating over 90% and reproducing over 96% of the observed routes. The branch-and-bound algorithm also slightly outperforms the path set resulting from the combination of all
Prato and Bekhor
TABLE 1
23
Coverage Results of Applied Algorithms Overlap Threshold
Algorithm
100%
90%
80%
70%
26.69
26.69
30.93
34.32
26.27
26.27
27.54
29.24
17.80
17.80
18.22
22.46
21.19
21.19
22.88
23.73
58.47
58.47
69.92
81.78
53.81 49.15
53.81 49.15
62.29 54.24
68.22 59.32
61.44
61.86
71.19
81.36
91.10
91.53
96.61
97.88
13.27
13.57
22.42
26.25
21.53
21.53
22.42
30.68
15.93
15.93
16.22
19.17
17.40
17.40
17.99
20.35
54.87
54.87
66.37
76.11
43.95 42.77
43.95 42.77
51.62 48.08
61.65 61.06
55.16
55.16
64.90
76.11
82.60
82.60
89.38
95.58
Chosen routes Labeling approach (least length) Labeling approach (least free flow time) Labeling approach (least travel time) Labeling approach (least delay) Link elimination approach Link penalty approach Simulation with small variance Simulation with large variance Branch-and-bound algorithm Alternative routes Labeling approach (least length) Labeling approach (least free flow time) Labeling approach (least travel time) Labeling approach (least delay) Link elimination approach Link penalty approach Simulation with small variance Simulation with large variance Branch-and-bound algorithm
the other methods since the coverage of the merged path set duplicates 87% of the routes and covers the 95% with an 80% overlap threshold. Results show the tendency of all algorithms to perform better with respect to the actual choices than to the possible alternatives considered by the respondents. More than 700 generated routes are completely inconsistent with the observed behavior by presenting TABLE 2
null overlap, mainly among the alternatives rather than among the chosen paths and mainly for shortest-path-based techniques rather than for the branch-and-bound algorithm. Behavioral consistency constitutes the most desirable property of the generation techniques, but providing guidelines for selecting the preferred method should consider the trade-off with computational performance. Table 2 shows the computational costs for the techniques applied to this case study and allows examination of the trade-off between implementation time and consistency. For each origin–destination pair, minimizing each label requires time for shortest-path calculation. The link elimination and link penalty approaches demand longer times for rewriting the network by removing one link or updating costs at each iteration. The simulation approaches require additional time for drawing random travel times before the shortest path is calculated. The branch-and-bound method exploits partial path subsegments calculated for different origin– destination pairs, with consequent reduction of computational costs while additional origin–destination pairs are processed. The labeling approach shows excessive inconsistency with respect to the observed behavior and provides limited increase in coverage with additional labels, and consequently further implementation of the method appears inadvisable. The link penalization, link elimination, and simulation techniques present asymptotical behavior with respect to the ability to produce unique routes and consequently additional iterations give reduced gain in coverage. The branch-and-bound technique gives excellent results in terms of coverage, computational costs similar to those of techniques largely outperformed, and shows a good trade-off between behavioral consistency and implementation time. For example, with time comparable with that of the simulation method with small variance, the 40% coverage increase of replicated routes indicates that the proposed algorithm performs excellently.
Choice Set Composition for Model Estimation Reproduction of the actual chosen route indicates the consistency of each algorithm with respect to the observed behavior of each respondent. Inconsistent observations for which path enumeration methods fail to reproduce the actual choice are excluded from choice sets prepared for model estimation. To visualize the consistency of the applied algorithms regarding the observed behavior, Figure 2 represents the distribution of coverage over the cumulative percentage of observations. The ideal algorithm would replicate all the observed routes and draw a horizontal line at the 100% overlap threshold. The area below
Computational Performance of Applied Algorithms
Algorithm Labeling approach (length) Labeling approach (free flow time) Labeling approach (travel time) Labeling approach (delay) Link elimination approach Link penalty approach Simulation with small variance Simulation with large variance Branch-and-bound method
Computational Time
Number of Unique Routes
Consistency Index
1.5 h 1.5 h 1.5 h 1.5 h 36 h 54 h 42 h 58 h 40 h
182 182 182 182 958 1164 1097 3305 2038
53.54 49.36 43.34 44.46 87.16 81.29 75.49 88.12 97.91
NOTE: Computations performed using Borland Pascal and Microsoft Excel XP on an Intel Pentium IV 3.06 GHz with 512 Mb RAM running Windows XP Home.
24
Transportation Research Record 1985 100
90
80
overlap threshold
70
60
50
40
30
20
10
0 0
20
40
60
100
80
cumulative percentage of the observations branch & bound
simulation with large variance
simulation with small variance
link elimination
link penalty
labeling approach
FIGURE 2
Distribution of coverage over 236 observations.
each line representing the distribution of the coverage measures the consistency of each algorithm with respect to the rectangular area that the ideal algorithm would individuate. The consistency of the applied methods varies from 67.2% for the labeling approach to 97.9% for the branch-and-bound method. With the objective of constructing choice sets for model estimation, the 80% overlap threshold individuates the observations to be included for each algorithm. With the objective of increasing the reliability of choice set comparison by considering the same observations and a high number of observations, two choice sets are constructed. The first choice set merges the path sets generated with the algorithms based on the shortest-path search, since each single algorithm reproduces a limited number of observations. The second choice set corresponds to the path set obtained with the branch-and-bound technique. Both choice sets account for the same 223 observations and results consistent with the same observed behavior. Figure 3 illustrates differences between the two choice sets in terms of number of alternatives and number of links for each observation. For the choice set obtained by merging paths generated with different algorithms, the median size counts 32 routes, more than one quarter of the observations contain at least 40 paths, and the maximum number of alternatives reaches 55. For the choice set generated with the branch-and-bound technique the median size is 17 routes, only 6% of the observations consist of 40 paths or more, and the maximum number of alternatives is 44. Both choice sets include a high number of alternatives and presumably contain routes that drivers would not consider. Considering the number of links for each observation with respect to the merged path set, the branch-and-bound path set presents a higher ratio between links and routes. This finding indicates that the paths share fewer links and are most likely to be more heterogeneous.
Route Choice Model Estimation Data sets for model estimation account for • Level-of-service variables, such as distance, free-flow time, and travel time; • Landmark dummy variables, equal to 1 if the route crosses the landmark and zero otherwise; and • Behavioral variables, measured at the individual level by applying factor analysis to the behavioral indicators (16). In contrast to the MNL model, different model specifications account for similarity among alternatives and require the estimation of additional parameters. The C-logit and PSL models maintain the logit structure by including a correction term within the deterministic part of the utility function. The following formulations for commonality factor and path size are applied for C-logit and PSL estimation (18, 19): ⎡ ⎛ Lkl ⎢ CFk = β 0 ln ⎢1 + ∑ ⎜⎜ ⎢ l ∈Cn ⎝ Lk Ll ⎢⎣ k ≠l
⎤ ⎞ ⎛ Lk − Lkl ⎞ ⎥ ⎟⎟ ⎜ ⎟⎥ ⎠ ⎝ Lt − Lkl ⎠ ⎥ ⎥⎦
where CFk Lk Ll Lkl β0
= = = = =
commonality factor of route k, length of route k, length of each route l in choice set Cn, common length between routes k and l, and 1.
(9)
Prato and Bekhor
25
55 50
45
number of unique routes
40 35 30
25 20 15 10
5 0 0
20
40
60
80
100
cumulative percentage of the observations shortest path–based methods
FIGURE 3
PSk =
Lq
∑L
a ∈Γ k
k
1 ⎛ Lk ⎞
∑ ⎜⎝ L ⎟⎠
l ∈Cn
Distribution of number of alternatives in choice sets.
(10)
γ
δ al
t
where = = = =
path size of route k, set of links of route k, length of link a, link–path incidence dummy (equal to 1 if route l uses link a, and zero otherwise), and γ = positive parameter.
PSk Γk La δal
Generalized nested structures relate the model coefficients to the network topology for adaptation to route choice. The GNL, CNL, and LNL models consider the following functional relationship for the inclusion coefficients (20): α mk =
Lm δ mk Lk
(11)
where αmk Lm Lk δmk
= = = =
branch-&-bound algorithm
inclusion coefficients (with 0 ≤ αmk ≤ 1), length of link m, length of route k, and link–path incidence dummy (equal to 1 if route k uses link m and zero otherwise).
For GNL estimation, nesting coefficients are considered unique for each nest and are expressed with the following parameterized formulation (21):
⎛ ⎜ μ m = ⎜1 − ⎜⎝
∑α ∑δ
ml
l ∈Cn
ml
l ∈Cn
⎞ ⎟ ⎟ ⎟⎠
γ
(12)
where µm are the nesting coefficients (with 0 ≤ µm ≤ 1) and γ is a parameter to be estimated. CNL is a particular case of the GNL model, in which all the links share a common nesting coefficient µm to be estimated. LNL is a particular case of the CNL model, in which the common nesting coefficient µm approaches zero and is not estimated (22). Tables 3 and 4 illustrate the best estimates for route choice models considering both choice sets. The same interpretation of the results is possible for both choice sets. Parameter estimates suggest that choices are influenced by experience since habit and familiarity negatively influence the utility whereas landmarks positively affect drivers’ behavior. The same conclusions about the goodness of fit across models are reachable for both choice sets. The LNL model largely outperforms all other models, and PSL and C-Logit models also produce very good results. GNL and CNL models tend to collapse to MNL and present worse results than MNL. Statistical comparison across data sets is not available, for the impossibility of measuring the covariance across data sets that are correlated since they also include similar routes. The better likelihood ratio index obtained for the branch-and-bound choice set is justified by the lower number of alternatives in the branch-andbound choice set, which leads to higher probabilities of choosing the observed choice and consequently to higher likelihood values. Since the choice sets include the same observations and the same conclusions are drawn in terms of result interpretation and model comparison, the analysis of the choice sets suggests that the
−3.58 −6.29 6.10 2.51 9.06 4.97 2.89 −1.34 −1.31 −2.73 1.23 −0.97
−0.882 −0.422 2.692 1.113 4.083 3.256 1.061 −0.648 −0.527 −0.482 0.225 −0.154
Distance Travel time Sabotino square dummy Adriano square dummy Sommeiller bridge dummy Dante bridge dummy Rivoli square dummy Bernini square dummy Acaja square dummy Habit Spatial ability Familiarity Commonality factor Ln of path size Common nesting coefficient Gamma—unique nesting coefficient Log likelihood at estimates Likelihood ratio index
N = 223; log likelihood for all coefficients at zero is −718.35.
−592.30 0.175
t-Stat.
Estimate
MNL
Model Estimation with Merged Choice Set
Variable
TABLE 3
−4.36 −6.44 5.85 2.46 8.69 5.06 3.92 −0.81 −0.50 −2.97 1.34 −1.09 −4.45
t-Stat.
−583.46 0.188
−1.025 −0.412 2.544 1.113 4.040 3.335 1.449 −0.379 −0.195 −0.514 0.239 −0.163 −1.197
Estimate
C-Logit
5.89
4.245
−575.45 0.199
−4.41 −6.46 5.17 2.58 8.47 4.88 3.45 −1.14 −0.79 −3.14 1.36 −1.34
t-Stat.
−1.028 −0.416 2.156 1.148 3.659 3.054 1.214 −0.545 −0.309 −0.530 0.241 −0.199
Estimate
PSL (γ = 9)
−3.84 −3.72 4.53 2.35 5.18 4.16 2.72 −1.44 −0.92 −2.33 0.90 −2.75
t-Stat.
4.245 1.35 −597.19 0.169
−0.803 −0.305 2.154 0.816 3.211 2.689 0.843 −0.528 −0.279 −0.264 0.077 −0.176
Estimate
GNL
3.49
−2.91 −3.38 3.53 1.99 3.83 3.31 2.32 −1.07 −0.77 −1.88 0.61 −0.93
t-Stat.
−596.49 0.170
0.986
−1.215 −0.572 3.837 1.418 5.640 4.509 1.490 −0.805 −0.467 −0.996 0.222 −0.310
Estimate
CNL
−2.97 −2.83 4.37 2.12 6.64 3.91 1.92 −0.90 −0.76 −2.94 0.50 −4.46
t-Stat.
−442.58 0.384
−0.929 −0.234 2.357 1.214 3.684 3.279 0.968 −0.653 −0.432 −0.478 0.082 −0.649
Estimate
LNL
−6.29 −5.29 4.33 2.56 8.00 3.27 1.13 −2.02 −1.45 −2.34 1.86 −1.76
−1.544 −0.355 1.930 1.138 3.089 1.846 0.345 −1.006 −0.605 −0.319 0.270 −0.209
Distance Travel time Sabotino square dummy Adriano square dummy Sommeiller bridge dummy Dante bridge dummy Rivoli square dummy Bernini square dummy Acaja square dummy Habit Spatial ability Familiarity Commonality factor Ln of path size Common nesting coefficient Gamma—unique nesting coefficient Log likelihood at estimates Likelihood ratio index
N = 223; log likelihood for all coefficients at zero is −600.07.
−466.27 0.223
t-Stat.
Estimate
MNL
−5.13 −5.13 3.99 2.69 7.47 3.43 2.57 −1.31 −0.64 −2.53 1.87 −1.95 −2.88
t-Stat.
−462.28 0.230
−1.271 −0.335 1.746 1.169 2.903 1.936 0.745 −0.619 −0.250 −0.343 0.272 −0.227 −1.218
Estimate
C-Logit
Model Estimation with Branch-and-Bound Choice Set
Variable
TABLE 4
4.33
3.455
−457.19 0.238
−4.04 −5.18 3.34 2.71 7.08 3.41 1.99 −1.48 −0.75 −2.51 2.05 −2.61
t-Stat.
−1.033 −0.332 1.465 1.159 2.720 1.903 0.555 −0.686 −0.288 −0.338 0.297 −0.309
Estimate
PSL (γ = 9)
−5.52 −4.06 3.66 2.30 5.76 2.83 0.43 −2.59 −1.86 −2.12 1.50 −0.94
t-Stat.
0.091 0.06 −470.79 0.215
−1.617 −0.302 1.775 1.070 2.748 1.630 0.141 −1.458 −0.830 −0.294 0.219 −0.114
Estimate
GNL
4.63
−4.24 −3.63 3.23 2.16 4.20 2.44 0.23 −2.05 −1.54 −1.76 0.59 −0.77
t-Stat.
−463.22 0.228
0.941
−1.548 −0.347 1.854 1.165 2.979 1.592 0.103 −1.434 −0.811 −0.445 0.147 −0.165
Estimate
CNL
−4.46 −2.07 3.17 1.82 4.54 2.49 0.82 −1.34 −0.91 −2.10 1.28 −2.88
t-Stat.
−364.89 0.392
−1.299 −0.160 1.632 1.112 2.478 2.049 0.338 −0.883 −0.500 −0.296 0.212 −0.352
Estimate
LNL
28
branch-and-bound algorithm performs better from the behavioral perspective by reproducing the actual chosen routes while including more heterogeneous alternatives in smaller choice sets.
SUMMARY AND CONCLUSIONS A path enumeration algorithm based on the branch-and-bound technique is proposed. This method is applicable to any urban network since its implementation requires existing resources and computational speed depends more on the tree depth than on the tree width. Implementation in a real case study shows the applicability of the method and evaluates the performance of the algorithm. Labeling, link elimination, link penalty, and simulation approaches and branch-and-bound techniques are applied to an urban network. A comparison with respect to actual routes chosen by individuals driving habitually from home to work shows that the proposed algorithm is significantly better from the perspective of behavioral efficiency with respect to the ideal algorithm. In parallel, the designed technique shows a good trade-off between computational costs and efficiency with respect to the methods that demonstrate a small performance increase with increasing implementation time. Construction of different choice sets characterized by similar behavioral efficiency—one consisting of a path set resulting from the combination of methods relying on the shortest-path search and the other resulting from the application of the branch-and-bound algorithm— enables evaluation of results and comparison of model estimation. The two choice sets produce estimates that are qualitatively comparable and suggest similar conclusions in terms of model comparison. Results from model estimation suggest the need for further investigation into generalized nested structures, since GNL and CNL models tend to collapse to the MNL model and the LNL model largely outperforms all other route choice models. Another area for further investigation is the route choice mechanism, since the influence of habit and landmarks on the utility suggests that distance and travel time are not the only elements considered in choosing a route. The parameters for defining the constraints in the bounding rule were arbitrarily defined in this study on the basis of common sense. Further research is needed to verify the sensitivity of the branchand-bound algorithm to the constraints’ parameters and to measure the effectiveness of the proposed method with respect to different data sets.
ACKNOWLEDGMENTS The authors are grateful to the anonymous reviewers, who provided many insightful comments and corrections to improve this paper.
REFERENCES 1. Cascetta, E., F. Russo, and A. Vitetta. Stochastic User Equilibrium Assignment with Explicit Path Enumeration: Comparison of Models and Algorithms. Proc., 8th IFAC Symposium on Transportation Systems, Technical University of Crete, Chania, 1997, pp. 1078–1084. 2. Prashker, J. N., and S. Bekhor. Route Choice Models used in the Stochastic User Equilibrium Problem: A Review. Transport Reviews, Vol. 24, 2004, pp. 437–463.
Transportation Research Record 1985
3. Dijkstra, E. W. Note on Two Problems in Connection with Graphs. Numerical Mathematics, Vol. 1, 1959, pp. 269–271. 4. Van der Zijpp, N. J., and S. Fiorenzo-Catalano. Path Enumeration by Finding the Constrained K-Shortest Paths. Transportation Research, Vol. 39B, 2005, pp. 545–563. 5. Ben-Akiva, M., M. J. Bergman, A. J. Daly, and R. Ramaswamy. Modeling Inter-Urban Route Choice Behaviour. Proc., of the Ninth International Symposium on Transportation and Traffic Theory, VNU Science Press, Utrecht, Netherlands, 1984, pp. 299–330. 6. Dial, R. B. Bicriterion Traffic Equilibrium: T2 Model, Algorithm, and Software Overview. In Transportation Research Record: Journal of the Transportation Research Board, No. 1725, Transportation Research Board of the National Academies, Washington, D.C., 2000, pp. 54–62. 7. Azevedo, J. A., M. E. O. Santos Costa, J. J. E. R. Silvestre Madera, and E. Q. Vieira Martins. An Algorithm for the Ranking of Shortest Paths. European Journal of Operational Research, Vol. 69, 1993, pp. 97–106. 8. De la Barra, T., B. Perez, and J. Anez. Multidimensional Path Search and Assignment. Proc., 21st PTRC Summer Annual Meeting, PTRC Education and Research Services Ltd., London, 1993. 9. Park, D., and L. R. Rilett. Identifying Multiple and Reasonable Paths in Transportation Networks: A Heuristic Approach. In Transportation Research Record 1607, TRB, National Research Council, Washington, D.C., 1997, pp. 31–37. 10. Scott, K., G. Pabon-Jimenez, and D. Bernstein. Finding Alternatives to the “Best” Path. Presented at 76th Annual Meeting of the Transportation Research Board, Washington, D.C., 1997. 11. Sheffi, Y., and W. B. Powell. An Algorithm for the Equilibrium Assignment Problem with Random Link Times. Networks, Vol. 12, 1982, pp. 191–207. 12. Fiorenzo-Catalano, S., and N. J. Van der Zijpp. A Forecasting Model for Inland Navigation Based on Route Enumeration. Proc., European Transport Conference, PTRC Education and Research Services Ltd., London, 2001, pp. 1–11. 13. Bekhor, S., M. E. Ben-Akiva, and S. Ramming. Route Choice: Choice Set Generation and Probabilistic Choice Models. Proc., 4th TRISTAN Conference, Azores, Portugal, 2001. 14. Friedrich, M., I. Hofsäss, and S. Wekeck. Timetable-Based Transit Assignment Using Branch and Bound Technique. In Transportation Research Record: Journal of the Transportation Research Board, No. 1752, TRB, National Research Council, Washington, D.C., 2001, pp. 100–107. 15. Hoogendoorn-Lanser, S. Modelling Travel Behaviour in Multi-modal Networks. TRAIL Research School, Netherlands, 2005. 16. Prato, C. G., S. Bekhor, and C. Pronello. Methodology for Exploratory Analysis of Latent Factors Influencing Drivers’ Behavior. In Transportation Research Record: Journal of the Transportation Research Board, No. 1926, Transportation Research Board of the National Academies, Washington, D.C., 2005, pp. 115–125. 17. Prato, C. G. Latent Factors and Route Choice Behaviour. Ph.D. thesis. Turin Polytechnic, Italy, 2005. 18. Cascetta, E., A. Nuzzolo, F. Russo, and A. Vitetta. A Modified Logit Route Choice Model Overcoming Path Overlapping Problems: Specification and Some Calibration Results for Interurban Networks. Proc., Thirteenth International Symposium on Transportation and Traffic Theory, Pergamon, Lyon, France, 1996, pp. 697–711. 19. Ramming, S. Network Knowledge and Route Choice. Ph.D. thesis. Massachusetts Institute of Technology, Cambridge, Mass., 2001. 20. Prashker, J. N., and S. Bekhor. Investigation of Stochastic Network Loading Procedures. In Transportation Research Record 1645, TRB, National Research Council, Washington, D.C., 1998, pp. 94–102. 21. Bekhor, S., and J. N. Prashker. Stochastic User Equilibrium Formulation for Generalized Nested Logit Model. In Transportation Research Record: Journal of the Transportation Research Board, No. 1752, Transportation Research Board of the National Academies, Washington, D.C., 2001, pp. 84–90. 22. Vovsha, P., and S. Bekhor. Link-Nested Logit Model of Route Choice: Overcoming the Route Overlapping Problem. In Transportation Research Record 1645, TRB, National Research Council, Washington, D.C., 1998, pp. 133–142. The Traveler Behavior and Values Committee sponsored publication of this paper.