trip distribution modelling using fuzzy logic and a ...

Transportation Planning and Technology, June 2003 Vol. 26, No. 3, pp. 213–238

TRIP DISTRIBUTION MODELLING USING FUZZY LOGIC AND A GENETIC ALGORITHM ´ a and DUSˇAN TEODOROVIC ´b MILICA KALIC a

Faculty of Transport and Traffic Engineering, University of Belgrade, Vojvode Stepe 305, 11000 Belgrade, Serbia and Montenegro; bThe Charles E. Via, Jr. Department of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University, 7054 Haycock Road, Falls Church, VA 22043, USA (Received 20 September 2001; Revised 10 August 2002; In final form 27 August 2003)

This article examines possibilities for the application of soft computing techniques for the prediction of travel demand. The model, based on fuzzy logic and a genetic algorithm, successfully solves the trip distribution problem. The possibilities of using the proposed model in solving trip generation, modal split and route choice problems have also been indicated. The model has been tested on a real numerical example. Exceptionally good correspondences between estimated and real values of passenger flows have been obtained. Keywords: Transportation planning; Trip distribution; Soft computing; Fuzzy logic; Genetic algorithm

1. INTRODUCTION The prediction of travel demand constitutes an important part in transportation planning. Every transportation planner-analyst seeks to predict travel demand as accurately as possible. Decisions based on projects that involve travel demand predictions affect the development of any transportation system. Therefore, transportation planners bear an exceptionally heavy responsibility. According to the traditional view, the transportation planning proISSN 0308-1060 print: ISSN 1029-0354 online  2003 Taylor & Francis Ltd DOI: 10.1080/0308106032000154575

214

´ and D. TEODOROVIC ´ M. KALIC

cess consists of four stages: trip generation, trip distribution, modal split and route choice. In the sequential modelling of travel demand, the value of the output variable from one phase represents the value of the input variable in the next phase. Besides the sequential modelling approach there is also simultaneous modelling. Simultaneous modelling refers to the simultaneous prediction of trip generation, trip distribution and modal split. In this article only one phase of the sequential modelling of travel demand is examined – trip distribution. This article attempts to explore the possibilities of using more recent techniques, such as soft computing techniques (for example, fuzzy logic, neural networks and genetic algorithms), in the sequential four-stage modelling of travel demand. Therefore, a trip distribution model based on soft computing techniques is presented here. The model has been tested by estimating air passenger flows between particular cities in a region. This article attempts to provide at least partial answers to the following questions: Can soft computing techniques be used in predicting travel demand? What is the ‘quality’ of the results achieved by these techniques? Are the reasons for continuing to develop models in this area by means of fuzzy logic and genetic algorithms well grounded? What are the advantages and disadvantages of these techniques? Despite the fact that soft computing techniques are widely used, the authors hold the view that their application in transportation planning has not been sufficiently well investigated to date. The application of fuzzy concepts in solving different kinds of problem is based on two aspects. The first aspect is related to the fact that fuzzy set theory is a suitable mathematical tool for treating subjectivity, ambiguity, uncertainty and imprecision [1–10]. Namely, we often do not have sufficiently precise input data, or the data are based on a subjective feeling of the expert and are most often described in linguistic terms. An expert or a decision maker often makes decisions according to experience, intuition and a subjective estimation of particular parameters. Over the last three decades a number of models based on the concept of fuzziness, which treat subjectivity, uncertainty and imprecision, have been developed. The second aspect of fuzzy logic relates to the problems in which uncertainty, subjectivity or imprecision are not present. In the last few years fuzzy logic has proved to be a good tool for tackling problems involving a mapping of inputs into outputs. Fuzzy systems can map inputs into outputs in the regions in which they are defined [7], without knowing the analytical relations between inputs and outputs. In recent

TRIP DISTRIBUTION MODELLING

215

years, significant theoretical results have been achieved in the field of fuzzy logic systems. Wang [11,12], Wang and Mendel [13] and Kosko [4] showed that fuzzy logic systems could be treated as universal approximators. This means that a fuzzy logic system can uniformly approximate any real continuous nonlinear function to an arbitrary degree of accuracy [5]. The idea that fuzzy logic can be applied in transportation planning, or more precisely, in predicting travel demand based on the number of generated – attracted trips in observed zones – came from the fact that fuzzy systems were found to be universal approximators. This article is organized as follows. The statement of the problem is given in Section 2, while in Section 3 the proposed solution is presented with a description of the model and explanation of the possible ways of using the fuzzy logic and genetic algorithm techniques. Section 4 presents a numerical example together with the parameters used and the results obtained. The model is then tested on real data. The conclusions based on the correspondence of the results of the model with reality are given in Section 5. 2. STATEMENT OF THE PROBLEM Trip distribution constitutes the second stage in the traditional transportation planning process. Trip distribution models are used to determine the number of trips between pairs of zones when the number of trips generated – attracted by particular zones – is known. Thus, the prediction of trip distribution involves the prediction of flows in a network regardless of a possible transportation mode or travel route. However, a model used to distribute trips to traffic zones can also be applied after the modal split stage. In that case, flows are determined for a particular mode of transportation. A number of models for destination choice developed in the past use theories based on travellers’ decision making processes. The logit model [14, 16] is the most widely used model of this type. During the last decade a number of trip distribution models have been developed. Carpio and Gomes [17] discussed the trip distribution problem within a game-theoretical framework. Lam and Huang [18] presented a combined trip distribution and assignment model with multiple user classes. In their model, the entropy-type (or gravity-type) trip distribution submodel is combined with the user equilibrium assignment problem for multiclass user transportation networks. Two new algorithms are developed and

216


appropriate computational results are reported. Easa [19] has reviewed the state-of-the-art in urban passenger trip distribution modelling and has discussed preparing input data, selecting a trip distribution model, calibrating the model, validating the model and forecasting. Easa [20] has also reviewed the status of quick-response analysis (simplified techniques, traffic-count-based models, a self-calibrating gravity model, partial matrix techniques and heuristic methods). In addition the implementation of PC-based quick-response techniques as well as special topics (including geographic information system (GIS) applications, use of census data, freeway trip distribution, pedestrian trip distribution, other trip distribution models and theoretical features) were discussed. Oppenheim [21] developed a user-equilibrium trip distribution/assignment model when analysing a variety of issues concerning the effects of road pricing policies on urban travel demand. Goncalves and Uysseaneto [22] have developed the new gravityopportunity model for trip distribution and shown that the conventional gravity model is a particular case of this new gravity-opportunity model. A practical application of the model for estimating intermunicipal passenger flows in Southern Brazil by public transport was reported. Roy [23] analysed the method proposed by Goncalves and Uysseaneto [22]. Arasan et al. [24] have made a detailed analysis of home-based trip distribution for three trip purposes, and trips made by five different modes using two types of gravity model formulations. Lam and Huang [25] studied the situation where a prior estimate of an origin-destination matrix is to be updated on the basis of recently acquired traffic counts and presented three methods for this type of a problem. The algorithms proposed, convergences and their computational efficiencies were also investigated. Johnston and Ceerla [26] critiqued policy of the many regional agencies that model travel demand without feeding assigned travel times back to the trip distribution model. They have developed a method that feeds back assigned travel times to the trip distribution stage. Tamin [27] attempted to develop a technique for modelling public transport demand (trip distribution and modal split) using traffic (passenger) count information and other simple zonal-planning data. The model developed represents a combination of the gravity model and the multi-nominallogit (MNL) model. The approach was tested using public transport data for Bandung (Indonesia). The model was found to provide a reasonably good fit. Mozolin et al. [28] estimated the commuter trip distribution. They compared their model based on multilayer perception neural networks with maximum-likelihood doubly-constrained


217

models. They showed that the predictive accuracy of neural network spatial interaction models is inferior to that of maximum-likelihood doubly-constrained models with an exponential function of distance decay. The majority of the models developed are based on analytical relations through which traffic flows are determined. Unlike statistical methods, artificial neural networks and fuzzy systems estimate the functions without specifying the mathematical model that describes the manner in which output results depend on input data (artificial neural networks and fuzzy systems are often described as ‘models without a model’). The basic research task in this article is to develop the trip distribution model based on soft computing techniques (fuzzy systems, neural networks and genetic algorithms) that make high quality predictions. The model developed should be able to make the appropriate prediction without knowing the functional relationships in effect between individual variables. As in other intelligent systems, the ‘intelligent’ trip distribution model should be able to generalize, adapt and learn based on new knowledge and new information. Therefore, this article demonstrates that the set of techniques used to predict travel demand could be broadened by soft computing techniques. 3. PROPOSED SOLUTION TO THE PROBLEM The trip distribution model is based on fuzzy logic and a genetic algorithm. The model consists of two parts. In the first part, the fuzzy rule base used to determine the number of trips between particular zones is generated. The fuzzy rule base obtained has a corresponding accuracy regarding the fitness between estimated and actual flows. In the second part of the model, in order to increase the degree of accuracy the initial fuzzy rule base is modified using a genetic algorithm. 3.1. Fuzzy Logic in Travel Demand Prediction Fuzzy logic has been used in solving various traffic and transportation problems. A detailed classification and analysis of the results achieved by fuzzy logic in modelling different traffic and transportation processes can be found in Teodorovic´ [8,9] and Teodorovic´ and Vukadinovic´ [10]. Even without being acquainted with the problem of planning and

218


predicting traffic flows, it can logically be assumed that an industrial town with a large population and a substantial income per capita will generate a large number of trips. In the same vein, famous tourist resorts attract a large number of trips. The idea was to apply fuzzy logic with the following type of rules: If THE NUMBER OF PASSENGERS DEPARTING FROM A TOWN TO TOURIST AIRPORTS is large, and THE NUMBER OF PASSENGERS FROM INDUSTRIAL TOWNS ARRIVING IN A TOURIST RESORT is large, then THE NUMBER OF PASSENGERS BETWEEN THE OBSERVED PAIR TOWN-TOURIST RESORT is large.

A global description of the trip distribution model is as follows: according to the number of passengers departing from a town to tourist airports, o, and the number of passengers from industrial towns arriving in a tourist resort, d, by using fuzzy logic, as a universal approximator, the flows between the observed towns and tourist resorts, f, are determined. In order to make a universal model of trip distribution for a network consisting of industrial towns, tourist resorts and flows between them, the normalization of input and output values is performed. In order to generate fuzzy rule bases from numerical examples, the procedure proposed by Wang and Mendel [13] was used. We could establish fuzzy sets for all the antecedents and the consequences. We approach this in such a way that, at the very beginning, we establish the domain intervals for all input and output variables (Fig. 1). We can divide each domain interval into a prespecified number of overlapping regions. The number of overlapping regions is not the same for each variable. The lengths of these overlapping regions are usually equal, but not necessarily so. In the next step each overlapping region is labelled and one membership function is assigned to it. We can also use different types of membership function for different variables. Let us assume that we have the following set of input output data pairs: (o(1), d (1); f (1)), (o(2), d (2); f (2)), …, (o(n), d (n); f (n)) where o and d represent the inputs while f represents the output. Values o, d and f belong to intervals [omin, omax], [dmin, dmax], [fmin, fmax], respectively. Each of the intervals is divided into subintervals within which membership functions of fuzzy sets are defined. Membership functions of fuzzy sets can assume different shapes, but for the sake of simplicity triangular shapes are most often used. Fig. 2 displays fuzzy


FIGURE 1

219

Fuzzy sets for input and output variables: (a) triangular fuzzy sets; (b) Gaussian curves.

220

FIGURE 2


Assignment of membership degrees in corresponding fuzzy sets for an input – output vector of data.

sets for input (o and d) and output variables (f). As seen from Fig. 1, different shapes of membership functions for the input and output variables were used (triangle and Gaussian curves). Having defined fuzzy sets we proceed to generate fuzzy rules. Let us explain how the first fuzzy rule is made from the first pair of data (o(1), d (1); f (1)) ⇒ (0.368, 0.035; 0.006). For value o(1), found in the interval [omin, omax] we determine the fuzzy set in which it has the largest membership degree. Let it be fuzzy set O5 and let o(1) belong to that fuzzy set with the membership degree of 0.75. The same procedure is applied to d (1) and f (1). Let D1, F2 be fuzzy sets in which the given values have the greatest membership degree:


221

(o(1), d (1); f (1)) ⇒ (o(1) belongs to set O5 with O5 ⫽ 0.75, d (1) belongs to set D1 with D1 ⫽ 0.71, f (1) belongs to set F2 with F2 ⫽ 0.85) ⇒ Rule 1 reads as follows: If o ⫽ O5 and d ⫽ D1 then f ⫽ F2 Fig. 2 illustrates the assignment of membership degrees of input and output variables in the corresponding fuzzy sets when generating rules according to the first vector of data (o(1), d (1); f (1)) ⇒ (0.368, 0.035; 0.006). The existence of conflicting rules – having the same premise but a different consequence – suggests the need to compute the degree of rule. The degree of rule is defined as the product of membership degrees of o, d and f in the corresponding fuzzy sets. D(Rule) ⫽ O(o)·D(d)·F(f) For example, the degree of rule 1 is defined as: D(Rule 1) ⫽ O5(o(1))·D1(d (1))·F2(f (1)) ⫽ 0.75·0.71·0.85 ⫽ 0.45 The problem of conflict is resolved by discarding those conflicting rules having a smaller degree. After resolving the problem of conflict between certain rules, a set of fuzzy rules is obtained which is frequently incomplete. Fig. 3(a) shows an incomplete fuzzy rule base. If a fuzzy system has two premises, the fuzzy rule base can easily be completed by logical inference. The remaining fields completed by the analyst who should consider the logic of the given matrix. It can be logically assumed that with an increase in the number of

FIGURE 3

Fuzzy rule bases: (a) incomplete; (b) complete.

222


passengers departing from a particular industrial zone there is an increase in the value of flow towards the observed tourist resort. It can similarly be assumed that as the number of passengers arriving in a particular tourist resort increases so does the number of passengers arriving in that tourist resort from a particular zone. In other words, ordinal numbers of fuzzy sets representing flows increase from left to right and from top downwards. Fig. 3(b) shows a complete fuzzy rule base, completed by the analyst. Having formed a fuzzy rule base we proceed to apply fuzzy logic, we use inference methods (reasoning techniques). The output of fuzzy logic is a fuzzy set. If a ‘crisp’ value is needed, a method of defuzzification is employed. Defuzzification is thus interpreted as the process of computing a numerical value from the resulting fuzzy set representing the output of fuzzy logic. In this article the most commonly used method, the centre of gravity, has been applied. Two fuzzy logic systems have been used. The first fuzzy system is based on singleton fuzzification, membership functions of input and output variables having a triangular shape, MAX-MIN fuzzy reasoning and defuzzification by the centre of gravity. The second fuzzy system involves singleton fuzzification, membership functions of input and output variables in the shape of the Gaussian curve, MAX-DOT fuzzy reasoning and defuzzification by the centre of gravity. In the latter case, the assumptions of the proven mathematical theorem that fuzzy systems are universal approximators were satisfied [11,12].

3.2. Genetic Algorithm for Modifying a Fuzzy Rule Base A fuzzy system is better if fuzzy-estimated values are closer to actual values of the output variable. Consequently, we are faced with the problem of determining the fuzzy rule base with the best fitness between fuzzy-estimated and actual values. The problem of searching ‘the best’ fuzzy rule was solved by using a genetic algorithm. Genetic algorithms were developed by analogy with Darwin’s theory of evolution and the basic principle of the ‘survival of the fittest.’ Holland [29] and Goldberg [30] achieved the most significant results in the field of genetic algorithms. In the first step, various solutions to the maximization (or minimization) problem are generated. In the next step, the evaluation of these solutions – that is, the estimation of the objective (cost) function – is made. Some of the ‘good’ solutions yielding a better ‘fitness’ (objective function value)


223

are further considered. The remaining solutions are eliminated from further consideration. The solutions selected undergo the phases of reproduction, crossover and mutation. After that, a new generation of solutions is produced to be followed by a new one, and so on. Each new generation is expected to be ‘better’ than the previous one. The production of new generations is stopped when a prespecified stopping condition is satisfied. The final solution of the problem is thus the best solution generated during the search. Each fuzzy rule consists of a premise and a conclusion. If the premise contains two variables, the fuzzy rule base can be represented in the form of matrix. The fuzzy rule base depicted in Fig. 4 represents the initial base – the basic individual in the proposed genetic algorithm. In the initialization the values of the number of generations are assigned as well as the number of individuals (strings) in a population. The number of individuals is an even number and is equal in each succeeding generation. According to the initial base (individual), other individuals in the first generation are created. Each individual is characterized by the matrix of rules and the value of the fitness function. The fitness function represents a reciprocal value of the sum of square deviation of real from estimated values of the output variable obtained by fuzzy logic. The individuals are formed in the following way. First, the number of changes in the basic individual is randomly generated. This number of changes is a random integer number from the interval [1, K] (where K is set in advance) and denotes the number of fields in a matrix where a change will occur. For instance, if number 9 has been randomly generated, a change will occur in 9 fields of the basic matrix. The fields in which a change will occur are also generated in a random fashion (the fields obtained according to linguistic information (fields filled in by an expert) are only considered for potential change.) The generator of random numbers is also used to determine the conclusion, therefore an element from the set of possible conclusions for the field considered. A field in a matrix corresponds to a fuzzy rule. If the conclusion in one field is changed, the logic of the matrix should be checked. In this manner, based on the initial fuzzy rule base the first generation of individuals (matrix of rules) is formed. The number of individuals is previously defined and determined in the initialization. To obtain the estimated values of the output variable for each individual – the fuzzy rule base – the MAX-MIN or MAX-DOT reasoning technique is applied to be followed by defuzzification performed by the centre of gravity method. Fig. 4 depicts the

FIGURE 4

Generation of initial population of individuals.

224 ´ and D. TEODOROVIC ´ M. KALIC


225

generation of the initial population from the first generation of individuals. The probability for each individual to be selected for mating is calculated as the ratio between the value of the fitness function of the given individual and the sum of values of the fitness functions of all individuals in one generation. Based on these probabilities a set of individuals is produced from which mating is later performed. Thus, an individual with a higher selection probability can appear in the set of individuals for mating in a number of copies while an individual with a small selection probability may not appear at all. Reproduction, being selection for mating, is illustrated in Fig. 5. In the first step of the crossover operation, from a reproduction set of individuals (fuzzy rule bases), parents are chosen randomly. In the next step, genetic material is exchanged producing two new individuals (offspring). This involves the exchange of a rectangle which may be composed of at least one field, having dimensions of 1 ⫻ 1, up to m ⫻ n fields (if dimensions of the matrix of rules are m ⫻ n). When the exchange is completed, the so-called ‘logic’ of the matrix – the fuzzy rule base – has to be checked. The rules in a base must never be considered separately but simultaneously. Each rule depends on its neighbourhood in the matrix of rules. If the ‘logic’ of the matrix is not satisfied, a new exchange of genetic material will be attempted. The maximum number of iterations is a previously determined number. If genetic material has successfully been exchanged, the fitness function for both individuals is computed. If genetic material has not been successfully exchanged, upon computing the fitness function of the individuals where ‘logic’ of the matrix has been violated, penalization is performed. Penalization involves a percentage decrease in the value of the fitness function. Fig. 6 represents some of the possible crossovers of individuals – fuzzy rule bases. Mutation is performed with probability of order of magnitude 10 ⫺ 3. In our example, mutation is understood to be the change of a conclusion in one field in the matrix of rules. A field is changed according to the set of possible conclusions for the given field. These sets are defined previously according to the initial fuzzy rule base in the first step of the genetic algorithm: initialization and assignment of the initial values of variables. The condition to stop the genetic algorithm is a prespecified maximum number of generations. The solution is taken to be the individual – the fuzzy rule base from one of the generated populations – having the largest value of the fitness function.

FIGURE 5

Reproduction – selection of individuals for mating.



FIGURE 6

227

Some possible crossovers of individuals.

4. NUMERICAL EXAMPLE The trip distribution stage in the air transportation planning process is investigated as an illustration of the methodology proposed. The model predicts the flows between industrial towns and tourist resorts in an observed region. The flows are estimated according to the number of passengers generated by industrial towns and the number of passengers attracted by tourist resorts (Fig. 7). The example considered is based on real data collected in 1989 and 1990. The data from the first year were used as a training set while the data from the second year were used as a testing set. The air transportation network consists of four industrial towns, seven tourist resorts and 18 links (Fig. 7). The variables used in these models refer to the number of passengers departing from a town to tourist airports, the number of passengers arriving from towns in a

228

FIGURE 7


Industrial towns and tourist resorts between which flows are to be estimated.

tourist resort and the number of passengers between a town and a tourist resort, denoted respectively by o, d and f. In spite of different shapes of membership functions of the input and output variables (triangle and the Gaussian curve), by applying the method of Wang and Mendel the identical incomplete fuzzy rule base shown in Fig. 3(a) was obtained. As can be seen, the incomplete fuzzy rule base comprises eight rules. So, from a total of 18 generated rules a fuzzy rule base with eight rules was formed, which means that some data vectors yielded equal rules, or some of the rules were mutually conflicting. The remaining fields were completed by the authors according to elementary logic that with an increase in the number of passengers departing from industrial towns and the number of passengers arriving in a tourist resort the flow between the pair industrial town/tourist resort also increases. The completed fuzzy rule base representing the initial base in the genetic algorithm is given in Fig. 3(b). Upon carrying out a number of experiments we arrived at the parameters used in the example and relating to genetic algorithms: • number of individuals in a generation, 300


229

FIGURE 8 New fuzzy rule base in the case of: (a) fuzzy logic with triangular fuzzy sets, MAX-MIN fuzzy reasoning and centre of gravity; (b) fuzzy logic with Gaussian curves, MAX-DOT fuzzy reasoning and centre of gravity.

• maximum number of changes in the basic individual, 25 • maximum number of changes when performing a crossover between two individuals, 50 • penalization, 0.7 • probability of mutation, 0.005 • number of generations, 50 By applying the algorithm, a new fuzzy rule base was obtained after 50 generations. Fig. 8(a) represents a new fuzzy rule base in the case of a fuzzy system with singleton fuzzification, membership function in the shape of a triangle, MAX-MIN fuzzy reasoning and defuzzification by the centre of gravity. Fig. 8(b) represents a new fuzzy rule base in the case of a fuzzy system with singleton fuzzification, membership function in the shape of Gaussian curves, MAX-DOT fuzzy reasoning and defuzzification by the centre of gravity. Fig. 9 shows values of the fitness function for different generations. By employing the genetic algorithm, we obtained a fuzzy rule base resulting from the optimization of the fitness function which represents a reciprocal value of the sum of square deviation of real from estimated values of the output variable. The values of flows from the training set obtained by fuzzy logic can be seen in Table I, while a graphical representation of the results obtained is given in Fig. 10. The model developed has been tested on the data collected during the second year, and the obtained results are given in Table II and Fig. 11.

230


FIGURE 9 Value of fitness function for different generations in the case of: (a) fuzzy logic with triangular fuzzy sets, MAX-MIN fuzzy reasoning and centre of gravity; (b) fuzzy logic with Gaussian curves, MAX-DOT fuzzy reasoning and centre of gravity.

What can be concluded from Tables I and II and from Figs 10 and 11? On one hand, by simple visual inspection and according to the results achieved, it can be said that by using fuzzy logic and the genetic algorithm ‘considerably good’ predictions of passenger flows can be obtained. From the figures illustrating a comparison of real and estimated passenger flows it can also be seen that the differences are

FIGURE 10

Comparison of real and estimated passenger flows (training set).

TRIP DISTRIBUTION MODELLING 231

97819 67434 3799 10404 2512 1171 590 20313 3560 12348 72017 65273 5015 4585 24573 3922 10252 10390

(C, (B, (A, (C, (B, (A, (D, (C, (B, (C, (C, (B, (A, (D, (C, (B, (C, (B,

5) 5) 5) 7) 7) 7) 7) 1) 1) 2) 4) 4) 4) 4) 6) 6) 3) 3)

Real values of passenger flows

Pair of zones

TABLE 1

94463 67172 5279 17529 3479 1210 1210 19166 3484 17030 78620 62397 4584 4129 19921 3477 18705 3490

92750 65882 4041 16182 3311 1175 1175 19802 3476 15314 79221 62797 2937 2933 21175 3552 18766 3398

Estimated values of passenger flows obtained by fuzzy logic and genetic algorithm Model 2 (Gaussian curve, MAX-DOT and center of gravity) 0.03356 0.00262 0.0148 0.07125 0.00967 0.00039 0.0062 0.01147 0.00076 0.04682 0.06603 0.02876 0.00431 0.00456 0.04652 0.00445 0.08453 0.069

K value Model 1 0.05069 0.01552 0.00242 0.05778 0.00799 0.00004 0.00585 0.00511 0.00084 0.02966 0.07204 0.02476 0.02078 0.01652 0.03398 0.0037 0.08514 0.06992

K value Model 2

Comparison of real and estimated passenger flows (training set) Estimated values of passenger flows obtained by fuzzy logic and genetic algorithm Model 1 (triangle, MAX-MIN and center of gravity) ⫺ ⫺ ⫹ ⫹ ⫹ ⫹ ⫹ ⫹ ⫺ ⫹ ⫺ ⫹ ⫺ ⫺ ⫹ ⫹ ⫺ ⫺

Sign


FIGURE 11

Comparison of real and estimated passenger flows (testing set).

TRIP DISTRIBUTION MODELLING 233

106332 67607 4619 16296 3180 468 585 24006 3524 14752 88770 58372 6254 4093 22234 3049 12300 11690

(C, (B, (A, (C, (B, (A, (D, (C, (B, (C, (C, (B, (A, (D, (C, (B, (C, (B,

5) 5) 5) 7) 7) 7) 7) 1) 1) 2) 4) 4) 4) 4) 6) 6) 3) 3)

Real values of passenger flows

Pair of zones

TABLE II

99279 67152 4791 20421 3752 1367 1367 21498 3747 19325 84164 67152 4991 4359 21152 3751 21011 3752

98839 67975 4110 19639 2902 1264 1264 21996 3448 17196 84320 66995 3088 3084 21299 3291 20994 3221

Estimated values of passenger flows obtained by fuzzy logic and genetic algorithm Model 2 (Gaussian curve, MAX-DOT and center of gravity) 0.07053 0.00455 0.00172 0.04125 0.00572 0.00899 0.00782 0.02508 0.00223 0.04573 0.04606 0.0878 0.01263 0.00266 0.01082 0.00702 0.08711 0.07938

K value Model 1 0.07493 0.00368 0.00509 0.03343 0.00278 0.00796 0.00679 0.0201 0.00076 0.02444 0.0445 0.08623 0.03166 0.01009 0.00935 0.00242 0.08694 0.08469

K value Model 2

Comparison of real and estimated passenger flows (testing set) Estimated values of passenger flows obtained by fuzzy logic and genetic algorithm Model 1 (triangle, MAX-MIN and center of gravity) ⫺ ⫹ ⫺ ⫹ ⫹ ⫹ ⫹ ⫹ ⫹ ⫹ ⫹ ⫹ ⫺ ⫺ ⫹ ⫹ ⫹ ⫺

Sign



235

insignificant when different shapes of membership functions of input and output variables and different reasoning techniques are used. On the other hand, it is absolutely necessary to support these statements by statistical analysis of the results obtained. In the example considered, real passenger flow values belong to the interval [0, 100000]. The smallest value in the training set is 590, while the greatest value is 97819. The mean passenger flow value is 23110 and the standard deviation is 30231. Our task was to ‘shoot’ value 590, value 1171, value 67434 or value 97819 with the same model. The problem considered here is characterized by great variability in the data. In other words, when evaluating results, it is extremely important to take into account the width of the interval to which all passenger flow values belong (in the numerical example the width of the interval is 100000). One of the possible evaluation measures of the model could be the following ratio: K⫽

兩Real value ⫺ Estimated value兩 Interval width

Obviously, the lower the K value, the better the model. The corresponding values for K are also shown in Tables I and II. In the case of a testing set, the mean passenger flow value is 24896, while the standard deviation equals 32503. As we can see from Tables I and II we have obtained very good K values in all the cases considered. We have also calculated the correlation coefficients between the real and the estimated values of passenger flows. In the case of Model 1 the correlation coefficients between the real and the estimated values were: r1 ⫽ 0.991041 r1 ⫽ 0.993146

(for training set) (for testing set).

In the case of Model 2, the correlation coefficients were: r2 ⫽ 0.991289 r2 ⫽ 0.990821

(for training set) (for testing set)

These correlation coefficients are also good indicators of the quality of the proposed approach. Are the results from the two models (Model 1 and Model 2) significantly different? To answer this question properly we performed a statistical test for paired observations (sign test). We recorded a plus ( ⫹ ) whenever the K value produced by Model 2 was smaller than the K value produced by Model 1, a minus ( ⫺ ) when it was greater, and a zero when it was the same. The total number, N, of pluses and

236


minuses equals 18. The number of ⫹ signs in the case of a training set is equal to 10. In the case of a testing set, the number of ⫹ signs equals 13. The corresponding value for the number of ⫹ signs is 4 at the 5% significance level [29]. Since the number of ⫹ signs in both cases is greater than the tabled value we conclude that there is no significant difference between Model 1 and Model 2.

5. CONCLUSION This article has demonstrated how soft computing techniques such as fuzzy logic and genetic algorithms can successfully be used in predicting travel demand. The possibility of and justification for using fuzzy logic has gained in importance in the light of a mathematical proof that fuzzy systems are universal approximators. The trip distribution model developed here, as part of a sequential transportation planning process, has been illustrated by a numerical example based on real data. It has the ability to generalize, adapt and learn based on new knowledge and new information. Graphical comparisons and a statistical analysis of real values and values of the output variable obtained by the suggested techniques were presented. According to the results achieved, it can be said that by using fuzzy logic and the genetic algorithm, good predictions of passenger flows can be obtained. From the graphs illustrating the comparison of real and estimated passenger flows, as well as from the corresponding statistical tests, it can also be seen that the differences are insignificant when different shapes of membership functions of input and output variables and different reasoning techniques are used. The results obtained suggest that soft computing techniques can be used in predicting travel demand. This preliminary research shows the high quality of the results achieved by these techniques. It seems that there are many positive reasons for continuing to develop models in this area by means of fuzzy logic and genetic algorithms.

Acknowledgement The authors would like to thank the anonymous referees whose comments and suggestions have improved this article.


237

References [1] Zadeh, L. (1965) ‘Fuzzy sets’, Information and Control 8, 338–353. [2] Klir, G. and Folger, T. (1988) Fuzzy Sets, Uncertainty and Information (Prentice Hall, Englewood Cliffs, NJ). [3] Zimmermann, H.-J. (1991) Fuzzy Set Theory and its Applications (Kluwer, Boston, MA). [4] Kosko, B. (1992) ‘Fuzzy systems as universal approximators’, Proc. IEEE International Conference on Fuzzy Systems, San Diego, 1153–1162. [5] Mendel, J. (1995) Fuzzy logic systems for engineering: a tutorial’, Proceedings of the IEEE, 83, 345–377. [6] Ross, T.M. (1995) Fuzzy Logic with Engineering Applications (McGraw-Hill, New York). [7] Kasabov, N.K. (1996) Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering (MIT, Cambridge, MA). [8] Teodorovic´, D. (1994) ‘Invited review: fuzzy sets theory applications in traffic and transportation’, European Journal of Operational Research 74, 379–390. [9] Teodorovic´, D. (1999) ‘Fuzzy logic systems for transportation engineering: the state of the art’, Transportation Research — A 33A, 337–364. [10] Teodorovic´, D. and Vukadinovic´, K. (1998) Traffic Control and Transport Planning: A Fuzzy Sets and Neural Networks Approach (Kluwer, Dordrecht). [11] Wang, L.X. (1992) ‘Fuzzy systems are universal approximators’, Proc. IEEE International Conference on Fuzzy Systems, San Diego, 1163–1170. [12] Wang, L.X. (1994) Adaptive Fuzzy Systems and Control: Design and Stability Analysis (PTR Prentice Hall, Englewood Cliffs, NJ). [13] Wang, L.X. and Mendel, J.M. (1992) ‘Generating fuzzy rules by learning from examples’, IEEE Transactions on Systems, Man and Cybernetics 22, 1414–1427. [14] McFadden, D. (1981) ‘Econometric models of probabilistic choice’. In: C.F. Manski and D. McFadden (eds) Structural Analysis of Discrete Data with Econometric Applications (MIT, Cambridge, MA), pp. 198–273. [15] Kanafani, A. (1983) Transportation Demand Analysis (McGraw Hill, New York). [16] de Ortuzar, J.D. and Willumsen, L.G. (1990) Modelling Transport (John Wiley & Sons, New York). [17] Carpio, L. and Gomes, L. (1991) ‘Modeling trip distribution through game-theory’, Systems Analysis Modelling Simulation 8, 515–521. [18] Lam, W.H.K. and Huang, H.J. (1992) ‘A combined trip distribution and assignment model for multiple user classes’, Transportation Research — B B26, 275–287. [19] Easa, S. (1993a) ‘Urban Trip Distribution in Practice: 1. Conventional Analysis’, Journal of Transportation Engineering- ASCE, 119, 793–815. [20] Easa, S. (1993b). ‘Urban Trip Distribution in Practice: 2. Quick Response and Special Topics’, Journal of Transportation Engineering- ASCE, 119, 816–834. [21] Oppenheim, N. (1993). ‘Equilibrium Trip Distribution Assignment With Variable Destination Costs’, Transportation Research B, 27, 207–217. [22] Goncalves, M.B. and Uysseaneto, I. (1993). ‘The Development Of A New Gravity Opportunity Model For Trip Distribution’, Environmental Planning A, 25, 817–826. [23] Roy, J.R. (1993). ‘The Development Of A New Gravity — Opportunity Model For Trip Distribution – Comment’, Environmental Planning A, 25, 1689–1691. [24] Arasan, V.T., Wermuth, M. and Srinivas, B.S. (1996), ‘Modeling of stratified urban trip distribution’, Journal of Transportation Engineering-ASCE, 122, 342– 349. [25] Lam, W.H.K. and Huang, H.J. (1996) ‘Link choice proportions from trip distribution and assignment models: An overview and comparison’, Journal of Advanced Transportation, 30, 1–21.

238


[26] Johnston, R.A. and Ceerla, R. (1996). ‘Travel modeling with and without feedback to trip distribution’, Journal of Transportation Engineering -ASCE 122, 83–86. [27] Tamin, O.Z. (1997). ‘Public transport demand estimation by calibrating a trip distribution mode choice (TDMC) model from passenger counts: A case study in Bandung, Indonesia’, Journal of Transportation Engineering, 31, 5–18. [28] Mozolin, M., Thill, J.C. and Usery, E.L. (2000) ‘Trip distribution forecasting with multilayer perceptron neural networks: A critical evaluation’, Transportation Research B, 34, 53–73. [29] Holland, J.H. (1975) Adaptation in Natural and Artificial Systems (University of Michigan Press, Ann Arbor, MI). [30] Goldberg, D. (1989) Genetic Algorithms in Search, Optimization and Machine Learning (Addison Wesley, Reading, MA). [31] Crow, E.L., Davis, F.A. and Maxfield, M.W. (1999) Statistics Manual (Dover, New York).

Copyright of Transportation Planning & Technology is the property of Routledge and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.