encompassing a huge number of cell phone users in a micro cell system. .... As an incoming call arrives, the network locates the called cell phone by paging all ...... travel time estimation accuracy,â 4th Swiss Transport Research Conference,.
VT-2005-00823
1
Dynamic Origin-Destination Flow Estimation Using Cellular Communication System Keemin Sohn and Daehyun Kim
Abstract— The present investigation aims at developing a new approach for the estimation of dynamic origin-destination (O-D) flow using cell phones as traffic probes. The state space model, depending on the autoregressive dynamics of O-D flow and the time series for link volume counts, was adopted. Unlike a direct approach using sample O-D flows extracted from the cell-based location data as additional observations, an indirect approach is proposed wherein the assignment map in the model is derived from the passing time at observation locations and the path choice proportion. A probe phone’s passing time at a certain point in a cell was approximated with its entry and exit times at cell boundaries. The average path choice proportion was also estimated using cell-based trajectories of probe phones. The simulation experiments confirmed that the approach was successfully applicable to the real-world freeway network. The results suggest that O-D flows estimated from the present approach is promising in that the mean absolute error ratio (MAER) was smaller than the case wherein only historical O-D flows are concerned. A sensitivity analysis revealed that the present approach met the requirements for an urban area encompassing a huge number of cell phone users in a micro cell system. Index Terms— Transportation Networks, Personal Communication Networks, Kalman Filtering
I. INTRODUCTION
M
obile cellular networks for communication are capable of providing, directly or potentially, a huge amount of frequently updated information on the location of their users. This information can be used to estimate real-time traffic states. The aim of this investigation was to identify the applicability of cell phone probes to the estimation of dynamic origin-destination (O-D) flows. The dynamic OD flows play an important role in determining forthcoming traffic states. The more accurate the OD flows data, the more reliable the traffic forecasting. Thus, both the provision of traffic information and the real-time traffic management is entirely depends on the Manuscript received December 18, 2005. This work is supported by grants from the Seoul Metropolitan Government as a 2004 basic research project of Seoul Development Institute. Keemin Sohn is with the Department of Urban Transportation, Seoul Development Institute, 391 Seocho-dong Seocho-gu, Seoul Korea (corresponding author; phone: +82-2149-1110; fax: +82-2149-1120; e-mail: kmsohn@ sdi.re.kr). Daehyun Kim is with the Department of Urban Transportation, Seoul Development Institute, 391 Seocho-dong Seocho-gu, Seoul Korea (e-mail: kdh0922@ sdi.re.kr).
capability of tracking dynamic O-D flows. Over the past decade, there has been considerable interest in estimating dynamic O-D flows from link traffic volume counts. Various researchers have inquired into the estimation of dynamic O-D flows [1] - [8]. They employed a model based on an assumption that either O-D flows on their own or deviations in O-D flows from historical O-D flows have an autoregressive relationship. By this assumption, the complication associated with insufficient observations could be sorted out. Several statistical approaches such as the generalized least square (GLS) estimator, the Kalman filter, and the Bayesian estimator were employed in order to acquire the solution. The model for obtaining dynamic O-D flows can be classified into two categories based on whether or not the model is governed by dynamic traffic assignment (DTA). DTA determines the relationship between O-D flows and traffic volume counts in observation links. DTA-based approaches require a common assumption that reliable information for network traffic assignment is available. Reference [2] proposed various methods for constituting the assignment map. One of the methods was to employ a fine tuned micro traffic simulator for the DTA. The development of a simulator to describe the real traffic situation and the acquisition of reliable historical OD are challenging issues in the case of a DTA-based approach. Parameters accounting for the relationship between O-D flows and link traffic volumes are also set as unknown variables in the case of a non-DTA-based approach. These unknown variables should be estimated only from time series measurements of traffic volume counts. Whereas early studies [3] - [5] considered the estimation of O-D flows for a single intersection or freeway corridor, [6] expanded the approach to the network level by taking screen-line flows and path choice function into account. Both a huge number of unknown variables and non-linear relations between them are left problematic in this case. The other issue on the estimation of dynamic O-D flows is a matter of whether the measurements of the traffic volumes and the priori information on O-D flows are enough to make the solution of model robust. The observability has been recently improved by incorporating multi-sources of observation into the estimation of O-D flows. Based on automatic vehicle identification (AVI) technology, [7] added vehicle trajectory information from automated license-plate surveys to
VT-2005-00823 observation data related to traffic volume count and then obtained a solution by means of several algorithms such as Bayesian updating methods, a Kalman filter, and least squares. Reference [8] also developed a model in which AVI data and link counts were combined, and proposed three ways of incorporating the AVI information into the model. The first method was to append AVI information as new observation and the second was through assignment as link choice proportions in the form of travel time–based estimates. The third was to incorporate AVI information through measurement equations by substituting the most recently completed set of O-D flow observations for the corresponding estimates in the deterministic component of the equation. Through the previous works, it is no doubt that AVI information has a great potential for upgrading the performance of dynamic O-D estimation. The proposed investigation involves information from a cellular mobile network instead of the conventional AVI technology. The cell phone probing system in the space-based approach, which will be described in the next section, could fall on the category of AVI technologies. On the other hand, the system has a big difference from the existing AVI by the fact that it has area detectors rather than point detectors. The system entails an unavoidable location error. It is obvious that the existing AVI system with point detectors prevails over the proposed approach. However, the proposed approach has an advantage in the market penetration due to the huge user base. In addition, the proposed approach would be very useful where no other AVI system is available or the AVI detectors are installed sparsely. There is another rationale for investigating cell phone probing. The area detector will be more efficient because technologies based on advanced cellular communication networks, such as mobile LAN(Local area network) and WiBro (Wireless broadband), which have a much smaller cell dimension, will be prevail in the near future. In this context, this study focuses on whether the cell-based probing can enhance the accuracy of the O-D flow estimation. Section II includes an overview of existing mobile cellular network, wherein a method for tracing probe phones on the cell basis is described. Complications in qualifying cell phones as traffic probes are discussed in Section III. The entire adopted model for O-D flow estimation is discussed and an approach to incorporate data from cell phone probing into the model was developed in Section IV. Section V is concerned with simulation experiments, through which the validity of the proposed approach was confirmed and the sensitivity on estimation accuracy was also investigated in terms of market penetration, cell dimension, and level of cell duplication. The conclusions are presented in Section VI.
II. CELL POSITIONING OVERVIEW According to [9], there are three possible approaches by which probe vehicles can transmit information. The first approach is space-based in which probe vehicles transmit
2 information to road side devices as they pass observation points. This type of approach is of main concern in this study, which is known as the AVI technology. The second one is time-based in which probe data are reported at every specific time instant regardless of the position of the probe vehicle and data can be obtained from GPS-based system or a beacon-based system. The last is event-based in which traffic information is reported when a particular event occurs, for example, traffic accident reports from drivers by cellular phones. Some researchers have recently become interested in the possibility of using cell phones as traffic probes [10] - [13]. Most of them only adopted the time-based approach so as to collect traffic information from cell phones and cellular mobile networks. Reference [13] introduced several methods for tracking the position of cell phones in a mobile cellular network, which are described as the signal profiling technique, the angle-of-arrival technique and the timing measurement technique. It is impossible to rule out location errors entailed in location data obtained by the time-based approach using the cell phones. The imperfect locations should then be subjected to a map matching process in order to derive useful traffic information from them [14]. The reason for why this study focuses on the space-based approach using cell phones as traffic probes is the good compatibility between the space-based approach and the conventional estimation model of O-D flows. The conventional estimation model has fixed locations for reference, such as origins, destinations, and detecting locations. Hereafter, such locations are referred to as reference points (or locations) and cells including reference points are referred to as reference cells. The space-based approach makes it possible to save the battery consumption of probe phones and to relieve the additional system load on mobile cellular networks because a probe phone does not require the location to be updated frequently. A location update is needed only when the probe phones enter and leave a cell, the coverage of which includes locations for observation and reference. The passing time at an observation point on a road can then be estimated from both the entry and exit times at boundaries of a cell that includes the location. This approximation also entails a location error as is the case for the time-based approach. Conventional mobile communication service consists of two operations, location-updating and paging. A brief description of the two operations has been given by [15]. The network coverage area is partitioned into a number of location areas (LAs). Each LA consists of a group of cells, each cell of which is served by a base station (BS). All base stations of a location area periodically broadcast an identifier associated with that area. When active cell phones enter a new cell, they compare the identifier stored with the broadcast identifier of the location area. If the two identifiers differ, the cell phone recognizes that it has entered a new location area and a location update is performed. As an incoming call arrives, the network locates the called cell phone by paging all cells within an LA. Therefore,
VT-2005-00823 the current location of a cell phone is only identified by LA in a conventional cellular operation. If a cell phone were to serve as a traffic probe, its location information should be reported when it enters and leaves reference cells, in the coverage of which the reference point on a transportation network of concern resides. Each moving cell phone with power on in a mobile cellular network performs an idle hand-over operation representing the switch of its jurisdictional base station whenever it enters a new cell. However, the idle hand-over operation does not entail a location update [16]. An additional system is required to make the idle hand-over to be reported to a mobile switching center (MSC) and the reported information is then processed further. This system revision is not required for every phone and cell, but is necessary only for probe phones and reference cells. The staff in charge of the network management in a company providing mobile communication services concluded that such a system revision is not technically difficult. In the case of decision of the investment for that revision, however, it matters most to have a consensus between the party adopting phone probes and the company possessing mobile cellular networks. In fact, the general hand-over technology was developed in order to secure incessant phone calls while a cell phone in use is moving. On the other hand, the purpose of idle hand-over is to maintain operation channels associated with at least a base station while a cell phone is on but not in use. The idle hand-over belongs to the hard hand-over in which the link to the prior base station is terminated before or as a cell phone is transferred to a new cell’s base station. Reference [17] described the properties of hard hand-over in detail. There may be many criteria on which one can judge the initiation of idle hand-over. Cell phones are generally simultaneously receiving radio signals from multiple base stations but they have only one jurisdictional base station. The cell phone is adjusted such that the idle hand-over occurs when the current signal is sufficiently less than the threshold and the other signal is the strongest. It is very important to locate where idle hand-over takes place and how the location varies in utilizing a cell phone as a traffic probe. The strength of the signal emitted from a base station is not deterministic but may vary according to both the internal fluctuations of the radio wave itself and the external environment such as weather conditions, the amount of communication traffic, and the condition of facilities or buildings around the cell boundary area. It should be noted that there has been no test on the variation in idle hand-over locations. In fact, conducting such a field survey would be complex because the cellular network is owned by private companies and the network configuration is protected as an operational secret. In this study, the locations are assumed to vary stochastically so that a normal distribution can be adopted. Of course, no matter what kind of stochastic distributions are employed, it would be acceptable, so long as they are calibrated from field data.
3 III. QUALIFICATIONS OF PROBE PHONES There are several issues that should be discussed before the proposed approach is applied to the estimation of O-D flows. Even though the cell-based probing has a great potential for the market penetration owing to the huge user base, there might be complications associated with privacy protection. It is not desirable to choose anonymous tracking. The frequent travelers such as taxi drivers and delivery service men could constitute good candidates for probes as far as the agreement is obtained. Those who agree on being a probe should carry their cell phone and keep it power on whenever they travel by car. They could be reimbursed by receiving traffic information or by discounting their communication fee. The agreement from volunteers is also very important because the probe phone requires a modification so as to permit the idle hand-over reported to MSC when it enters and leaves a reference cell. Even though a vocation group with a high trip frequency were to be provided with probe phones, the probe phone would not always be moving. When the probe phone remains in a reference cell for a period, the duration time becomes unrealistic in comparison with historical data, hence, should be eliminated. When a probe phone is passing a road located in a reference cell but out of the transportation network of concern, the erroneous result could be eliminated by comparing current cell trajectory with predefined cell-based paths. When a probe phone shows up in a road that is parallel to the road of concern but not contained in the network, the erroneous result could be eliminated if the current path travel time of the probe phone deviates significantly from the historical average travel time. The thresholds for the judgement should be determined by examining the field data. However, the above resolution is not perfect especially in the latter case. Further research employing advanced technologies for classification is required to completely sort out this problem. This will be investigated in further research with actual field data. Advanced technologies for binary classification, such as Support Vector Machine (SVM) [18] and Bayesian filtering [19] could be available. The cell coverage is confined to several hundreds meters in an urban area. For instance, a personal communication service (PCS) operation in Seoul, South Korea, from which the test bed of this investigation was extracted, adopts a cell coverage of approximately less than 500 meters in radius. As mentioned earlier, the new cell-based communication services with smaller cell dimension will prevail in the near future. It is expected that the location accuracy will improve with the advanced services. Ultimately, traffic probing will end up with RFID (Radio Frequency Identification) which can detect a vehicle’s exact location anytime, anywhere. Therefore, the detection tchnology using the cell-based network, which is suggested in this investigation, could be a tentative solution before RFID is realized.
VT-2005-00823
4 during time interval h ( X h = X hR − X hH );
IV. MODELING FRAMEWORK The present approach adopted a conventional state-space model for O-D flow estimation based on a time series of link volume counts. It is a reliable relationship between link traffic volume counts and O-D flows that matters the most in the model. Unlike a direct approach wherein sample O-D flows are derived from the cell-based location data and then applied to the model as additional observations, an indirect approach is proposed. In the proposed approach, an assignment map of the model was obtained through the passing time at observation locations and the path choice proportion, both of which are derived from cell phone probes. Using the cell boundary crossing time obtained from mobile cellular network, a passing time at a certain point in the cell could be approximated. The path choice proportion could be derived from the cell-based trajectories of probe phones. The conventional model for O-D flow estimation is introduced briefly and then how to constitute the assignment map is described in detail.
X hR = n × 1 vector representing the historical O-D flows during time interval h , the element of which is x hi ( X hR = (x1h , x2 h ,..., xnh )T );
xih = i th element (O-D flow) of vector X hR representing the number of vehicles that departed its origin (O) during time interval h towards its destination (D) ; X hH = n × 1 vector representing the historical O-D flows H
during time interval h , the element of which is xih
(
H ( X hH = x1Hh , x 2Hh ,..., xnh
X h+1 =
Yh =
h
∑F X
t =h− p h
∑A
h t
t
t
+ U h+1
X t + Vh
(1) (2)
t =h −q
where n = the number of O-D pairs; X h = n × 1 vector of state variables which represents deviations of actual O-D flows from historical O-D flows
T
xihH = i th element (O-D flow) of vector X hH representing the historical O-D flows for time interval h ; Ft = n × n transition matrix; p
A. Model Description The model proposed by [2] was adopted as a basic model in the present investigation. It described well the effect of historical O-D flows on the O-D estimation. Equation (1) and (2) is a typical form of state-space model consisting of two equations; the transition equation indicating the autoregressive characteristics of state variables (deviations of O-D flows from historical O-D flows) and the measurement equation representing the relationship between observed traffic volume counts and O-D flows. The reason for why the deviations were adopted as state variables rather than the actual O-D flows was because both the seasonal variation and the systematic trend could be automatically incorporated into the historical O-D flows. Only short term variations of dynamic O-D flows was taken into account in the model, whereas long term variations such as seasonal variabilities and day-of-the-week variations were definitely included in the historical O-D flows. The transition matrix implies a within-day dynamics or evolutionary process that deviations of O-D flows subscribe to. The use of deviations as state variables can also justify the assumption of normal distributions over random terms in (1). It can be problematic to use the actual O-D flows as state variables directly, because the distribution of traffic flow is generally known to be skewed [2].
) );
= autoregressive order of state variables;
m = the number of traffic volume counting points; q = the maximum number of time intervals required for
crossing the entire network; Yh = m × 1 measurements vector indicating the difference between observed link traffic volumes and the assigned link traffic volumes from historical O-D flows (Y = Y R − h h
h
∑A
t =1− q
h t
X tH );
Y = m × 1 vector of observed traffic volumes; Ath = m × n matrix of assignment map indicating the R h
relationship between link traffic volumes during time interval h and the O-D flows departing during time interval t ( t ≤ h ); = the element of Ath representing the fraction of xit that departed its origin during time interval t ( t ≤ h ) and crossed the counting point on link j during time interval h ; α itjh
U h = n × 1 vector of random variables for time interval h ;
Vh = m × 1 vector of random variables for time interval h . The random terms in (1) and (2) are assumed to be white noises complying with (3). The assumption of no serial correlation in the transition equation can be described by the fact that the unobserved factors correlated over time are captured by the historical O-D flows. E (U h ) = 0 , E (Vh ) = 0 E (U hU l ) = Phδ hl , E (VhVl ) = Qhδ hl (3) where = 0 δ hl = = 1
if h ≠ l ; otherwise
VT-2005-00823
5
Ph = n × n variance-covariance matrix for time interval
h whose diagonal elements are the variances of O-D flow deviations and other elements are the covariances between each pair of O-D flow deviations ; Qh = m × m variance-covariance matrix for time interval h whose diagonal elements are the variances of link traffic volume counts and other elements are the covariances between each pair of the volume counts. In order to apply the GLS and Kalman filter algorithm directly to (1) and (2), the equations should be converted into state augmented forms as proposed by [1]. Details in post-augmentation formulations have been explained in detail by [1], [2] and [20]. The state augmented form of the equations has more variables and then increases computational burdens. A simple model could be adopted to alleviate the computational burdens at the expense of estimation accuracy [2], [8]. An alteration was given in the first term in the right hand side of (1) and (2) to develop a more simplified model. The components prior to the time interval h are referred to as constants by substituting previous estimates for variables of preceding time intervals. Thus, the simple model has fewer variables than the basic model in augmented form. The result from the simple model will be included in this investigation. X h+1 = Fh X h +
Yh = Ahh X h +
h−1
∑ F Xˆ
t = h− p
h −1
∑A
t =h−q
h t
t
t
+ U h+1
(4)
Xˆ t + Vh
(5)
~ where Xˆ t = X~t − X tH . X t denotes the estimates of O-D flows of prior time interval t ( t ≤ h ). Representative algorithms such as the GLS and Kalman filter were used to solve the model. Various reports such as [1]-[8], dealing with dynamic O-D flow estimation, have employed these algorithms and verified the performance of them based on synthetic or real data. B. Derivation of Assignment Map The assignment map was considered to be exogenously provided in the present model. Several methods are available for constituting an assignment map, however, only [21] derived an assignment map analytically using the passing time of reference points and the path choice proportion. The method is as follows; (6) α jh = α~ jh q it
∑
∀κ ∈Κ i
κt
κt
α~κjht = 1
= (hH−ηκ′tj ) /(ηκ′′t j −ηκ′tj )
if (h −1)H < ηκ′tj < ηκ′t j < hH if (h −1)H < ηκ′tj < hH