Ensemble of Multiple Pedestrian Representations - Semantic Scholar

3 downloads 45318 Views 537KB Size Report
uva.nl/research/isla/dc-ped-class-benchmark.html. The nonpedestrian .... As future work, we plan to study bootstrap methods to increase the training set size and .... A model of the power spectral density (PSD) of traffic information as a 2-D ...
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 9, NO. 2, JUNE 2008

[14] E. Smaragdis and M. Papageorgiou, “A series of new local ramp metering strategies,” Transp. Res. Rec., vol. 1856, pp. 74–86, 2003. [15] A. Kotsialos, M. Papageorgiou, J. Hayden, R. Higginson, K. McCabe, and N. Rayman, “Discrete release rate impact on ramp metering performance,” Proc. Inst. Elect. Eng. Intell. Transp. Syst., vol. 153, no. 1, pp. 85–96, Mar. 2006. [16] E. Smaragdis, M. Papageorgiou, and E. Kosmatopoulos, “A flowmaximizing adaptive local ramp metering strategy,” Transp. Res., Part B: Methodol., vol. 38, no. 3, pp. 251–270, Mar. 2004. [17] E. Kosmatopoulos, M. Papageorgiou, D. Manolis, J. Hayden, R. Higginson, K. McCabe, and N. Rayman, “Real-time estimation of critical occupancy for maximum motorway throughput,” Transp. Res. Rec., vol. 1959, pp. 65–76, 2006. [18] E. B. Kosmatopoulos and M. Papageorgiou, “Stability analysis of the freeway ramp metering control strategy ALINEA,” in Proc. CD-ROM 11th IEEE Mediterr. Conf. Control Autom., Rhodes, Greece, 2003. [19] L. Elefteriadou, R. P. Roess, and W. R. McShane, “Probabilistic nature of breakdown at freeway merge junctions,” Transp. Res. Rec., vol. 1484, pp. 80–89, 1995. [20] M. Lorenz and L. Elefteriadou, “Defining highway capacity as a function of the breakdown probability,” Transp. Res. Rec., vol. 1776, pp. 43–51, 2001. [21] M. J. Cassidy and J. Rudjanakanoknad, “Increasing the capacity of an isolated merge by metering its on-ramp,” Transp. Res., Part B: Methodol., vol. 39, no. 10, pp. 896–913, Dec. 2005. [22] M. Papageorgiou, E. Kosmatopoulos, M. Protopappas, and I. Papamichail, “Effects of variable speed limits (VSL) on motorway traffic,” in “Dynamic Systems and Simulation Laboratory,” Technical Univ. of Crete, Chania, Greece, Int. Rep. 2006-25, 2006. [23] K. G. Keen, M. J. Schofield, and G. C. Hay, “Ramp metering access control on M6 motorway,” in Proc. 2nd IEE Intern. Conf. Road Traffic Control, London, U.K., 1986.

Ensemble of Multiple Pedestrian Representations Loris Nanni and Alessandra Lumini

Abstract—In this paper, a new approach for pedestrian detection is presented. We design an ensemble of classifiers that employ different feature representation schemes of the pedestrian images: Laplacian EigenMaps, Gabor filters, and invariant local binary patterns. Each ensemble is obtained by varying the patterns used to train the classifiers and extracting from each image two feature vectors for each feature extraction method: one for the upper part of the image and one for the lower part of the image. A different radial basis function support vector machine (SVM) classifier is trained using each feature vector; finally, these classifiers are combined by the “sum rule.” Experiments are performed on a large data set consisting of 4000 pedestrian and more than 25 000 nonpedestrian images captured in outdoor urban environments. Experimental results confirm that the different feature representations give complementary information, which has been exploited by fusion rules, and we have shown that our method outperforms the state-of-the-art approaches among pedestrian detectors. Index Terms—Gabor filters, invariant local binary patterns (LBPs), Laplacian EigenMaps (LEMs), multiclassifier, pedestrian detection.

365

I. I NTRODUCTION In the European Union, pedestrian accidents represent the second largest source of traffic-related injuries. Over the past few years, several pedestrian classification approaches have been proposed. 1) A feedforward neural network with local receptive fields is trained in [11] using the gray-level images. 2) A fully connected feedforward neural network is trained in [12] using the high-pass filtered images. 3) Reference [13] is one of the first papers to study the overcomplete sets of Haar wavelet features; these features are used to train a support vector machine (SVM) classifier. This approach was adapted by Elzein et al. [14] and others. 4) To reduce the computation time, Viola et al. [15] proposed to use a detector cascade, where simpler detectors are placed earlier in the cascade, and more complex detectors are placed later. 5) Reference [16] proposes a component-based approach as a way to reduce the complexity of pedestrian detection; it extracts a feature vector from each of the nine fixed subregions. 6) Mohan et al. [17] extend [13] and use four classifiers for detecting the head, legs, and left/right arms of the pedestrian. These four results are combined by a classifier after ensuring proper geometrical constraints. 7) In [18], the authors show that a novel combination of SVMs with local receptive field features performs better than several other methods (e.g., Haar wavelet [13] and detector cascade [15]). In the literature, it is shown that the use of infrared images can facilitate the recognition process [20], [23]. Recently, interesting works based on infrared cameras have been proposed (e.g., [21] and [22]). In [19], the authors present a new pedestrian-detection method that employs far infrared images. It is based on the localization of warm symmetrical objects with specific aspect ratio and size. Obviously, the results obtained by a pedestrian detector based on warm symmetrical object detection (e.g., [19]) and a pedestrian detector based on artificial vision techniques (e.g., [18] and this paper) may be combined to obtain further reduction of the errors. Several works have shown that a classification method based on a single image representation does not perform as well as a method where several image representations are combined [28]. For example, a personal authenticator based on palm images has been proposed [27], where, from each palm image, three feature vectors are extracted. Three different matchers are trained using a different feature vector and then combined at the score level. In this paper, we present a component-based pedestrian detector. The images are divided into two parts: the upper and the lower. Then, for each subimage, several feature vectors are extracted, and a different radial basis function SVM1 is trained for each feature vector. Finally, these classifiers are combined by the “sum rule” [2]. Moreover, to reduce the computational time, each classifier is trained using a different subset of the training images. II. P ROPOSED S YSTEM

Manuscript received November 23, 2006; revised April 29, 2007, July 31, 2007, and November 26, 2007. The Associate Editor for this paper was H. Takahashi. The authors are with the Department of Electronics, Computer Sciences, and Systems (DEIS), University of Bologna, 40136 Bologna, Italy (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TITS.2008.922882

The block diagram of the proposed pedestrian detector system is shown in Fig. 1. First, the images are normalized to reduce the lighting effects using the method proposed in [4]. Then, each image is divided in two subimages: the lower and upper parts. From each part, we 1 Implemented as in the Matlab Toolbox OSU SVM Toolbox, the parameters used in all the tests are C = 1000 and Gamma = 1.

1524-9050/$25.00 © 2008 IEEE

366

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 9, NO. 2, JUNE 2008

Fig. 3.

Fig. 1. Our system.

Original images (Left) Pedestrian. (Right) Nonpedestrian.

30, and 40) and orientations (orientation = 0, π/4, π/2, and (3π)/4). From the images resulting from the convolution with the Gabor filters, we calculate the LBPs. The LBP is a grayscale invariant texture measure obtained as a binary code that is calculated by thresholding the neighborhood of a pixel by the gray value of its center. This feature extraction method is demonstrated to be [9] invariant in terms of grayscale variations and incapable of discriminating a large range of rotated textures (see [9] or the Matlab code2 for details). In this paper, we extract the invariant LBP histogram of ten bins from each image convolved by the Gabor filters. B. Feature Extraction: LEMs3 [5] LEMs are a nonlinear dimensionality reduction approach that is aimed at constructing a representation for data lying on a lowdimensional manifold embedded in a high-dimensional space. The basic idea of the LEM algorithm is the use of a local similarity measure to construct a graph in which each node corresponds to a high-dimensional data point and where the edges represent connection to neighboring data points. (See [5] for a detailed explanation.) In this paper, we use LEM to project the image onto a lower 100-D space. III. E XPERIMENTAL R ESULT

Fig. 2. Original image, their LEM projections, and their invariant LBPs extracted from the images convolved by the Gabor filters.

The images are first convolved with a bank of 16 Gabor filters [8] with different wavelengths (standard deviation σ = 10, 20,

For our experiments, we use the difficult DaimlerChrysler data set used in [18], which is downloadable from http://www.science. uva.nl/research/isla/dc-ped-class-benchmark.html. The nonpedestrian examples of this data set are extracted from video images that do not contain any pedestrians and also included those images where the shape-based pedestrian detector [24] resulted in a low confidence match. The explanation on the detection component used in the DaimlerChrysler data set can be found in [18] and [31]. Fig. 3 shows some examples of pedestrian (left) and nonpedestrian (right) images from the data set used in this paper. The whole data set is divided into five fully disjoint sets: three for training and two for testing. Each single set contains 4800 pedestrian examples and 5000 nonpedestrian examples. For each experiment, three different training set are generated, each by selecting two out of the three training sets. Testing all three combinations of training sets on both test sets yields six different receiver operating characteristic (ROC) curves. The ROC curve quantifies the tradeoff between the detection rate (the percentage of positive examples correctly classified) and the false positive rate (the percentage of negative examples incorrectly classified). The classification performance indicator used in this paper is the error under the ROC curve [area under curve (AUC)]. To reduce the computation time, we train a different classifier for each of the two training sets, and for each training set, we train one classifier for each of the two feature vectors extracted (related to the upper and lower parts of the images). In Table I, we compare the performance

2 The Matlab implementation is available at the Outex site (http:// www.outex.oulu.fi).

3 Matlab code shared by M. Belkin and P. Niyogi (http://www.cs.uchicago. edu/research/publications/techreports/[email protected]).

extract several feature vectors, which are used to train a pool of SVM classifiers combined by the sum rule. Considering the similarity score estimated by each classifier, the sum rule selects, as the final score, the sum of the scores of the two classifiers [2]. We obtain the best performance by combining the SVMs trained by the Laplacian EigenMap (LEM) features and by the invariant local binary patterns (LBPs) extracted from the images convolved with the Gabor filters [Gabor local binary patterns (GLB)] [29]. Several feature extraction techniques have been tested and compared; experimental results (reported in Table II) show that the application of LEMs outperforms principal component analysis (PCA), discrete cosine transform (DCT), independent component analysis (ICA), Gabor filters, and invariant LBPs extracted from the images convolved with the Gabor filters. In Fig. 2, we show two images, i.e., a pedestrian image and a nonpedestrian image, and their extracted features by LEM projections and by invariant LBPs extracted from the images convolved with the Gabor filters. A. Feature Extraction: Invariant LBPs of Gabor-Filtered Images2

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 9, NO. 2, JUNE 2008

367

TABLE I COMPARISON AMONG SEVERAL METHODS TESTED IN THIS PAPER

of our new approach for pedestrian detection fusion (FUS) with other methods proposed in the literature. • PC: a standard application of the PCA on the whole image, where a single SVM classifier is trained; • LE: application of the LEMs on the whole image, where a single SVM classifier is trained; • GL: application of the invariant LBPs extracted from the images convolved by the Gabor filters on the whole image, where a single SVM classifier is trained; • HAAR: with SVMs trained using the overcomplete dictionary of Haar wavelets [13]; • LRF: with SVMs trained using the local receptive fields features (the best method proposed in [18]); • M-PC: a simpler version of our ensemble where only the PCA feature transform is used as feature-extraction method; • M-LE: a simpler version of our ensemble where only the LEM feature transform (our best feature-extraction method) is used as feature-extraction method; • M-GLB: a simpler version of our ensemble where only the invariant LBPs extracted from the images convolved by the Gabor filters are used as feature-extraction method; • FUS: our complete approach consisting of fusion, by the sum rule, between the SVMs trained by the LEM features and by the invariant LBPs extracted from the images convolved by the Gabor filters; • G-FUS: a fusion between the methods LE and GL. In our opinion, it is very interesting to note that the AUC obtained by the ensemble of PCA (M-PC) is considerably lower than that of a single classifier trained by PCA (PC). This result confirms that the subdivision of the image into an upper and a lower subwindow is useful in this classification problem. Moreover, adopting two different feature-extraction methods allows our ensemble to also outperform the best method proposed in the literature [18]. To confirm the benefit of the our method, the detection error tradeoff (DET) curve has also been plotted. The DET curve [26] is a 2-D measure of classification performance that plots the probability of false alarm against the rate of miss probability. In Fig. 4, the DET curve obtained by M-LE (our best feature-extraction method) and FUS methods is plotted. From this plot, it is possible to note that FUS obtains a detection rate of 90% (hence, a miss probability of 10%) with a false alarm rate of 5%. In [18], the best method obtains a detection rate of 90% with a false alarm rate of 10%. To obtain a detection rate of 90% with a false alarm rate of 5%, Munder and Gavrila [18] increased the training set size with a bootstrap strategy. The resulting ROC curves are given in Fig. 5 for LRF and FUS. From this figure, it is clear that FUS outperforms LRF. In Table II, we report a comparison of the performance obtained by our ensemble trained by different couples of extracted features, chosen among the following: DCT, PCA, LEMs, ICA, Gabor filters (GAB), and invariant LBPs extracted from the images convolved by the Gabor filters (GLB). From Table II, it is clear that there are three combinations (DCT+GLB, PCA+GLB, and LEM+GLB) that outperform the LRF method [18].

Fig. 4. DET curve obtained by (black line) FUS and (red line) M-LE.

Fig. 5. ROC curves of LRF and FUS. TABLE II AUC OBTAINED BY THE FUSION BETWEEN TWO FEATURE-EXTRACTION METHODS

As further experiment, we investigated the relationship (i.e., the error independence) between the classifiers trained using different feature-extraction methods. Error independence is an essential property in combining classifiers [30]. In this paper, the independence of the classifiers is calculated using Yule’s Q-statistic [30]. For statistically independent classifiers, Qi,k = 0. Q varies between −1 and 1. It is very interesting to note that the Q-statistic between GLB and the other methods is very low. For this reason, the fusion between GLB and the other methods permits obtaining a low AUC (Table III). In Fig. 6, we show two examples of pedestrians classified as nonpedestrian and two examples of nonpedestrians classified as pedestrian. From these images, it is clear that an image-based method that works on low-resolution images cannot obtain an AUC ∼ 0.

368

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 9, NO. 2, JUNE 2008

TABLE III Q-STATISTIC BETWEEN THE CLASSIFIERS TRAINED USING THE DIFFERENT FEATURE-EXTRACTION METHODS

Fig. 6. (a) Pedestrian classified as nonpedestrian. (b) Nonpedestrian classified as pedestrian.

In our opinion, to obtain an AUC ∼ 0, we need to combine the standard image-based methods with the methods based on the infrared images [23]. Finally, we try to improve the performance of our fusion method by combining three or four feature-extraction methods. Our tests show that, by augmenting the number of combined feature-extraction methods, the performance of the system does not increase. The best combination of three feature-extraction methods obtains an AUC of 2.15, whereas the best combination of four feature-extraction methods obtains an AUC of 2.4. Finally, we compare the testing time of PC (the standard method based on the whole image) and FUS: PC classifies a test image in 0.03 s, and FUS classifies a test image in 0.18 s. These results are obtained with a P4 2-GHz computer with 1-GB random access memory and Matlab 7.0. IV. C ONCLUSION The objective of this paper was to investigate the integration of pedestrian classifiers and to achieve performance that may not be achievable with a stand-alone classifier. This paper has suggested a new method of pedestrian detection using the combination of pedestrian representations. The main contribution of this paper is a new method for creating an ensemble of pedestrian classifiers, which has been tested and validated in a difficult data set where the example of a nonpedestrian is matched as a pedestrian by a shape-based pedestrian detector. As future work, we plan to study bootstrap methods to increase the training set size and study the performance improvement that may be obtained by increasing the size of the training set. R EFERENCES [1] R. O. Duda, P. E. Hart, and G. Stork, Pattern Classification, 2nd ed. New York: Wiley, 2000. [2] J. Kittler, M. Hatef, R. Duin, and J. Matas, “On combining classifiers,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 3, pp. 226–239, Mar. 1998. [3] K. Baek, B. Draper, R. Beveridge, and K. She, “PCA vs ICA: A comparison on the FERET data set,” in Proc. Joint Conf. Inf. Sci., Durham, NC, Mar. 2002, pp. 824–827. [4] T. Connie, A. T. B. Jin, M. G. K. Ong, and D. N. C. Ling, “An automated palmprint recognition system,” Image Vis. Comput., vol. 23, no. 5, pp. 501–515, May 2005.

[5] X. He, S. Yan, Y. Hu, P. Niyogi, and H.-J. Zhang, “Face recognition using Laplacianfaces,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 3, pp. 328–340, Mar. 2005. [6] A. Lumini and L. Nanni, “Combining classifiers to obtain a reliable method for face recognition,” Multimed. Cyberscape J., vol. 3, no. 3, pp. 47–53, Jun. 2005. [7] M. S. Bartlett, H. M. Lades, and T. J. Sejnowski, “Independent component representations for face recognition,” in Proc. SPIE Conf. Hum. Vis. Electron. Imag. III, San Jose, CA, 1998, pp. 528–539. [8] J. G. Daugman, “Complete discrete 2-D Gabor transforms by neural networks for image-analysis and compression,” IEEE Trans. Acoust., Speech, Signal Process., vol. 36, no. 7, pp. 1169–1179, Jul. 1988. [9] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971–987, Jul. 2002. [10] Z. Pan, A. Rust, and H. Bolouri, “Image redundancy reduction for neural network classification using discrete cosine transforms,” in Proc. Int. Joint Conf. Neural Netw., Como, Italy, Jul. 24–27, 2000, vol. 3, pp. 149–154. [11] C. Wohler and J. Anlauf, “An adaptable time-delay neural-network algorithm for image sequence analysis,” IEEE Trans. Neural Netw., vol. 10, no. 6, pp. 1531–1536, Nov. 1999. [12] L. Zhao and C. Thorpe, “Stereo- and neural network-based pedestrian detection,” IEEE Trans. Intell. Transp. Syst., vol. 1, no. 3, pp. 148–154, Sep. 2000. [13] C. Papageorgiou and T. Poggio, “A trainable system for object detection,” Int. J. Comput. Vis., vol. 38, no. 1, pp. 15–33, Sep. 2000. [14] H. Elzein, S. Lakshmanan, and P. Watta, “A motion and shape-based pedestrian detection algorithm,” in Proc. IEEE Intell. Veh. Symp., 2003, pp. 500–504. [15] P. Viola, M. Jones, and D. Snow, “Detecting pedestrians using patterns of motion and appearance,” in Proc. Int. Conf. Comput. Vis., 2003, pp. 734–741. [16] A. Shashua, Y. Gdalyahu, and G. Hayun, “Pedestrian detection for driving assistance systems: Single-frame classification and system level performance,” in Proc. IEEE Intell. Veh. Symp., 2004, pp. 1–6. [17] A. Mohan, C. Papageorgiou, and T. Poggio, “Example-based object detection in images by components,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 4, pp. 349–361, Apr. 2001. [18] S. Munder and D. M. Gavrila, “An experimental study on pedestrian classification,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 11, pp. 1863–1868, Nov. 2006. [19] M. Bertozzi, A. Broggi, A. Fascioli, T. Graf, and M.-M. Meinecke, “Pedestrian detection for driver assistance using multiresolution infrared vision,” IEEE Trans. Veh. Technol., vol. 53, no. 6, pp. 1666–1678, Nov. 2004. [20] H. Nanda and L. Davis, “Probabilistic template based pedestrian detection in infrared videos,” in Proc. IEEE Intell. Veh. Symp., Paris, France, Jun. 2002, pp. 15–20. [21] Y. L. Guilloux and J. Lonnoy, “PAROTO project: The benefit of infrared imagery for obstacle avoidance,” in Proc. IEEE Intell. Veh. Symp., Paris, France, Jun. 2002, pp. 81–86. [22] Y. Fang, K. Yamada, Y. Ninomiya, B. Horn, and I. Masaki, “Comparison between infrared-image-based and visible-image-based approaches for pedestrian detection,” in Proc. IEEE Intell. Veh. Symp., Columbus, OH, Jun. 2003, pp. 505–510. [23] M. Bertozzi, A. Broggi, T. Graf, P. Grisleri, and M. Meinecke, “Pedestrian detection in infrared images,” in Proc. IEEE Intell. Veh. Symp., Columbus, OH, Jun. 2003, pp. 662–667. [24] D. M. Gavrila and V. Philomin, “Real-time object detection for ‘Smart’ vehicles,” in Proc. Int. Conf. Comput. Vis., 1999, pp. 87–93. [25] G. Borgefors, “Distance transformations in digital images,” Comput. Vis. Graph. Image Process., vol. 34, no. 3, pp. 344–371, Jun. 1986. [26] A. Martin et al., “The DET curve in assessment of decision task performance,” in Proc. EuroSpeech, 1997, pp. 1895–1898. [27] A. Kong, D. Zhang, and M. Kamel, “Palmprint identification using feature-level fusion,” Pattern Recognit., vol. 39, no. 3, pp. 478–487, Mar. 2006. [28] A. Lumini and L. Nanni, “Detector of image orientation based on borda-count,” Pattern Recognit. Lett., vol. 27, no. 3, pp. 180–186, Feb. 2006. [29] W. Zhang, S. Shan, W. Gao, X. Chen, and H. Zhang, “Local Gabor binary pattern histogram sequence (LGBPHS): A novel non-statistical model for face representation and recognition,” in Proc. 10th IEEE Int. Conf. Comput. Vis., Oct. 17–21, 2005, vol. 1, pp. 786–791.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 9, NO. 2, JUNE 2008

[30] L. I. Kuncheva and C. J. Whitaker, “Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy,” Mach. Learn., vol. 51, no. 2, pp. 181–207, May 2003. [31] [Online]. Available: http://www.gavrila.net/Computer_Vision/Research/ Pedestrian_Detection/pedestrian_detection.html

A Sampling Theorem Approach to Traffic Sensor Optimization W. L. Leow, Daiheng Ni, and Hossein Pishro-Nik

Abstract—With the objective of minimizing the total cost, which includes both sensor and congestion costs, the authors adopted a novel sampling theorem approach to address the problem of sensor spacing optimization. This paper presents the analysis and modeling of the power spectral density of traffic information as a 2-D stochastic signal using highly detailed field data. The field data were captured by the NextGeneration SIMulation (NGSIM) program in 2005. To the best knowledge of the authors, field data with such a level of detail were previously unavailable. The resulting model enables the derivation of a characterization curve that relates sensor error to sensor spacing. The characterization curve, concurring in general with observations of a previous work, provides much more detail to facilitate sensor deployment. Based on the characterization curve and a formulation relating sensor error to congestion cost, the optimal sensor spacing that minimizes the total cost can be determined. Index Terms—Sampling theorem, sensor optimization, spectral domain analysis, traffic congestion, traffic sensing.

I. I NTRODUCTION The United States has 47 000 mi (75 640 km) of Interstate Highways [1]. If half of the roads were monitored by traffic sensors with one every one third of a mile, as is typically adopted in practice, approximately 70 000 sensors would be required. Assuming each sensor has a lifetime cost of $30 000 [2], the total cost will amount to about $2 billion. If, somehow, one determines that 20% of the sensors are unnecessary in the sense that their existence does not provide additional information, a savings of roughly $400 million is expected. Little research has been conducted on the subject of optimal traffic sensor deployment or on the potential savings from such optimization. Current practice is mainly based on rudimentary studies or none at all. For example, Georgia Navigator [3], which is Georgia’s Intelligent Transportation System (ITS), chooses to install sensors every one third of a mile along its major highways. The rationale is that this is the distance that a vehicle traverses assuming an average traffic speed of 60 mi/h (96.6 km/h) over a data aggregation interval of 20 s. Therefore, the strategic problem that this paper attempts to address is traffic sensing optimization, including optimizing sensor deployment strategies and minimizing the uncertainty of the system of interest. To achieve this goal, we sought an interdisciplinary collaboration and developed an analytical approach based on the sampling theorem. Manuscript received January 29, 2007; revised April 29, 2007, June 19, 2007, November 29, 2007, and December 24, 2007. This work was supported in part by the University of Massachusetts, Amherst, under Grant A090900055. The Associate Editor for this paper was D. Srinivasan. W. L. Leow and H. Pishro-Nik are with the Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA 01003 USA (e-mail: [email protected]; [email protected]). D. Ni is with the Department of Civil and Environmental Engineering, University of Massachusetts, Amherst, MA 01003 USA (e-mail: ni@ecs. umass.edu). Digital Object Identifier 10.1109/TITS.2008.922925

369

In this approach, traffic information such as flow, speed, and density is obtained from actual detailed vehicle trajectories collected by Cambridge Systematics, Inc., under the auspices of the NextGeneration SIMulation (NGSIM) program [4]. The NGSIM program has collected data sets of vehicle trajectories from actual live traffic video footages. Advances in technology have enabled the NGSIM program to capture vehicle trajectories to a level of detail that was previously impossible. Six sets of data, each containing detailed vehicle trajectories of actual automobiles traveling on two different freeways under real-life actual driving conditions during the morning and evening peak periods, were used in this paper. The morning and evening peak periods correspond to the times of day where traffic information is of greatest importance. This paper focuses on the most important times to traffic management instead of worstcase scenarios or extreme cases. Signal processing techniques and the multidimensional sampling theorem will be used to glean an understanding of traffic information, which is treated as a 2-D stochastic signal in the space–time domain. A model of the power spectral density (PSD) of traffic information as a 2-D stochastic signal will then be derived. Using the derived model, we attempt to determine a characterization curve that relates sensor spacing to the error from the sensor. With this relationship, we determine the optimal sensor spacing. Our contributions include the following: 1) a novel sampling theorem approach to address the optimal sensor deployment problem; 2) an analytical model of the PSD of traffic information as a 2-D stochastic signal; 3) the normalized mean-square error (NMSE) and sensor spacing characterization curve; 4) procedure and formulation to determine the optimal sensor spacing that minimizes the total cost. The next section provides a review of related work. Section III details the spectral characteristics and the modeling of traffic information and presents the NMSE and sensor spacing characterization curve. Section IV discusses the optimization of sensor spacing. This is followed by a conclusion. II. R EVIEW OF R ELATED W ORK In 1979, the Federal Highway Administration published a guideline for locating freeway sensors [5] based on empirical studies. The report concluded the following: 1) that any sensor spacing below 1000 ft (304.8 m) generally produces relatively little or no increase in effectiveness; 2) that sensor spacing over 2500 ft (762 m) produces unsatisfactory performance according to the criteria defined in the report; and 3) that there exists a cost-effectiveness tradeoff for sensor spacing between 1000 ft (304.8 m) and 2500 ft (762 m). This report is similar to our research in that both addressed the sensor optimization problem and led to consistent findings. The differences between the two are as follows: 1) The report is based on empirical studies, whereas our approach is analytical; 2) the report employed a different set of criteria than ours to evaluate sensor deployment strategies; and 3) the report gives categorical recommendations, whereas our approach yields more details over the whole spectrum of sensor spacing, which might be of greater interest to practitioners. A few studies [6]–[9] addressed the sensor location problem in a traffic system, i.e., the minimum number of sensors and their associated locations to facilitate the estimation of origin–destination (O–D) matrices. These studies differ from our research in different ways. First, the objective of these studies was to estimate O–D

1524-9050/$25.00 © 2008 IEEE