Multidimensional Compression of ITS Data Using Wavelet-Based ...

5 downloads 9491 Views 3MB Size Report
S. Agarwal is with California State University at Los Angeles, Los Angeles,. CA 90032 .... was recovered with a compression ratio of 6.2% and a recovery. error of ...
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

1

Multidimensional Compression of ITS Data Using Wavelet-Based Compression Techniques Shaurya Agarwal, Student Member, IEEE, Emma E. Regentova, Member, IEEE, Pushkin Kachroo, Senior Member, IEEE, and Himanshu Verma Abstract— This paper explores the use of wavelet transformbased methods for ITS data compression. A methodology for structuring data and applying wavelet transform-based algorithms is proposed. The methodology provides the option of controlling the compression ratio at the cost of an acceptable distortion, visualizing data at different detail levels. With proper database management, this methodology will also allow faster data access without fully decompressing them. Given a high correlation of traffic data and knowing that the image data are compressed very well due to the inherent correlation of image pixels, the idea here is to restructure the traffic data, such that efficient image compression methods underlying modern image compression standards can be used. Three data structures are discussed: 1-D, 2-D, and 3-D. For a 1-D arrangement, different wavelets and decomposition levels were tested and analyzed for distortion levels in the data after decompression. The 2-D and 3-D data arrangements were compressed using embedded zerotree wavelet and set partitioning in hierarchical trees algorithms, which are well-proved algorithms for compressing image data. A case study was performed using the traffic flow data from freeways in Las Vegas, Nevada. As could be expected, the compression ratio under the 3-D scheme has shown the best results. The 2-D and 3-D approaches yielded a 91% and 95.2% reduction ratios, respectively. Index Terms— Wavelet transform, data compression, urban traffic data, ITS, EZW, SPIHT.

I. I NTRODUCTION

W

ITH a rapid increase in use of Intelligent Transportation System (ITS) for urban traffic planning and management, amount of traffic data being sensed and captured has also increased many folds. This ITS data are very crucial to several ITS applications such as navigation systems, real time traffic control methodologies, identification of crash prone zones and incident management [1]. A huge amount of realtime as well as historical data are used by modern day transportation managers for traffic planning, management, safety, and operations [2]–[4]. Traffic researchers need this data for pattern matching, data mining, database queries, prediction, and trend analysis [5]. Hence we can argue that ITS data Manuscript received October 30, 2015; revised March 4, 2016 and August 6, 2016; accepted September 23, 2016. The Associate Editor for this paper was S. Sun. S. Agarwal is with California State University at Los Angeles, Los Angeles, CA 90032 USA. E. E. Regentova and P. Kachroo are with the Department of Electrical and Computer Engineering, University of Nevada at Las Vegas, Las Vegas, NV 89154 USA. H. Verma is with HomeAway, Inc., Austin, TX 78703 USA. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TITS.2016.2613982

forms the backbone of a modern day transportation system. This massive amount of data has posed a challenge in front of transportation agencies in terms of its efficient and optimal storage, retrieval and analysis. The data obtained by traffic flow detectors is in the form of a large spatio-temporal series. It is observed that redundant information often exists in the data files forming the basis for loss-less compression. It is the most commonly used data compression technique which enables complete retrieval of the data without loss of any information. Examples include RAR and Winzip, which can reduce the data size upto 10 times without any loss of information [6]. However traffic data has intrinsic redundant information which can be exploited to further compress the data considerably. For example, consider traffic flow data that is expected to be more or less the same at a particular point during peak hours on weekdays. It might be different on weekends or other hours of the day. Hence, one can explore the correlation of the data with respect to time and space domains for more efficient storage and retrieval. In pursuit of higher data compression ratios, lossy techniques are used which introduce a trade off between information loss and the compression ratio. Modern day transportation data varies a lot in terms of time resolution, space resolution, parameters collected, sensor types, sensitivity to noise etc. Likewise ITS applications also vary a lot in terms of their data needs and their outputs. For instance, an application used for analyzing daily or weekly trends of traffic on a particular road stretch might not require the full spatio-temporal resolution and it might give acceptable results even after ‘permissible’ loss of information during compression. There is a need to analyze the ‘permissible’ limits of loss of information or data distortion for various ITS applications. This can lead to huge amount of savings in terms of data storage and transmission costs [7]–[9]. Data management systems must provide ITS managers with the control over the desired compression ratio against the permissible loss of information. This way, based on the ITS application managers can decide the level of compression and the method, leading to better and optimal data management. Wavelet based methods are well known for their image/video applications, including high power compression standards, such as JPEG2000 and MPEG. Wavelet techniques are referred to as lossy techniques. However, with wavelets in the framework, one can achieve high compression without much affecting the quality of data, and then they are referred to as near lossless. Additionally, they offer another attractive property for intelligent data processing, i.e., progressive

1524-9050 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

refinement of data. That is, one can attain almost exact reconstruction of the original data by successive refinement through levels of decoding starting at low resolution approximation up to fine details. These properties of wavelet based transform methods motivate us to look for its possible application in ITS data. Motivation behind choosing wavelet analysis and their properties are discussed in detail in the following sections. In this paper we focus on analyzing and investigating data compression techniques best suited for ITS applications. Main research goal is to provide ITS managers with a ready to use methodology which furnishes with the control over the compression ratio against data accuracy. We start the analysis by exploring the usage of discrete wavelet transform to traffic data in one dimension, that is using the original data format. We then extend the study in two and three dimensional data rearrangements using EZW and SPIHT algorithms. An efficient data arrangement scheme in multidimensional array formats are also discussed. In the end, a detailed case study is performed using actual traffic data; results are analyzed and compared for various data arrangements and compression techniques. To the best knowledge of authors, this is the first attempt of applying these wavelet compression algorithms on ITS (sensor flow) data. The rest of the paper is structured as follows: Section II provides an extensive study of the previous research on ITS data compression followed by the key take away points behind the motivation of zeroing in on the wavelet techniques for ITS applications. Section III provides the background of wavelet based compression algorithms whereas Section IV describes the data used in the study. Section V, VI and VII describe the 1-D, 2-D and 3-D methodology respectively and finally Section VIII concludes the paper. II. L ITERATURE S URVEY Data compression techniques can be divided mainly into two major groups: lossless and lossy compression. Lossless compression techniques have the ability to regenerate an exact original, whereas lossy techniques trades some loss of accuracy against a higher compression ratio. Most of the lossy-data compression algorithms transform the data mathematically to a new domain, in which data are decorrelated. Wavelet-based compression does this by capturing correlation in frequencytime space to the extent which makes possible to attain up to 100 times compression or beyond at almost near lossless image quality. A. Data Compression in ITS Traditional transportation data compression approaches include coding algorithms (lossless compression) such as WinZip, and RAR [6]. Some of the advanced techniques use wavelet-based compression [10], [11]; Principal Component Analysis based compression [12]–[14]; and correlation based methods in [15]–[17]. Some of these relevant techniques and their results are briefly discussed below. Li et al. proposed a method based on Principal Component Analysis (PCA) for compression of flow-volume data of a

traffic network [12]. PCA is applied on the dataset to break it into several principal components (PCs). PCs are observed to have much fewer dimensions than the original data. The data was recovered with a compression ratio of 6.2% and a recovery error of 13%. Ding et al. proposed an effective method for urban traffic data compression, based on Wavelet Principal Component Analysis [13]. In the proposed algorithm, the data was decomposed using wavelets. Then a multi-scale PCA was applied to reduce the dimension of the dataset. A simulation was performed for testing the algorithm; results showed that this wavelet-based PCA outperformed the conventional PCA algorithm for data compression. Asif et. al. also explored the use of PCA and DCT for urban data compression in [18]. Mitrovic et. al. proposed a CUR matrix decomposition based algorithm; where the CUR matrix decomposition leads to lowdimensional models. Authors claimed that the resulting models are easy to interpret, and can be used for compressed sensing of the traffic network [19]. Guitton, Skordylis and Trigoni utilized correlations to compress time-series data in traffic monitoring sensor networks. They examined the relative performance of Fourier- and wavelet-based algorithms for compressing traffic data locally at the sensor nodes [15], [16]. Results showed that exploitation of temporal correlations yielded 14-30% savings in cost of data transmission, while use of spatial correlations yielded 10-35% savings, for a tolerated error of 5-15 cars per 5 min. Cheng et. al. explored use of a mining method to cluster the traffic flow series on different road links [20]. They found discrete wavelet transform highly useful for flow feature extraction and identifying temporal-spatial similarities in urban transportation network. Qiao et al. proposed a method that incorporates the onedimensional discrete wavelet compression approach for ITS data [21]. In this method, three compression indices were constructed, and one threshold selection algorithm was proposed. The identified threshold balanced both the compression ratio and the signal distortion. They conducted a case study on the traffic data in San Antonio, Texas, which showed that the proposed method achieves a compression ratio of 8.12% of what the existing system provided. The overall compression ratio was observed to be T f do for i, j ← 0; i < m, j < n; i ← i + 1, j ← j + 1 do  W :m × n CODE(wi j , T ) end for T ← T /2 end while end procedure

Algorithm 2 CODE Function for EZW 1: procedure CODE (wi j ,T) 2: if wi j ≥ T then 3: wi j ← P 4: else if −wi j ≥ T then 5: wi j ← N 6: else if |wi j | < T & allchild(|wi j |) < T then 7: wi j ← T 8: else 9: wi j ← Z 10: end if 11: end procedure

at various levels of accuracy to be viewed and analyzed. Depending on the need of the application, these methods can be easily customized. We next explain these algorithms by using flowcharts and pseudocodes. Figure 2 shows the overall steps involved, algorithms 1 and 2 provide the pseudocode, and figure 3 shows the implementation flowchart for EZW algorithm. SPIHT algorithm was developed by Said and Pearlman as an improved codec of EZW. Main difference was the partitioning of sets, which consisted of coefficients of whole subtrees. SPIHT classifies the wavelet coefficients into three sets: list of insignificant pixels (LIP), list of significant pixels (LSP), and list of insignificant sets (LIS). We list below the important set notations used in the algorithm 3: O(i, j ) is the set of all offspring of (i, j ); D(i, j ) is the set of coordinates of all descendants of (i, j ); H (i, j ) is the set of all tree roots; L(i, j ) = D(i, j ) − O(i, j ), is the set of all descendants except the offspring. SPIHT technique can be divided into three phases (see algorithm 3): first phase initializes the lists, computes, and output the initial threshold; second phase is the sorting phase (LIP and LIS pass) and the third phase is refinement phase.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. AGARWAL et al.: MULTIDIMENSIONAL COMPRESSION OF ITS DATA

Fig. 3.

EZW coding main flowchart.

Fig. 4.

Data location on I-15 in Las Vegas.

IV. DATA D ESCRIPTION AND V ISUALIZATION For evaluation of the proposed compression approach, a case study was conducted using the freeway traffic data in Las Vegas, Nevada; this data was obtained from the Freeway and Arterial System of Transportation (FAST). Flow detectors are installed at along the freeways in Las Vegas area, placed every one third of a mile. This data contains time stamp, sensor detector ID, traffic volume, average speed, and lane occupancy. The data used in the case study was collected from loop detectors on the I-15 NB between Tropicana and Flamingo in Las Vegas (as shown in Figure 4). This location is chosen based on inputs from FAST. The location ensures a wide range of data in terms of density, speed and traffic flow which is important for testing robustness of the compression algorithm. The data was collected from freeway detectors between roadway i d : 59, segment i d : 2 and roadway i d : 72, segment i d : 1. It was collected from 6 A.M. to 12 P.M. for several days in March 2011. The detectors at the location are loop detectors and the counts are polled every 5 mins. The data are then aggregated and reported every 15 mins. Figure 5 shows visualization of traffic volume data on I-15 NB in Las Vegas using wavelet coefficients during a day. In this color coded visualization green color indicates free flow

5

Algorithm 3 SPIHT Algorithm 1: procedure I NITIALIZATION compute and output k = log2 (max| Ii, j |); 2: 3: set LSP=∅ and LIP= H (#of Levels) 4: add (i, j ) H (#of Levels) with D(i, j ) = ∅ to LIS as type A; 5: end procedure 6: procedure S ORTING PHASE for (i, j ) LIP do 7: 8: output Sk (i, j ); 9: if Sk (i, j ) = 1 then 10: move (i, j ) to LSP; output sign of Ii, j 11: end if 12: end for 13: for (i, j ) LIS do 14: if (i, j ) is of type A then 15: output Sk (D(i, j )); 16: if Sk (D(i, j )) = 1 then 17: for (e, f )  O(i, j ) do 18: output Sk (e, f ); 19: if Sk (e, f ) = 1 then 20: add (e, f ) to LSP; output sign of  Ie, f 21: else append (e, f ) to LIP; 22: end if 23: end for 24: if L(i, j ) = ∅ then 25: move (i, j ) to end of LIS as type B; goto (13) 26: end if 27: else remove (i, j ) from LIS 28: end if 29: end if 30: if (i, j ) is of type B then 31: output Sk (L(i, j )); 32: if Sk (L(i, j )) = 1 then 33: append each (e, f ) O(i, j ) to LIS as type A; remove (i, j ) from LIS; 34: end if 35: end if 36: end for 37: end procedure 38: procedure R EFINEMENT PHASE for (i, j ) LSP except those included in the last sorting 39: phase do 40: output the k t h bit of | Ii, j |; 41: end for 42: end procedure 43: procedure Q UANTIZATION - STEP UPDATE decrement k; 44: 45: if k < 0 then 46: stop; 47: else goto step (6); 48: end if 49: end procedure condition on freeway whereas red indicated the jam condition. This kind of visualization helps the managers to quickly asses the daily/weekly traffic patterns.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

TABLE I D ATA F ROM THE F LOW D ETECTOR

Fig. 5.

Data visualization on I-15 in Las Vegas.

recommended for de-noising of a given signal. Hard thresholding was used in this study for investigating the effect of thresholding in visualization and detail retention for selecting a best wavelet function for the traffic data. Hard thresholding is explained below: Hard thresholding (keep or kill):  medi an(abs(detail at level 1)); if nonzero T hr = (4) 0.05 max(abs(detail at level 1)); otherwise In hard thresholding, the coefficients below a certain threshold are set to zero, and the magnitudes of the wavelet coefficients above the threshold are left unchanged, as follows:  d ; |d| > T hr hard = Td 0 ; |d| ≤ T hr Fig. 6.

Framework and methodology.

Figure 6 summarizes the flow of the method employed in this paper and explains the framework for data compression. V. 1-D DATA C OMPRESSION In this section, we conduct a preliminary analysis into the wavelet methods used for data compression. We start by describing the data and its arrangement for analysis, followed by a brief description of the performance measures for compression. We perform our analysis using various wavelets and various decomposition levels. Procedures for the 1D approach are followed from the case study done in [12] and [21], which used data from Texas Department of Transportation (TxDOT). A. Data Description and Arrangement Data are extracted from a particular detector, for a single day and arranged in 1-Dimensional array format. From now on we call this as 1-D data arrangement. Speed, volume and occupancy are separated and stored in different files. Example of this storage is presented in table I. B. Performance Measures and Parameters 1) Threshold Selection: There are two types of thresholding, hard (keep or kill) and soft (shrink or kill). Hard thresholding is generally used for compression, whereas soft thresholding is

where Tdhard is the thresholded wavelet coefficients. 2) Percentage Recovery: Percentage recovery is the ratio of the L 2 norm squared of the reconstructed signal from the thresholded wavelet coefficients to the L 2 norm squared of the original wavelet coefficients (which is equal to the signal L 2 norm squared), expressed as a percentage. If this number is close to 100, it means that the energy in de-noised or de-compressed version is very close to the energy in the uncompressed version. If X is a one-dimensional signal and XC is the reconstructed signal, then the percentage recovery is reduced to: Percentage Recovery(R) =

XC 2 × 100(%)

X 2

3) Compression Ratio: The compression ratio is the ratio of the number of thresholded wavelet coefficients that are zero to the total number of wavelet coefficients, expressed as a percentage (see [21]). A compression ratio near 100 means that virtually all the thresholded wavelet coefficients are zero; this indicates that the signal or image is being reconstructed based on a very sparse set of wavelet coefficients. C. Results of the Case Study In order to find the best combination of wavelet type, decomposition level, and threshold value; an exhaustive search was performed. Table II shows the compression ratio and percentage recovery of the data for level independent(hard) thresholding. Results show that the percentage recovery was

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. AGARWAL et al.: MULTIDIMENSIONAL COMPRESSION OF ITS DATA

7

TABLE II R ESULTS FOR L EVEL -I NDEPENDENT T HRESHOLDING

Fig. 8. Decomposition of the traffic signal using the Haar wavelet at Level 5.

Fig. 9.

2-D data rearrangement. TABLE III 2-D D ATA A RRANGEMENT

Fig. 7. Wavelet decomposition of the traffic signal using a Haar wavelet at Level 3.

above 99% for all the levels and wavelets, while the compression ratio ranged from 25.35% to 54.52%. It is also interesting to note the effect of filter length (decomposition level) on the compression ratio. With a longer filter, computation time increases, however the compression ratio is not improving significantly. Figure 7, 8 show the original signal, reconstructed signal, approximation and wavelet coefficients for the best results based on Table II. VI. T WO -D IMENSIONAL DATA C OMPRESSION A. Proposed Approach The data was rearranged into a two-dimensional matrix, as shown in Figure 9. It is formed by combining several days of

data into one file. One dimension of the matrix captures the time variation within a day and the other dimension captures variation over several days for the same time. Table III shows one such arrangement, where the volume data was arranged into a 2D matrix. As traffic flow follows a certain trend during longer duration of time, over the days and weeks, the data in this 2D matrix was expected to be correlated. This correlation was exploited by the wavelet algorithms for better compression performance. B. Performance Measures and Parameters 1) Reduction Ratio: Reduction ratio, a measure of achieved compression, is given by compression ratio (CR) and Bit-PerPixel (BPP) ratio. CR and BPP represent equivalent information. CR indicates that the compressed image is stored using CR % of the initial storage size, while BPP is the number of bits used to store one pixel of the image. The reduction ratio (RR) indicates the reduction, in percentage, of the initial

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

Fig. 10.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

2-D compression of occupancy data.

storage size, as defined below: Size of Compressed Image × 100(%) C R(%) = Size of Original Image R R(%) = 100 − C R(%)

Fig. 11.

2-D compression of speed data.

Fig. 12.

2-D compression of volume data.

Fig. 13.

Comparison using the SPIHT algorithm.

(5)

2) Error Metrics: Error metrics used to compare the various image compression techniques is often the Mean Square Error (MSE). MSE is the cumulative squared error between the compressed and the original image and is defined as follows: MSE =

n 1 (X i − Xˆ i )2 n i=1

where X i is the original image, Xˆ i is the compressed image and n is the total number of pixels in the image. For ease of interpretation we have used Normalized Mean Square Error (NMSE) as the error metric. C. Results of the Case Study A 2D data matrix was made consisting of 16 days of data, represented by 16 columns and time during the day on the rows, as shown in Table III. Basically, this is a 64X16 data matrix, which can also be treated as a grayscale image for compression. Figure 10 shows result obtained by using EZW and SPIHT algorithms on the occupancy data in 2D rearrangement. It is interesting to note that for lower values of reduction ratio (30 − 80%), difference in the NMSE of both the algorithms is higher, while for higher reduction ratios (80 − 95%) EZW catches up with SPIHT alogorithm in terms of NMSE. Maximum achieved reduction ratio is at 91% at the cost of 0.24 NMSE in this case. Figure 11 shows result obtained by using the algorithms on the speed data in 2D rearrangement. In this case, difference between the performance of both algorithms is less and for the values of NMSE greater than 0.24 they exhibit similar performance. Maximum reduction ratio is at 91% against 0.39 NMSE in this case. It is interesting to observe that reduction ratio of 87% is achieved at a much lower NMSE (0.24). Figure 12 shows result obtained on the volume data in 2D rearrangement. In this case, difference between the performance of both algorithms is relatively more and the difference remains even for higher NMSEs. Maximum reduction ratio is at 92% against a relatively higher NMSE (0.47). Figure 13 compares the performance of the SPIHT algorithm for different types of traffic data. It can be observed that

reduction ratio lies between 92 − 95% for a NMSE of 0.55 for all the three types of traffic data. We also observe that while occupancy and speed had almost overlapping compression versus the normalized mean square error curve, the volume was the least compressible for the lower NMSEs. VII. T HREE -D IMENSIONAL DATA C OMPRESSION A. Proposed Approach This data arrangement, a three-dimensional matrix, is made by combining multiple 2-D matrices. This 3D matrix can be constructed in multiple ways. Data in the first two dimensions remains as before (time and days). For adding the third dimension, one way can be to take three 2-D matrices that contain volume, speed and occupancy data, respectively, for the same sensor and combine them into a 3D matrix. Another way is to use space data (from various adjacent sensors along the freeway) and combine them into a 3D matrix. This is shown in figure 14 for three adjacent sensor data. Please note that this can also be viewed as a colored image with the data

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. AGARWAL et al.: MULTIDIMENSIONAL COMPRESSION OF ITS DATA

9

TABLE IV C OMPARISON OF VARIOUS P ROPOSED A PPROACHES

Fig. 14.

Fig. 15.

Fig. 16.

Comparison of 2-D and 3-D compression using SPIHT.

Fig. 17.

Compression of 2-D vs 3-D volume data using SPIHT.

3-D data rearrangement.

3-D data compression using SPIHT.

from these three sensors acting as R, G and B components of the image. B. Results of the Case Study After data rearrangement, a 64X16X3 matrix was generated. The third dimension had the information of Occupancy, Speed and V olume. Figure 15 shows the result using SPIHT compression technique. It is observed that the maximum reduction ratio is at 96%, a little higher than the 2-D arrangement. Figure 16 compares results from 2D data rearrangement with the 3D data rearrangement. It can be observed that for lowers NMSEs performance of 3D rearrangement is almost similar to 2D case, but at higher NMSEs it improves and achieves a higher reduction ratio. Next, we create another 64X16X3 matrix using a different methodology, where third dimension has the information of V olume from three adjacent sensors (see figure 14). Figure 17 compares the compression of 2D vs. 3D volume data arrangements. It is very interesting to observe that the curves are

almost parallel and 3D rearrangement out performs 2D with almost a constant margin for all values of NMSE. Compressed data obtained through all the above three approaches was analyzed and results of this case study are summarized in Table IV. VIII. C ONCLUSION This paper focused on the use of wavelet transform in transportation data compression. Compression of loop detector data was investigated. Analysis was divided into three sections based on the data arrangement. The first data arrangement, 1D, basically used a one dimensional array for storing the loop detector data. The file containing the data had all the traffic parameters for a single detector, varying over the time. Various wavelets and levels of decomposition were investigated and used to analyze their performance. The second data arrangement, 2D, was obtained by rearranging and combining several days of data into a single 2D matrix. EZW and SPIHT compression techniques were investigated. 2D approach resulted in a reduction ratio of about 95.2%. The third data arrangement, 3D, was obtained by combining three 2D matrices. It was observed that the compression performance for 3D was better than the 2D matrix arrangement.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

This paper provided a detailed analysis and framework of traffic data compression using wavelet transform based algorithms. The framework and approach was well supported by means of a detailed case study. This technique was proven to be highly useful for traffic data compression, giving the transportation managers an option to control the compression according to the desired ITS requirements. However, there are some limitations which remain to be further investigated. For example, in 3-D version, the third dimension is currently restricted to data only from three sensors. Further analysis is also required in terms of identifying the operating point on the ROC curves for a particular ITS application. ACKNOWLEDGMENT The authors are thankful to the Freeway and Arterial System of Transportation (FAST) for providing necessary details and the data. R EFERENCES [1] J. Zhang, F.-Y. Wang, K. Wang, W.-H. Lin, X. Xu, and C. Chen, “Data-driven intelligent transportation systems: A survey,” IEEE Trans. Intell. Transp. Syst., vol. 12, no. 4, pp. 1624–1639, Dec. 2011. [2] F. Qiao, L. Yu, and X. Wang, “Double-sided determination of aggregation level for intelligent transportation system data,” Transp. Res. Rec., J. Transp. Res. Board, vol. 1879, no. 1, pp. 80–88, 2004. [3] S. Agarwal, A. Sancheti, R. Khaddar, and P. Kachroo, “Geospatial framework for integration of transportation data using Voronoi diagrams,” in Proc. 92nd Annu. Meeting Transp. Res. Board, 2013, pp. 1–15, paper 13-5378. [4] N.-E. El Faouzi, H. Leung, and A. Kurian, “Data fusion in intelligent transportation systems: Progress and challenges—A survey,” Inf. Fusion, vol. 12, no. 1, pp. 4–10, Jan. 2011. [5] R. Souleyrette, D. Plazak, T. Strauss, and S. Andrle, “Applications of state employment data to transportation planning,” Transp. Res. Rec., J. Transp. Res. Board, vol. 1768, no. 1, pp. 26–35, 2001. [6] D. A. Lelewer and D. S. Hirschberg, “Data compression,” ACM Comput. Surveys, vol. 19, no. 3, pp. 261–296, Sep. 1987. [7] S. J. Baek, G. D. Veciana, and X. Su, “Minimizing energy consumption in large-scale sensor networks through distributed data compression and hierarchical aggregation,” IEEE J. Sel. Areas Commun., vol. 22, no. 6, pp. 1130–1140, Aug. 2004. [8] P. A. Alsberg, “Space and time savings through large data base compression and dynamic restructuring,” Proc. IEEE, vol. 63, no. 8, pp. 1114–1122, Aug. 1975. [9] J. Muckell, P. W. Olsen, Jr., J.-H. Hwang, C. T. Lawson, and S. S. Ravi, “Compression of trajectory data: A comprehensive evaluation and new approach,” GeoInformatica, vol. 18, no. 3, pp. 435–460, Jul. 2014. [10] S. Klimenko, G. Mitselmakher, and A. Sazonov, “Losless compression of LIGO data,” California Inst. Technol., Pasadena, and Massachusetts Inst. Technol., Cambridge, MA, USA, Tech. Note LIGO-T000076-00-D, 2000. [11] M. Misiti, Y. Misiti, G. Oppenheim, and J.-M. Poggi, “Wavelet toolbox user’s guide,” MathWorks, Natick, MA, USA, 1996. [12] Q. Li, H. Jianming, and Z. Yi, “A flow volumes data compression approach for traffic network based on principal component analysis,” in Proc. IEEE Intell. Transp. Syst. Conf. (ITSC), Sep./Oct. 2007, pp. 125–130. [13] J. Ding, Z. Zhang, and X. Ma, “A method for urban traffic data compression based on wavelet-PCA,” in Proc. 4th Int. Joint Conf. Comput. Sci. Optim. (CSO), Apr. 2011, pp. 1030–1034. [14] L. Li, X. Su, Y. Zhang, J. Hu, and Z. Li, “Traffic prediction, data compression, abnormal data detection and missing data imputation: An integrated study based on the decomposition of traffic time series,” in Proc. IEEE 17th Int. Conf. Intell. Transp. Syst. (ITSC), Oct. 2014, pp. 282–289.

[15] A. Guitton, A. Skordylis, and N. Trigoni, “Utilizing correlations to compress time-series in traffic monitoring sensor networks,” in Proc. IEEE Wireless Commun. Netw. Conf. (WCNC), Mar. 2007, pp. 2479–2483. [16] A. Skordylis, A. Guitton, and N. Trigoni, “Correlation-based data dissemination in traffic monitoring sensor networks,” in Proc. ACM CoNEXT Conf., 2006, p. 42. [17] T. Srisooksai, K. Keamarungsi, P. Lamsrichan, and K. Araki, “Practical data compression in wireless sensor networks: A survey,” J. Netw. Comput. Appl., vol. 35, no. 1, pp. 37–59, Jan. 2012. [18] M. T. Asif, S. Kannan, J. Dauwels, and P. Jaillet, “Data compression techniques for urban traffic data,” in Proc. IEEE Symp. Comput. Intell. Vehicles Transp. Syst. (CIVTS), Apr. 2013, pp. 44–49. [19] N. Mitrovic, M. T. Asif, U. Rasheed, J. Dauwels, and P. Jaillet, “CUR decomposition for compression and compressed sensing of large-scale traffic data,” in Proc. 16th Int. IEEE Conf. Intell. Transp. Syst. (ITSC), Oct. 2013, pp. 1475–1480. [20] Y. Cheng, Y. Zhang, J. Hu, and L. Li, “Mining for similarities in urban traffic flow using wavelets,” in Proc. IEEE Intell. Transp. Syst. Conf. (ITSC), Sep./Oct. 2007, pp. 119–124. [21] F. Qiao, H. Liu, and L. Yu, “Incorporating wavelet decomposition technique to compress TransGuide intelligent transportation system data,” Transp. Res. Rec., J. Transp. Res. Board, vol. 1968, no. 1, pp. 63–74, 2006. [22] Y. Xiao, Y.-M. Xie, L. Lu, and S. Gao, “The traffic data compression and decompression for intelligent traffic systems based on two-dimensional wavelet transformation,” in Proc. 7th Int. Conf. Signal Process. (ICSP), vol. 3. Aug./Sep. 2004, pp. 2560–2563. [23] X.-F. Shi and L.-L. Liang, “A data compression method for traffic loop detectors’ signals based on lifting wavelet transformation and entropy coding,” in Proc. Int. Symp. Inf. Sci. Eng. (ISISE), Dec. 2010, pp. 129–132. [24] A. S. Lewis and G. Knowles, “Image compression using the 2-D wavelet transform,” IEEE Trans. Image Process., vol. 1, no. 2, pp. 244–250, Apr. 1992. [25] J. M. Shapiro, “Embedded image coding using zerotrees of wavelet coefficients,” IEEE Trans. Signal Process., vol. 41, no. 12, pp. 3445–3462, Dec. 1993. [26] A. Said and W. A. Pearlman, “A new, fast, and efficient image codec based on set partitioning in hierarchical trees,” IEEE Trans. Circuits Syst. Video Technol., vol. 6, no. 3, pp. 243–250, Jun. 1996. [27] C. M. Brislawn, J. N. Bradley, R. J. Onyshczak, and T. Hopper, “FBI compression standard for digitized fingerprint images,” Proc. SPIE, vol. 2847, pp. 344–355, Nov. 1996. [28] S. Santoso, E. J. Powers, and W. M. Grady, “Power quality disturbance data compression using wavelet transform methods,” IEEE Trans. Power Del., vol. 12, no. 3, pp. 1250–1257, Jul. 1997. [29] R. S. H. Istepanian and A. A. Petrosian, “Optimal zonal wavelet-based ECG data compression for a mobile telecardiology system,” IEEE Trans. Inf. Technol. Biomed., vol. 4, no. 3, pp. 200–211, Sep. 2000. [30] Z. Zeng and I. G. Cumming, “SAR image data compression using a tree-structured wavelet transform,” IEEE Trans. Geosci. Remote Sens., vol. 39, no. 3, pp. 546–552, Mar. 2001. [31] A. Karim and H. Adeli, “Incident detection algorithm using wavelet energy representation of traffic patterns,” J. Transp. Eng., vol. 128, no. 3, pp. 232–242, May 2002. [32] S. Agarwal, P. Kachroo, and E. Regentova, “A hybrid model using logistic regression and wavelet transformation to detect traffic incidents,” IATSS Res., vol. 40, no. 1, pp. 56–63, Jul. 2016. [33] D. Boto-Giralda et al., “Wavelet-based denoising for traffic volume time series forecasting with self-organizing neural networks,” Comput.-Aided Civil Infrastruct. Eng., vol. 25, no. 7, pp. 530–545, Oct. 2010. [34] D. Salomon, Data Compression: The Complete Reference. New York, NY, USA: Springer, 2004. [35] C. K. Chui, An Introduction to Wavelets, vol. 1. San Francisco, CA, USA: Academic, 1992. [36] H. Adeli and A. Karim, Wavelets in Intelligent Transportation Systems. Hoboken, NJ, USA: Wiley, 2005. [37] D. Hong, J. Wang, and R. Gardner, Real Analysis With an Introduction to Wavelets and Applications. Amsterdam, The Netherlands: Elsevier, 2004. [38] I. Daubechies, Ten Lectures on Wavelets, vol. 61. Philadelphia, PA, USA: SIAM, 1992.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. AGARWAL et al.: MULTIDIMENSIONAL COMPRESSION OF ITS DATA

Shaurya Agarwal (S’14) received the B.Tech. degree in electronics and communication engineering from IIT Guwahati, Guwahati, India, in 2009; the M.S. degree in electrical and computer engineering and mathematics from University of Nevada at Las Vegas (UNLV) in 2012 and 2015, respectively; and the Ph.D. degree in electrical engineering from UNLV. He was a Post-Doctoral Associate with New York University. He is currently an Assistant Professor with the Electrical and Computer Engineering Department, California State University at Los Angeles. His research areas include theoretical and applied feedback control, nonlinear and hybrid control systems, intelligent transportation systems, cyber-physical systems, network reliability, and security.

Emma E. Regentova (M’01) has been a Professor in the Department of Electrical and Computer Engineering at the University of Nevada Las Vegas (UNLV) since 2001. She is a two-time recipient of ECE Outstanding Professor of the Year awards and awards for her service to IEEE for organizing student activities in Region 6 SWA. Her research interests comprise classical image processing and image analysis for applications that include medical, x-ray, photon/neutron, hyperspectral imaging and computed tomography.

11

Pushkin Kachroo (SM’12) received the Ph.D. degree in mechanical engineering from University of California at Berkeley and the Ph.D. degree in mathematics from Virginia Tech in 1993 and 2007, respectively. He was an Associate Professor with Virginia Tech in 2007. He is currently a Lincy Professor with the Department of Electrical and Computer Engineering, University of Nevada at Las Vegas (UNLV). He is also Director of the Mendenhall Innovation Program, UNLV. He has authored over 160 publications, including ten books. He is also an Associate Editor of IEEE T RANSACTIONS ON I NTELLIGENT T RANSPORTATION S YSTEMS .

Himanshu Verma received the B.Tech. degree in electronics and communication engineering from IIT Guwahati, Guwahati, India, in 2009 and the M.S. degree in electrical and computer engineering from University of Nevada at Las Vegas in 2012. His research areas include intelligent transportation systems, database management and design, and big data.

Suggest Documents