Anomaly Detection on Collective Moving Patterns - IEEE Computer

4 downloads 0 Views 163KB Size Report
Finally, outlier detection is performed on such PCA-. LLE feature space. The experimental results show that the proposed method can effectively detect abnormal ...
2011 IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing

Anomaly Detection on Collective Moving Patterns Manifold Learning based Analysis of Traffic Streams

Su Yang

Wenbin Zhou

College of Computer Science and Technology Fudan University Shanghai 200433, China [email protected]

College of Computer Science and Technology Fudan University Shanghai 200433, China [email protected]

of outliers as defined in [3], that is, local outliers and global outliers. Our research is focused on global outliers. In [4], anomaly detection is performed over individual moving object's trajectory. In [5], a distributed computing based solution is developed for detecting anomaly from network traffic data. In [6], detection of outliers in high-dimensional streaming data is performed following dimensionality reduction. The problem for existing researches is: Most researches address machine learning and data mining algorithms. However, an important issue is missing, that is, how to represent the raw data in a meaningful way to enable easy access to discriminative patterns. Actually, feature extraction is of essence to render a solution practical. The reason to perform feature extraction prior to outlier detection rather than applying directly the raw data to the outlier detector is that the raw data is usually too random to enable robust representation but feature extraction leads to a robust representation of the data by abstracting the data in a statistical or geometric sense. In this study, we address pattern analysis on collective behaviors and propose a new feature extraction method to solve the problem. We validate the proposed solution via a case study of pattern analysis and outlier detection on traffic streams. First, the traffic flows observed by more than 4000 sensors are organized as high-dimensional time series. Second, a widely used manifold learning technique referred to as locally linear embedding (LLE) [7] is used to compute the K coefficients in fitting every data point with its K nearest neighbors in the high-dimensional space, which can be regarded as a K-dimensional local feature associated with every data point. Then, the local features of the data points within a time window are summarized by using principal component analysis (PCA) to obtain a feature at a higher abstraction level. Finally, outlier detection is performed on such PCA-LLE feature space. The experimental results show that the proposed method can effectively detect abnormal traffic patterns, which correspond with some special days, like the New Year day, the national day of America, and some days with extreme weather conditions. This justifies that the proposed manifold learning based feature extraction method works with satisfaction for detecting global outliers from multiple data streams and can be applied to mining particle movement patterns.

Abstract—Some special natural and social events like natural disasters, terrorism attacks, and traffic accidents may have remarkable effect on people’s collective behaviors, which means the behaviors of a large number of people viewed as a whole. Analysis of the patterns of collective behaviors in the sense of data mining is an important topic. Solving this problem is crucial and practical in that online detection of abnormal patterns of people’s collective behaviors may lead to rapid response to emergent affairs. In terms of technology, the task can be regarded as detection of abnormal particle movement patterns from a global point of view. In this study, we propose a solution to solve this problem. First, the traffic flows observed by more than 4000 sensors are organized as high-dimensional time series. Second, a widely used manifold learning technique referred to as locally linear embedding (LLE) is used to compute the K coefficients in fitting every data point with its K nearest neighbors in the high-dimensional space, which can be regarded as a K-dimensional local feature associated with every data point. Then, the local features of the data points within a time window are summarized by using principal component analysis (PCA) to obtain a global feature to represent the traffic data in every time window. Finally, outlier detection is performed on such PCALLE feature space. The experimental results show that the proposed method can effectively detect abnormal traffic patterns, which correspond with some special days, like the New Year day, the national day of America, and some days with extreme weather. Keywords-Collective Behaviors; Outlier Detection; Traffic

I.

INTRODUCTION

Some special natural and social events like natural disasters, terrorism attacks, and traffic accidents may have remarkable effect on people’s collective behaviors, which means the behaviors of a large number of people viewed as a whole. Some studies are conducted towards understanding of collective human behaviors [1,2]. Analysis of the patterns of collective behaviors in the sense of data mining is of importance since on-line detection of abnormal patterns of people’s collective behaviors may lead to rapid response to emergent affairs, for instance, the precondition for rapid response to a traffic accident is to detect the anomaly in realtime. In terms of technology, the task can be regarded as detection of abnormal particle movements from a global point of view. This is closely related to outlier detection for multiple data streams in the context of data mining. There are two types 978-0-7695-4578-3/11 $26.00 © 2011 IEEE DOI

The rest of this paper is organized as follows. In section 2, we describe the solution in detail. In section 3, we present the experimental results. In section 4, we make conclusions.

704

II.

collected from all sensors can be formatted as a vector, each element of which represents the number of vehicles passing through a certain sensor during a time unit. Then, at a couple of sampling times, the data can be viewed as a vector sequence or a point cloud. The feature extraction is performed on the point cloud. It is intuitive to use manifold learning techniques to characterize the manifold of the point cloud. Here, W is a numerical representation of how every point is related to its neighbors. The eigenvalues of the matrix WWT is also referred to as PCA. PCA usually functions for noise filtering, which highlights the major information while suppresses the minor information. The LLE-PCA feature has been applied to nonlinear time series analysis in reconstructed state space [8].

CHARACTERIZING OBSERVATIONS FROM MULTIPLE SENSORS VIA MANIFOLD LEARNING

A. Overview Our goal is to detect abnormal time-varying particle movement patterns, where the particles can be people or vehicles. The task is to make a decision on whether the group particle movements within a given time window is different from most cases. The whole flowchart is as follows: First, the number of the moving particles observed at every sensor is counted to form a time series and the data from all sensors at every sampling time is organized as an N-dimensional vector. Then, compute the manifold features for the N-dimensional vectors from each time window. Finally, perform outlier detection on the feature space.

D. Outlier Detection Suppose that [X1,X2,…,XL] is a long sequence and can be divided into a couple of subsequences [X1,X2,…,XM; XM+1,XM+2,…,X2M; …X(R-1)M+1,X(R-1)M+2,…,X(R-1)M] with a time window of length M, where L=(R-1)M. For each subsequence, we can obtain a LLE-PCA feature. Suppose that for the ith time window, the LLE-PCA feature of the corresponding subsequence is denoted as Fi. Our goal is to seek outliers in the feature space spanned by {Fi|i=1,2,…,R}. The algorithm is as follows:

B. Observations from Multiple Sensors The monitoring data from N distributed sensors can be represented as O=[O1(t),O2(t),…,ON(t)]T. The continuous data sampled at M discrete times (t1,t2,…,tM) are represented in the form of a matrix:

ª O1 (t1 ) O1 (t 2 ) « O (t ) O (t ) 2 2 X = [ X 1 , X 2 ,..., X M ] = « 2 1 « # # « ¬O N (t1 ) O N (t 2 )

" O1 (t M ) º " O 2 (t M ) »» » " # » " O N (t M )¼

Input: {Fi|i=1,2,…,R}. Output: Outliers.

where Xi=[O1(ti),O2(ti),…,ON(ti)] represent the data sampled from all sensors at the ith sampling time, ti, and [X1,X2,…,XM] can be regarded as a N-dimensional time sequence of length M.

1.

Compute the Euclidean distance from every feature vector to its k nearest neighbors d(Fi,Fij), where i=1,2,…,R and j=1,2,…,k;

C. Manifold Features from Observations The feature extraction is composed of two phase: (1) Use the K nearest neighbor around every data point Xi to fit the data point in the least-square sense via the well-known LLE manifold learning equation:

2.

For i=1,2,…,R, compute lad ( Fi ) =

T

ε (W ) =

M

¦ i =1

Xi −

2

K

¦

3.

4.

Wij Y j

j =1

Compute m =

1 R

1 k

k

¦ d (F , F i

ij

);

j =1

R

¦ lad ( F ) ; i

i =1

If lad(Fi)>θ.m, where θ is a threshold and i=1,2,…,R, then, add Fi to the list of outliers for output.

The above outlier detection method is based on the k nearest neighbors around very data point in the feature space. First, we compute the averaged Euclidean distance from the k nearest neighbors to the data point of interest, referred to as local average distance. Intuitively, if a data point is an outlier, it must be far away from the other data points and the local average distance of it must be remarkably larger than the normal ones. Once we get the local average distances for all data points, we can calculate the mean m. If the local average distance with regard to a data point p satisfies lad(p)>θ.m, then, the data point p is regarded as an outlier.

where Yj∈{Xi|i=1,2,…,M;i≠j} denotes the jth nearest neighbor around Xi. The fitting error ε(W) is subject to the weight matrix W, which is composed of the fitting coefficients W=[Wij|i=1,2,…,M;j=1,2,…,K]. The LLE algorithm is used to find out the optimal W for minimizing ε(W). (2)We rewrite the weight matrix as follows: ª W11 W21 " W M 1 º » « W W22 " WM 2 » W = « 12 « # # " # » » « ¬W1K W2 K " W MK ¼ Suppose that λ=[λ1,λ2,…,λK]T are the K eigenvalues of WWT organized in ascending order. λ=[λ1,λ2,…,λK]T is used as the feature to characterize the raw data X. This feature is referred to as LLE-PCA feature. It is actually a summarization of the manifold characteristics of X. The motivation to develop such a feature is as follows: At a specific sampling time, the data

III. EXPERIMENTS Real-world moving objects data are used to validate the proposed method. The data are continuously collected by the Traffic Management Center of Minnesota Department of Transportation, with a 30-second interval from over 4,000 loop detectors located around the Twin Cities Metro freeways for

705

seven days per week. The data have two kinds of files: volume file and occupancy file. The volume files are used to record the traffic volume, and we use this kind of files only.

IV.

We investigate the problem of outlier detection on largescale collective behaviors. We focus on feature extraction, which is a missing topic in the current literature. A case study of detecting abnormal traffic flows is conducted. The problem is actually outlier detection on co-evolving data streams. Our contribution is that we proposed a new solution by viewing the co-evolving data streams as high-dimensional trajectories and applying manifold learning techniques to capture the characteristics of such trajectories. The outlier detection results by using the proposed feature show that such a scheme is effective in detecting abnormal traffic flows since some known special days are detected as outliers like New Year, Independence Day, and a day with storm. As manifold learning is a matured area, many tools developed in this area may be used for further studies.

This dataset contains thousands of detectors’ records but not all of these detectors work each day. We select 4584 detectors that work every day. The data that we use cover the period from 1st January to 22nd September in 2010. Most people work in working days but they may go anywhere at weekends. Hence, they behave similarly in working days. Since we only need periodic or quasi-periodic collective behaviors, we remove those data collected at weekends. We also remove those days that contain incomplete data. Finally, we get a dataset with 156 days. We reformat the data with a time interval of 10 minutes. We apply the proposed method to this data set for outlier detection. We set K=40 for calculating the LLE-PCA feature and set another k=10 for calculating the outlier degree. Fig. 1 shows the normalized outlier values of every working day in this dataset, where the outlier values are normalized as divided by the biggest outlier value. In this figure, the black points represent normal days, and the others are outliers. We mark 4 days with the highest outlier values as 4 points with different colors, and the other days corresponding with high outlier values are marked as blue points.

ACKNOWLEDGMENT This work is supported by 973 Program (grant No.2010CB731401), NSFC (grant No. 91024011 and grant No. 61071133), Ministry of Industry and Information Technology of China (grant No. 2010ZX01042-002-003-004), and Science and Technology Commission of Shanghai Municipality (grant No. 09JC1401500).

We also conduct some experiments to evaluate how the parameters affect the outlier detection performance. There are 3 parameters in total: the parameter K in computing the LLEPCA feature, the parameter k in computing the outlier degree, and the threshold θ to decide whether the object of interest is an outlier. The outlier detection results against the parameters are listed in Table 1 and Table 2. We can see that only a few days are different in the outlier lists when the parameters change. Note that 4 days, 1st January, 31st May, 5th July, and 13 August, are always detected as outliers, which are not subject to the parameter values. As shown in the following Table, the 4 days are New Year, Memorial Day, Independence Day, and a day with storm. This can explain why the people behave abnormally in the 4 days. Besides, 18th January is detected as an outlier in most cases. We attribute the cause of the abnormality of this day to the snow storm.

2010/1/1 2010/1/18 2010/2/8 2010/5/31 2010/7/5 2010/8/13 2010/9/6

CONCLUSIONS

REFERENCES [1]

[2]

[3]

[4]

[5]

New Year Snow Storm Snow of 4.7 inches Memorial Day Independence Day Storm Labor Day

[6]

[7] [8]

706

N. Eagle and A. Pentland, “Reality Mining: Sensing Complex Social Systems”, Personal and Ubiquitous Computing, vol. 10, no. 4, pp. 255268, 2006 J. Candia, M. C. Gonzalez, P. Wang, T. Schoenharl, G. Madey, and A.L. Barabasi, “Uncovering individual and collective human dynamics from mobile phone records”, Journal of Physics A: Mathematical and Theoretical, vol. 41, 224015, 2008 Yang Zhang, Nirvana Meratnia, and Paul Havinga, “Outlier Detection Techniques for Wireless Sensor Networks: A Survey”, IEEE Communications Surveys & Tutorials, vol. 12, no. 2, pp. 1-12, 2010 Yingyi Bu, Lei Chen, Ada Wai-Chee Fu, and Dawei Liu, “Efficient Anomaly Monitoring Over Moving Object Trajectory Streams”, in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York: ACM, 2009, pp. 159-167 Parminder Chhabra, Clayton Scott, Eric D. Kolaczyk, and Mark Crovella, “Distributed Spatial Anomaly Detection”, in Proceedings of the 27th Conference on Computer Communications (INFOCOM 2008), IEEE, 2008, pp. 1705-1713 Ji Zhang, Qigang Gao, and Hai Wang, “SPOT: A System for Detecting Projected Outliers from High-dimensional Data Streams”, in Proceedings of the 16th International Conference on Data Engineering, IEEE, 2008, pp. 1628 – 1631 S. T. Rowels and L. K. Saul, “Nonlinear Dimensionality Reduction by Local Linear Embedding”, Science, vol. 290, pp. 2323~2326, 2000 S. Yang and I. F. Shen, “Manifold analysis in reconstructed state space for nonlinear signal classification”, in Lecture Notes in Computer Science (ICIC 2007), Springer, 2007, vol. 4681, pp. 930-937

1 Normal days Outlier 31st May 1st Jan. 13th Aug. 5th Jul.

0.9

Normalize Outlier Value

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0

20

40

60

80 Day

100

120

140

160

Figure 1. The normalized outlier values of all days TABLE I: ABNORMAL DAYS DETECTED USING LLE-PCA FEATURE WITH K=20 k 10 10 10 10 10 10 20 20 20 20 20 20 40 40 40 40 40 40

θ 1.5 2 2.5 3 3.5 4 1.5 2 2.5 3 3.5 4 1.5 2 2.5 3 3.5 4

01/ 01 Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ

01/ 07 Ĝ Ĝ         Ĝ           Ĝ          

01/ 18 Ĝ Ĝ Ĝ Ĝ Ĝ   Ĝ Ĝ Ĝ Ĝ Ĝ   Ĝ Ĝ Ĝ Ĝ    

02/ 02/ 03/ 08 22 08 Ĝ                             Ĝ                                                          

03/ 15 Ĝ      Ĝ      Ĝ     

05/ 31 Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ

06/ 23                  

07/ 05 Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ

07/ 14 Ĝ Ĝ     Ĝ Ĝ     Ĝ Ĝ    

07/ 15 Ĝ      Ĝ      Ĝ Ĝ    

08/ 10 Ĝ      Ĝ Ĝ     Ĝ Ĝ    

08/ 11 Ĝ           Ĝ           Ĝ          

08/ 13 Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ

09/ 06 Ĝ Ĝ Ĝ       Ĝ Ĝ Ĝ       Ĝ Ĝ Ĝ      

09/ 21 Ĝ      Ĝ      Ĝ     

TABLE II: ABNORMAL DAYS DETECTED USING LLE-PCA FEATURE WITH K=40 k 10 10 10 10 10 10 20 20 20 20 20 20 40 40 40 40 40 40

θ 1.5 2 2.5 3 3.5 4 1.5 2 2.5 3 3.5 4 1.5 2 2.5 3 3.5 4

01/ 01 Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ

01/ 07 Ĝ Ĝ Ĝ       Ĝ Ĝ         Ĝ Ĝ        

01/ 02/ 02/ 03/ 18 08 22 08 Ĝ Ĝ    Ĝ      Ĝ      Ĝ                    Ĝ      Ĝ      Ĝ      Ĝ                    Ĝ   Ĝ  Ĝ      Ĝ      Ĝ                   

03/ 15       Ĝ      Ĝ     

707

05/ 31 Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ

06/ 23             Ĝ     

07/ 05 Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ

07/ 14 Ĝ Ĝ     Ĝ Ĝ     Ĝ Ĝ    

07/ 15 Ĝ      Ĝ Ĝ     Ĝ Ĝ    

08/ 10 Ĝ Ĝ     Ĝ Ĝ Ĝ    Ĝ Ĝ Ĝ   

08/ 11                         Ĝ          

08/ 13 Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ Ĝ

09/ 06 Ĝ Ĝ Ĝ       Ĝ Ĝ Ĝ       Ĝ Ĝ Ĝ      

09/ 21 Ĝ      Ĝ      Ĝ     

Suggest Documents