Crowd Escape Behavior Detection and Localization ... - IEEE Xplore

3 downloads 0 Views 2MB Size Report
Feb 10, 2015 - Crowd Escape Behavior Detection and Localization. Based on Divergent Centers. Chun-Yu Chen and Yu Shao. Abstract—In this paper, we ...
IEEE SENSORS JOURNAL, VOL. 15, NO. 4, APRIL 2015

2431

Crowd Escape Behavior Detection and Localization Based on Divergent Centers Chun-Yu Chen and Yu Shao

Abstract— In this paper, we propose a novel framework for anomalous crowd behavior detection and localization by introducing divergent centers in intelligent video surveillance systems. In this paper, the scheme proposed can deal with this problem by modeling the crowd motion obtained from the optical flow. The obtained magnitude, position and direction are used to construct the motion model. The method of the weighted velocity is applied to calculate the motion velocity. People usually instinctively escape from a place where abnormal or dangerous events occur. Based on this inference, a novel algorithm of detecting divergent centers is proposed: divergent centers indicate possible places where abnormal events occur. The proposed algorithm of detect divergent centers can identify more than one divergent center by analyzing the intersections of vectors, and this algorithm consist of the distance segmentation method and the nearest neighbor search. The performance of our method is validated in a number of experiments on public data sets. Index Terms— Anomaly detection, crowd escape, weighted velocity, divergence center.

I. I NTRODUCTION AND R ELATED W ORK

R

ECENTLY there has been growing interests in anomalous crowd event detection. Timely detection of crowd abnormal events means that more time is available for taking action to reduce the impacts of those events. Although a lot of work have been done in this field, it is remains a very challenging research field. These work can be roughly categorized into two classes: one is focusing on single action [1] or individual pedestrian behavior [2]. Another one focuses on the group activity like fighting or escaping event the crowd scenario. And more and more researches on this have been done in recent years. Traditional technique [3] which seems to work for low density crowd scenes cannot solve high dense crowd motions problem. The main limitation is the lack of a universal definition of anomaly. And there are thousands of abnormal behaviors in crowd scene so it is infeasible to enumerate all the set of anomalies. One common solution to this problem is the definition of anomaly.

Manuscript received October 7, 2014; accepted December 7, 2014. Date of publication December 18, 2014; date of current version February 10, 2015. This work was supported in part by the National Natural Science Foundation of China under Grant 61175126 and in part by the Ph.D. Programs Foundation of the Ministry of Education, China, under Grant 20112304110009. The associate editor coordinating the review of this paper and approving it for publication was Dr. Nitaigour P. Mahalik. The authors are with the College of Information and Communication Engineering, Harbin Engineering University, Harbin 150001, China (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSEN.2014.2381260

Abnormal is defined as deviating from the expected, and this is unusual in references [4], [5]. In the real world, pedestrians usually walk in the street and not affected by their neighbors result in regular movement patterns. However when unexpected events happen, such as fires, riots, explosions, etc., people would escape since this is an instinctive response of avoiding danger. The crowd events caused by panic often result in serious consequences such as stampede and deaths as described in [6]. So these motivate researchers to pay more attention for studying the panic behaviors and predicting the occurrence of such events. Thus this paper focuses on the following problem: Given a set of training video sequences, analyzing behaviors in video sequences and detect the positions of abnormal events automatically. We specifically focus on the abnormal events with crowd escape behaviors in this work. Many methods of computer vision have been applied to identify crowd behaviors in surveillance videos. A survey [7] summarizes several methods which include crowd tracking [8]–[10], crowd density counting [11], abnormal detection [12], [13] and crowd scene understanding [14], [15], and so on. A review [16] discusses the advantages and disadvantages of every approach from the aspect of the abnormal definition, detection algorithm and feature extraction. But very few researcher to discuss the abnormal behaviors localization. A few methods have been proposed for characterizing crowd behavior events in surveillance videos mainly based on analysis of optical flow, spatio-temporal features [17], dynamic texture [18] and graph theory [19]. There are a few methods which directly extract motion patterns from optical flow fields. Although anomalous crowd detection has received some attention in recent years, most researches have focused in identifying and tracking abnormal behaviors of individuals in crowds [20], [21]. Crowd scene problems, such as crowd counting [22], movement tracking [8] and crowd activities understanding [23], have attracted the interest of many researchers. For the abnormal crowd events detection, these approaches can be broadly classified according to whether or not examples of normal and abnormal events are needed in the train phase [24]–[26]. Different features are extracted for crowd event, the trajectories which depict the normal events of videos are extracted, and compared by the abnormal events in [27]. In [28], the authors proposed a direction model which extracts the direction feature from the optical flow field to recognize the abnormal events. The novelty of this model is the use of multiple

1530-437X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

2432

IEEE SENSORS JOURNAL, VOL. 15, NO. 4, APRIL 2015

directions at each position. The feature of magnitude is utilized to characterize the crowd behaviors in [29]. Although both algorithm in [29] and this work utilize magnitude, we proposed method to combine additional information in the form of position and direction. The spatio-temporal feature is extracted in [11] to model the local motion pattern behavior in extremely crowded scenes. The local motion patterns with 3D Gaussian distributions is captured via a HMM. And HMM was applied to judge the state of event (normal or abnormal). For the locations at which the abnormal crowd events occur, the literature [30] mentions that the Markov Chain Monte Carlo method is used to address this problem. In this paper, we focus on crowd escape behaviors detection. We propose a novel method to detect locations of anomalous crowd events. Our work is based on the idea that we consider a crowd as a single entity and study its behaviors. Hence, in what follows, the term “crowd” refers to more than three people in a small spatial region. We construct a novel framework for modeling crowd escape behaviors and demonstrate its effectiveness. Similar to the above approaches, the optical flow is employed to describe the crowd motion and movement magnitudes, speeds and directions are also extracted in this paper. We consider the location of abnormal crowd event problem, the notion of the divergent center is inspired by previous work in literature [30] which introduces the divergent center in a small part. Thus we propose a novel method which is different from the work [30] to determine the divergent center. The anomaly detection is based on the stable appear of divergent centers when motion features change, this method is able to handle both the detection of escape events but also the possible locations of escape events. The primary contributions of this paper are below: 1) The weighted velocity of optical flow vector is proposed. 2) The novel method used to compute divergent centers is introduced. 3) The divergence centers are directly used to model the normal and abnormal behavior events. This paper is organized into two main parts: anomaly detection and anomaly localization as follow: In section II, we introduce the proposed approach for analyzing and model the anomalous crowd event. In section III, we introduce and analyze the method of anomalous crowd events localization. In section IV we show the experimental results in multiple scenarios. Finally, our conclusion and potential extensions of our work are given in section V. II. M ODELING C ROWD E VENT In this section we describe the construction of the divergent center model. The proposed framework for crowd escape event detection is based on the analysis of holistic crowd rather than individuals. We first represent the video sequences by using the optical flow field, and divide a video sequence into a set of patches using the regular grid to analysis crowd behaviors. Then we extract the foreground and the motion features. These features include position, magnitude, and direction. When anomalous crowd behavior detection is performed, the weighted velocity will be represented with

corresponding features. The computation of weighted velocity is used to classify the crowd behaviors into two classes with an empirically defined threshold. Finally, we introduce the divergent centers analysis to classify the crowd movement. The normal behavior and the abnormal one can be classified by whether the divergent centers appear stability or not. A. Features Extraction In this step, we start by extracting features. People usually stay in a certain region or move in a small scale during a short period. Whereas people would quickly deviate from the locations where they stay or they want to go when unusual events occur, result in changes in the magnitude, direction, and location of motion vectors. As a result, these features would be extracted from the optical flow field. And in the stage of the foreground extract, the cluster method and threshold method are applied in this work. We extract three main features: optical flow positions, optical flow magnitudes, and optical flow directions. In crowd scenes, since people are dense, the crowd movements are restricted by other people so that the individuals have similar movement speeds, and the movement speeds of individuals are usually slow. People are usually driven by their goals. The direction of individuals nearly retains the same within a certain time interval. However, in the case of escape, in contrast to the normal behavior the movement speeds are very high and the change in direction is very big due to the panic. People would quickly deviate from the locations where they stay or they want to go as in the case of normal. The crowd may run around in the view field, so the foreground patches can locate any position. As a result, we consider the crowd area in the view field is relatively large when escape behaviors happen. Thus these three features described above are very significant for anomaly detection. The main objective of this paper is to detect locations of the abnormal event, so the features extracted are the foundation of detection, the result of features extracted is needed to represent the behaviors. In our model, we analyze crowd behaviors in video through the corresponding optical flow field. Let ν be an optical field, we divide the whole frame into small blocks which are represented as ν = {v1 , v2 , . . . , vn }, for example, if the size of frame is 320 × 240, the block size is 24 × 24 pixels, the whole frame is divided into 10 × 14 small regions, the block which belongs to the foreground usual have greater magnitude than a threshold ε. The objective of dividing the frame is reducing the amount of calculation, and overcoming the disadvantage of high computational complexity. B. The Weighted Velocity We proposed a method based on the divergent centers to detect locations of escape events. In order to reduce the computational complexity, we add up a restricted condition that the crowd velocity is detected before determining divergent centers. The operation of detecting divergent centers is implemented if the detected velocity is greater than the threshold which is obtained from the experiment experience. In the case

CHEN AND SHAO: CROWD ESCAPE BEHAVIOR DETECTION AND LOCALIZATION BASED ON DIVERGENT CENTERS

of escape, the movement features which include magnitudes, speeds and directions would change remarkably. The optical flow is a reliable approximation to two-dimensional image motion, and the optical flow has good characteristics of spatio-tempal which include the vector information of pixel instantaneous movement. The crowd in different scenes shows different behaviors such that the crowd has different the optical flow velocity feature. Therefore, we can analysis the extent of abnormal behaviors through calculating the weighted velocity of the optical flow. Abundant information is implied in the transformation between the image frame sequences and pixels. In the case of normal behaviors, movement magnitudes of the normal behaviors are small and movement directions are broadly same, resulting to the motion velocities are low. While in the case of abnormal behaviors, the motion velocities are greater than the former. Thus this declares the reason that the velocity can be used to describe the motion behaviors in video surveillance systems. The optical flow includes not only the instantaneous speed of each pixel but also the motion direction of each pixel. The magnitude of velocity reflects the intensity of movement, and the movement direction angle can used to determine the direction of movement target. In this paper, the Horn-Schunck optical flow method which assumes the change of the optical flow is smooth in the whole image [31] is employed. The nonweighted velocity does not distinguish the abnormal behaviors and the normal behaviors very well only utilizing the information of the speed information. Thus the weighted velocity of n-th frame optical flow is extracted according to the following formula: V (n) =

W  H 

wi, j (n) magi, j (n)

(1)

i=1 j =1

Where V (n) denotes the weighted velocity of the optical flow in the motion region. Parameter magi, j (n) is the magnitude of the n-th frame image pixels pix(i, j), coefficient wi, j (n) is the weight of the velocity. W is the width of the motion area and H is the height of the motion area. The magnitude of each pixel in the motion area is calculated as follow formula:  (2) magi, j (n) = u 2 (x, y) + v2 (x, y) The offsets of horizontal velocity and vertical velocity are u i, j and vi, j respectively. The angle of motion is denoted: Angi, j (n) = arctan

u i, j (n) vi, j (n)

(3)



The weighted velocity is defined as: 2  2  |Ang Max| |Ang| wi, j (n) = × 10 + × 10 (4) π π Where Ang =  AngCur −  Ang Avg, Ang Max = AngCur −  Ang Max, and  AngCur denotes the angel between the current pixel and the horizontal direction. The angle of the max speed pixel is  Ang Max in the

2433

motion region. The average angle between moving target and horizontal direction: u i, j (n) (5) Angi, j (n) = arctan vi, j (n) Where Sum(P) denotes the sum of all pixels in the motion region. III. A BNORMALITY L OCALIZATION BASED ON THE D ISTANCE S EGMENTATION M ETHOD Without loss of generality, we assume that people are driven by their goals in the case of non-escape. Whereas people would deviate from the locations where they stay or they want to go when unusual events occur. If groups of people deviate from the same area, it is called “divergence”. Base on this inference, the notion of divergent centers is introduced in this paper to detect the location of escape event. Although some researches like [32] and [33] have been studied the locations of the individual anomalies. Few researches were studied for the locations of the anomalous crowd behaviors. However, determining the location of escape event is very important to take actions timely. To determine the locations of the anomalous crowd behaviors, we propose a novel algorithm that includes the K-Nearest Neighborhood Search and the distance segmentation method. The full algorithm is provided for clarification in Algorithm 1 and works as follows: Given an input video with n frames, we obtain the foreground blocks w × w which extracted from the optical flow field. The velocity vectors which estimated from blocks as the input of Algorithm 1 and output is a set of divergent centers C. Then, the algorithm initializes a set of straight lines used to store the parameters of straight lines k and b. And k and b were determined by the locations and directions of pixels. (Consider only the presence of the slope of a straight line.) Assume that: the parameters of straight line given by kl = vi, j /u i, j bl = i − kl × j

(6)

Where, k is the slope of a straight line and b is x-intercept. i and j is the corresponding coordinates. A. Determine the Intersection Set Any one intersection p1 = (x 1 , y1 ) in intersection set P = { p1 , p2 , . . . ps } is obtained through two different lines. The idea of solving intersections is solving the binary-element equation set as follows: x = (b2 − b1 )/(k1 − k2 ), y = ((b2 − b1 )/(k1 − k2 ) ) × k1 + b1

(7)

In escape crowd scenes, the individual movement direction could not meet the ideal situation which object flees from danger along the standard line due to restriction of other people around them. Thus the foreground blocks directions are   additionally corrupted by Gauss noise N μ, σ 2 in order to discuss the intersections distribution.

2434

Algorithm 1 Determine the Positions of Divergent Centers  (n) (n) (n) 1: Input velocity vectors U (n) = u 1 , u 2 , . . . , u k  (n) (n) (n) and V (n) = v 1 , v 2 , . . . , v k 2: Obtain the set of parameters K , b 3: Obtain the intersection set P = { p1, p2 , . . . ps } 4: The removing outliers method is used to obtain the intersection set Z = {z 1 , z 2 , . . . , z n } 5: Initialize divergent centers 1 The knnsearch method is used to search the  Z = {z 1 , z 2 , . . . , z n } to obtain the nearest N sets of points Di st = {Di st1n , Di st2n , . . . , Di stnn } (According to arrange from small to big) 2 Initialize three divergent centers set C  = {C1 , C2 , C3 } 3 Judge the k-th nearest point distance in the set  Di st1n 4 If di s_ max < d0 , the k nearest points are  retained, and saved to the set C1 (threshold d0 is determined according to experimental experience) 5 for i = k+1 to n If the i-th nearest point  distance< d0 , the i nearest points are retained in set C1 6 Otherwise, saved to the set C2 or the set C3 .  The judgment principle of set C2 or the set C3 is similar to the set C1 set C2 or the set C3 7 end for  8 The distance segmentation method is used to  separate the different dense region 9 Otherwise, divergent center does not exist  10 Return divergent centers set C(k), the divergent  center c = {c1 , c2 , c3 } is the centroid of divergent center set C = {C1 , C2 , C3 } 6: Perform Algorithm 2 to determine new intersection set Z  = {z 1 , z 2 , . . . , z n } 1 to step  10 to obtain the new 7: Re-perform the step  divergent centers

In order to improve the performance of the proposed algorithm, we analyze the distribution of divergent centers. We assume that the there are three moving objects with one divergent center in the scene. And three moving objects were added on three different Gauss noise in order to close to real situation. Three moving objects are located in points A(a1 , b1 ), B(a2 , b2 ), and C(a3 , b3 ), and directions are k1 , k2 , and k3 respectively, and we assume that the intersections are p1 (0, 0) , p2 (0, 0) , p3 (0, 0). The three straight line equations can be rewritten as follows: ⎧ ⎪ y  = (k1 + ϕ1 ) x 1 + b1 − k1 · a1 ⎪ ⎨1 (8) y2 = (k2 + ϕ2 ) x 2 + b2 − k2 · a2 ⎪ ⎪ ⎩  y3 = (k3 + ϕ3 ) x 3 + b3 − k3 · a3     Where, ϕ1 ∼  N u 1 , σ12 , ϕ2 ∼ N u 2 , σ22 and ϕ3 ∼ N u 3 , σ32 are denote three different Gauss noise.

IEEE SENSORS JOURNAL, VOL. 15, NO. 4, APRIL 2015

Algorithm 2 Determine New Intersections   (n) (n) (n) 1: Input velocity vectors U (n) = u1 , u2 , . . . , uk ,   (n) (n) (n) V (n) = v 1 , v 2 , . . . , v k and c = {c1 , c2 , c3 } for i = 1 to L if ∼ is empty (c(i )) Obtain the moving object closest to the divergent center c(i ), and obtain a corresponding virtual moving object by calculating the symmetric objects in regard to the initialization divergent center c(i ). The virtual moving object  is added into the velocity vectors U (n), V (n) 5: Otherwise, continue 6: end for 7: Return the   new velocity vector (n) (n) (n) U (n) = u 1 , u 2 , . . . , u k , · · · u s ,   (n) (n) (n) V (n) = v 1 , v 2 , . . . , v k , · · · v s 2: 3: 4:

Obtain the sets of parameters K , b   Intersections P  = p1 , p2 , . . . pm    10: Intersections Z = {z 1 , z 2 , . . . , z r } after the method of removing outliers 8: 9:

We can obtain the new intersections: ⎛ϕ a − ϕ a + b − b + k a − k a ⎞ 1 1 2 2 2 1 1 1 2 2 , ⎜ k1  − k 2 + ϕ1 − ϕ2 ⎟ ⎜ ϕ 1 a1 − ϕ 2 a2 + b2 − b1 + k 1 a1 − k 2 a2 ⎟ p1 ⎜ ⎟ ⎝(k2 + ϕ2 ) × ⎠ k 1 − k 2 + ϕ1 − ϕ2 + b2 − (k2 + ϕ2 ) × a2 ⎛ϕ a − ϕ a + b − b + k a − k a ⎞ 1 1 3 3 3 1 1 1 3 3 , ⎜ k1  − k 3 + ϕ1 − ϕ3 ⎟ ⎜ ϕ 1 a1 − ϕ 3 a3 + b3 − b1 + k 1 a1 − k 3 a3 ⎟ p2 ⎜ ⎟ ⎝(k3 + ϕ3 ) × ⎠ k 1 − k 3 + ϕ1 − ϕ3 + b3 − (k3 + ϕ3 ) × a3 ⎛ϕ a − ϕ a + b − b + k a − k a ⎞ 2 2 3 3 3 2 2 2 3 3 , (k3 + ϕ3 ) ⎜  ⎟ k 2 − k 3 + ϕ2 − ϕ3  ⎜ ⎟ ϕ 2 a2 − ϕ 3 a3 + b3 − b2 + k 2 a2 − k 3 a3 p3 ⎜ ⎟ ⎝× ⎠ k 2 − k 3 + ϕ2 − ϕ3 + b3 − (k3 + ϕ3 ) × a3 (9) We analyze the distribution of intersections by simulating intersection positions. We generate 1000 sets of random number ϕ1 , ϕ2 , and ϕ3 to statistical the distribution of intersections. From the expression of intersections, we can assume the distribution of intersections may relate to positions of intersections. We generate three moving objects locate in a certain range of angle and they are asymmetric in regard to the divergent center. We also generate three moving objects locate in two ranges of angle and they are symmetric in regard to the divergent center. The distributions of intersections are shown as Fig. 1. From the Fig. 1, the distribution of intersections is really related to the positions of moving objects. This relationship is that the estimated divergent center is close to the dense region of moving objects. Thus we add a step which refers to select some moving objects and obtain the corresponding

CHEN AND SHAO: CROWD ESCAPE BEHAVIOR DETECTION AND LOCALIZATION BASED ON DIVERGENT CENTERS

2435

TABLE II M EAN S QUARE E RROR W ITH THE S ITUATION OF L INE AND THE S ITUATION OF H ALF -L INE

Fig. 1. Left column: The moving objects positions are asymmetric. Right column: The moving objects positions are symmetric. The red circle represents the ideal divergent center.

Fig. 2. The distributions of intersections (a) The situation of no one virtual moving objects. (b) The situation of one virtual moving object. (c) The situation of two virtual moving objects. TABLE I M EAN S QUARE E RROR W ITH THE S ITUATION OF V IRTUAL M OVING O BJECTS

virtual moving objects by calculating the symmetric objects in regard to the initialization divergent center in the initialization stage of intersection. And can obtain the new intersections set. We discuss the influence of the number of virtual moving objects to the estimated divergent center. For instance, Fig. 2 shows the distribution of divergent center by setting different numbers of virtual moving objects. Table I shows the mean square error under the situation of different numbers of virtual moving objects. From the above simulation results, we can improve the proposed method through adding the virtual moving objects. Table I illustrates the mean square error without virtual moving objects is obviously less than the one with virtual moving objects. But the mean square error does not always decrease accompany with the increase of virtual moving objects. Thus we only select one virtual moving object, and the selection principle is selecting a moving object which is closest to the divergent center. In this paper, we focus on crowd escape behaviors detection, so using half-line (radial) instead of straight line is more suitable to this work. In order to confirm this idea, we illustrate this through the simulation. The results are shown in Table II.

From the Fig. 1 and Fig. 2, we can see that when the angle between the true trajectory and ideal trajectory of the moving object is too large, may result in the intersection is very far from the ideal divergent center and this intersection is a outlier which is not needed. Thus the key concept of the algorithm is removing the outliers in the intersection set. Since this operation can reduce the adverse factors and also can reduce the amount of calculation. In this step, the adaptive threshold method first is applied to remove the points which are too far away from the visual field. Then K-Nearest Neighborhood Search is also applied to remove the outliers which are k-th farthest point sets. B. The Distance Segmentation Method The K-Nearest Neighborhood Search is used to search the several divergent center sets C (k), and C (k) = {Ci (k)} L denotes the obtained divergent centers set, where L is the maximum number of divergent centers set. ci (k) is the centroid of the divergent centers set. The farthest distance between two intersections in a divergent center set must be less than a constant dt h . After determining the divergent centers sets C (k), the distance segmentation method is used to judge the dense region of intersections, i.e., divergent center is the center of the so-called dense region. In this paper, the distance segmentation method is used to separate the different dense region. The number of dense region is equal to the number of divergent centers. We stipulate the distance between each member in every divergent center is much less than the distance between each divergent center. So the segmentation of divergent centers can be done by comparing the space distance between two dense regions with a constant d0 . We define the distance between the i-th centroid of the divergent centers set ci (k) = [x ik yik ] and j-th centroid of the divergent centers set in C (k) as follows:    2  2  x ik − x j k + yik − y j k (10) d ci (k), c j (k) = If

  d ci (k), c j (k) < d0

(11)

the ci (k) and c j (k) are considered belong to the same centers set. d0 reflects the density of intersections in the centers set, and the choices of d0 depend on the density of crowd. From Eq. (10) and Eq. (11), we can obtain the multiple dense regions, namely the multiple divergent

2436

IEEE SENSORS JOURNAL, VOL. 15, NO. 4, APRIL 2015

centers sets. We assume that there are L divergent centers denote {C1 , C2 , · · · , C L }, and L  ci = c˜ij (k) (12) j =1

Where c˜ j (k) is the j-th intersection in the i-th divergent centers set. The distance between two different divergent centers set is defined as:      j i d Ci, C j = min d c˜m (k), c˜n (k) 1 ≤ m ≤ m˜ i, 1 ≤ n ≤ m˜ j (13)   When d Ci , C j ≥ d0 , Ci and C j is two   considered different divergent centers set; when d Ci , C j < d0 , Ci and C j combined into one divergent centers set. So the distance between each intersection must meet:    i min d c˜m (k), c˜ni (k) < d0 1 ≤ m ≤ m˜ i , 1 ≤ n ≤ m˜ i , m = n.

(14)

IV. E XPERIMENTAL R ESULTS We show the experiments and the results of our approach in this section. We first focus on the velocity detection using videos from UMN dataset. After that, we experiment the divergent center detection approach using the synthetic and real vector fields, and the real vector fields include the UMN dataset, PETS dataset and other videos. In this work, we adopt the optical flow method to compute the vector fields. The size of video frame is resized to 240 × 320 pixels. The block size is 24 × 24, so the whole frame is partition into 10 ×14 small regions. The size of block depending on how far the camera is and how many details we want to get. Then we randomly select 10% of the field vectors in the training and employing k-means algorithm (k = 2) to get a suitable value of parameter ε which is the mean of the centroids of the two clusters. For the main parameters, the numbers of directions J is set to 8, the maximum number of divergent centers is set to 3 (In general, a scenario has not too much divergent centers and we only study the simple situations). We implement the proposed method with MATLAB, and all experiments are performed on a 2.8GHz dual core computer with 4GB of RAM. For a given video with a resolution of 320 × 240. It takes on average 0.7895 second to compute an optical flow field and takes on average 1.1004 second to detect divergent centers. A. Velocity Detection The approach described in the previous sections has been evaluated in the UMN dataset. This dataset have been widely used for performance evaluation, and it contains three escape scenes. Normal events depict individuals wandering around or organized in groups. Abnormal events depict a crowd escaping in panic. Fig. 3 shows some sample frames and the results of detection. We display the results of 3 different scenes and the corresponding frames which the escape events happened of each scene.

Fig. 3. Left column: some sample from the normal events. Centre column: some sample from the abnormal events. Right column: The results of velocity detection.

TABLE III A CCURACY (%) C OMPARISION OF THE P ROPOSED M ETHOD W ITH SF IN C ROWD E SCAPE B EHAVIOR D ETECTION

For this paper, the purpose of detecting the velocity is reducing the calculation for determining divergent centers. From Fig. 3, we can find a wave crest clearly. The velocity curve begins to increase at about 485th frame in Fig. 3(a) compared to the video data, the crowd of people start to escape. The velocity curve reaches its peak at about 526th frame. After 580th frame, with the people leaving out the scene, the curve begins to decrease. Fig. 3(b) and Fig. 3(c) show the same trend like Fig. 3(a). From Figure 3, the proposed approach successfully detected the trend of the abnormality. This laid the foundation of the detection the positions of anomaly. Thus the suitable threshold is very important to get an accurate output which is the input of detecting divergent centers. We test the proposed method and SF method in these scenarios. SF method in [30] and [34] was tested on the these two scenes, and we use the provided results to compare the corresponding results. Benefiting from comparison between the proposed method and SF method, our proposed method can achieve a more stable performance. More representative comparison results for scenarios are shown in Table III. It can be seen that the performance of our method is more accurate.

CHEN AND SHAO: CROWD ESCAPE BEHAVIOR DETECTION AND LOCALIZATION BASED ON DIVERGENT CENTERS

Fig. 4. Results of divergent center position detected in synthetic vector fields. (a) The situation of one divergent center. (b) The situation of two divergent center. (c) The situation of three divergent center.

2437

Fig. 6. (a) The results of position on the intersection figure. (b) Results of different method of outliers removing. (c) The influence of K on the results.

Fig. 7.

Divergent centers in real video.

Fig. 5. (a) The change of deviations when sigma is set as tan5o. (b) The histogram of deviation variation and the Gauss curve fitting.

B. Determination of Divergent Center Locations To determine the locations of the divergent centers based in the algorithm described in section 3. We use a set of synthetic vector fields and real video which is recorded by oneself for evaluation. We first specify the number of divergent centers is 1, 2 and 3 in three sets of synthetic vectors. In crowd scenes, the movement direction of individual could not meet the ideal situation which the individual flee from danger along the standard line due to restriction of other people around them. Thus the directions of foreground blocks are additionally corrupted by Gauss noise N (0, tan 5◦ ) in order to test the robustness of the proposed method. In an ideal world, the proposed algorithm can detect the location of anomaly accuracy. In the experiment, specify a maximum number for divergent centers does not appear. The results of the proposed method in ideal situation are shown as Fig. 4. The final set of divergent centers is marked by the red triangles, and the green dots denote the truth divergent centers. In the actual situation, we generate 2000 sets of random number which obey normal distribution to statistical the influence of direction vary on error. Results of one divergent center are shown as Fig. 5. From Fig. 5(a) we can see that the mean of deviations is 0.0944. This illustrates our approach is robust. And then we construct the histogram of deviation and fitting the Gauss curve. The mean and variance are determined by the method of non-parameter estimation. From Fig. 5(b) we can see that the deviations obey the asymmetrical distribution. The distribution is right-skewed distribution by calculation the 3   parameter of the skewness (g1 = s13 ni=1 X i − X¯ ) and the 4   kurtosis (g2 = s14 ni=1 X i − X¯ ). However, there are a lot of factors which can influence the algorithm performance. From the intersection figure of straight lines as shown on Fig. 6(a), we can known that the main

Fig. 8.

Divergent centers in different scenarios.

factors influence the algorithm are parameter of the K-Nearest Neighborhood Search K and the different methods of outliers removing. The value of K can be obtained in the training sets. From Fig. 6(b), we can see that an optimal K can be obtained through experiments to make the best error performance. And from Fig. 6(c) we can see that the optimal removing outliers method can be obtained within a certain change of angle by setting a proper parameter. We also use the proposed approach to detect divergent centers in a real video which records the scene that students go out of the building after class. The crowd is dispersed, so the divergent centers are always existent. Fig. 7 shows three scenes, and the results are superimposed in the video frames. Many obstacles exist in the video and the mapping of the glass is very obvious, so result the accuracy of algorithm decrease. The proposed algorithm is capable of determining a suitable number of divergent centers in this scene, thus this illustrates the proposed approach is robust. Finally, the detection of escape behavior events can be evaluated in the UMN dataset and the PETS dataset. After the velocity threshold is determined by the algorithm described in section II-B, the method of detecting the divergent centers method is applied. Fig. 8 shows nine representative cases, and

2438

IEEE SENSORS JOURNAL, VOL. 15, NO. 4, APRIL 2015

the results of detecting divergent center are superimposed on the video frames. C. Time Complexity Analysis In this paper, our proposed approach includes two main steps: one is computing the velocity and the other is determining the divergent centers. So the cost of our proposed approach also is comprised of two parts. The cost of the compute of velocity is mainly determined by the optically flow. And the cost of the proposed approach for determining divergent centers is mainly determined by the computing the intersections. The complexity is OT = O ([n × (n − 1)] /2) = O((n 2 − n)/2), where n is the number foreground objects for intersection sets computing. V. C ONCLUSION AND F UTURE W ORK This paper derives a simple anomaly detector that perfectly combines the velocity and divergent centers. Our method is simple but with high accuracy. In addition, we propose a divergent center framework that formulates locations of anomaly. As we all know, people usually instinctively escape from a place when abnormal or dangerous events occur. So we explored the change between the non-escape behavior and the escape behavior and the relationship between the escape behavior and the locations of the escape events. The proposed method can estimate the locations of the divergent centers when escape behaviors happen. We demonstrate the performance of our proposed method using multiple datasets. These experiments show that our approach is applicable to a wide range of scenes which consist of medium and high crowd density scenes. We believe the proposed frame work plays an important role in video analysis of crowd scenes. As part of our future work, we plan to further study the algorithm of determining divergent centers in order to improve the robustness. We plan to remove the part of the detection of velocity, and apply improved new algorithm of determining divergent to detect locations of the anomaly automatically. R EFERENCES [1] P. Tu, T. Sebastian, G. Doretto, N. Krahnstoever, J. Rittscher, and T. Yu, “Unified crowd segmentation,” in Proc. 10th ECCV, 2008, pp. 691–704. [2] Y. Cong, J. Yuan, and J. Liu, “Sparse reconstruction cost for abnormal event detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2011, pp. 3449–3456. [3] H. Zhong, J. Shi, and M. Visontai, “Detecting unusual activity in video,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2. Jun./Jul. 2004, pp. II-819–II-826. [4] D. Helbing, I. Farkas, and T. Vicsek, “Simulating dynamical features of escape panic,” Nature, vol. 407, no. 6803, pp. 487–490, 2000. [5] J. C. S. Jacques Junior, S. R. Musse, and C. R. Jung, “Crowd analysis using computer vision techniques,” IEEE Signal Process. Mag., vol. 27, no. 5, pp. 66–77, Sep. 2010. [6] B. Zhan, D. N. Monekosso, P. Remagnino, S. A. Velastin, and L.-Q. Xu, “Crowd analysis: A survey,” Mach. Vis. Appl., vol. 19, nos. 5–6, pp. 345–357, 2008. [7] J. Rittscher, P. H. Tu, and N. Krahnstoever, “Simultaneous estimation of segmentation and shape,” in Proc. Comput. Vis. Pattern Recognit., vol. 2, Washington, DC, USA, Jun. 2005, pp. 486–493. [8] S. Ali and M. Shah, “Floor fields for tracking in high density crowd scenes,” in Proc. 10th Eur. Conf. Comput. Vis., 2008, pp. II:1–II:14.

[9] D. Kong, D. Gray, and H. Tao, “A viewpoint invariant approach for crowd counting,” in Proc. 18th Int. Conf. Pattern Recognit., vol. 3, 2006, pp. 1187–1190. [10] R. Mehran, A. Oyama, and M. Shah, “Abnormal crowd behavior detection using social force model,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2009, pp. 935–942. [11] L. Kratz and K. Nishino, “Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2009, pp. 1446–1453. [12] S.-F. Lin, J.-Y. Chen, and H.-X. Chao, “Estimation of number of people in crowded scenes using perspective transformation,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 31, no. 6, pp. 645–654, Nov. 2001. [13] M. J. Jones and D. Snow, “Pedestrian detection using boosted features over many frames,” in Proc. 19th Int. Conf. Pattern Recognit., Dec. 2008, pp. 1–4. [14] O. P. Popoola and K. Wang, “Video-based abnormal human behavior recognition—A review,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 42, no. 6, pp. 865–878, Nov. 2012. [15] X. Liu, L. Lin, S. Yan, H. Jin, and W. Tao, “Integrating spatio-temporal context with multiview representation for object recognition in visual surveillance,” IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 4, pp. 393–407, Apr. 2011. [16] A. B. Chan and N. Vasconcelos, “Modeling, clustering, and segmenting video with mixtures of dynamic textures,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 5, pp. 909–926, May 2008. [17] D. Ryan, S. Denman, C. Fookes, and S. Sridharan, “Textures of optical flow for real-time anomaly detection in crowds,” in Proc. 8th IEEE Int. Conf. Adv. Video Signal-Based Surveill., Aug./Sep. 2011, pp. 230–235. [18] L. Seidenari, M. Bertini, and A. Del Bimbo, “Dense spatio-temporal features for non-parametric anomaly detection and localization,” in Proc. 1st ACM Workshop Anal. Retr. Tracked Events Motion Imag. Streams (ARTEMIS), 2010, pp. 27–32. [19] D.-Y. Chen and P.-C. Huang, “Visual-based human crowds behavior analysis based on graph modeling and matching,” IEEE Sensors J., vol. 13, no. 6, pp. 2129–2138, Jun. 2013. [20] C. Schuldt, I. Laptev, and B. Caputo, “Recognizing human actions: A local SVM approach,” in Proc. 17th ICPR, Aug. 2004, pp. 32–36. [21] P. Scovanner and M. F. Tappen, “Learning pedestrian dynamics from the real world,” in Proc. 12th ICCV, Sep./Oct. 2009, pp. 381–388. [22] A. B. Chan and N. Vasconcelos, “Counting people with low-level features and Bayesian regression,” IEEE Trans. Image Process., vol. 21, no. 4, pp. 2160–2177, Apr. 2012. [23] X. Wang, X. Ma, and W. E. L. Grimson, “Unsupervised activity perception in crowded and complicated scenes using hierarchical Bayesian models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 3, pp. 539–555, Mar. 2009. [24] R. Mehran, B. E. Moore, and M. Shah, “A streakline representation of flow in crowded scenes,” in Proc. 11th ECCV, 2010, pp. 439–452. [25] B. Zhao, L. Fei-Fei, and E. P. Xing, “Online detection of unusual events in videos via dynamic sparse coding,” in Proc. IEEE Conf. CVPR, Jun. 2011, pp. 3313–3320. [26] M. Bertini, A. Del Bimbo, and L. Seidenari, “Multi-scale and realtime non-parametric approach for anomaly detection and localization,” Comput. Vis. Image Understand., vol. 116, no. 3, pp. 320–329, 2012. [27] M. Thida, H.-L. Eng, M. Dorothy, and P. Pemagnino, “Learning video manifold for segmenting crowd events and abnormality detection,” in Proc. 10th Asian Conf. Comput. Vis., 2010, pp. 439–449. [28] Y. Benabbas, N. Ihaddadene, and C. Djeraba, “Motion pattern extraction and event detection for automatic visual surveillance,” J. Image Video Process., vol. 2011, pp. 1–15, Jan. 2011, Art. ID 7. [29] T. Hassner, Y. Itcher, and O. Kliper-Gross, “Violent flows: Real-time detection of violent crowd behavior,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Workshops, Jun. 2012, pp. 1–6. [30] S. Wu, H.-S. Wong, and Z. Yu, “A Bayesian model for crowd escape behavior detection,” IEEE Trans. Circuits Syst. Video Technol., vol. 24, no. 1, pp. 85–98, Jan. 2014. [31] C. Zach, T. Pock, and H. Bischof, “A duality based approach for realtime TV-L 1 optical flow,” in Proc. 29th DAGM Symp. Pattern Recognit., Heidelberg, Germany, 2007, pp. 214–223. [32] K.-W. Cheng, Y.-T. Chen, and W.-H. Fang, “Abnormal crowd behavior detection and localization using maximum sub-sequence search,” in Proc. 4th ACM/IEEE Int. Workshop Anal. Retr. Tracked Events Motion Imag. Stream (ARTEMIS), 2013, pp. 49–58.

CHEN AND SHAO: CROWD ESCAPE BEHAVIOR DETECTION AND LOCALIZATION BASED ON DIVERGENT CENTERS

[33] W. Li, V. Mahadevan, and N. Vasconcelos, “Anomaly detection and localization in crowded scenes,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 1, pp. 18–32, Jan. 2014. [34] J. K. Kearney, W. B. Thompson, and D. L. Boley, “Optical flow estimation: An error analysis of gradient-based methods with local optimization,” IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-9, no. 2, pp. 229–244, Mar. 1987.

Chun-Yu Chen received the B.S. degree in mechanical and electrical engineering from Shenyang Aerospace University, Shenyang, China, in 1997, and the M.S. and Ph.D. degrees in signal and information processing from the Harbin Institute of Technology, Harbin, China, in 2001 and 2006, respectively. He was a Post-Doctoral Research Fellow in Electronic Science and Engineering at the School of Astronautics, Harbin Institute of Technology, from 2006 to 2010, where he is currently an Associate Professor with the College of Information and Communication Engineering. His current research interests include computer vision, stereoscopic vision, intelligent signal processing, and embedded system.

2439

Yu Shao received the B.S. degree from the Inner Mongolia University of Science and Technology, Baotou, China, in 2012, and the master’s degree from the College of Information and Communication Engineering, Harbin Engineering University, Harbin, China, in 2012. Her current research interests include video signal processing, and video surveillance systems.