12th International Conference on Information Fusion Seattle, WA, USA, July 6-9, 2009
An ensemble approach for increased anomaly detection performance in video surveillance data Christoffer Brax Informatics Research Centre University of Skövde, Sweden and Saab AB, Skövde, Sweden
[email protected]
Lars Niklasson Informatics Research Centre University of Skövde Skövde, Sweden
[email protected]
information from two CCTV cameras. The information was used for anomaly detection, and the detection algorithm was able to detect a number of different anomalies. However, the environment used in the experiments was not very crowded: only a few objects were present in the scene at any given time and their motion patterns were mainly regular. This paper extends the previous work by looking at a much more crowded environment, in combination with advanced tracking and anomaly detection. The environment is an airport’s international departure hall. We will compare two different methods for anomaly detection and evaluate them individually as well as collectively: i.e., with their output fused. The hypothesis is that by combining two different anomaly detection methods, we should be able to detect more anomalies than when using just one algorithm. The paper is organized as follows: in Section 2 we present the basics of visual surveillance and anomaly detection. In Section 3 we describe the methods used in the experiments and provide details about the anomaly detection algorithms. The results are presented in Section 4 and discussed in Section 5. Finally, we conclude the paper in Section 6, which also proposes future work.
Abstract – The increased societal need for surveillance and the decrease in cost of sensors have led to a number of new challenges. The problem is not to collect data but to use it effectively for decision support. Manual interpretation of huge amounts of data in real-time is not feasible; the operator of a surveillance system needs support to analyze and understand all incoming data. In this paper an approach to intelligent video surveillance is presented, with emphasis on finding behavioural anomalies. Two different anomaly detection methods are compared and combined. The results show that it is possible to best increase the total detection performance by combining two different anomaly detectors rather than employing them independently. Keywords: anomaly detection, classifier fusion, CCTV, video content analysis, behaviour classification.
1
Introduction
The increased societal threats from terrorists and organized crime have led to an explosion in new installations of public surveillance systems. The number of sensors in public areas such as airports, train stations and harbours has increased. The amount of available information poses a great challenge for the operators of surveillance systems. We have previously argued [1] that manual analysis of video information from closed-circuit television (CCTV) cameras can give good results under optimal conditions, but that in most cases conditions are far from optimal. Most of today’s video surveillance systems are used in a reactive way: i.e., for historical review. In a typical scenario, an operator only inspects the footage in response to an alert: e.g., when an alarm goes off on a door that is monitored by a camera. New technology, such as video content analysis and anomaly detection algorithms, can be used to aid operators to be more proactive, alerting them to strange or anomalous behaviour among the objects surveyed by the CCTV cameras. We have previously presented experiments using an intelligent surveillance system [2]. In the reported experiments, a video tracker was used to extract 978-0-9824438-0-4 ©2009 ISIF
Rikard Laxhammar Informatics Research Centre University of Skövde, Sweden and Saab AB, Järfälla, Sweden
[email protected]
2 2.1
Background Visual surveillance
According to Hu et al. [3], the main tasks of visual surveillance are automatic detection, recognition and tracking of humans and vehicles in CCTV video. They identify a number of applications connected to visual surveillance, such as access control in restricted areas, identification of persons at a distance, crowd-flux and congestion analysis, anomaly detection and alarm raising, and interactive surveillance employing multiple cameras. The experimental system used for the experiment in this paper targets the two last applications, with focus on anomaly detection and alarm raising. Dick and Brooks [4] identify a number of research issues connected with automated video surveillance. The first is related to the lower levels of the JDL model [5], 694
and involves problems such as detection, tracking and classification of objects in CCTV video. The next is related to the higher levels of the JDL model, and involves problems such as human and traffic motion analysis. Dick and Brooks present a number of general considerations for automated surveillance systems, such as robustness, camera handoff, efficiency, and evaluation. Of the issues Dick and Brooks identify, we will in the following focus on the efficiency and evaluation problems.
2.2
3.1
The experimental platform used for all experiments is based on the one described in [1, 2], with some exceptions. Figure 1 shows the platform’s schematic design. The video tracking subsystem consists of three single CCTV cameras with software for extracting tracking information connected to each camera. These image trackers produce local tracks described in spherical coordinates relative to each camera. The local tracks are sent to a global tracker, which integrates the local tracks to produce global tracks in a Cartesian coordinate system with a fixed origin. The anomaly detection subsystem consist of two anomaly detection modules, which analyze global tracks and add an estimate of the degree of anomaly to each track. The output from the anomaly detection modules is then fused by the Anomaly Detection Fusion Module to get a single estimate of the degree of anomaly for each track. Finally, the anomaly classification is presented to an operator on a Man Machine Interface (MMI), together with the video and tracking information.
Anomaly detection
In recent years, methods based on anomaly detection have gained a lot of attention in the research community. Much of the research originates from the IT security domain [68] where anomaly detection is often used for discovering intruders. The information fusion community has also started to show interest [9-12], e.g. using anomaly detection methods in maritime surveillance to find anomalous vessel behaviours. There are many definitions of anomaly detection, not to mention what constitutes an anomaly. In this paper we make a distinction between two types of anomaly detection: knowledge-based and data-driven. In knowledge-based anomaly detection, the nature of the anomaly is known beforehand by eliciting information from a domain expert. This type of anomaly detection is also referred to as rule-, template-, or case-based detection. A good example is [13], where Guerriero et al. propose a system that searches for anomalies like “small boats on open sea” or “large ships without AIS transponder”. These two anomalies are known in advance and can therefore be built into the detection system. The data-driven approach starts from the premises that (1) large amounts of pre-recorded data are available, and (2) one cannot always define beforehand what behaviours should be detected. The pre-recorded data is assumed to reflect the normal behaviour of objects in a certain domain. From this data a normal model is built, which can then be used for classifying new observations as normal or anomalous, depending of how well they fit the model. An example here is [9], where Kraiman et al. use self-organizing maps and Gaussian mixture models for learning the normal behaviour of maritime vessels. The two approaches described above can complement each other. According to Patcha and Park [7], a combination of knowledge-based and data-driven methods can best detect both known anomalies and new, previously unknown anomalies.
3
Platform architecture
Image Tracker Image Tracker Image Tracker
Anomaly Detector 1 Global Tracker Anomaly Detector 2
Anomaly Detection Fusion Module
MMI
Figure 1: System architecture for the experimental system. All processing is done on-line in real time, with the exception of the video, which is recorded in advance. Instead of using live video from CCTV cameras, prerecorded video is played back to the image tracker. This provides a controlled environment in which to analyze and compare the performance of the individual and combined anomaly detectors objectively. The functionality of the demonstration platform is exactly the same regardless of whether live or pre-recorded video is used. One of the new features of the video tracking subsystem compared to the one used in [1, 2] is the ability to track objects between two, non-overlapping cameras. This is an important feature that will potentially have a large impact on the anomaly detection performance when dealing with anomalies that are spread over a wide area that may lack complete camera coverage.
3.2
Data set
The video was recorded between 7 am and 7 pm for three consecutive days. The data from the first and second day was used as training data and implicitly assumed to reflect normal behaviour of people at the airport. During the third day a number of deliberate anomalies were recorded together with the normal activity. These anomalies were manually classified; all other observations were
Methodology
An experimentation platform was built to evaluate whether the classification performance of an anomaly detection system could be increased by running multiple anomaly detection algorithms in parallel and fusing their output. The video data for it was recorded in the international departure hall of a major airport. 695
The refresh rate provided by the trackers is 4 Hz for the global tracker and 15 Hz for the local tracker. The video frame rate is 15 Hz. During the third day a total of twelve different types of anomalies were recorded; see Table 2. For some anomalies, a number of variants were recorded. In total 25 different anomalies were recorded and manually labelled. Some of the 25 anomalies were recorded at more than one time, e.g. in the morning and then in the afternoon. A total of 40 unique anomalies were recorded for a total of 281 anomalous tracks.
considered to be normal. Table 1 shows how many normal and anomalous tracks were recorded during each of the three days. A subset of Day Three’s tracks was used as test data for the experiments. This subset included 281 anomalous tracks as well as 13995 normal tracks that were present in the same data. Table 1: Statistics for the data set used in the experiments. Total number of tracks considered normal Number of anomalous tracks
Day 1 18538
0
Day 2 29103
0
Day 3 36338 (13995 used in test data set)
Table 2: A list of the types of anomalous situations used in the experiments. No 1 2
281 (used in test data set)
Even though all tracks during Day One and Two are assumed to be normal, it is possible they contain some tracks that could be classified as anomalous. This could potentially raise a problem. However, so long as the number of anomalous tracks is sufficiently lower than the number of normal tracks, the anomalous tracks should not have a significant negative impact on the classification performance of the anomaly detection algorithms. The focus in the experiments was on the performance of the detectors and not on the quality of the training data. High performance would suggest that the impact of any potential anomalies in the training data is low. Low performance would suggest either that the detection algorithms have poor performance or that the impact of potential anomalies in the training data is high. Figure 2 shows the layout of the departure hall and the approximate positions and field-of-view of the cameras. Two check-in counters and two staircases/ escalators to the security check area are also shown.
Check-in
a er m a C 1
Camera 2
3 4 5 6 7 8 9 10 11 12
Description Someone leaves a bag and walks away (4 variants). A bag is picked up by another individual than its owner (2 variants). A loitering person walks around in the hall for a long time without checking in (2 variants). A person runs through the hall. A person moves in the wrong direction on an escalator or staircase (2 variants). A person runs from the check-in counter towards the security check (2 variants). A person walks around in unusual places (2 variants). A person stands still where people usually do not (2 variants). 3-4 persons run in different directions from one starting point (2 variants). 3-4 persons form a group at an unusual location (3 variants). Pick pocket, bag theft: One person walks up, takes something, and hands it on to someone else (2 variants). A person walks from group to group.
Some of these anomalies can be detected by using a single camera, e.g. the abandoned bag (1); other anomalies, such as the loitering person (3), require tracking over multiple cameras which may have non-overlapping fields of view.
Check-in
Camera 3
3.3 3.3.1 Figure 2: Camera setup at the demonstration site.
Algorithms Anomaly detector 1: State-based anomaly detection
The first detection algorithm used is state-based anomaly detection, which we have used previously for finding anomalies in video surveillance data [1, 2]. The algorithm employs discrete state-based representation for both contextual information and real-time data from sensors; the representation is similar to multi-dimensional histograms that traditionally have been used for query optimization in databases [14]. The main difference from multi-dimensional histograms is that this method uses
The global tracker outputs tracks with the following features: • ID. • Time stamp. • X, Y and Z coordinates for the object. • X, Y, and Z velocity for the object. • Width and height of the object
696
expert knowledge to set the boundaries of the bins instead of generating them from the data. The state-based representation is built upon a number of state classes. A state class is a discretization of a feature: e.g., course (or heading). A state class encompasses a number of possible states. For example the class CourseState might have four different states: north, south, east, and west. In this case the following state classes were used: • • • • • • •
The basic model of normal activity remains the same as in [2]. This model is built by analyzing the relative frequency of different composite states in the training data set. We have shown previously [2] that this kind of model can be used to detect certain anomalies: i.e., those situations where one or several of the atomic states occur rarely. This kind of anomaly is referred to as a state occurrence anomaly. A threshold (called T_OCCURRENCE) is used for deciding if a particular observation can be said to include a state occurrence anomaly. Adjusting this threshold changes the sensitivity of the detection. There are, however, problems with relying only on state occurrences. Imagine a situation where the observed object suddenly decreases in velocity. This should be detected as a rare state occurrence; but how should it be interpreted? On the one hand, it is possible that the lower velocityis normal for that area. On the other, that we went from a composite state with high velocity to a composite state with low velocity might not be. The model of normal activity needs to be extended with information on composite state transitions. We analyze the sequence of observations for each object in the training data and count the number of state transitions for each pair of composite states: e.g. Composite_A Composite_B, etc. If the number of state transitions for each pair is divided by the total number of state transitions, we get a probability for each transition. This value can be used to find state transition anomalies, defined as those state transitions with a probability lower than the threshold T_TRANSITION. This threshold can be set to zero if we want only new, previously unseen state transitions be classified as anomalous; or to some value above zero if we want the system to be sensitive to preexisting state transition anomalies. State transitions between the same states, such as Composite_A Composite_A, are special cases of state transitions and must be handled separately. An example could be a bag that is left behind. The bag would get tagged with a number of same-state transitions. We want to be able to detect anomalies such as an object staying in the same state too long. We therefore add another set of information to the model that says how long it is normal to stay in the same state. For example, it might be normal to stay in Composite_A for one to five time steps: e.g. a person is waiting for the elevator; but if the person stays in Composite_A for six or more time steps, the model should classify the behaviour as anomalous. We call this same-state anomaly. An observation is classified as a same-state anomaly if it stays in the same state longer than the maximum number of time steps defined by the model, as learned from the training data. What is needed to classify a situation as anomalous? It is not enough that one or more of the three classifiers above classify a single observation as anomalous. A single observation of anomaly can be caused by noise in
CourseState, with nine different states: eight compass points (north, north- west, west, etc.) plus undefined PositionState, with a 10x5 grid placed around the area of interest. SpeedState, with four states: stopped, slow, medium, and fast. RelationState, with five states: inFront, behind, left, right, and undefined. EnvironmentalState, with six states: onFloor, onEscalator, onSeats, nearMachines, onStaricase, and undefined. SizeState, with five states: small, medium, large, xlarge, and undefined. TimeState, with four states: morning, lunch, afternoon, and night.
The bounds for each state are set by a domain expert, based on what can be expected in the data set given the domain. So for example, the bounds for each state in EnvironmentState are set by comparing the position of the object to a map over the area. The seven state classes above are called atomic states classes; there can also be composite state classes, which are built by combining atomic state classes. In this case we use only one composite state class, which is built from one instance of each of the seven atomic state classes. An example of a composite state class is: CompositeState_A = (CourseState=north, PositionState=row12_col5, SpeedState=Stopped, RelationState=inFront, EnvironmentState=onFloor, SizeState=small, TimeState=lunch). CompositeState_A tracks an object with heading north (±22.5 degrees), x and y coordinates of (12,5) , and velocity between 0 and 1 km/h. Another object lies in front of the object; the object is situated on the floor; it has a size under 0.3 m2; it is observed during lunch hours. After analysing the training data, we concluded that additional state classes were needed beyond those used in [2]. For example, it was very difficult to find a bag that was left unattended, because the representation scheme could not distinguish between a small bag and a large person. Therefore the SizeState state class was added. It divides the training data set into five classes based on height and width information from the video tracker. Another state class called TimeState was added to handle daily fluctuations in the data. An observed behaviour is noted as occurring in the morning, at lunch, in the afternoon, or at night. 697
the system. To get a more robust classification, it is necessary to look at multiple observations instead of just one. If the observations of e.g. a person are classified as anomalous for more than n time steps the situation should be classified as anomalous. The analysis of multiple observations can be done in two ways: over the entire track or over a shorter interval. The advantage of the latter is that it is possible to detect objects that go from normal to anomalous and than back to normal again. The interval is implemented as a sliding window of n time steps where the window size is defined by T_WINDOWSIZE. The last thing to consider is how to combine the three classifiers. Should they be added or multiplied; should they have the same or different weights? After a number of trials analysing different fusion strategies, we concluded that the occurrence classifier and the transition classifier should be weighted equally. On the other hand, although the SameState classifier was seldom triggered, when it was, it was often due to a genuine anomaly. Therefore the SameState classifier is weighted to increase the total estimate of anomaly by T_WINDOWSIZE over two: i.e., the same as if all an occurrence anomaly was detected on all the observations and a transition anomaly on half the observations within the sliding window. See [15] for a detailed review of classifier fusion methods. For each time step, the last T_WINDOWSIZE observations are classified according to each of the three classifiers. The value for each observation can be zero or one depending on whether an anomaly occurred. The total anomaly value for each classifier will be in the interval [0, T_WINDOWSIZE]. The fused degree of anomaly is denoted by DOA_TOTAL. This is used together with threshold T_ANOMALY for deciding if a situation should be classified as normal or anomalous. The total degree of anomaly is used by the Anomaly Detection Fusion Module. The thresholds T_WINDOWSIZE, T_ANOMALY, T_TRANSITION, and T_OCCURANCE are set by empirical investigation of the number of false positives in the training data set.
3.3.2
stationary tracks were also used. The probability density function (PDF) for normal values of these features is approximated by multivariate Gaussian mixture models (GMMs). To reduce the complexity of the GMMs, the surveillance area is divided into a uniformly sized grid, where each cell models the local distribution of the features. For each cell we have a four-dimensional GMM modelling the local position-velocity vector (capturing correlations between position, speed, and heading of targets), a one-dimensional GMM modelling the local track age, and a three-dimensional GMM correlating the accumulated time that objects have remained stationary with their position in each cell. During anomaly detection, the likelihood from each model is compared to a modelspecific threshold; if one or more likelihood fall below the threshold, the observation is classified as anomalous. In this approach, individual observations are classified as normal or anomalous. One may argue that it is more relevant to consider anomalous tracks than particular anomalous observations within each track. With that in mind, we will classify a track as anomalous whenever we detect one or more anomalous observations associated to it.
3.3.3
Classifier fusion
Several studies have already shown that using a combination of classifiers can offer a significant performance increase for many classification problems [15; 16]. We wanted to investigate whether the fused output from the two anomaly detectors described above is better that the output from either of them taken individually. The Anomaly Detection Fusion Module (ADFM) takes its inputs from Anomaly Detector 1 (AD1) and Anomaly Detector 2 (AD2). The output from the anomaly detectors contains the same information as the output from the global tracker: no information has been lost. However, the anomaly detectors add additional information about what anomalies they have detected: anomaly exists (a boolean value), degree of anomaly (a percentage), and type of anomaly (a classification string). In the ADFM these three pieces of information are fused and the result sent, along with the original tracking information, to the MMI. In the current set up, the method of fusing the output from the two anomaly detectors is rather simplistic. If either of the detectors reports an anomaly, then the ADFM reports an anomaly. If only one of the detectors reports an anomaly, then the information about the anomaly is passed through unchanged. If both detectors report an anomaly, then the resultant degree of anomaly is the highest value of the individual degrees of anomaly. The resultant type of anomaly is simply the concatenation of the individual strings, joined by a plus sign.
Anomaly detector 2: Anomaly detection based on momentary track features
The second detection algorithm is based on momentary track features. It has previously been used for detection of anomalous behaviour of maritime vessels [10] and anomalous events in video surveillance data [1]. The algorithm uses a statistical model of momentary feature values in normal activity then does anomaly detection based on the likelihood that newly observed feature values could have been generated by the statistical model. The basic feature model used in this study is extended from the one used in [1]. Besides momentary kinematics (position and velocity vector), the accumulated age of each active track and the accumulated stationary time for 698
3.4
also decided to include the results from settings resulting in 10.2% alarm levels. The summary of these results is shown in Table 3.
Evaluation
According to Dick and Brooks [4], evaluation of surveillance systems is an important issue that is inherently difficult. This is due to the fact that most surveillance systems are very domain specific. Dick and Brooks argue that it is often best to make an extensive evaluation of a system in the environment in which intended to be used. However, there are many situations where this is not feasible. In these cases quantitative performance measures like Receiver Operating Characteristic (ROC) curves can be used. As said previously, the aim here was to investigate whether a fusion of the output of two anomaly detectors performs better than either detector on its own. The performance is measured by the number of true positives (TP) and false positives (FP) among the tracks labeled as anomalous or not. A true positive is a track correctly classified as containing an anomaly. A false positive is a track incorrectly classified as containing an anomaly. To repeat, video data from the first two days was used for building the model of normal activity: i.e., all of the tracks were labeled “normal”. The data from the third day was used as test data. This data set included both normal and anomalous tracks. The anomalous tracks were manually labeled “anomalous” and the rest “normal”. The tracks labeled “normal” were not further analyzed, and so there is a possibility that some of them could be judged to contain anomalies. The thresholds used by both anomaly detectors can be adjusted to fine-tune the performance. Three different threshold sets were used for each anomaly detector. The values were adjusted in order to achieve levels of approximately 1%, 3% and 7% alarm levels (combined TPs and FPs) in the test data set.
4
Table 3: Results from experiments with AD1 using four threshold sets corresponding to 1.1%, 3.3%, 7.2% and 9.9% alarm levels in the test data set. AD1: Threshold TPs FPs
1.1%
3.3%
7.2%
10.2%
12.5% 5 of 40
32.5% 13 of 40
45% 18 of 40
62.5% 25 of 40
1.1%
3.2%
7.0%
9.9%
The parameters for AD2 were set with the goal of obtaining 0.3%, 1%, and 3% alarm levels from each of the three feature models. After combining the three models, the resultant alarm levels were 0.9%, 3.4%, and 7.2% respectively. The results from AD2 are shown in Table 4. Table 4: Results from experiments with AD2 using three threshold sets corresponding to 0.9%, 3.4% and 7.2% alarm levels in the test data set. AD2: Threshold TPs FPs
0.9%
3.4%
7.2%
2.5% 1 of 40
15% 6 of 40
27.5% 11 of 40
0.9%
3.3%
7.1%
The result of fusing the output from AD1 and AD2 is shown in Table 5. We used four different threshold sets for each anomaly detector. The alarm levels for the first three were approximately the same. We also included one set for each detector with the threshold values that detected the most anomalies. The table should be read as follows. The first row shows the alarm levels for AD1 and AD2. The second row shows the number of anomalies (TPs) found by combining the output from the two anomaly detectors, e.g., at the 1.1%/0.9% alarm level, the total anomalies found are six of the forty, or 15%. The third row shows the FP interval: the range of complete FP overlap between the two detectors. The lower bound is set by the detector with the highest number of FPs, i.e. AD1 with 1.1%. The upper bound is determined by adding the FP rates together, i.e. 1.1% + 0.9% = 2%. The upper bound in the FP range assumes there is no overlap between the two detectors: i.e., it does not consider the possibility of FPs found by both detectors. The fourth row show the TP overlap, e.g., at the 1.1%/0.9% alarm level, five anomalies were detected only by AD1, one was detected only by AD2, and no anomalies were detected by both AD1 and AD2.
Results
AD1 and AD2 were run separately; then we calculated the theoretical performance of both methods running together with the ADFM. This does not pose any difficulties, due to the simple fusion scheme implemented in the ADFM. The results are derived from how many of the 40 unique anomalies the detectors were able to find. An anomaly was considered successfully detected if at least one of the tracks that the anomaly was implicated in was detected. This is admittedly a liberal metric; we justified it on the grounds that it should be enough if at least some part of the anomalous situation raised the attention of the system. The threshold settings for AD1 were decided by iteratively running Monte Carlo simulations with different values for T_WINDOWSIZE, T_ANOMALY, T_TRANSITION, and T_OCCURRENCE. After over a hundred simulations, suitable threshold settings were found resulting in 1.1%, 3.3%, and 7.2% alarm levels in the test data set. These values were as close as we were able to obtain to the target levels of 1%, 3%, and 7%. We 699
Table 5: Results from fusing the output from AD1 and AD2. AD1 + AD2 TPs
1.1% + 0.9% 15% 6 of 40
3.3% + 3.4% 42.5% 17 of 40
7.2% + 7.2% 55% 22 of 40
70% 28 of 40
FPs
1.1-2%
3.3-6.5%
7.1-14.1%
9.9-17 %
TP overlap
AD1: 5 AD2: 1 Both: 0
AD1: 11 AD2: 4 Both: 2
AD1: 11 AD2: 4 Both: 7
AD1: 17 AD2: 3 Both: 8
positions while e.g. they are using the automatic check-in machines. It was also hard to find anomalies in the periphery of the cameras' field-of-view (e.g., anomalies 7 and 8). Another problem for anomaly detection is that the scene typically included a lot of people, each with many degrees of freedom. Their movement patterns were effectively arbitrary, even if some people did keep to certain paths. Of course this significantly affects the model of normal activity. The scenarios intended to be understood as anomalous might simply disappear into the background noise. In this study two completely different anomaly detection methods were used, each with its own advantages and limitations. We believe that there is clearly no single approach offering optimal performance for all possible anomaly detection problems; hence a fusion of different approaches is one way to get a system to handle a wider range of detection problems. The approach taken to classifier fusion in this study was to combine different kinds of classifiers. Alternatively one might use a number of identical classifiers built on different data sets with different thresholds. This ensemble approach is commonly used in data mining, and there are a wide range of effective methods for fusing the results [15]. The results of our study confirm that combining the output from two different anomaly detectors can increase total detection performance. The downside of course is that the number of FPs also increases.
10.2% + 7.2%
The results Table 5 do indeed confirm that we can find more anomalies when combining the two anomaly detectors. However, whether this proves the value of doing so is not at all clear: it all depends on the FP overlap. If the third threshold set results in 7.1% FPs (best case), it is better to use two detectors; but if it results in 14.1% FPs, it is better to only use AD1, with the 10.2% threshold. That finds three more anomalies than the fused scenario we just considered, with a 3.9 percentage points fewer FPs. We analyzed the FP overlap for the 3.3% + 3.4% threshold set (the middle column in Table 5) and found an overlap of 1.1: i.e., only 1.1% of all FPs were found by both AD1 and AD2. This means that the fused number of FPs is closer to 6.5% than to 3.3%. The four threshold sets had approximately the same TP overlap; therefore, the FP overlap should be approximately the same as well. This allows us to conclude that the fused number of FPs is close to the upper bound. This implies that e.g. for the 7.2% + 7.2% set, it is better to use AD1 on its own, with the 10.2% threshold. For the 1.1% + 0.9% and 3.3% + 3.4% sets, it is better to use the multiple detector approach.
5
6
Conclusion and future work
In this study, previous work by the same authors on detecting behavioural anomalies in video surveillance data was extended to a significantly more complex domain, more complex both in terms of the number of objects present and in terms of the variance in what qualified as normal. To address the increased complexity, we attempted to extend and combine both anomaly detection methods from [1]. The state-based approach was extended with more state classes and with temporal state transitions. This did indeed increase the detection performance; but aspects of the method require further investigation: e.g., how to combine the internal classifiers, how to find appropriate thresholds, how to find the optimal ratio between the total number of states and the amount of training data, and how to generate the states in a state class automatically. The momentary track feature method was extended with a new GMM feature model capturing position and stationary time. This enabled it to detect objects that remain stationary for an anomalous amount of time. However, again aspects of the method require further investigation: the optimal threshold values to use when fusing the three internal classifiers, the optimal size of the grids (used to decrease the complexity of the feature
Discussion
The results are not as impressive as in the earlier study [1]. This is not surprising given the dramatic increase in scene complexity between the two studies. The increased complexity places significant demands on the video tracking subsystem, resulting in many tracks that were broken up into several sub-tracks. This makes finding anomalies based on stationary time (i.e., time in scene or time in the same state) very difficult for either detector. The abandoned bag (Anomaly 1 in Table 2) was intended to be detected as an object standing still for an abnormally long amount of time. This only happened in two of eight instances, for the simple reason that the bag changed identity a number of times. Even though the bag was stationary in the video images, the position and velocity reported by the tracker fluctuated. In an attempt to compensate for this in AD1, the CourseState was always set to undefined when the SpeedState was set to Stopped. The detection rate of abandoned bags is not helped by the tendency of people to put their bags in similar 700
6.
models), and the best way to make use of more than one observation when classifying a track. The study confirms that combining multiple anomaly detection methods can indeed increase the number of successfully detected anomalies. However, the performance is still far removed from that of the previous study [1], where the number of FPs was lower than 1%, and all anomalies were detected. In this study, the best arrangement resulted in 17% FPs while correctly detecting 70% of the anomalies. This level of FPs makes real-world application difficult. The quality of the data set is much lower than in the previous project. The present tracking subsystem is not able consistently to maintain the tracking during the lifetime of objects. Both the video tracker and the track correlator needs to be improved, and the anomaly detection methods need to be made more resistant to broken tracks. Because of the unavoidably low quality of the data set, the anomaly detection methods should first be evaluated on a fully labelled mock data set. This should reveal which properties in the track behaviour most impact the detection performance most for each of the anomaly detection methods. Another area for improvement is the method of data fusion. This study took a naïve approach. Other, more sophisticated methods should be evaluated to see if it is possible to decrease the percentage of FPs.
7.
8.
9.
10. 11. 12.
13.
14.
Acknowledgements This work was supported by the Information Fusion Research Program (www.infofusion.se) at the University of Skövde, Sweden, in partnership with the Swedish Knowledge Foundation, under grant 2003/0104, and with participating private companies. This work was also supported by the LFV Group, Sweden.
15.
References 1.
2.
3.
4. 5.
Brax, C., R. Laxhammar, and L. Niklasson. Approaches for Detecting Behavioural Anomalies in Public Areas Using Video Surveillance Data. in SPIE Europe Security + Defence 2008. 2008. Cardiff,.Wales, United Kingdom Brax, C., L. Niklasson, and M. Smedberg. Finding behavioural anomalies in public areas using video surveillance data. in The 11th International Conference on Information Fusion. 2008. Cologne, Germany. Hu, W., et al., A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2004. 34(3): p. 334-352. Dick, A.R. and M.J. Brooks. Issues in automated visual surveillance. 2003: CSIRO Publishing. Llinas, J., et al. Revisions and extensions to the JDL data fusion model II. in Proceedings of the Seventh International Conference on Information Fusion. 2004. Stockholm, Sweden: International Society of Information Fusion.
701
Lazarevic, A., et al. A comparative study of anomaly detection schemes in network intrusion detection. in Proceedings of the third SIAM international conference on data mining. 2003. San Francisco, CA, USA. Patcha, A. and J.-M. Park, An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput. Networks, 2007. 51(12): p. 3448-3470. Portnoy, L., E. Eskin, and S.J. Stolfo. Intrusion detection with unlabeled data using clustering. in ACM Workshop on Data Mining Applied to Security (DMSA-2001). 2001. USA. Kraiman, J.B., S.L. Arouh, and M.L. Webb. Automated anomaly detection processor. in Enabling Technologies for Simulation Science VI. 2002. Orlando, FL, USA: SPIE. Laxhammar, R. Anomaly detection for sea surveillance. in The 11th International Conference on Information Fusion. 2008. Cologne, Germany. Piciarelli, C. and G.L. Foresti, On-line trajectory clustering for anomalous events detection. Pattern Recogn. Lett., 2006. 27(15): p. 1835-1842. Rhodes, B.J., et al. Maritime situation monitoring and awareness using learning mechanisms. in IEEE Military Communications Conference 2005. Atlantic City, New Jersey. Guerriero, M., et al. Radar/AIS Data Fusion and SAR tasking for Maritime Surveillance. in The 11th International Conference on Information Fusion. 2008. Cologne, Germany. Nitin, T., et al., Dynamic multidimensional histograms, in Proceedings of the 2002 ACM SIGMOD international conference on Management of data. 2002, ACM: Madison, Wisconsin. Ruta, D. and B. Gabrys. An overview of classifier fusion methods. in Computing and Information Systems. vol. 7, no. 1. 2000: University of Paisley.