Pattern Analysis and Applications manuscript No. (will be inserted by the editor)
Tracking-based Event Detection for CCTV Systems Luis M. Fuentes1 , Sergio A. Velastin2 1
Departamento de Fisica Aplicada Escuela Universitaria de Informatica de Segovia, Universidad de Valladolid Plaza Santa Eulalia 9-11, 40005 Segovia, Spain Tel:+34921112450, Fax:+34921441210
[email protected]
2
Digital Imaging Research Centre Kingston University Penrhyn Road Kingston-upon-Thames, Surrey KT1 2EE, UK
Received: date / Revised version: date
This paper presents a description of an event
techniques in order to build an intelligence surveillance
detection algorithm based on trajectories designed for
system being able to detect “potentially dangerous situ-
CCTV surveillance systems. Following the foreground
ations”.
Abstract
segmentation, blob and scene basic characteristics -blob position or speed and people density- are used to create
Keywords: Tracking, Human interaction, Event detec-
a low level descriptions of predefined events. Comparing
tion, Surveillance, CCTV Systems
sequence parameters with the semantic description of the events associated with the current scenario, the system is able to detect them and raise an alert signal to the
1 Introduction
operator, the final decision-maker. In the approach pre-
The use of CCTV systems in surveillance has grown
sented here, the specific demands for CCTV surveillance
exponentially in the last decade. According to Norris
systems applied to public transport environments will be
[1] this “rise of CCTV surveillance systems has to be
analysed together with the appropriate image processing
seen in the context of underlying social, political and
2
Luis M. Fuentes, Sergio A. Velastin
economic pressures which have led to the intensifica-
providing a technological aid to overcome the drawbacks
tion of surveillance across a whole range of spheres”.
associated with the huge amount of visual information
The main sphere of application is related with crime re-
generated by these surveillance systems. The concept of
duction, based upon some assumptions including: deter-
“advanced visual surveillance” involves not only the in-
rence, presence of capable guardian, detection and iden-
terpretation of a sequence of images but also the detec-
tification. The effectiveness of CCTV systems in crime
tion of predefined events susceptible of trigger an alarm.
prevention has been questioned [2] arguing that its big-
The use of constraints obtained from the knowledge of
ger success comes from the identification of criminals, as
both, the task and the environment allows a greater com-
evidence in trials. However, its expansion goes on and
putational efficiency [4].
coverage extend towards ubiquity. Even more, wireless
However, intelligent surveillance systems have to deal
technology is been introduced in order to cover crime
with an especially difficult task when people and their
hot spots easier and faster.
behaviour are to be analysed. There is an extensive lit-
In some application spheres, however, CCTV surveil-
erature dealing with the different approaches used to de-
lance systems are used basically as monitoring tools. The
tect and track persons, to track hands and detect ges-
Metro and Railways general inquiry results [3] confirm
tures, to identify faces, etc. These approaches are driven
that the companies make a large use of CCTV systems
by many different applications involving moving person
-composed by the CCTV camera network, central and lo-
of moving parts of the human body; such as hand sign
cal control rooms and human resources- for surveillance
language recognition, hand and head driven interfaces,
in both public and restricted areas in theirs stations. The
face recognition or smart surveillance systems [5, 6].
same inquiry shows that the main objectives of CCTV
The approach presented here shows the possibility of us-
systems are the passenger’s safety and the knowledge of
ing existing CCTV systems in public transport environ-
passengers flows (for counting or as assistance for the
ments to be used not only as monitoring tools but also
planning of facilities). Almost all spaces inside a station
as active surveillance tools capable of detecting “poten-
are under CCTV surveillance, from the well covered plat-
tially dangerous situations”, as defined by the final user.
forms through the halls, corridors, escalators and stairs
After introducing a basic processing algorithm, based
to the less covered control gates and ticket purchase ar-
on the novel idea of using luminance contrast in fore-
eas. In these contexts, the development of an automated
ground detection, a simple tracking algorithm, based on
surveillance system is highly desired as means of im-
overlapping bounding boxes, is used to produce an event
proving the performance of the existing CCTV systems
detector based on the analysis of blobs’ trajectories and
Tracking-based Event Detection for CCTV Systems
3
positions. An event detection scheme is then proposed
ping bounding boxes and linear prediction as matching
and some results analysed.
criteria, obtaining people’s trajectories without handling occlusion.
2 Related Work
The use of trajectories to extract human action informa-
The proposed event detection method is based on the
tion has also been suggested before to determine wether
analysis of people’s trajectories obtained from a simple
the people being tracked were walking, running, biking
tracking method. Tracking algorithms establish a corre-
or roller blading [16]. Event detection systems have been
spondence between the image structure of two consecu-
applied on traffic analysis [17] and human interaction
tive frames. Typically the tracking process involves the
at sport grounds to intelligent surveillance systems [18].
matching of image features for non-rigid objects such
There are also examples of full multi-camera surveillance
as people, or correspondence models, widely used with
systems applied to basic event detection [19]. In the last
rigid objects like cars. Many approaches have been pro-
two years, however, the number of research programs
posed for tracking a human body, as can be seen in
in the area [26–28] and published works has greatly in-
some reviews [5, 6]. Some are applied in relatively con-
creased: trajectory based detection of activities [30], the
trolled [7–9] or in variable outdoor environments [10, 11].
introduction of an event ontology for the description of
The matching process uses the information of overlap-
spatial and temporal events [31], or an event detector
ping boxes [11], colour histogram back projection [12]
based on split and merge behaviour of blobs, [32] are
or different blob features such as colour or distance be-
just a small example of the results obtained.
tween the blobs. In some approaches all these features
The following paragraphs will introduce a simple event
are used to create the so-called matching matrices [13].
detection system adapted for indoors transport environ-
In many cases, Kalman filters are used to predict the
ments [20]. Once again, the described system, as part
position of the blob and match it with the closest blob
of a very complex surveillance system [29], is focused on
[9]. The use of blob trajectory [9] or blob colour [11]
simplicity and processing speed. Therefore, the event de-
helps to solve occlusion problems. In many cases, dif-
tector does not present a better performance than some
ferent approaches are used together to improve perfor-
of the examples cited above but can work in real-time
mance [14]. The proposed system works indoors with
with multiple image signals and therefore, its application
blobs -defined as bounding boxes representing the fore-
to existing surveillance system is straightforward. The
ground objects- and tracking is performed by matching
authors use a novel foreground detection technique and
boxes from two consecutive frames [15] using overlap-
a similar ontologic approach to event detection [31], al-
4
Luis M. Fuentes, Sergio A. Velastin
beit the mathematical formulation, together with a blob
background image is needed. The central points of the
behaviour analysis [32]. The main difference from them
method are described below.
is its nature: an application driven approach born from the nature and the amount of the video signal to use and the events to detect.
3.1.1 Definition of luminance contrast
Luminance con-
trast is an important magnitude in psychophysics and the central point in the definition of the visibility of a
3 Basic processing
particular object. Typically, luminance contrast is defined as the relative difference between object luminance,
In certain environments, the number of CCTV surveil-
LO , and surrounding background luminance, LB .
lance cameras is so high that the idea of a dedicate PC C=
for each one is simply unacceptable. Therefore, in order
LO − LB LB
to be successfully applied to security monitoring tasks,
To apply this concept in foreground detection we pro-
an Advanced Surveillance System has to be able to pro-
pose an alternative contrast definition comparing the lu-
cess in real time more than one image channel, typically
minance coordinate in the YUV colour system [22] ‘y’ of
in up to four image channels. A simplification of the im-
a pixel P (i, j) in both the current and the background
age processing stage is then necessary in order to achieve
images:
real-time processing speed for multiple image channels.
C(i, j) =
y(i, j) − yB (i, j) yB (i, j)
In the process leading from an acquired image to seLuminance values are in the ranges [0,255] for images quence related information, foreground segmentation is digitised in YUV format or [16,255] for images transparticularly important in computational terms. A foreformed from RGB coordinates. Null (zero) values for ground detection method based on luminance contrast background ‘y’ coordinate are changed to one because along with a straightforward tracking algorithm are used the infinite contrast value they produce has no physito achieve real-time image processing. cal meaning -there is normally no real chance of getting that value as consequence of thermal noise in CCD 3.1 Segmentation detectors-. Contrast values will oscillate in the range [Foreground pixels detection is achieved using luminance
1,254]. Values around zero are expected for background
contrast [21]. This method reduces computational time
pixels, negative values for foreground pixels darker than
by using black and white segmentation in colour im-
their corresponding background pixels and positive val-
ages. It also simplifies the background model, just a
ues for brighter pixels, figure 3.1.1. Highest values of
Tracking-based Event Detection for CCTV Systems
5
zero, the foreground detection algorithm should use two different thresholds, CP for positive and CN for negative values of contrast, depending on the nature of both the background and the objects to be segmented. For the sake of simplicity, a single contrast threshold C will be assumed, that is CP = −CN = C. Therefore, a pixel P (i, j) is set to foreground when the absolute value of its contrast is bigger than the chosen threshold. Otherwise it is set to background. A median filter is applied afterwards to reduce noise and the remaining foreground pixels are grouped into an initial blob. This blob is divided horizontally and vertically using X-Y projected histograms, box size and height-to-width ratio. Resulting blobs are classified, according to their size and aspect, and characterised with the following features: bounding box, width, height and the centroid of foreground pixels in the box.
3.2 Tracking Fig. 1 Two examples of luminance contrast distribution. The lower plot has been sub-sampled.
The algorithm used here [15] is based on a two-way matching matrices algorithm (matching blobs from the current frame with those of the previous one and vice
contrast values are obtained under the unusual circum-
versa) with the overlapping of bounding boxes as a match-
stances of very bright objects against very dark back-
ing criterion. Altough very simple, this criterion has been
grounds and values bigger than 10 are not likely to be
found to be effective in other approaches [11] and does
obtained.
not require the prediction of the blob’s position because the visual motions of blobs were always small relative
3.1.2 Foreground detection and blob selection
Accord-
ing to the non symmetrical distribution of contrast around
to their spatial extents. Due to its final application the algorithm works with relative positioning of blobs and
6
Luis M. Fuentes, Sergio A. Velastin
their interaction forming or dissolving groups and does not try to track and individual blob when forming a group. Trajectories should show if the people merging into a group are near enough to interact or not. The evolution of the group blob should also give information about this. However, the proposed algorithm may be easily enhanced. Colour information may be used in the matching process and the predicted position can be used to track individual blobs while forming a group. Indeed, once the blobs have been detected, the colour information of the pixels inside the bounding box can be used to improve the segmentation or to add colour information to the blob’s characteristics. Such colour information can be used in the matching process or in the location of a particular person when forming a group.
3.2.1 Matching blobs
Let us take two consecutive frames,
F (t − 1) and F (t). Foreground detection and blob iden-
Fig. 2 An example of detected blobs in two consecutive frames, the matching matrices and strings.
tification algorithms result in N blobs in the first frame and M in the second. To find the correspondence between both sets of blobs, two matching matrixes are evaluated:
corresponding string element has to store two or three
the matrix matching the new blobs, Bi (t), with the old
values.
blobs, Bj (t − 1), called Mtt−1 and the matrix match-
t Mt−1 (i, j) = M atching{Bi (t − 1), Bj (t)}
t ing the old blobs with the new ones Mt−1 . To clarify
Mtt−1 (i, j) = M atching{Bi (t), Bj (t − 1)}
the matching, the concept of ”matching string” is intro-
t St−1 (i) =
[
j
/
t Mt−1 (i, j) = 1
(1)
(2)
duced. Its meaning is clear, the numbers in column k
j
show the blobs that match with the blob k, see figure 2.
3.2.2 Tracking blobs
It is possible for one blob to get a positive match with
tion of the blob from frame F (t − 1) to frame F (t) by
two blobs and, sometimes, with three. In this case, the
analysing the values of the matching strings of both
The algorithm solves the evolu-
Tracking-based Event Detection for CCTV Systems
Bi (t − 1)
[
7
Bj (t − 1) ≡ Bk (t)
t t St−1 (i) = St−1 (j) = k
=⇒
Stt−1 (k) = i
Bi (t − 1) ≡ Bj (t)
[
Bk (t)
=⇒
t St−1 (i) = j
S
S
(3)
j
k
(4)
Stt−1 (j) = Stt−1 (k) = i
Bi (t) ≡ new
t (j) 6= i St−1
=⇒
∀j
(5)
Stt−1 (i) = ∅
Bi (t − 1) ≡ leaves
=⇒
t St−1 (i) = ∅
Stt−1 (j)
Bi (t − 1) ≡ Bj (t)
=⇒
6= i
(6) ∀j
t St−1 (i) = j
(7)
Stt−1 (j) = i Fig. 3 Solving blob’s temporal evolution
frames. Simple events such as people entering or leav-
ries of single persons or cars are easily obtained in low
ing the scenario, people merging into a group or a group
density situations, figure 3.2.2. Whenever it is necessary,
splitting into two people are easily solved. An example
an interpolation of the position of the tracked blob in
of the correspondence between some events in the tem-
the frames where it was forming part of a group may
poral evolution of the blobs and the matching strings,
provide approximate complete trajectories. The inter-
merging is shown, figure 3.
polated centroids position is obtained using the median
After classifying, the matching algorithm updates each
speed of previous frames and the centroid of the group
new blob using the information stored in the old ones
blob.
and keeps the position of the centroid to form a trajectory when the blob is being tracked. If two blobs merge to form a new one, this particular blob is classified as a group. This new group blob is tracked individually although the information about the two merged blobs is stored for future use. If the group splits again, the system uses speed direction and blob characteristics like colour- to identify correctly the two splitting blobs. Tracking blobs centroid from frame to frame, trajecto-
4 Event detection Blob detection provides 2D information allowing an approximate positioning of people in the 3D scenario a more precise positioning requires either a geometric camera calibration or stereo processing simultaneously analysing images from two cameras-. People position can be used to detect some position-based events characteristic of transport environment, such as unattended lug-
8
Luis M. Fuentes, Sergio A. Velastin
calator, aisle, hall, platform, etc-, the usual direction of motion, and the location of forbidden areas and gates -zones of the image through which people get into the scene or out of it-. The system uses a classification of the scene based on the amount of people present in the image, namely people-density. This classification can not be clearly defined using the number of people because that number depends on each scene -20 people can be classified as normal density in a ticket hall or as high density in an escalator-. People-density will be divided in four levels, namely zero, low, medium and high density. In this paper we address zero -no people- and low density Fig. 4 Example of trajectories in different scenarios
-up to four people, a single person and a group or two groups-, although some events concerning medium and
gage, intrusion into forbidden areas, falls on tracks, etc. Further analysis using position information from consecutive frames, tracking, allows a basic analysis of people interaction and the detection of dynamic-based events, unusual movements in passageways, vandalism, attacks, etc.
high people-density have been successfully solved [23]. 1. Unattended luggage. A person carrying luggage leaves it and moves away: – Initial blob splits in two – One (normally smaller and horizontal) presents no motion, the other moves away from the first. – Temporal thresholding may be used to trigger the
4.1 Low-level description of events
alarm. 2. Falls.
The following paragraph shows some examples of how
– Blob wider than it is tall.
event detection can be achieved using the position of
– Slow or no centroid motion
the centroids, the characteristics of the blobs and the
– Falls or tracks: centroid in forbidden area.
tracking information, [20], basic characteristics of images and sequences. A basic knowledge of the scenario is also needed; information about the place -stairs, es-
3. People hiding. People hide (from the camera-from other people) – Blob disappearing in many consecutive frames
Tracking-based Event Detection for CCTV Systems
9
– Last centroid’s position no close to a “gate” (to leave the scene) – Last centroid’s position very close to a previously labelled “Hiding zone” – Temporal thresholding may be used to trigger the alarm. 4. Vandalism. People vandalising public property: – Isolation: only one person/group present in the
Fig. 5 Evaluation of social zones versus distance. A person getting too close to another may indicate an attack.
scene – Irregular centroid motion – Possible changes in the background afterwards 5. Fights. People fighting move together and break away many times, fast movements: – Centroids of groups or persons move to coincidence
cial intercourse, figure 5. Although this distance mapping depends on the cultural background and crowding situations, it is basically applicable in Western Europe and low density situations. Under poor lighting conditions, like those encountered in outdoor platforms, this distance mapping varies. People tend to feel safer when
– Persons/Groups merging and splitting
there is enough ambient light to recognise faces of others
– Fast changes in blob’s characteristics
nearby: at 4 meters an alert person is thought able to
6. Intrusion in forbidden areas
take evasive or defensive action if a threat is perceived,
– Blob (centroid) in forbidden area
although 10 meters distance provide a greater margin for
– Temporal thresholding may be used to trigger the
comfort [25].
alarm. 7. Attacks. – Newcomer getting too close – One blob may initially be static – One blob tries to move apart – Becomes fight.
4.2 Identification of events The above description of events in terms of the trajectory and position of the blobs from a sequence of frames tries to summarise how the event detection algorithm works. From each frame, a set of blob related events -blob
In the last case, a basic approximation may use people’s
split, no blob motion, centroid in forbidden area, etc.- is
feeling of distance and how social patterns of behaviour
maintained and cross-checked with previous frames sets.
establish a range of distances to different kinds of so-
When the required conditions are fulfilled, the system
10
Luis M. Fuentes, Sergio A. Velastin
Fig. 6 Examples of event detection: Sequence showing a ”person-hidding event” (the dot marks the place where a blob disappears from the sequence in an area not defined as gate) and centroid’s trajectories from ”graffiti scribbling”, ”people sitting in the stairs” events (the arrow shows the point where the blob remained stationary) and people leaving a bag and coming back to pick it up.
raise and alarm. In any case, the system only attracts the
be directly digitised in YUV format-. Working with a
attention of the operator, who always decides whether an
video signal there is no perceptible difference between
event is actually taking place.
processed and un-processed video streaming. The system can successfully resolve blobs forming and dissolving
4.3 Results
groups and track one of them throughout this process, It can also be easily upgraded with background updat-
The presented real-time tracking system was tested on ing and tracking of multiple objects. With the proposed an 850 MHz compatible PC running Windows 2000. It system, some predefined events -including people hidworks with colour images in half PAL format 384x288. ing, graffiti scribbling and people sitting in the stairsIt has been tested with live video and image sequences have been detected, figure 6. A complete testing of the in BMP and JPEG formats. The minimum processing proposed low-level description of the predefined events speed observed is 10 Hz, from disk images in BMP forhas not been carried out due to the lack of enough valid mat requiring RGB to YUV transform -live video can
Tracking-based Event Detection for CCTV Systems
11
Fig. 7 Euclidean distance between centroids in single tracking and example of trajectories from two people meeting and walking together.
video sequences containing such events. Some simula-
ground truth was established visually and therefore a left
tions have been used to test event detection -someone
baggage is marked as such frames before the algorithm
falling down the stairs, vandalism and people hiding-
can detect it as a separate item. On the contrary, the
whilst real footage has been used to test tracking algo-
algorithm may not be able to separate two people and
rithm and some position-based events -unattended lug-
considers only one group-blob frames before is marked
gage and intrusion into forbidden areas-. However, an
as such in the ground truth. An example of the results
evaluation of the tracking algorithm and static object de-
obtained is shown below.
tection is provided using the image sequences and ground Serie’s Name
Frames
Percentage
Left Bag1
409
81 %
clidean distance between the obtained and ground truth
Left Bag
291
57 %
centroids, as shown in figure 7. The averaged deviation
Left Bag at Chair
435
67%
in this distance goes from two to five pixels in all the
Meet Walk and Split
147
117%
truth files form CAVIAR project [33]. The performance of the tracking algorithm can be evaluated using the eu-
analysed series. The algorithm was also tested with he group and left-
The main problem in analysing the CAVIAR Project
baggage sequences. In this case, we provide the percent-
sequences comes from the number of frames-per-second.
age of frames with the group correctly detected or the
The proposed system was developed for multiple signal
stationary flag correctly raised by the application. The
coming up to ten frames per second. Therefore, cen-
12
Luis M. Fuentes, Sergio A. Velastin
troids’ velocity and displacement are, normally, too small.
calibration algorithm that obtains the vanishing lines,
An improvement on the results above can be obtained
and therefore, an additional description of scene’s geom-
manipulating some parameters and processing just one
etry in terms of floor, walls and ceiling can also be used
in two or three frames -12 or 8 frames per second-. How-
to improve detection of some events like unattended lug-
ever, even under these conditions, the system failed de-
gage. We are currently working in the implementation of
tecting the “fight” event in the CAVIAR series, raising
that algorithm.
the alarm less than 50% of the frames. References 5 Conclusions
1. Norris C, Moran J, Armstrong G (1998) Surveillance,
Due to final system requirements, a high processing speed
Closed Circuit Television and Social Control. Ashgate, UK
is essential. Luminance contrast segmentation and its as-
2. Armitage R (2002) To CCTV or not to CCTV URL:
sociated background model have been chosen because they provide an excellent performance with lower com-
http://www.nacro.org.uk/templates/publications/ briefingItem.cfm/ 2002062800-csps.htm 3. Langlais A (2003) User Needs Analysis, CROMATICA
putational cost. Some important points concerning the TR-1016, CEC Framework IV Telematics Applications
influence of the chosen method in background subtraction and tracking have been previously discussed [21, 15]. With the proposed system, some predefined events
Programme, URL: http://dilnxsrv.king.ac.uk/cromatica/ 4. Buxton H, Gong S (1995) Visual surveillance in a dynamic and uncertain world. Artificial Intelligence 78:431-459
-including people hiding, graffiti scribbling and people
5. J. Aggarwal K, Cai Q (1999) Human Motion Analysis:
sitting in the stairs- have been detected, showing its suit-
A Review. Computer Vision and Image Understanding 73
ability for security tasks, including surveillance and pub-
3:428-440
lic areas monitoring, where the number of CCTV cameras mounted makes it impossible by means of its physical and economic implications. Reduced human monitor-
6. Gavrila D M (1997) The visual analysis of human movement: A survey. Computer Vision and Image Understanding 73: 82-98 7. De la Torre F, Martinez E, Santamaria ME, Moran JA
ing is still needed to solve the raised alarms and to mon(1997) Moving Object Detection and Tracking System: a
itor system’s selected video footage. Additional information like the number of people and crossing frequency in
Real-time Implementation. Proceedings of the Symposium on Signal and Image Processing GRETSI 97: 375-378
a certain area or the evolution of people density in all ar-
8. Wren CR, Azarbayejani A, Darrel T, Pentland P (1997)
eas monitored with time may also be obtained. Further
Pfinder: Real-Time Tracking of the Human Body. IEEE
3D information can be obtained with an automatic pre-
Trans. Pattern Analysis and Machine Intellig. 17 6:780-785
Tracking-based Event Detection for CCTV Systems
13
9. Rosales R, Claroff S (1998) Improved Tracking of Multi-
19. Morellas V, Paulidid I, Tsiamyrtzis P (2003) DETER:
ple Humans with Trajectory Prediction and occlusion Mod-
Detection of events for thread evaluation and recognition.
elling. Proceedings of the IEEE Conf. On Computer Vision
Machine Vision and Applications 15:29-45
and Pattern Recognition,
20. Fuentes LM (2002) Assessment of image processing tech-
10. Haritaoglu I, Harwood D, Davis LS (2000) W4: RealTime Surveillance of People and Their Activities. IEEE Trans. Pattern Analysis and Machine Intellig. 22 8:809-822 11. McKenna S, Jabri S, Duric Z, Rosenfeld A, Wechsler H (2000) Tracking Groups of People. Computer Vision and Image Understanding 80:42-56
niques as a means of improving Personal Security in Public Transport. PerSec. EPSRC project Final Report 21. Fuentes LM, Velastin SA (2001) Foreground segmentation using luminance contrast. Proceedings of the WSES/IEEE Conference on Speech, Signal and Image Processing:2231-2235
12. JAgbinya JI, Rees D (1999) Multi-Object Tracking in Video. Real-Time Imaging 5:295-304
22. Plataniotis KN, Venetsanopoulus AN (2000) Colour Image Processing and Applications. Springer-Verlag, Berlin-
13. Intille SS, Davis JW, Bobick AF (1997) Real-time Closed-World Tracking. Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’97): 697-703
Heidelberg-NewYork 23. Boghossian B (1999) Evaluation of motion-based algorithms for automated crowd management. Workshop on Performance Characterisation and Benchmarking of Vision
14. Corvee E, Velastin SA, Jones GA (2003) Occlusion TolSystems, Intern. Conf. on Vision Systems, ICVS erant Tracking using Hybrid Prediction Schemes. Acta Au24. van Bommel WJM (1990) Public lighting and residential tomatica Sinica 23 3 lighting. Elektrotechniek 68 5:451-454 15. Fuentes LM, Velastin SA (2001) People tracking in surveillance applications. 2nd IEEE Intern. Workshop on
lighting. Engineering Report 43, Lighting Design and En-
Performance Evaluation of Tracking and Surveillance 16. Rosales R, Sclaroff S (1999) 3D trajectory recovery for tracking multiple objects and trajectory guided recognition of actions. Proceedings of the IEEE Conf. on Computer
fic
Xiaokun, event
Porikil
detection:
F A
(2003) Survey.
TrafURL:
http://vision.poly.edu:8080/ fporikli/traffic.html 18. Rota N, Thonnat M (2000) Video Sequence Interpretation for Visual Surveillance. Proceedings of The IEEE Workshop on Visual Surveillance:59-68
gineering Centre. NV Philips Gloeilampenfabrieken, Eindhoven, The Netherlands 26. ARDA-VACE
Program:
URL:
http://www.ic-
arda.org/InfoExploit/vace/index.html
Vision and Pattern Recognition 2:23-25 17. Li
25. Caminada JF, van Bommel WJM (1980) Residential area
27. EU
PRISMATICA
Program:
URL:
http://www.prismatica.com 28. EU
ADVISOR
Program:
URL:
http://www-
sop.inria.fr/orion/ADVISOR/index.html 29. Velastin SA, Vicencio-Silva MA, Lo B, Sun J, Khoudour L (2003) A Distributed Surveillance System for Im-
14
Luis M. Fuentes, Sergio A. Velastin
proving Security in Public Transport Networks, URL:
Sergio A. Velastin holds a PhD in Computer Sci-
http://www.prismatica.com
ence from the Manchester University. In 1990 became
30. Chowdhury and Chellappa R (2003) A Factorisation Approach for Activity Recognition. 2nd IEEE Workshop on Event Mining: Detection and Recognition of Events in Video, Proceedings of the 2003 Computer Vision and Pat-
Senior Lecturer in the Electric and Electronic Engineering Department at King’s College London, where he founded and managed the Vision and Robotics Laboratory. In 2001 joined the Digital Image Reasearch Cen-
tern Recognition Conference
tre at Kingston University. He is currently reader at 31. Nevatia R, Zhao T, Hongeng S (2003) Hierarchical
Kingston University and founder and general director Language-based Representation of Events in Video Steams. 2nd IEEE Workshop on Event Mining: Detection and
of IPSOTEK Ltd.
Recognition of Events in Video, Proceedings of the 2003 Computer Vision and Pattern Recognition Conference
Originality and contribution The paper deals with
32. Guler S, Liang WH, Pushee IA (2003) A Video Event
the extensively covered subject of tracking applied tho
Detection and Mining Framework. 2nd IEEE Workshop
the relatively little researched problem of human be-
on Event Mining: Detection and Recognition of Events in
haviour understanding. The novel concept of luminance
Video, Proceedings of the 2003 Computer Vision and Pat-
contrast application to foreground segmentation is ap-
tern Recognition Conference
plied in order to get a faster and reliable forgeround
33. EC Funded CAVIAR project/IST 2001 37540, found at URL: http://homepages.inf.ed.ac.uk/rbf/CAVIAR/.
detection algorithm. A simple low-level description of events in terms of spatial and temporal evolution of the people is also original. The main contribution is address-
Luis M. Fuentes received a MS degree in Optics from
ing the real demands of CCTV surveillance systems: sim-
the University of Zaragoza in 1989 and PhD in Physics
plicity (no camera calibration, special placement, new
from the University of Valladolid in 1999. He was a
CCTV cameras or maintenance is needed), speed (the
Post-Doc research assistant at the Electronic Engineer-
high number of CCTV cameras make impossible a dedi-
ing department at King’s College London and afterwards
cated PC and a real-time processing for up to four CCTV
at Digital Image Research Centre, Kingston University
cameras is needed) and tailoring (designing the event de-
working at the design of Intelligent Surveillance Systems.
tection algorithm according to the demands of the Public
He is currently lecturing at the University of Valladolid.
Transport sector).