´ Cl´ F. Robert-Inacio, A. Raybaud, E. ement, ‘Multispectral Target Detection and Tracking for Seaport Video Surveillance’, Proceedings of Image and Vision Computing New Zealand 2007, pp. 169–174, Hamilton, New Zealand, December 2007.
Multispectral Target Detection and Tracking for Seaport Video Surveillance ´ Cl´ement2 F. Robert-Inacio1 , A. Raybaud1 and E. 1
´ Institut Sup´erieur de l’Electronique et du Num´erique, Laboratoire Mat´eriaux et Micro´electronique de Provence UMR CNRS 6137, Place Pompidou, 83000 Toulon, France. 2 CS Syst`emes d’Information, 230 rue Marcellin Berthelot, 83130 La Garde, France. Email:
[email protected]
Abstract In this paper, a video surveillance process is presented including target detection and tracking of ships at the entrance of a seaport in order to improve security and to prevent terrorist attacks. This process is helpful in the automatic analysis of movements inside the seaport. Steps of detection and tracking are completed using IR data whereas the pattern recognition stage is achieved on color data. A comparison of results of detection and tracking is presented on both IR and color data in order to justify the choice of IR images for these two steps. A draft description of the pattern recognition stage is finally drawn up as development prospect. Keywords: Multispectral data, target detection, target tracking, pattern recognition, shape identification
1
Introduction
Target detection and tracking have been widely studied in the last ten years because of the development of video surveillance in different contexts. In particular the automation of the target detection and tracking proved to be necessary because of the increasing volume of data to be treated. But such tasks can be very difficult to achieve according to the complexity of the background or of the target itself [1]. For example studies have been carried out in biometrics for face recognition [2] or iris identification [3]. In this context the object under study is welllocated and the background is reduced to a minimal amount of data and complexity. In many fields the detection of an object is required and hyperspectral imaging sensors can provide image data containing both spatial and spectral information [4]. For example another application using multispectral data is described in [5] for systematic poultry carcass inspection in order to supply a safe meat. In our application we have to deal with multispectral data provided by an acquisition system and giving information concerning both spatial and spectral properties. Furthermore the background can be very complex and cannot be reduced to a simple amount of data. In the following a full description of the experimental context is given as well as algorithms for target detection and tracking. Then preliminary
169
results are presented on some examples. And finally further developments in pattern recognition are described in order to identify the objects under study.
2
Experimental context
The main goal of this study is to secure a seaport by providing a video surveillance in order to protect it from terrorist attacks. The attacks under consideration are restrained to attacks coming from the sea and from objects moving at the surface. Mobiles to be identified range from swimmers to tankers, including windsurfers, jetskis, fast boats, sailing boats and other kinds of ships. In this way, the mobiles to be detected can be of various kinds regarding their shapes, dimensions or speeds. The main features of these mobiles concern shape, size speed and trajectory. The analysis of the video stream will consists of three different stages: detection and tracking, trajectory analysis and mobile identification. The trajectory analysis will allow to determine if the mobile is threatening for the seaport and the mobile identification will be based on pattern recognition and shape classification from geometrical features.
3
Image acquisition
The image acquisition is simultaneously achieved by a high-definition color camera and an infrared
Figure 1: Sea surface surveillance process.
Figure 2: systems.
Configuration of the image acquisition
(IR) acquisition system (made of three IR cameras) providing images in three different frequency domains. Two acquisition systems such as described previously are used for the image acquisition in order to determine which location is the best. Fig. 2 shows the two different spots (stars) selected to settle the two acquisition systems. The area under surveillance is symbolized by a square with an edge length of 2 km (in yellow).
4
Figure 3: Target detection process: a) primary target detection, b) elementary area gathering.
Target detection
modified elementary areas. Afterwards elementary areas are gathered according to their location. In other words if two or more elementary areas are neighbors, they are considered as a unique target (Figure 3b). Nevertheless a main problem to solve is wave motion. Actually even if the sea is calm, wave motion induces the detection of several irrelevant targets when using a camera acquiring color data. In order to avoid this drawback infrared data are used as the water temperature does not vary significantly between the top and the bottom of a given wave.
Target detection is achieved by considering a background image periodically refreshed along the process [6]. The time interval between two background images is set to 3 minutes but can be chosen by the operator. And the video stream is set to 10 frames per second. In this way mobiles are detected by difference between the image under study and the background image. This method allows to take into account persistent changes in the background. For example if a new ship arrives and stands in the camera frame for a long time (anchoring), its shape is integrated to the background after a given time. In order to determine if a part of the image has been modified, the image under study is cut up into several arbitrary elementary areas in which an average value is computed. Then the average value is compared to the corresponding one on the background image. If the difference of the two average value is greater than an error tolerance value the elementary area is assimilated to a target. Colored squares in Figure 3a correspond to
5
Target tracking
Target tracking is achieved in two steps. First of all targets detected at a given time of the sequence are compared to the targets of the previous image. Then if their location is coherent in terms of speed, the targets are not rejected as a false alarm. Note that the rejection is effective after several images.
170
a
b
c Figure 4: Acquisition sequences for a) IR data (mid time), b) IR data (full time), c) color data (full time).
171
Afterwards the following step consists in elaborating the history of the target position along the sequence. This step is preliminary to the trajectory determination and analysis. Such an analysis should provide a helpful tool to decide if a target is threatening or not. For example if a target is very fast and sailing directly to the seaport, the trajectory analysis should classify it as a probable attack and the operator must be warned in order to confirm the diagnosis. But there remains problems to solve such as target tracking with multiple targets overlapping each others or targets disappearing from the image and reappearing a few time after.
6
estimated. Actually small mobiles with high speed are suspect. Afterwards a trajectory analysis is also an appropriate approach. If the mobile has a straight trajectory incoming into the seaport, the alert must be given. And finally detection of terrorist attacks can be refined by using shape classification in order to identify the mobile.
8
The next step consists in studying fine details of outlined targets in order to classify them. The determination of several geometrical features can allow to reduce the set of reference shapes to which the unidentified target has to be compared. For this step, we consider a single image of the target to identify. The process must be able to classify a mobile whatever its orientation is. But in order to fully identify targets it is necessary to take into account color data. Actually they contain finer details than IR images. That is why the acquisition system used in this study provides both IR and color data. Figure 5 describes the full identification process. A database provides reference elements about geometrical features for known ships or mobiles under consideration. These geometrical features are measured on the mobile under study in order to classify and identify it. Geometrical features can include information about elongation, symmetry and so on [8][9]. A similarity measure will be established and estimated between the mobile under study and each element of the database [10]. This similarity measure gives a value between 0 and 1, evaluating the degree of similarity between two shapes. A threshold value close to 1 indicates if the identification has been achieved. In other words, if the similarity degree between a reference ship of the database and the ship under study is lower than this threshold, that means that the kind of the ship under study is probably unknown, whereas a similarity degree greater than this threshold indicates that the ship under study is of the kind of the reference ship. In every case, the maximal similarity degree between the ship under study and each of those of the database points out the closest reference ship to those under study. The database giving reference mobiles is supposed to evolve according to further identifications according to a learning process complemented by human knowledge. That means that the amount of reference data will increase considerably while processing video sequences. In order to reduce computation time, we will have to integrate spectral information provided by IR images about the mobile
Experimental results
Unfortunately the amount of data at our disposal is not significant enough to enable a real performance evaluation of the algorithms [7]. Nevertheless sequences on which they were applied are significant enough to let foresee high performances in this particular context. Figure 4 shows two different sequences: on the left, a fast boat crosses the area under surveillance whereas it is a jet ski on the right. Rows a and b represent IR data at mid time and at the end of the sequence and row c shows color data. We can note that boat or jet ski wake is a perturbing element for target detection as it produces a lot of false alarms on color sequences. That is why it is better to analyse IR images as wakes are almost homogeneous because of their low temperature variation. On the video sequences the fast boat crosses the camera shot whereas the jet ski makes a loop beginning at the center, going to the left and then back to the right. Yellow points (Figure 4b) and blue points (Figure 4c) are centroids of targets along the sequence. We can note that the detection is better and simplified with IR data as temperature information is smoothed on the contrary of color information. In this way, it is easy to determine the mobile trajectory as there is only a single target. Actually, ship wake is progressively integrated to the background on IR images whereas it produces false targets on color images. The yellow frame on the six images of Figure 4 is the rectangular box containing the detected mobile on the current image. We can remark that the wake is taken into account as part of the mobile for color images, whereas only the mobile itself is framed for IR data.
7
Further developments in pattern recognition and shape classification
Detection of terrorist attacks
In order to achieve such a task, several features must be taken into account. First of all, direct measures such as the target size and speed must be
172
to identify. In this way, IR features will enable to determine what subset of reference mobiles could suit and then must be considered for identification.
9
matic target detection applications,” Lincoln Laboratory Journal, vol. 14, No 1, pp. 79–116, 2003. [5] F. Ding, Y. Chen, and K. Chao, “Two-waveband color-mixing binoculars for the detection of wholesome and unwholesome chicken carcasses: a simulation,” Applied Optics, vol. 44, pp. 5454–5462, 2005.
Conclusion
The presented application is original as it takes into account a complex background on which complex targets evolve. The complexity of the background lays in wave motion that prevents the background to be constant along the sequence and in the scene itself that can be made of a landscape including a lot of different elements. The algorithm uses a simple model for the background that does not require heavy calculus. Furthermore the detection algorithm is able to refresh the background image in order to integrate new elements, such as new ships at anchoring. Another point is that this study justifies the choice of IR data for the steps of target detection and tracking. Actually IR images are better smoothed especially on water regions and on ship hulls. For example immatriculation numbers or different colors of painting are not taken into account with IR data. Furthermore mixing data from different spectrum ranges gives a new approach of video surveillance and mobile identification, especially in outside contexts with natural lighting and moving parts in the background. Finally the last step concerning ship identification has still to be developped in order to evaluate performances of the whole system.
10
[6] D. Hall, J. Nascimento, P. Ribeiro, E. Andrade, P. Moreno, S. Pesnel, T. List, R. Emonet, R. Fisher, J. S. Victor, and J. Crowley, “Comparison of target detection algorithms using adaptive background models,” in Proc. IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Beijing, China, October 2005, pp. 113–120. [7] J. Black, T. Ellis, and P. Rosin, “A novel method for video tracking performance evaluation,” in Proc. IEEE 7th International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Nice, France, October 2003, pp. 125–132. [8] F. Robert-Inacio, “Symmetry parameters for 3d pattern classification,” Pattern Recognition Letters, vol. 26/11, pp. 1732–1739. [9] B. Gr¨ unbaum, “Measures of symmetry for convex sets,” in Proc. Symp. Pure Math., vol. 7, 1963, pp. 233–270.
Acknowledgements
This research is supported by the SECMAR project, funded by the Pˆ ole de Comp´etitivit´e Mer of the R´egion Provence-Alpes-Cˆ ote-d’Azur, France.
References [1] K. Jung, K. Kim, and A. Jain, “Text information extraction in images and video : a survey,” Pattern Recognition, vol. 37, pp. 977– 997, 2004. [2] Y. Tong, Y. Wang, Z. Zhu, and Q. Ji, “Robust facial feature tracking under varying face pose and facial expression,” Pattern Recognition, vol. 40, pp. 3195–3208, 2007. [3] S. Sirohey, A. Rosenfeld, and Z. Duric, “A method of detecting and tracking irises and eyelids in video,” Pattern Recognition, vol. 35, pp. 1389–1401, 2002. [4] D. Maniolakis, D. Marden, and G. Shaw, “Hyperspectral image processing for auto-
173
[10] F. Robert, “Shape studies based on the circumscribed disk algorithm,” in Proc. of the IEEE-IMACS, CESA 98, vol. 4, Hammamet, Tunisia, April 1998, pp. 821–826.
Figure 5: Mobile identification process.
174