face identification with cnn-um - CiteSeerX

0 downloads 0 Views 124KB Size Report
Abstract - In this paper an analogic face-registration algorithm is proposed for video tracking of surveillance applications using CNN-UM. The algorithm follows.
FACE IDENTIFICATION WITH CNN-UM Zoltán Szlávik∗ Abstract - In this paper an analogic face-registration algorithm is proposed for video tracking of surveillance applications using CNN-UM. The algorithm follows pose variations of head by affine transformations such as rotation. Morphological and divers type of wave calculations of CNN-UM was used. The success rate of face identification was 92.24% on a standard facial database.

1. INTRODUCTION Face identification is very important in security, surveillance and telecommunication applications. The proposed algorithm will be used for face tracking in a video surveillance system. In most cases the quality and resolution of the recorded image-segments is poor and hard to identify a face. For this reason, surveillance projects often use additional zooming cameras to focus on region of interests. Disturbing effects might distort recordings such as variations in illuminations, in pose or occlusions. Due to the acquisition conditions the size of the face images is smaller (sometimes much smaller) than the input image size in most of the existing face recognition systems. For example, the valid face region could be as small as 15 × 15 pixels, whereas the face images used in still image based systems are at least 128 × 128 . Small-size images not only make the recognition task difficult, but also affect the accuracy of face segmentation. Object tracking, however, needs feature extraction, matching in real-time. The CNN-UM is approved to be an appropriate tool for accomplishing real-time image processing [1][2][3]. In our approach we assume that the human head is detected and localized. The task is divided into the following sub-tasks. First some features are extracted to estimate the pose of the head. Than the head is transformed to create canonical images, in which the face is filtered, rotated and scaled to a standard size. At last searching the most similar face in a database makes the identification of face.

Tamás Szirányi∗ The geometric feature-based methods analyze explicit local facial features, and their geometric relationships. These methods are known as Graph Matching, Elastic Matching and Local Feature Analysis [4]. The template-based methods compute the correlation between a face and one or more model templates to estimate the face identity. To reduce the dimensionality of image-space several techniques are used. Statistical tools such as Support Vector Machines [8][9], Linear Discriminant Analysis [7][10], Principal Component Analysis [11][12], have been studied to construct a suitable set of face templates. These methods mostly capture global features of the face images. Most of the face recognition algorithms focus on frontal facial views. However, pose changes often lead to large nonlinear variation in facial appearance due to self-occlusion and self-shading. Some algorithms have been proposed, which can recognize faces at a variety of poses. However most of these algorithms require gallery images at every pose [9].

Figure 1: Example of client model image (a) and test image (b) rejected by experts based on LDA These techniques are often too slow to be used for tracking purposes. The demonstrated algorithm extracts some features and finds the closest image of the unclassified image in an image database. It uses the CNN-UM architecture [2], a real hardware [3] for real-time image processing. 3. INTRODUCTION TO CNN A CNN-UM, which stands for Cellular Neural Network Universal Machine (CNN-UM [2]), was used. The CNN can be defined as:

x ij (t) = − xij + 2. OVERVIEW OF EXISTING ALGORITHMS Current face recognition algorithms can be categorized into two classes, geometric feature-based [4] or image template based [5][6][7]. ∗

∑A

kl∈S r

ij , kl

y kl (t) +

∑B

kl∈S r

ij , kl

u kl + z ij

(1)

where i,j denotes the cell position in a grid, x, y, u, z are called state, output, input, and threshold of cell (i,j), respectively. A and B are called the feedback

Analogical & Neural Computing Laboratory Computer and Automation Research Institute Lágymányosi u. 11, Budapest, H-1111, Hungary, e-mail: [szlavik,sziranyi]@sztaki.hu, tel.: +36 1 279 7155

and input synaptic operators or templates. The output y is defined by the following equation

yij = f ( xij ) =

1 ( xij + 1 + xij − 1 ) . 2

An elementary operation in this architecture is called template operation [13], which means execution of some operation on the whole image. Using templates different image processing, morphological and wave-metric operations can be implemented [13][14]. 4. FACE IDENTIFICATION WITH CNN-UM The key tasks in face tracking of video-surveillance are the fast and efficient feature extraction and comparison of images. Computationally most intensive part of image identification is the normalization of images. To design fast algorithms we use fully parallel image processing CNN-UM architecture [3]. The algorithm is tested on the UMIST [15] face database. It consists 493 images of 20 persons. We choose this database because for each person it covers a range of poses from profile to frontal views likes an image sequence from video. Here are some sample images:

Figure 2: Image samples from UMIST face database The faces in database are pose and scale variant. The algorithm tried to compensate pose variations, which may occur between two video-frames by rotating faces to a “desired” view. To rotate faces we estimate some information from the picture – symmetry information about a face. We found the axis of symmetry of a face and the position of facesides on an image. Knowing these we can rotate the face into the “desired” position. The main steps of the normalization algorithm are shown on Figure 3:

Figure 3: The main steps of the face-rotation algorithm 4.1 Feature extraction, estimating facial pose and rotation In general, feature extraction means the detection of features based on an image that is transformed to a standard/reference position and scale. Finding the eyes, mouth, eye-sockets, eyebrows, nose and calculating the relative position of them. We assume that face images are transformed (shifted and rotated in 2D) to a given position and scale. Feature extraction in our algorithm means detection of some features, which may support the rotation of the image. To estimate the facial pose we measure how the face is rotated. We do it by finding the axis of symmetry of a face, the sides of a face and taking the difference between them. The ratio of the two obtained value is the measurement of the face rotation. So the algorithm contains three steps: • Detection of edges; • Detection of face symmetry information – axis of symmetry and sides of a face (an object); • Rotation. For edge detection we use Sobel operators. It is a linear template when applied on the ACE4K[3]. As an axis of symmetry of a face we choose the region of nose. For estimating the axis of symmetry (Figure 4) we take an assumption that the region of nose on a face is the most vertically detailed region of face. So if we apply the CCD (connected component detection) operation to the edge map of the input image, then the peak of the resulted image will be at the region of nose. To find this peak we apply some typical template operations for binary images from [14]. We could find the region of nose also by calculating vertical gradient field of the input

image. However, we decided to design an algorithm with binary-output operations, because they are work more robust on the ACE4K [3]. For measuring the sides of face we use the result of the above algorithm (the result of VERTICAL CCD template) (Figure 5).

Finding the axis of symmetry and the sides of face we can rotate and shift the face to any position on a picture. As a rotation in depth we define an image transform that expands one half of the face and compress the other. It wouldn’t be a realistic rotation, but it does simulate a non-linear transformation of the face under rotation in depth.

Figure 7: Rotation of face on CNN-UM (Simulation results). The left is the input image the right is the rotated. 5. RESULTS AND CHIP EXPERIMENTS

Figure 4: Finding the nose - the axis of symmetry of face

Figure 5: Finding sides of a face The following images show some results of the nose finding algorithm described above. There are cases where the nose is signed by a band not by a single line. It is easy to find the center of such band and then the nose signed by a single line – the axis of symmetry.

In our approach we apply a simple recognition technique based on the use of whole image gray-level templates. The simple version of template matching represented as a two-dimensional array of intensity values is compared, by using the Euclidean distance, to a single template representing the whole face. When attempting recognition, the unclassified image is compared in turns to all of the database images, returning a score, the distance to an image in database. Then the unknown person is classified as the one giving the smallest score. This template-matching algorithm contains 3 elements of feature-based approaches: the sides of face, axis of symmetry, the distance between the sides and axis of symmetry. The recognition rate of this algorithm on UMIST database is 92.24%.

Figure 8: Face (a) has been found as a result of the sample in face (b) The feature extraction algorithm and rotation were tested on ACE4K. Some results are shown on Figure 9 and Figure 10:

Figure 9: Feature extraction on ACE4K

Figure 6: Results of nose and sides of face finding algorithms (Simulation results).

Figure 10: Rotation of face on ACE4K. The left is the input image the central is the rotated. One can see that the ACE4K chip not as accurate in grayscale transformations as in binary operations CONCLUSIONS In this paper some analogic algorithms were proposed for facial feature extraction and face indexing using CNN-UM. Software simulation of our algorithm gives 92.24% success rate. The steps of the algorithm were tested on ACE4K. ACKNOWLEDGEMENT The authors wish to thank to Professor Tamas Roska for consultations. The support of the National R&D Program of Hungary, TeleSense Project is acknowledged. REFERENCES: [1] L.O. Chua, L. Yang, “Cellular Neural Networks: Theory”, IEEE Trans. on Circuits and Systems, (CAS), Vol.35. pp. 1257-1272, 1988 [2] T. Roska, L.O. Chua “The CNN Universal Machine: an analogic array computer”, IEEE Trans. On Circuits and Systems II, Vol. 40, No. 3. pp. 163-173, 1993 [3] S. Espejo, R. Domínguez-Castro, G. Liñán, and Á. Rodriguez-Vázquez, “A 64x64 CNN universal chip with analog and digital I/O”, 5th Int. Conf. Electronics, Circuits and Systems (ICECS-98), Lisbon, pp. 203-206 1998 [4] H. Wu, Y. Yoshida, T. Shioyama “Optimal Gabor Filters for High Speed Face Classification”, 16th Int. Conf. On Pattern Recognition, 2002 [5] R. Gross, I. Matthews, S. Baker “Eigen Light-Fields and Face Recognition Across Pose”, IEEE Int. Conf. On Aut. Face and Gest. Rec., 2002 [6] R. Gross, J. Shi, J.F. Cohn “Quo vadis Face recognition?”, Third Workshop on Empirical Evaluation Methods in Computer Vision, 2001 [7] Y. Bing, J. Lianfu, C. Ping “A new LDAbased method for Face Recognition”, 16th Int. Conf. On Pattern Recognition, 2002

[8] E. Rivlin, M. Rudzsky, R. Goldenberg, U. Bogomolov, S. Lepchev “A Real-Time System for Classification of Moving Objects”, 16th Int. Conf. On Pattern Recognition, 2002 [9] R. Nelson, I. Green “Tracking Objects using Recognition”, 16th Int. Conf. On Pattern Recognition, 2002 [10] J. Czyz, J. Kittler, L. Vandendorpe “Combining Face Verification Experts”, 16th International Conference On Pattern Recognition, 2002 [11] H.C. Kim, D. Kim, S.Y. Bang “Face Recognition using the Mixture-OfEigenfaces Method”, Pattern Recog. Letters, Vol. 23, pp. 1549-1558, 2001 [12] D. Socolinsky, A. Selinger “A Comparative Analysis of Face Recognition Performance with Visible and Thermal Infrared Imagery”, 16th Int. Conf. On Pattern Recognition, 2002 [13] L.O. Chua, T. Roska, “Cellular Neural Network and Visual Computing”, Cambridge Univ. Press, Cambridge, 2002 [14] http://lab.analogic.sztaki.hu/csl/CSL [15] D.B. Graham, N.M. Allinson “Characterizing Virtual Eigensignatures for General Purpose Face Recognition”, Face Recognition: From Theory to Applications, NATO ASI Series F, Computer and Systems Sciences, Vol. 163., pp. 446-456, 1998

Suggest Documents