made a co-simulation of this bloc using the Simulink tool of Matlab. Keywords: Face; detection; Haar; Adaboost; FPGA. 1 Introduction. A great deal of computer ...
Implementation of Face Detection System Using Haar Classifiers H. Blaiech1, F.E. Sayadi2 and R. Tourki3 Departement of Industrial Electronics, National Engineering School, Sousse, Tunisia 2 Departement of Electronic Engineering, Higher Institute of Applied Science and Technology, Sousse, Tunisia 3 Departement of Physical Sciences, Faculty of Science, Laboratory of Electronics and Microelectronics, Monastir, Tunisia 1
Abstract - This paper presents a hardware implementation of face and eyes detection algorithm. To become we must go through several stages. First we have implemented an algorithm for detecting and tracking eyes written in C using an open source library of image processing and computer vision the “OpenCV”. Then a profiling on HW/SW (hardware/software) partition was done. The hardware part solves the part of the algorithm with higher computational costs. The system has been implemented on Spartan3A DSP FPGA board. In order to visualize the results on real images, we have made a co-simulation of this bloc using the Simulink tool of Matlab.
2
Software implementation
In this section we describe the algorithm for detecting and tracking eyes employed [1], we implement this algorithm in C using the library OpenCV [2]. The main steps of the algorithm are described as follows “Figure 1”:
Keywords: Face; detection; Haar; Adaboost; FPGA.
1
Introduction
A great deal of computer vision research is dedicated to the implementation of systems designed to detect user movements and facial gestures. In many cases, such systems are created with a specific goal providing a way for people with disabilities or limited motor skills to be able to use computer systems. These systems are named “Human-computer interaction system”. Among the main biometric characteristics that can serve as a means of identification in these systems of human computer interaction, we quote: the shape of the ear, the hand print, the voice, the eyes, the hand shape and the faces 2D and 3D. In this paper we will focus on the detection by the face 2D and the eyes. The paper is organized as follow: Section 2 gives a software implementation of the algorithm of detecting and tracking eyes. An improvement of this algorithm was done by adding a bloc for face detection, and then a profiling on hardware/software partition was done. The hardware part solves the part of the algorithm with higher computational costs. Section 3 outlines the face detection method based on Haar-Like descriptors. Section 4 presents the results of hardware implementation of the face detection bloc on Spartan 3A DSP Board, followed by a simulation and a co-simulation of this bloc using the Simulink tool of Matlab.
Figure 1. Overview of the main stages in the system
Our problematic in this system is the detection of the eyes from the whole image. Hence there is a possibility to have two components that are near and very similar to the characteristics of the eyes, but in reality they are not eyes as shown in the “Figure 2”. So to solve this problem and to increase the efficiency of our algorithm, we added a step with the aim of restricting the search box of the eye, to the face area of the user.
Figure 2. Problematic in the detection of the eyes from the whole image
2.1
Eye detection through face ace detection
This step consists in detecting the face by applying the method based on Haar descriptors, descriptors which will start the steps described above. This method is defined in the library OpenCV in C as "CVHaarDetectObjects. CVHaarDetectObjects. So the new faces of tests across the various stages of algorithm for detecting and tracking eyes are described below “Figure 3”:
whatever the scale adopted, so a loop of scale is applied to the input image for detecting all current faces in the image as shown in the figure below. below Our contribution is to translate the bloc of face detection to VHDL working on one scale as shown in the figure below “Figure 5”.
Figure 3. Detection of the eyes from the frame of the face
2.2
Profiling and hardware partition p
We calculated the runtime of each bloc of the detecting and tracking eyes algorithm. We note that the initialization bloc runs only once time,, while the rest of the blocs run repeatedly until we get the position of the eyes and face. So leaving aside the initialization bloc, we note that the bloc with the higher computational costs in our algorithm is the face detection bloc. bloc Therefore we decided to implement it on a hardware platform in order to reduce its runtime “Figure 4”.
Figure 5. Diagram of the Haar face detection method
4 4.1
Hardware Implementation mplementation of the Face Detection Bloc loc on FPGA Architecture of the Algorithm
The previous algorithm has been translated into VHDL, it has one entity "Classifier_kernel" Classifier_kernel" composed of 5 input and 7 output. The architecture of this algorithm “Figure 6” called "code" includes 7 components (Feat_rect0_rom, Feat_rect1_rom, Feat_rect1_rom ...) that contain the values of the classification functions of AdaBoost [7], [8]. Our architecture also includes 7 processes. We applied our detection algorithm on an Image 20x20.
Figure 4. Comparison of runtime of different differe blocs
3
Description of the face ace detection method
We begin by detailing the method of face detection which will be subsequently translated to VHDL. This method is based on Haar-Like descriptors [3], [4], [5], [6]. An Image will be the input of a tree of classification functions. This tree is based on the algorithm AdaBoost, it is composed of 22 cascaded stages. A positive result of the first classification function triggers the assessment of a second classification function, and so on. A negative result of one of the classification functions leads to a rejection of the sub-window. This tree of functions is applied to the input image at any point of the image and
Figure 6. Architecture of the face detection bloc
4.2 4.2.1
Simulation Results Test of Positive Examples We tested a set of positive images “Figure 7”.
the existence of the face in the Image. We note that our VHDL code is suitable for face detection on a single scale. For this, it requires a good positioning to properly detecting the true face. Example1: This image “Figure 10” includes a bit of face. The simulation results gave a negative result. The image failed to pass the stage number 6, which has interrupted the process of detection, and the signal Failed is set to 1 “Figure 11”.
Figure 7. Test images for positive images
These images are named positives as they have successfully passed the 22 stages of the cascade by applying all the descriptors of Adaboost “Figure 8”. The behavior and the number of cycles (2580 2580,5 cycles) for each positive example is the same since the Image will successfully pass all the stages of the AdaBoost algorithm “Figure 9”.
Figure 10. Negative example n°1 The image has failed in the 6th stage Test interrupted
Figure 8. Processing for face detection
From the first descriptor of AdaBoost testing whether the eye area is darker than the cheek region, we were able to extract the coordinates of the eyes. And from the positions of eyes, we have localized the face. The image has passed all the stages (22 stages) Face Detected
Figure 11. Simulation of negative example n°1
Example2: This image “Figure 12” does not contain faces. The detection process has failed from the first stage and the signal Failed is set to 1indicating that there is no face in the picture “Figure 13”.
Figure 12. Negative Example n°2 The image has failed in the 1st stage Test interrupted
Figure 9. Simulation of positive examples
The program that we implemented into VHDL and we synthesized with Xilinx ISE, gives good results. 4.2.2
Test of Negative Examples Secondly, we tested a set of images either not containing faces, or not having a good taken face. The simulation results vary depending on the position and / or
Figure 13. Simulation of the negative example n°2
4.3
classifiers, and thresholds stages. stages Each bloc ROM is an IPCore that will be implemented on FPGA Spartan.
Co-Simulation
As we worked with signals, it is always difficult for a user ignoring the VHDL, to interpret them. For these reasons we have achieved a co-simulation simulation between Matlab and ModelSim using the Simulink tool. tool The following figure presents an overview of the Cosimulation “Figure 14”: Bloc VHDL of « ModelSim »
Display results
Table I. Synthesis Results on Spartan3A Logic Utilization
Used
Available
%
Number of Slices Number of Slices Flip Flops Number of 4 input LUTs Number of bonded IOBs Number of BRAMs Number of MULT18X18SIOs Number of GCLKs
7830 6978 8645 55 25 3 3
1792 3584 3584 248 16 16 24
436% 194% 241% 22% 156% 18% 12%
Following synthesis on the Card Spartan3A “Table I”, we observed that there's an overflow in the number of slices, LUTs and block RAMs used, for that we decided to use another FPGA board “Spartan3A Spartan3A DSP”.
4.5
Implementation on Spartan3A DSP Board
We repeated the same work on a more recent and larger capacity card “the Spartan3A DSP” of the type XC3SD3400A, from the package FG676. Table II. Synthesis Results on Spartan3A DSP
Converter data type
Bloc «Locate»
Figure 14. Co-simulation
Similarly, we visualized the position of the face and eyes in other images taken with different positioning “Figure 15”.
Figure 15. Localization of the face and the eyes in the images of test
So Simulink proved that is a good simulation tool since it allows to simulate the code MATLAB and the code VHDL simultaneously. Its use in this work has allowed a good visualization of the results on images. Then, integrating this code on FPGA ensures automatic and effective detection on real time.
4.4
Implementation on Spartan3A Board
The synthesis of the architecture is performed under cost constraint (time-to-market, area, power consumption, etc.). We will achieve our synthesis on Xilinx Spartan 3A type XC3S200A, FG320 package. To optimize our detection algorithm into terms of resources used and time of design, we use blocks as IP ROMs to store the predefined values for AdaBoost. To become we will build 6 blocks of ROM memory to store the following values: the values of the characteristics of descriptors (width, height, weight, ...), the values left and right of the weak classifiers, the thresholds for weak
Logic Utilization
Used
Available
%
Number of Slice Flip Flops Number of 4 input LUTs Number of Slices occupied - Number of Slices that contains only related logic - Number of Slices that contains unrelated logic Number total of 4 input LUTs
6.981 8.560 7.620 7.620
47.744 47.744 23.872 7.620
0
7.620
14% 17% 31% 100 % 0%
8.593
47.744
17%
- Number used as logic
8.560
- Number used as route-thru
33
Number of bonded IOBs
55
469
11%
Number of BUFGMUXs
3
24
12%
Number of DSP48As
3
126
2%
22
126
17%
Number of RAMB 168WERs Average Fanout of Non-Clock Nets
4.00
The period of the clock CLK is: 14.136 ns (frequency: 70.741 MHz) By analyzing the results “Table “Tab II”, we note that the values are positive i.e we haven’t an overflow in the resources used. The use of prefabricated IPs, we saved Time of Design and Time to Market. Market Table III. Comparison of Results Estimated Runtime in C 8,077 ms
Estimated Runtime in VHDL on FPGA 0,036 ms
By comparing the results of implementation of the bloc face detection in C and VHDL, VHDL by adopting a single scale, we notice that we have gained in terms of runtime and also in design time through the use of IPs “Table III”.
5
Conclusion
In this paper, we described the state of the art of the principal methods of detection and tracking used recently in human-computer interaction systems. Then we detailed the face detection method used. We described the basic functioning of a human-computer interaction system that consists on tracking the eyes and detecting the blink with a USB camera. We performed an optimization in this system by adding a step of face detection for making the detection and location of the eyes more efficient. We also studied the runtime of different stages of the system, and we decided to implement the phase of face detection on an FPGA although it takes the most time in the system. So we wrote the code VHDL of the method and we implemented it on the Card Spartan with an embedded processor the “MicroBlaze”, using IPs Cores to store the values of the learning algorithm AdaBoost. This implementation has saved us the runtime, but also the design time "Time to Market" using predefined IPs. Moreover, we proved the effectiveness of our algorithm for detecting face and eyes that we developed in VHDL, by visualizing the results on a set of images by a cosimulation using the tool Matlab Simulink. Among our nearest perspectives, we want to improve our face detection algorithm to operate on multiple scales in order to scan the entire Image and detect faces in image. Also we'll use our hardware implementation of the method of face detection to achieve a Co-Design in order to build a HW/SW implementation responding to the real time constraint. Also we intend to build our own training base AdaBoost based on positive and negative images, and extend it to recognize the emotions and the attitude of the persons present in the image.
6
References
[1] M. Chau, M. Betke, “Real Time Eye Tracking and Blink Detection with USB cameras”, Boston University Technical Report No. 2005-12, 2005. [2] OpenCV library, http:// sourceforge.net/ projects/ opencvlibrary [3] P. Viola, and M. Jones, “Rapid object detection using a boosted cascade of simple features”, in Proceedings IEEE Conf. on Computer Vision and Pattern Recognition 2001. [4] P. Viola, and M. Jones, “Robust real-time object detection”. Intl. J. Computer Vision, 57(2):137-154, 2004. [5] N. Besbes, “Indexation en intervenant d’un document vidéo par identification du visage”, master, University Paul Sabatier Toulouse III, Doctoral School of Computer Science and telecommunications, 2007. [6] P. Negri, “Détection et Reconnaissance d’objets structurés : Application aux Transports Intelligents”,
thesis, University Pierre et Marie Curie - Paris VI, Institute of Intelligent Systems and Robotics, 2008. [7] S. Zhao, “Apprentissage et Recherche par le Contenu Visuel de Catégories Sémantiques d'Objets Vidéo”, master, University Paris Descartes, laboratory of teams Image and Signal processing CNRS France, 2007. [8] J.U. Cho, S. Mirzaei, J. Oberg, R. Kastner, “FPGABased Face Detection System Using Haar Classifiers”, International Symposium on Field Programmable Gate Arrays (FPGA), 2009.