real time multiple face recognition security system (rtm ...

58 downloads 408611 Views 3MB Size Report
location 2, A+C is the value at location 3, and A is the value at location 1. So, with an. Integral Image ...... conventional cooling systems deployed by laptop processors that's the water copper tube .... Quantity. Price. Samsung LED monitor. 1.
REAL TIME MULTIPLE FACE RECOGNITION SECURITY SYSTEM (RTM-FS) College Of Engineering Perumon- EC - Project 2013

0|Page

ACKNOWLEDGEMENT

It is with great pleasure and pride that we present this report before you. At this moment of triumph, it would be unfair to neglect all those who helped using successful completion of this project. First of all, we would like to place myself at the feet of God Almighty for his everlasting love and for blessings and courage he gave me, which made it possible to see through the turbulence and set me in the right path. We would like to thank our Principal Dr.Z.A.Zoya for the proper ambience to go on with the project. We would also like to thank our head of the department, Mrs. Ananda Resmi for all help and guidance that she provided to me. We are grateful to my project coordinator, Mr .Sudheer V.R Assistant Professor in Department of Electronics And Communication for the guidance and whole heated support. We would take this opportunity to thank my friends who were always a source of encouragement.

1|Page

ABSTRACT

Here we are trying to understand the working and the implementation of multiple face detection cum recognition system standing strictly on under graduate level of understanding. This covers familiarization of the topic and details about the creation of a working model and the testing of the same. The scope of the discussion is generally to understand the working of the face recognition system and its implementation model. The project is designed to improve the automated security systems. Here we can identify people in real time and sounds alarm in case of person recognized as dangerous by law enforcement agencies or by the user himself. So in essence by implementing this system we could alert the user when a person recorded in the database comes under our surveillance camera.

2|Page

TABLE OF CONTENTS

NAME

PAGE NO

ACKNOWLEDGEMENT

1

ABSTRACT

2

TABLE OF CONTENTS

3

LIST OF FIGURES

4

CHAPTER 1 INTRODUCTION

5

CHAPTER 2 HISTORY

8

CHAPTER 3 OVERVIEW OF THE SYSTEM

9

CHAPTER4 DISCRIPTION

10

CHAPTER 5 REQUIREMENTS

25

CHAPTER 6 IMPLEMENTATION DETAILS

26

CHAPTER 7 TEST AND TRIAL

74

CHAPTER 8 APPLICATION OF THE PROPOSED SYSTEM

75

CHAPTER 9 CONCLUSION

77

REFERENCE

78

APPENDIX

3|Page

LIST OF FIGURES

SL NO

NAME

FIG 4.1

EXAMPLE OF HAAR FEATURES

11

FIG 4.2

USE OF HAAR CASCADE

12

FIG 4.3.1

COMPUTATION OF INTEGRAL IMAGE

13

FIG 4.3.2

ALGORITHM FLOW CHART

14

FIG 4.4

FACE RECOGNITION SCHEMATICS

16

FIG 4.5

M-TRAINING FACES

18

FIG 4.6

K-EIGEN FACES

18

FIG 4.7

EIGEN FACE REPRESENTATION

19

FIG 4.8

WHAT PCA DOES

19

FIG 4.9

DIMENTIONALITY REDUCTION

20

FIG 4.10

REPRESENTATION OF MEAN IMAGE

20

FIG 6.1

EMGU CV ARCHITECTURE

29

4|Page

PAGE NO

CHAPTER 1 – INTRODUCTION

Human face recognition has drawn considerable attention from the researchers in recent years. An automatic face recognition system will find many applications in areas such as human-computer interfaces, model-based video coding and security control systems. In addition, face recognition has the potential of being a non-intrusive form of biometric identification. The difficulties of face recognition lie in the inherent variability arising from face characteristics (age, gender and race), geometry (distance and viewpoint), image quality (resolution, illumination, signal to noise ratio), and image content (background, occlusion and disguise). Because of such complexity, most face recognition systems to date assume a well-controlled environment and recognize only near frontal faces. However, these constraints need to be relaxed in practice. Also, in applications such as video database search, a person’s face can appear in arbitrary backgrounds with unknown size and orientation. Thus there is a need for robust face recognition systems to handle these uncertainties. People have amazing ability to recognize and remember thousands of faces. • • • • • • •

1.1

Face is an important part of who you are and how people identify you. While humans have had the innate ability to recognize and distinguish faces for millions of years, computers are just catching up. Face recognition is a fascinating problem with important commercial applications such as mug shot matching, crowd surveillance & witness face reconstruction. In computer vision most of the popular face recognition algorithms have been biologically motivated. Using these models researchers can quantify the similarity between faces; images whose projections are close in face space are likely to be from the same individual. Compare results of these models with human perception to determine whether distance in face space corresponds to the human notion of facial similarity. Biometrics is used for that purpose.

What is biometrics? • • •

5|Page

A biometric is a unique, measureable characteristic of a human being that can be used to automatically recognize an individual or verify an individual’s identity. Biometrics can measure both physiological and behavioral characteristics. Physiological biometrics (based on measurements and data derived from direct measurement of a part of human) include: a. Finger scan





1.2

b. Facial recognition c. Iris scan d. Retina scan e. Hand scan. Behavioral biometrics (based on measurements and data derived from an action) include: a. Voice scan b. Signature scan c. Keystrokes scan. A biometric system refers to the integrated hardware and software used to conduct biometric identification and verification.

Why choose face recognition over other biometrics? • • • • • •

It is non intrusive and requires no physical interaction on behalf of the user. It is accurate and allows for high enrollment and verification rates. It does not require an expert to interpret the comparisons. It can use your existing hardware infrastructure; existing cameras and image capture devices will work with no problem. You can use existing images without having to re-enroll every user.(e.g.: passports, ID cards, drivers licenses etc) It is the only biometric that allows performing passive identification in 1 to many environments (eg: identifying a terrorist in a busy airport terminal).

1.3 What is face recognition system? . • • • • •

• • 6|Page

In clear terms face recognition system is a system, which turns your face to computer code so that it can be compared with thousands of faces. In order for face recognition system to work it has to know what a basic face looks like. Face recognition system is based on ability to first recognize faces, which is a technological feat in itself and then measure the various features of each face. If you look into mirror you can see that your face has certain distinguishable landmarks sometimes called as nodal points. There are about 80 nodal points on human face like a. distance between eyes b. width of nose c. depth of eye sockets d. cheekbones e. jaw line f. chin These nodal points are used to create numerical code, a string of numbers that represents the face in database (called face print). Only 14-22 nodal points are needed to complete the recognition process.

The security system deals with detecting face from live video then recognizing it and it sounds an alarm in case of security breach. The system uses Open CV as image processing tool, server system as hardware. Since our system is a real time one we need to select an accurate and fast algorithm. Since there are several algorithms available, the most promising algorithm for face detection is Viola Jones using AdaBoost (~95% accuracy) for recognition is PCA Eigen Faces (~ 75% accuracy).

7|Page

CHAPTER 2: HISTORY OF FACE RECOGNITION

1960′s First semi-automated system The first semi-automated facial recognition programs were created by Woody Bledsoe, Helen Chan Wolf, and Charles Bisson. Their programs required the administrator to locate features such as the eyes, ears, nose, and mouth on the photograph. It then calculated distances and ratios to a common reference point which was then compared to reference data.

1970′s Goldstein, Harmon, and Lesk Used 21 specific subjective markers, such as hair color and lip thickness, to automate the recognition. The measurements and locations needed to be manually computed, causing the program to require a lot of labor time.

1988 Kirby and Sirovich

Applied principle component analysis, a standard linear algebra technique , to the face recognition problem. Considered a milestone because it showed that less than one hundred values were required to accurately code a suitable aligned and normalized face.

8|Page

CHAPTER 3- OVERVIEW OF THE SYSTEM

Overview  Real Time  Viola Jones using AdaBoost (~95% accuracy)  PCA Eigen Faces (~ 75% accuracy)  Server Hardware

BLOCK DIAGRAM

HDMI PORT

WEB CAMERA

AMD FX4100 USB PORT

MONITOR

BUZZER

SPEECH OUT

9|Page

CHAPTER 4- DESCRIPTION

4.1 Flow Chart

Video Frame

Haar-Cascade Viola Jones Face Detection

PCA-Eigen Face

Training Images

Face Recognition

Yes/No

Face

Not Face

4.2 Face Detection Face detection is a computer vision technology that determines the locations and sizes of human faces in arbitrary (digital) images. It detects facial features and ignores anything else, such as buildings, trees and bodies. Face detection can be regarded as a specific case of object-class detection. In object-class detection, the task is to find the locations and sizes of all objects in a digital image that belong to a given class. Examples include upper torsos, pedestrians, and cars. There are many ways to detect a face in a scene - easier and harder ones. Here is a list of the most common approaches in face detection: 10 | P a g e

    

Finding faces in images with controlled background Finding faces by color Finding faces by motion Using a mixture of the above Finding faces in unconstrained scenes:  Neural Net approach  Neural Nets using statistical cluster information  Model-based Face Tracking  Weak classifier cascades"

We use Viola-Jones method for face detection because it gives 95% accuracy

4.2.1 How Face Detection Works OpenCV's face detector uses a method that Paul Viola and Michael Jones published in 2001. Usually called simply the Viola-Jones method, or even just Viola-Jones, this approach to detecting objects in images combines four key concepts: • • • •

Simple rectangular features, called Haar features An Integral Image for rapid feature detection The AdaBoost machine-learning method A cascaded classifier to combine many features efficiently

Fig: 4.1 Examples of the Haar features used in OpenCV

11 | P a g e

The features that Viola and Jones used are based on Haar wavelets. Haar wavelets are single wavelength square waves (one high interval and one low interval). In two dimensions, a square wave is a pair of adjacent rectangles - one light and one dark. The features that Viola and Jones used are based on Haar wavelets. Haar wavelets are single wavelength square waves (one high interval and one low interval). In two dimensions, a square wave is a pair of adjacent rectangles - one light and one dark. The actual rectangle combinations used for visual object detection are not true Haar wavelets. Instead, they contain rectangle combinations better suited to visual recognition tasks. Because of that difference, these features are called Haar features, or Haarlike features, rather than Haar wavelets. Figure 4. 1 shows the features that OpenCV uses.

Fig: 4.2 Use of Haar cascade in face

The presence of a Haar feature is determined by subtracting the average dark-region pixel value from the average light-region pixel value. If the difference is above a threshold (set during learning), that feature is said to be present. To determine the presence or absence of hundreds of Haar features at every image location and at several scales efficiently, Viola and Jones used a technique called an Integral Image. In general, "integrating" means adding small units together. In this case, the small units are pixel values. The integral value for each pixel is the sum of all the pixels above it and to its left. Starting at the top left and traversing to the right and down, 12 | P a g e

the entire image can be integrated with a few integer operations per pixel, after integration, the value at each pixel location, (x,y), contains the sum of all pixel values within a rectangular region that has one corner at the top left of the image and the other at location (x,y). To find the average pixel value in this rectangle, you'd only need to divide the value at (x,y) by the rectangle's area.

Fig: 4.3.1 Computation of Integral Image

But what if you want to know the summed values for some other rectangle, one that doesn't have one corner at the upper left of the image? Figure 4.3 shows the solution to that problem. Suppose you want the summed values in D. You can think of that as being the sum of pixel values in the combined rectangle, A+B+C+D, minus the sums in rectangles A+B and A+C, plus the sum of pixel values in A. In other words, D = A+B+C+D - (A+B) - (A+C) + A.

Conveniently, A+B+C+D is the Integral Image's value at location 4, A+B is the value at location 2, A+C is the value at location 3, and A is the value at location 1. So, with an Integral Image, you can find the sum of pixel values for any rectangle in the original image with just three integer operations: (x4, y4) - (x2, y2) - (x3, y3) + (x1, y1). To select the specific Haar features to use, and to set threshold levels, Viola and Jones use a machine-learning method called AdaBoost. AdaBoost combines many "weak" classifiers to create one "strong" classifier. "Weak" here means the classifier only gets the right answer a little more often than random guessing would. That's not very good. But if you had a whole lot of these weak classifiers, and each one "pushed" the final answer a little bit in the right direction, you'd have a strong, combined force for arriving at the correct solution. AdaBoost selects a set of weak classifiers to combine and assigns a weight to each. This weighted combination is the strong classifier.

13 | P a g e

Fig 4.3.2 the classifier cascade is a chain of filters. Image sub regions that make it through the entire cascade are classified as "Face." All others are classified as "Not Face."

Viola and Jones combined a series of AdaBoost classifiers as a filter chain, shown in Figure 3, that's especially efficient for classifying image regions. Each filter is a separate AdaBoost classifier with a fairly small number of weak classifiers.

The acceptance threshold at each level is set low enough to pass all, or nearly all, face examples in the training set. The filters at each level are trained to classify training images that passed all previous stages. (The training set is a large database of faces, maybe a thousand or so.) During use, if any one of these filters fails to pass an image region, that region is immediately classified as "Not Face." When a filter passes an image region, it goes to the next filter in the chain. Image regions that pass through all filters in the chain are classified as "Face." Viola and Jones dubbed this filtering chain a cascade.

14 | P a g e

Source: Internet

The order of filters in the cascade is based on the importance weighting that AdaBoost assigns. The more heavily weighted filters come first, to eliminate non-face image regions as quickly as possible.

4.3 Face Recognition Face recognition is the task of identifying an already detected object as a ‘known’ or ‘unknown’ face, and in more advanced cases, telling exactly who’s face it is.

15 | P a g e

Fig: 4.4 how face recognized-schematic

A. Recognition algorithms can be divided into two main approaches: 1- Geometric: which looks at distinguishing features. 2- Photometric: which is a statistical approach that distills an image into values and comparing the values with templates to eliminate variances. B. Popular recognition algorithms include 1. Principal Component Analysis using Eigen faces, 2. Linear Discriminate Analysis, 3. Elastic Bunch Graph Matching using the Fisher face algorithm, 4. The Hidden Markov model, and 5. The neuronal motivated dynamic link matching.

4.3.1 PCA - Eigen Face algorithm for Face Recognition PCA based Eigen face method is at the most primary level and simplest of efficient face recognition algorithms and is therefore a great place for beginners to start learning face recognition.

No face recognition algorithm is yet 100% efficient. It could reach 100% efficiency but not always! So, no existing face recognition algorithm is 100% foolproof! That is why; it’s a very hot topic of research today: to optimize face recognition such that it gives nearperfect efficiency in real-time and critical environment!

16 | P a g e

Therefore Secondly: PCA based Eigen faces method is not 100% efficient! In fact, on the average, it goes up to 70% to 75% efficiency honestly. However, it works well enough, to be used in a beginner or hobbyist robotics/computer vision project. Because, even though there are other better existing algorithms for face recognition, they are still not 100% efficient! And those other recognition algorithms, though better than PCA based Eigen Face, have a bigger overhead of "coding" effort you need to put in to implement them in our project.

4.3.2 Working of PCA-based Eigen faces method

The task of facial recognition is discriminating input signals (image data) into several classes (persons). The input signals are highly noisy (e.g. the noise is caused by differing lighting conditions, pose etc.), yet the input images are not completely random and in spite of their differences there are patterns which occur in any input signal. Such patterns, which can be observed in all signals, could be - in the domain of facial recognition - the presence of some objects (eyes, nose, mouth) in any face as well as relative distances between these objects. These characteristic features are called eigenfaces in the facial recognition domain (or principal components generally). They can be extracted out of original image data by means of a mathematical tool called Principal Component Analysis (PCA). By means of PCA one can transform each original image of the training set into a corresponding eigenface. An important feature of PCA is that one can reconstruct any original image from the training set by combining the eigenfaces. Remember that eigenfaces are nothing less than characteristic features of the faces. Therefore one could say that the original face image can be reconstructed from eigenfaces if one adds up all the eigenfaces (features) in the right proportion. Each eigenface represents only certain features of the face, which may or may not be present in the original image. If the feature is present in the original image to a higher degree, the share of the corresponding eigenface in the “sum” of the eigenfaces should be greater. If, contrary, the particular feature is not (or almost not) present in the original image, then the corresponding eigenface should contribute a smaller (or not at all) part to the sum of eigenfaces. So, in order to reconstruct the original image from the eigenfaces, one has to build a kind of weighted sum of all eigenfaces. That is, the reconstructed original image is equal to a sum of all eigenfaces, with each eigenface having a certain weight. This weight specifies, to what degree the specific feature (eigenface) is present in the original image. If one uses all the eigenfaces extracted from original images, one can reconstruct the original images from the eigenfaces exactly. But one can also use only a part of the eigenfaces. Then the reconstructed image is an approximation of the original image. However, one can ensure that losses due to omitting some of the eigenfaces can be minimized. This happens by choosing only the most important features (eigenfaces). Omission of eigenfaces is necessary due to scarcity of computational resources.

17 | P a g e

How does this relate to facial recognition? The clue is that it is possible not only to extract the face from eigenfaces given a set of weights, but also to go the opposite way. This opposite way would be to extract the weights from eigenfaces and the face to be recognized. These weights tell nothing less, as the amount by which the face in question differs from “typical” faces represented by the eigenfaces. Therefore, using this weights one can determine two important things: 1. Determine, if the image in question is a face at all. In the case the weights of the image differ too much from the weights of face images (i.e. images, from which we know for sure that they are faces), the image probably is not a face. 2. Similar faces (images) possess similar features (eigenfaces) to similar degrees (weights). If one extracts weights from all the images available, the images could be grouped to clusters. That is, all images having similar weights are likely to be similar faces.

4.3.3 Computation by PCA-Eigen Face Method The initial condition while doing PCA is that the training set and known face image must be same size. The PCA Eigen face converts each of these of these images into vector matrices and works on these vector forms.

Fig 4.5 M, training faces

Fig 4.6 K, Eigen Faces

18 | P a g e

The PCA is used to generate K Eigen faces for a training set of M images where KProject as follows:

30 | P a g e

STEP-2: in the Visual C# Project menu, Select "Windows Forms Application" and name the project "Camera Capture", and Click "OK" STEP-3: Lets first add Emgu References to our project.(though you can add them at any time later BUT you must add references before debugging.) Select the Browse tab in the window that pops up, go to Emgu Cv's bin folder as in Level-0 tutorial, and select the following 3 .dllfiles (Emgu.CV.dll, Emgu.CV.UI.dll and Emgu.Util.dll) click OK to continue.

STEP-5: Rename Form1.cs to CameraCapture.cs and change its Text Field to "Camera Output “Add Emgu CV Tools to your Visual Studio, because we will be using those tools, such as Image Box. Add a button to the form and please do some more required "housekeeping" as below: Image Box Properties: Name:

CamImageBox

Border Style: Fixed single 31 | P a g e

Button properties: (Name): Text:

btnStart Start!

Then Debug and Save.

6.1.4.2 Face Detection STEP 1: DECLARING THE CLASSIFIER - Declare an object of class HaarCascade private HaarCascade haar; STEP 2: LOAD THE HaarCascade XML file A classifier uses data stored in an XML file to decide how to classify each image location. So naturally, Haar will need some XML file to load trained data from. You'll need to tell the classifier (Haar object in this case) where to find this data file you want it to use. It's better to locate the XML file we want to use and make sure our path to it is correct, before we code the rest of our face-detection program. 32 | P a g e

haar = new HaarCascade("haarcascade_frontalface_alt_tree.xml"); STEP 3: SET THE IMAGE SOURCE FOR FACE DETECTION STEP 4: INSERT THE FACE DETECTION CODE: 'MCvAvgComp[] var faces = grayframe.DetectHaarCascade(haar, 1.4, 4, HAAR_DETECTION_TYPE.DO_CANNY_PRUNING, new Size(25, 25))[0];

STEPS 5 DEBUG THE PROGRAM

6.1.4.3 Face Recognition

STEP 1: ENTERING THE CRITERIA FOR FACE RECOGITION MCvTermCriteria termCrit = new MCvTermCriteria(ContTrain, 0.001); int thrs = 1000;

STEP 2 ADD EIGEN FACE RECOGNIZER CODE EigenObjectRecognizer recognizer = new EigenObjectRecognizer( trainingImages.ToArray(), labels.ToArray(), thrs, ref termCrit);

STEPS 3 DRAW THE LABEL FOR EACH FACE DETECTED AND RCOGNIZED currentFrame.Draw(name, ref font, new Point(f.rect.X - 2, f.rect.Y - 2), new Bgr(Color.LightGreen)); if (name != "")

6.1.4.4 Alarm Out STEP 1 ADDING THE BUZZER OUT 33 | P a g e

Console.Beep(2000, 1000);

STEP 2 ADDING SPEECH OUT 

Define speech synthesizer

new SpeechSynthesizer().Speak("Security Alert Person Identified ");

6.1.5 Program Code Main Form.cs using using using using using using using using using using

System; System.Collections.Generic; System.Drawing; System.Windows.Forms; Emgu.CV; Emgu.CV.Structure; Emgu.CV.CvEnum; System.IO; System.Diagnostics; System.Speech.Synthesis;

namespace MultiFaceRec { public partial class FrmPrincipal : Form { //Declararation of all variables, vectors and haarcascades Image currentFrame; Capture grabber; HaarCascade face; HaarCascade eye; MCvFont font = new MCvFont(FONT.CV_FONT_HERSHEY_TRIPLEX, 0.5d, 0.5d); Image result, TrainedFace = null; Image gray = null; List trainingImages = new List(); List labels= new List(); List NamePersons = new List(); int ContTrain, NumLabels, t; string name, names = null;

public FrmPrincipal() { InitializeComponent(); //Load haarcascades for face detection face = new HaarCascade("haarcascade_frontalface_default.xml"); eye = new HaarCascade("haarcascade_eye.xml"); try { //Load of previus trainned faces and labels for each image string Labelsinfo = File.ReadAllText(Application.StartupPath + "/TrainedFaces/TrainedLabels.txt"); string[] Labels = Labelsinfo.Split('%'); NumLabels = Convert.ToInt16(Labels[0]); ContTrain = NumLabels; string LoadFaces;

34 | P a g e

for (int tf = 1; tf < NumLabels+1; tf++) { LoadFaces = "face" + tf + ".bmp"; trainingImages.Add(new Image(Application.StartupPath + "/TrainedFaces/" + LoadFaces)); labels.Add(Labels[tf]); } } catch(Exception e) { //MessageBox.Show(e.ToString()); MessageBox.Show("Nothing in binary database, please add at least a face(Simply train the prototype with the Add Face Button).", "Triained faces load", MessageBoxButtons.OK, MessageBoxIcon.Exclamation); } }

private void button1_Click(object sender, EventArgs e) { //Initialize the capture device grabber = new Capture(); grabber.QueryFrame(); //Initialize the FrameGraber event Application.Idle += new EventHandler(FrameGrabber); button1.Enabled = false; }

private void button2_Click(object sender, System.EventArgs e) { try { //Trained face counter ContTrain = ContTrain + 1; //Get a gray frame from capture device gray = grabber.QueryGrayFrame().Resize(320, 240, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC); //Face Detector MCvAvgComp[][] facesDetected = gray.DetectHaarCascade( face, 1.2, 10, Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING, new Size(20, 20)); //Action for each element detected foreach (MCvAvgComp f in facesDetected[0]) { TrainedFace = currentFrame.Copy(f.rect).Convert(); break; } //resize face detected image for force to compare the same size with the //test image with cubic interpolation type method

35 | P a g e

TrainedFace = result.Resize(100, 100, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC); trainingImages.Add(TrainedFace); labels.Add(textBox1.Text); //Show face added in gray scale imageBox1.Image = TrainedFace; //Write the number of triained faces in a file text for further load File.WriteAllText(Application.StartupPath + "/TrainedFaces/TrainedLabels.txt", trainingImages.ToArray().Length.ToString() + "%"); //Write the labels of triained faces in a file text for further load for (int i = 1; i < trainingImages.ToArray().Length + 1; i++) { trainingImages.ToArray()[i - 1].Save(Application.StartupPath + "/TrainedFaces/face" + i + ".bmp"); File.AppendAllText(Application.StartupPath + "/TrainedFaces/TrainedLabels.txt", labels.ToArray()[i - 1] + "%"); } MessageBox.Show(textBox1.Text + "´s face detected and added :)", "Training OK", MessageBoxButtons.OK, MessageBoxIcon.Information); } catch { MessageBox.Show("Enable the face detection first", "Training Fail", MessageBoxButtons.OK, MessageBoxIcon.Exclamation); } }

void FrameGrabber(object sender, EventArgs e) { label3.Text = "0"; //label4.Text = ""; NamePersons.Add("");

//Get the current frame form capture device currentFrame = grabber.QueryFrame().Resize(320, 240, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC); //Convert it to Grayscale gray = currentFrame.Convert(); //Face Detector MCvAvgComp[][] facesDetected = gray.DetectHaarCascade( face, 1.2, 10, Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING, new Size(20, 20)); //Action for each element detected foreach (MCvAvgComp f in facesDetected[0]) { t = t + 1;

36 | P a g e

result = currentFrame.Copy(f.rect).Convert().Resize(100, 100, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC); //draw the face detected in the 0th (gray) channel with blue color currentFrame.Draw(f.rect, new Bgr(Color.Red), 2);

if (trainingImages.ToArray().Length != 0) { //TermCriteria for face recognition with numbers of trained images like maxIteration MCvTermCriteria termCrit = new MCvTermCriteria(ContTrain, 0.001); int thrs = 1000; try { thrs = int.Parse(txtThreshold.Text); } catch (Exception ex) { // MessageBox.Show("Enter integer as threshold value"); }

//Eigen face recognizer EigenObjectRecognizer recognizer = new EigenObjectRecognizer( trainingImages.ToArray(), labels.ToArray(), thrs, ref termCrit); name = recognizer.Recognize(result); //Draw the label for each face detected and recognized currentFrame.Draw(name, ref font, new Point(f.rect.X - 2, f.rect.Y - 2), new Bgr(Color.LightGreen)); if (name != "") { Console.Beep(2000, 1000); //add sound

new SpeechSynthesizer().Speak("Security Alert Person Identified ");

}

} NamePersons[t - 1] = name; NamePersons.Add("");

//Set the number of faces detected on the scene label3.Text = facesDetected[0].Length.ToString(); /* //Set the region of interest on the faces gray.ROI = f.rect; MCvAvgComp[][] eyesDetected = gray.DetectHaarCascade(

37 | P a g e

eye, 1.1, 10, Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_CANNY_PRUNING, new Size(20, 20)); gray.ROI = Rectangle.Empty; foreach (MCvAvgComp ey in eyesDetected[0]) { Rectangle eyeRect = ey.rect; eyeRect.Offset(f.rect.X, f.rect.Y); currentFrame.Draw(eyeRect, new Bgr(Color.Blue), 2); } */ } t = 0; //Names concatenation of persons recognized for (int nnn = 0; nnn < facesDetected[0].Length; nnn++) { names = names + NamePersons[nnn] + ", "; } //Show the faces procesed and recognized imageBoxFrameGrabber.Image = currentFrame; label4.Text = names; names = ""; {

} NamePersons.Clear();

//Clear the list(vector) of names

} private void button3_Click(object sender, EventArgs e) { Process.Start("copyright Anand Raj & Team and Open Source"); } private void textBox1_TextChanged(object sender, EventArgs e) { } private void FrmPrincipal_Load(object sender, EventArgs e) { } private void groupBox1_Enter(object sender, EventArgs e) { }

38 | P a g e

private void textBox2_TextChanged(object sender, EventArgs e) { } } }

Eigen Object Recogniser.cs using System; using System.Diagnostics; using Emgu.CV.Structure; namespace Emgu.CV { /// /// An object recognizer using PCA (Principle Components Analysis) /// [Serializable] public class EigenObjectRecognizer { private Image[] _eigenImages; private Image _avgImage; private Matrix[] _eigenValues; private string[] _labels; private double _eigenDistanceThreshold; /// /// Get the eigen vectors that form the eigen space /// /// The set method is primary used for deserialization, do not attemps to set it unless you know what you are doing public Image[] EigenImages { get { return _eigenImages; } set { _eigenImages = value; } } /// /// Get or set the labels for the corresponding training image /// public String[] Labels { get { return _labels; } set { _labels = value; } } /// /// Get or set the eigen distance threshold. /// The smaller the number, the more likely an examined image will be treated as unrecognized object. /// Set it to a huge number (e.g. 5000) and the recognizer will always treated the examined image as one of the known object. /// public double EigenDistanceThreshold { get { return _eigenDistanceThreshold; } set { _eigenDistanceThreshold = value; } }

39 | P a g e

/// /// Get the average Image. /// /// The set method is primary used for deserialization, do not attemps to set it unless you know what you are doing public Image AverageImage { get { return _avgImage; } set { _avgImage = value; } } /// /// Get the eigen values of each of the training image /// /// The set method is primary used for deserialization, do not attemps to set it unless you know what you are doing public Matrix[] EigenValues { get { return _eigenValues; } set { _eigenValues = value; } } private EigenObjectRecognizer() { }

/// /// Create an object recognizer using the specific tranning data and parameters, it will always return the most similar object /// /// The images used for training, each of them should be the same size. It's recommended the images are histogram normalized /// The criteria for recognizer training public EigenObjectRecognizer(Image[] images, ref MCvTermCriteria termCrit) : this(images, GenerateLabels(images.Length), ref termCrit) { } private static String[] GenerateLabels(int size) { String[] labels = new string[size]; for (int i = 0; i < size; i++) labels[i] = i.ToString(); return labels; } /// /// Create an object recognizer using the specific tranning data and parameters, it will always return the most similar object /// /// The images used for training, each of them should be the same size. It's recommended the images are histogram normalized /// The labels corresponding to the images /// The criteria for recognizer training public EigenObjectRecognizer(Image[] images, String[] labels, ref MCvTermCriteria termCrit) : this(images, labels, 0, ref termCrit) { }

40 | P a g e

/// /// Create an object recognizer using the specific tranning data and parameters /// /// The images used for training, each of them should be the same size. It's recommended the images are histogram normalized /// The labels corresponding to the images /// /// The eigen distance threshold, (0, ~1000]. /// The smaller the number, the more likely an examined image will be treated as unrecognized object. /// If the threshold is < 0, the recognizer will always treated the examined image as one of the known object. /// /// The criteria for recognizer training public EigenObjectRecognizer(Image[] images, String[] labels, double eigenDistanceThreshold, ref MCvTermCriteria termCrit) { Debug.Assert(images.Length == labels.Length, "The number of images should equals the number of labels"); Debug.Assert(eigenDistanceThreshold >= 0.0, "Eigen-distance threshold should always >= 0.0"); CalcEigenObjects(images, ref termCrit, out _eigenImages, out _avgImage); /* _avgImage.SerializationCompressionRatio = 9; foreach (Image img in _eigenImages) //Set the compression ration to best compression. The serialized object can therefore save spaces img.SerializationCompressionRatio = 9; */ _eigenValues = Array.ConvertAll(images, delegate(Image img) { return new Matrix(EigenDecomposite(img, _eigenImages, _avgImage)); }); _labels = labels; _eigenDistanceThreshold = eigenDistanceThreshold; } #region static methods /// /// Caculate the eigen images for the specific traning image /// /// The images used for training /// The criteria for tranning /// The resulting eigen images /// The resulting average image public static void CalcEigenObjects(Image[] trainingImages, ref MCvTermCriteria termCrit, out Image[] eigenImages, out Image avg) { int width = trainingImages[0].Width; int height = trainingImages[0].Height;

41 | P a g e

IntPtr[] inObjs = Array.ConvertAll(trainingImages, delegate(Image img) { return img.Ptr; }); if (termCrit.max_iter trainingImages.Length) termCrit.max_iter = trainingImages.Length; int maxEigenObjs = termCrit.max_iter; #region initialize eigen images eigenImages = new Image[maxEigenObjs]; for (int i = 0; i < eigenImages.Length; i++) eigenImages[i] = new Image(width, height); IntPtr[] eigObjs = Array.ConvertAll(eigenImages, delegate(Image img) { return img.Ptr; }); #endregion avg = new Image(width, height); CvInvoke.cvCalcEigenObjects( inObjs, ref termCrit, eigObjs, null, avg.Ptr); } /// /// Decompose the image as eigen values, using the specific eigen vectors /// /// The image to be decomposed /// The eigen images /// The average images /// Eigen values of the decomposed image public static float[] EigenDecomposite(Image src, Image[] eigenImages, Image avg) { return CvInvoke.cvEigenDecomposite( src.Ptr, Array.ConvertAll(eigenImages, delegate(Image img) { return img.Ptr; }), avg.Ptr); } #endregion /// /// Given the eigen value, reconstruct the projected image /// /// The eigen values /// The projected image public Image EigenProjection(float[] eigenValue) { Image res = new Image(_avgImage.Width, _avgImage.Height); CvInvoke.cvEigenProjection( Array.ConvertAll(_eigenImages, delegate(Image img) { return img.Ptr; }), eigenValue, _avgImage.Ptr, res.Ptr); return res; }

42 | P a g e

/// /// Get the Euclidean eigen-distance between and every other image in the database /// /// The image to be compared from the training images /// An array of eigen distance from every image in the training images public float[] GetEigenDistances(Image image) { using (Matrix eigenValue = new Matrix(EigenDecomposite(image, _eigenImages, _avgImage))) return Array.ConvertAll(_eigenValues, delegate(Matrix eigenValueI) { return (float)CvInvoke.cvNorm(eigenValue.Ptr, eigenValueI.Ptr, Emgu.CV.CvEnum.NORM_TYPE.CV_L2, IntPtr.Zero); }); } /// /// Given the to be examined, find in the database the most similar object, return the index and the eigen distance /// /// The image to be searched from the database /// The index of the most similar object /// The eigen distance of the most similar object /// The label of the specific image public void FindMostSimilarObject(Image image, out int index, out float eigenDistance, out String label) { float[] dist = GetEigenDistances(image); index = 0; eigenDistance = dist[0]; for (int i = 1; i < dist.Length; i++) { if (dist[i] < eigenDistance) { index = i; eigenDistance = dist[i]; } } label = Labels[index]; } /// /// Try to recognize the image and return its label /// /// The image to be recognized /// /// String.Empty, if not recognized; /// Label of the corresponding image, otherwise /// public String Recognize(Image image) { int index; float eigenDistance; String label; FindMostSimilarObject(image, out index, out eigenDistance, out label);

43 | P a g e

return (_eigenDistanceThreshold = 2 && strcmp(argv[1], "train") == 0 ) { char *szFileTrain; if (argc == 3)

45 | P a g e

szFileTrain = argv[2]; // use the given arg else { printf("ERROR: No training file given.\n"); return 1; } learn(szFileTrain); } else if( argc >= 2 && strcmp(argv[1], "test") == 0) { char *szFileTest; if (argc == 3) szFileTest = argv[2]; // use the given arg else { printf("ERROR: No testing file given.\n"); return 1; } recognizeFileList(szFileTest); } else { recognizeFromCam(); } return 0; } #if defined WIN32 || defined _WIN32 // Wrappers of kbhit() and getch() for Windows: #define changeKeyboardMode #define kbhit _kbhit #else // Create an equivalent to kbhit() and getch() for Linux, #define VK_ESCAPE 0x1B

// Escape character

// If 'dir' is 1, get the Linux terminal to return the 1st keypress instead of waiting for an ENTER key. // If 'dir' is 0, will reset the terminal back to the original settings. void changeKeyboardMode(int dir) { static struct termios oldt, newt; if ( dir == 1 ) { tcgetattr( STDIN_FILENO, &oldt); newt = oldt; newt.c_lflag &= ~( ICANON | ECHO ); tcsetattr( STDIN_FILENO, TCSANOW, &newt); } else tcsetattr( STDIN_FILENO, TCSANOW, &oldt); } // Get the next keypress. int kbhit(void) { struct timeval tv; fd_set rdfs; tv.tv_sec = 0; tv.tv_usec = 0;

46 | P a g e

FD_ZERO(&rdfs); FD_SET (STDIN_FILENO, &rdfs); select(STDIN_FILENO+1, &rdfs, NULL, NULL, &tv); return FD_ISSET(STDIN_FILENO, &rdfs); } // Use getchar() on Linux instead of getch(). #define getch() getchar() #endif // Save all the eigenvectors as images, so that they can be checked. void storeEigenfaceImages() { // Store the average image to a file printf("Saving the image of the average face as 'out_averageImage.bmp'.\n"); cvSaveImage("out_averageImage.bmp", pAvgTrainImg); // Create a large image made of many eigenface images. // Must also convert each eigenface image to a normal 8-bit UCHAR image instead of a 32-bit float image. printf("Saving the %d eigenvector images as 'out_eigenfaces.bmp'\n", nEigens); if (nEigens > 0) { // Put all the eigenfaces next to each other. int COLUMNS = 8; // Put upto 8 images on a row. int nCols = min(nEigens, COLUMNS); int nRows = 1 + (nEigens / COLUMNS); // Put the rest on new rows. int w = eigenVectArr[0]->width; int h = eigenVectArr[0]->height; CvSize size; size = cvSize(nCols * w, nRows * h); IplImage *bigImg = cvCreateImage(size, IPL_DEPTH_8U, 1); // 8-bit Greyscale UCHAR image for (int i=0; istep / sizeof(float); for(i=0; idata.fl + i*nEigens); projectedTrainFaceMat->data.fl + i*offset); } // store the recognition data as an xml file storeTrainingData(); // Save all the eigenvectors as images, so that they can be checked. if (SAVE_EIGENFACE_IMAGES) { storeEigenfaceImages(); } } // Open the training data from the file 'facedata.xml'. int loadTrainingData(CvMat ** pTrainPersonNumMat) { CvFileStorage * fileStorage; int i; // create a file-storage interface fileStorage = cvOpenFileStorage( "facedata.xml", 0, CV_STORAGE_READ ); if( !fileStorage ) {

48 | P a g e

printf("Can't open training database file 'facedata.xml'.\n"); return 0; } // Load the person names. personNames.clear(); // Make sure it starts as empty. nPersons = cvReadIntByName( fileStorage, 0, "nPersons", 0 ); if (nPersons == 0) { printf("No people found in the training database 'facedata.xml'.\n"); return 0; } // Load each person's name. for (i=0; i nPersons) { // Allocate memory for the extra person (or possibly multiple), using this new person's name. for (i=nPersons; i < personNumber; i++) { personNames.push_back( sPersonName ); } nPersons = personNumber; //printf("Got new person -> nPersons = %d [%d]\n", sPersonName.c_str(), nPersons, personNames.size()); } // Keep the data personNumTruthMat->data.i[iFace] = personNumber; // load the face image faceImgArr[iFace] = cvLoadImage(imgFilename, CV_LOAD_IMAGE_GRAYSCALE); if( !faceImgArr[iFace] ) { fprintf(stderr, "Can\'t load image from %s\n", imgFilename); return 0; } } fclose(imgListFile); printf("Data loaded from '%s': (%d images of %d people).\n", filename, nFaces, nPersons); printf("People: "); if (nPersons > 0) printf("", personNames[0].c_str()); for (i=1; idata.i[iNearest]; if (nearest == truth) { answer = "Correct"; nCorrect++; } else { answer = "WRONG!"; nWrong++; } printf("nearest = %d, Truth = %d (%s). Confidence = %f\n", nearest, truth, answer, confidence); } tallyFaceRecognizeTime = (double)cvGetTickCount() timeFaceRecognizeStart; if (nCorrect+nWrong > 0) { printf("TOTAL ACCURACY: %d%% out of %d tests.\n", nCorrect * 100/(nCorrect+nWrong), (nCorrect+nWrong)); printf("TOTAL TIME: %.1fms average.\n", tallyFaceRecognizeTime/((double)cvGetTickFrequency() * 1000.0 * (nCorrect+nWrong) ) );

54 | P a g e

} } // Grab the next camera frame. Waits until the next frame is ready, // and provides direct access to it, so do NOT modify the returned image or free it! // Will automatically initialize the camera on the first frame. IplImage* getCameraFrame(void) { IplImage *frame; // If the camera hasn't been initialized, then open it. if (!camera) { printf("Acessing the camera ...\n"); camera = cvCaptureFromCAM( 0 ); if (!camera) { printf("ERROR in getCameraFrame(): Couldn't access the camera.\n"); exit(1); } // Try to set the camera resolution cvSetCaptureProperty( camera, CV_CAP_PROP_FRAME_WIDTH, 320 ); cvSetCaptureProperty( camera, CV_CAP_PROP_FRAME_HEIGHT, 240 ); // Wait a little, so that the camera can auto-adjust itself #if defined WIN32 || defined _WIN32 Sleep(1000); // (in milliseconds) #endif frame = cvQueryFrame( camera ); // get the first frame, to make sure the camera is initialized. if (frame) { printf("Got a camera using a resolution of %dx%d.\n", (int)cvGetCaptureProperty( camera, CV_CAP_PROP_FRAME_WIDTH), (int)cvGetCaptureProperty( camera, CV_CAP_PROP_FRAME_HEIGHT) ); } } frame = cvQueryFrame( camera ); if (!frame) { fprintf(stderr, "ERROR in recognizeFromCam(): Could not access the camera or video file.\n"); exit(1); //return NULL; } return frame; } // Return a new image that is always greyscale, whether the input image was RGB or Greyscale. // Remember to free the returned image using cvReleaseImage() when finished. IplImage* convertImageToGreyscale(const IplImage *imageSrc) {

55 | P a g e

IplImage *imageGrey; // Either convert the image to greyscale, or make a copy of the existing greyscale image. // This is to make sure that the user can always call cvReleaseImage() on the output, whether it was greyscale or not. if (imageSrc->nChannels == 3) { imageGrey = cvCreateImage( cvGetSize(imageSrc), IPL_DEPTH_8U, 1 ); cvCvtColor( imageSrc, imageGrey, CV_BGR2GRAY ); } else { imageGrey = cvCloneImage(imageSrc); } return imageGrey; } // Creates a new image copy that is of a desired size. // Remember to free the new image later. IplImage* resizeImage(const IplImage *origImg, int newWidth, int newHeight) { IplImage *outImg = 0; int origWidth; int origHeight; if (origImg) { origWidth = origImg->width; origHeight = origImg->height; } if (newWidth width && newHeight > origImg->height) { // Make the image larger cvResetImageROI((IplImage*)origImg); cvResize(origImg, outImg, CV_INTER_LINEAR); // CV_INTER_CUBIC or CV_INTER_LINEAR is good for enlarging } else { // Make the image smaller cvResetImageROI((IplImage*)origImg); cvResize(origImg, outImg, CV_INTER_AREA); // CV_INTER_AREA is good for shrinking / decimation, but bad at enlarging. } return outImg; } // Returns a new image that is a cropped version of the original image.

56 | P a g e

IplImage* cropImage(const IplImage *img, const CvRect region) { IplImage *imageTmp; IplImage *imageRGB; CvSize size; size.height = img->height; size.width = img->width; if (img->depth != IPL_DEPTH_8U) { printf("ERROR in cropImage: Unknown image depth of %d given in cropImage() instead of 8 bits per pixel.\n", img->depth); exit(1); } // First create a new (color or greyscale) IPL Image and copy contents of img into it. imageTmp = cvCreateImage(size, IPL_DEPTH_8U, img->nChannels); cvCopy(img, imageTmp, NULL); // Create a new image of the detected region // Set region of interest to that surrounding the face cvSetImageROI(imageTmp, region); // Copy region of interest (i.e. face) into a new iplImage (imageRGB) and return it size.width = region.width; size.height = region.height; imageRGB = cvCreateImage(size, IPL_DEPTH_8U, img->nChannels); cvCopy(imageTmp, imageRGB, NULL); // Copy just the region. cvReleaseImage( &imageTmp ); return imageRGB; } // Get an 8-bit equivalent of the 32-bit Float image. // Returns a new image, so remember to call 'cvReleaseImage()' on the result. IplImage* convertFloatImageToUcharImage(const IplImage *srcImg) { IplImage *dstImg = 0; if ((srcImg) && (srcImg->width > 0 && srcImg->height > 0)) { // Spread the 32bit floating point pixels to fit within 8bit pixel range. double minVal, maxVal; cvMinMaxLoc(srcImg, &minVal, &maxVal); //cout

Suggest Documents