Implementation Discrete Cosine Transform and ...

0 downloads 0 Views 685KB Size Report
Recognition. Marprin H. Muchri1, Samuel Lukas1, David Habsara Hareva1 ... Two categories in face recognition system are feature-based approach and.
Implementation Discrete Cosine Transform and Radial Basis Function Neural Network in Facial Image Recognition Marprin H. Muchri1, Samuel Lukas1, David Habsara Hareva1 1

Computer Science Department, Universitas Pelita Harapan, Karawaci, Indonesia [email protected], [email protected], [email protected]

Abstract. Facial image recognition has been widely used and implemented in many aspects of life such as, investigation area or security purposes. However, research in this area is still been done. Source images of this paper are taken from image library provided in description of the Collection of Facial Images, 2008. This paper explains how 35 faces in JPG format with dimension 180 x 180 pixels can be represented by only 3 x 3 DCT coefficients and can be recognized fully 100% by Radial Basis Function Network.

Keywords : Discrete Cosine Transforms; Radial Basis Function; Image recognition

1 Introduction Two categories in face recognition system are feature-based approach and brightness-based approach. The feature-based approach is done by a special processing of face image that are key points of the face, such as the edges, eyes, noise, mouth or other special characteristics. The calculation process only covers some partial images that priory have been extracted in a certain way. On the other hand, the brightness-based approach calculates all parts of the image. Therefore it is also known as holistic-based approach or image-based approach. Since all parts of the image have to be considered, the brightness-based approach needs a longer time to process and is also complicated. To make it short and simple, the image has to be transformed to a certain model. Many models have been proposed. One of that model was introduced by Turk and Pentland, 1991, using the Principle Component Analysis [1][2][3]. Another proposed model applied the Discrete Wavelet Transform (DWT) [4]. In the present paper, the DCT is applied. After extraction of the features of the image, the system goes to a recognition system. There are also a lot of recognition models that can be used, such as back propagation neural network [4] and Hidden Markov Models [5][6]. This paper discusses on how face recognition can be done by using Radial Basis Function Network, RBFN.

2 Discrete Cosine Transform and Radial Basis Function Network Discrete Cosine Transform (DCT) is a transform coding mostly used in signal processing or digital image processing. It is derived from the Discrete Fourier Transform (DFT). The objects of this paper are in the form of images. Therefore 2DDCTs are implemented [7][8]. Special domain of an image 𝐼(𝑥, 𝑦) is transformed to frequencies domain as 𝐶(𝑢, 𝑣) is stated as (1) &(2) and the inverse as (3) 𝑐𝑜𝑙 𝐶(𝑢, 𝑣) = 𝛼(𝑢)𝛼(𝑣) ∑𝑟𝑜𝑤 𝑟=0 ∑𝑐=0 𝐼(𝑟, 𝑐) cos [

𝛼(𝑢), 𝛼(𝑣) =



1



2

𝑁

(2𝑟+1)𝑢𝜋 2𝑁

] cos [

(2𝑐+1)𝑣𝜋 2𝑁

]

(1)

𝑓𝑜𝑟 𝑢, 𝑣 = 0, (2) 𝑓𝑜𝑟 𝑢, 𝑣 = 1, 2, … , 𝑁 − 1

{ 𝑁 (2𝑟+1)𝑢𝜋 (2𝑐+1)𝑣𝜋 𝑟𝑜𝑤 𝑐𝑜𝑙 ∑ 𝐼(𝑟, 𝑐) = 𝑢=0 ∑𝑣=0 α(𝑢)α(𝑣)C(𝑢, 𝑣)cos [ ] cos [ ] 2𝑁

(3)

2𝑁

DCT has important properties. They are de-correlation, energy compaction, domain scaling, separability, and symmetry. De-correlation means that there is no correlation in calculating among all the DCT coefficients. Therefore, all DCT coefficients can be calculated independently. DCT exhibits excellent energy compaction for highly correlated images. Efficacy of a transformation scheme can be directly gauged by its ability to pack input data into as few coefficients as possible without introducing visual distortion in the reconstructed image significantly. DCT is also not scaling invariant. This implies that in an image recognition system, all of the images that are used for training or identification have to be uniform. Separability means that the DCT coefficients can be computed in two steps by successive 1-D operations on rows and columns of an image. It is stated in (4). Another look at the row and column operations in Equation (4) reveals that these operations are functionally identical. Such a transformation is called a symmetric transformation [9]. 𝐶(𝑢, 𝑣) = 𝛼(𝑢)𝛼(𝑣) ∑𝑟𝑜𝑤 𝑟=0 cos [

(2𝑟+1)𝑢𝜋 2𝑁

] ∑𝑐𝑜𝑙 𝑐=0 𝐼(𝑟, 𝑐) cos [

(2𝑐+1)𝑣𝜋 2𝑁

]

(4)

The idea of Radial Basis Function (RBF) Networks derives from Multi-Layer Perceptron (MLP) networks but RBF Networks take a slightly different approach. They have five main features. They are two-layer feed-forward networks. The hidden nodes implement a set of radial basis functions (e.g. Gaussian functions). The output nodes implement linear summation functions as in an MLP. The network training is divided into two stages: first the weights from the input to hidden layer are determined, and then the weights from the hidden to output layer. The training/learning is very fast. Configuration of RBF Network for P input nodes with Q hidden nodes and R output nodes can be seen in [10]. Figure 1 is the scheme of RBFN. The goal of RBF is to find a function 𝑓: 𝑥 𝑝 → 𝑦 𝑟 so that it can interpolate of a set of N data points in a p-dimensional input space, 𝑋 = (𝑥𝑖 𝑥2 … 𝑥𝑝 ), to be mapped onto the r-dimensional output space, 𝑌 = (𝑦𝑖 𝑦2 … 𝑦𝑟 ). Radial Basis function of every hidden node has a center vector, 𝑥𝑐 = (𝑥𝑐1 𝑥𝑐2 … 𝑥𝑐𝑝 ) and a variance, 𝜎 2 . The output of every hidden node is stated in (5) then by doing linear combination with the

weights, 𝑊𝑘𝑗 , from hidden nodes to the output nodes the the output of the RBF is performed (6). 𝜑(‖𝑥 − 𝑥𝑐 ‖), 𝜑(𝑎) = 𝑒𝑥𝑝 (−

𝑎2 2𝜎 2

)

(5)

𝑞

𝑦 𝑘∗ = 𝑓(𝑥 ∗ ) = ∑𝑗=1 𝑊𝑘𝑗 𝜑𝑗 (‖𝑥 ∗ − 𝑥𝑐𝑗 ‖)

(6)

Figure 1: Structure of RBFN

3 System Design Block diagram of the system is presented in Fig. 2. They are training process and recognizing process. The output of training process is the weights of RBFN from hidden nodes to the output nodes, whereas for recognizing process is the name of the input facial image. Facial Training Images

Image Processing

Features Extraction

Training RBFN Weights of RBFN

Facial Image

Image Processing

Features Extraction

Recognizing Name of the person

Figure 2: Block diagram of the System

Facial training images consist of thirty five images with seven subjects. Each subject consists of five sample images. They are shown in Figure 3. Each image is normalized with the size of 64 x 64 pixels and transformed into gray scale image. By implementing DCT transform in feature extraction process than 64 x 64 DCT coefficients are performed. Some of these DCTs are trained in training process by RBFN. Suppose p-DCT coefficients are chosen for the features of a facial image, then the input data set is a matrix, 𝑋 = {𝑥𝑖𝑗 |𝑖 = 0,2, … ,34 ; 𝑗 = 0,2, … , 𝑝 − 1} whereas the output training data set is a matrix, 𝑇 = {𝑡𝑖𝑘 |𝑖 = 1,2, … ,35 ; 𝑘 = 1,2, … ,7}. If image 𝑖 in data set represents person number 𝑘 in which 𝑘 = (𝑖 𝑑𝑖𝑣 5) + 1 then 𝑡𝑖𝑗 is defined in (7).

Figure 3: Thirty five images to be trained

The output of hidden layer is formed in 𝐻 = {ℎ𝑖𝑘 } in (9) after finding the center vector of hidden layer 𝐶𝑘 = {𝑐𝑘𝑗 , 𝑘 = 1,2, … ,7 ; 𝑗 = 0,2, … , 𝑝 − 1} and 𝜎𝑘2 in (8) and the weights is 𝑊 = {𝑤𝑘𝑘 } in (10). 𝑖𝑓 𝑘 = (𝑖 𝑑𝑖𝑣 5) + 1 𝑜𝑡ℎ𝑒𝑟𝑠

1 𝑡𝑖𝑘 = { 0 1

𝑐𝑘𝑗 = ∑5𝑘−1 𝑖=5(𝑘−1) 𝑥𝑖𝑗 𝜎𝑘2 =

1 5𝑝

5 5𝑘−1



𝑘 = {1, … ,7} ; 𝑗 = 0,1, … , 𝑝 − 1

𝑝−1

∑(𝑥𝑖,𝑗 − 𝑐𝑘𝑗 )2 𝑘 = 1, 2, … ,7 ; 𝑗 = 0,1, … , 𝑝 − 1

𝑖=5(𝑘−1) 𝑗=0 (

ℎ𝑖𝑘 = 𝜑𝑘 (𝑥𝑖𝑗 ) = 𝑒

(7)

2 ‖𝑥𝑖𝑗 − 𝐶𝑘𝑗 ‖ ) 2 2𝜎𝑘

(𝐻𝑡

𝑊=

𝑖 = {0, … ,34}, 𝑘 = {1, … ,7} −1

𝑡

𝐻) 𝐻 𝑇

(8)

(9) (10)

In recognizing process, a single facial image is represented by 𝑥𝑗 𝑗 = 0,2, … , 𝑝 then the output of recognizing process, 𝑂 = {𝑂𝑗 | 𝑗 = 1,2, … ,7} is in (11) (

ℎ𝑘 = 𝜑𝑘 (𝑥𝑖𝑗 ) = 𝑒

2 ‖𝑥𝑗 − 𝐶𝑘 ‖ ) 2 2𝜎𝑘

𝑘 = {1, … ,7}

(11)

𝑂 = 𝐻𝑊 It is clear that index of the highest value of component 𝑂𝑗 indicates that input facial image belongs to that index number of person in the system.

4 Experiment Result and Discussion The experiment is to determine how many DCTs coefficient is needed to achieve better recognition percentage. It is done from 2 x 2 DCTs to 4 x 4 DCTs. The results all of 35 new input images, figure 4, can be recognized 100% correct. The result is tabulated at Table 1.

Figure 4: Thirty five images to be recognized Table 1. The percentage and threshold results of the experiments

Subject

DCT 2 x 2 Threshold 0.60

DCT 3 x 3 % rec 100

Threshold 0.72

DCt 4 x 4

1

% rec 100

% rec 100

Threshold

2

100

0.80

100

0.77

100

0.74

3

100

0.51

100

0.62

100

0.59

4

100

0.81

100

0.93

100

0.94

5

100

0.90

100

0.96

100

0.96

6

100

0.84

100

0.93

100

0.92

7

100

0.96

100

1.00

100

0.96

0.65

From the system design, it indicates that from a single input facial image, it will be 7 values of each output. The values indicate the percentage recognition to the index of associated person. From the table, using DCTs 2 x 2 subject no-1, the threshold is 0.6. It means if the value of the first output value is at least 0.6, it indicated that the input facial image belongs to the first person. Whereas threshold value by using

DCTs 3 x 3 of subject 6 is 0.93, it means the input image belongs to person number 6 if the sixth value of the output at least 0.93. From that table, the best DCTs is 3 x 3.

5 Conclusion System is able to recognize 100% facial input data using Discrete Cosine Transform and Radial basis Function Network. However, some works still can be done if the facial data is not taken from the facial image database, such as from direct camera.

References M. Turk and A. Pentland, “Eigenfaces for recognition", Journal of Cognitive Neuroscience, vol. 3, no. 1, pp.71-86, 1991 2. K. I. Diamantaras and S. Y. Kung, “Principal Component Neural Networks: Theory and Applications”, John Wiley & Sons,Inc., 1996. 3. Alex Pentland, Baback Moghaddam, and Thad Starner, “View-Based and Modular Eigenspaces for Face Recognition”, IEEE Conf. on Computer Vision and Pattern Recognition, MIT Media Laboratory Tech. Report No. 245 1994 4. M. Alwakeel and Z. Shaaban, “Face Recognition Based on Haar Wavelet Transform and Principal Component Analysis via Levenberg-Marquardt Back propagation Neural Network”, European Journal of Scientific Research, ISSN 1450-216X, Vol.42 No.1 pp.2531, 2010 5. Kohir, Vinayadatt V., U. B. Desai, Face Recognition Using a DCT-HMM Approach, Indian Institute of Technology, Mumbai, India, 1998. 6. Satio E. Handy, Samuel Lukas, Helena Margaretha, Further Tests for Face Recognition Using Discrete Cosine Transform And Hidden Markov Model, Proceeding, International Conference on Electrical Engineering and Informatics (MICEEI), Makasar, 2012. 7. Khayam, Syed Ali, The Discrete Cosine Transform (DCT) : Theory and Application, Michigan State University, 2003 8. Acharya, T., Ray, A.K, Image Processing: Principles and Applications, John Wiley, 2005. 9. Hayder Radha, “Lecture Notes: ECE 802 - Information Theory and Coding,” January 2003. 10. Byung-Joo Oh, “Face Recognition Using Radial Basis Function Network based on LDA,” International Journal of Computer, Information Science and Engineering., Vol:1, pp. 401405, 2007. 1.

Suggest Documents