Implementation of Sparse Representation Classifier (SRC) - inase

0 downloads 0 Views 598KB Size Report
processing, feature extraction, training and classification using classifier. A new and efficient ... as classifier for heart sound recognition and has outperformed the.
Proceedings of the 2014 International Conference on Communications, Signal Processing and Computers

Implementation of Sparse Representation Classifier (SRC) to heartbeat biometrics W. C. Tan and D. A. Ramli

Recently research [1], [2] have been proved that heartbeat or heart sound can be used as the biometric trait for human authentication. Human heart sounds are noises generated by the beating heart and the resultant flow of blood through it. Two heart sounds are normally produced during each cardiac cycle namely S1 and S2. The first heart sound S1 is normally longer, low-pitch tone and sound like “lup” whereas the second heart sound S2 is shorter, high-pitch and sound like “dup”. These natural signals have been applied in auscultation by doctors for health monitoring and diagnosis. Since heart sounds contain information about an individual’s physiology, it can be potentially used as a biometric traits and provide unique identity for each person. Besides, heart sounds are very difficult to counterfeit or imitate by others and therefore reduces falsification in authentication systems. In 2006, the possibility of using heart sound as biometric trait for human identification is investigated and a preliminary results indicate an identification rate of up to 96% for a database consists of 7 individuals, with heart sounds collected over a period of 2 months [2]. Their system is based on the cepstral analysis with a specified configuration called Linear Frequency Bands Cepstral (LFBC) as feature extraction method, combined with Gaussian Mixture Modeling (GMM) and Vector Quantization (VQ) as classifier. In 2007 [1], a heart sound biometric system is proposed by the authors using a feature extraction method called chirp-Z transform (CZT) and K-Nearest Neighbor (KNN) based on Euclidean distance as the classifier. Their system achieved 0% false rejection rate (FRR) and 2.2% false acceptance rate using a database containing heart sound recorded from 20 different people. The weakness of the CZT feature extraction method is that the locations of the S1 and S2 heart sounds have to be well aligned for each sample. In 2010 [3], three different types of features are extracted which are auto-correlation, cross-correlation and cepstrum. The classifiers applied in their system are Mean Square Error (MSE) and KNN. KNN classifier achieved 93% identification rate evaluated using a database of 400 heart sound that were recorded from 40 individuals by 10 heart sound recordings for each individuals. In 2013 [4], a new feature set called marginal spectrum is extracted from the heart sounds and classifier VQ based on Linde-Buzo-Gray algorithm (LBG-VQ) is used for classification the heart sounds. The identification rate of their system achieved 94.40% evaluated using a database of 280 heart sounds from 40 participants. In this paper, a heart sound authentication system based on Mel Frequency Cepstral Coefficient (MFCC) and Sparse

Abstract— Biometrics is a new trend of human identity verification technologies nowadays to replace conventional methods such as pin number and password or token-based authentication. These biometric traits include face, iris, fingerprint, voice and etc. In this paper, heart sound is proposed and used as biometric trait for human identity authentication. The proposed heart sound biometric system consists of four main modules, which are heart sound acquisition, preprocessing, feature extraction, training and classification using classifier. A new and efficient classification technique, Kernel Sparse Representation Classifier (KSRC) based on Sparse Representation Classifier (SRC) and kernel trick is implemented in this paper. The reconstructive and discriminative nature of SRC provides a high performance of classification even with noisy data. By introducing the Kernel trick into SRC, the classification performance of the classifier is further improved by implicitly map features data into a high-dimensional kernel feature space. The results of the system prove the possibility of heart sound as a biometric trait for human authentication system. Meanwhile, KSRC shows a promising result as classifier for heart sound recognition and has outperformed the other classifiers i.e. Support Vector Machines (SVM), SRC and KNearest Neighbor (KNN) with 85.45% accuracy.

Keywords—biometrics, representation classifier.

heart

sound,

kernel

trick,

sparse

I. INTRODUCTION

T

HE use of traditional authentication systems to prove legitimate user are exposed to several weaknesses such as the authenticator could be easily lost or stolen and the link between authenticator and legitimate user is weak. Thus, the use of biometric traits as a reliable authentication system to identify authentic user is becoming important to replace its traditional counterpart in order to increase the security of the authentication system. Biometric authentication system can be divided into two types which are based on physical or behavioral traits. Physical traits are characteristics which is fixed or unvarying such as face, iris, fingerprint, heart sound as well as DNA. Behavioral traits are characteristics which represented by skills or functions performed by an individual such as gait, voice and signature.

This work was supported by Universiti Sains Malaysia under Research University Grant 814161 and Fundamental Research Grant Scheme (FRGS) 6071266. W. C. Tan is a PhD student at School of Electrical & Electronic Engineering, Universiti Sains Malaysia. (e-mail: [email protected]). D. A. Ramli is a senior lecturer at School of Electrical & Electronic Engineering, Universiti Sains Malaysia. (corresponding author to provide phone: +604-5996028; e-mail: [email protected]).

ISBN: 978-1-61804-215-6

36

Proceedings of the 2014 International Conference on Communications, Signal Processing and Computers

Representation Classifier (SRC) are used as feature extraction and classification method respectively. The SRC is a new and powerful classifier especially for noisy or corrupted samples data. Thus, this research aims to develop a robust and reliable heart sound authentication system which can work well with noisy heart sound sample. The proposed system consists of four main phases; data acquisition, signal pre-processing, feature extraction, and training and classification phases. The rest of this paper is organized as follows. Section II briefly states the methods and architecture of the heart sound authentication. Section III presents the classification using SRC and KSRC. The result and discussion are illustrated in section IV. Finally, section V summarizes conclusions.



(2)



(3)

(

% is especially helpful for detecting speech from noisy background and for start and end point detection. Then, the upper and lower of STA and ZCR threshold is set as shown in table 1. Table 1: Threshold value for heart sound segmentation process. Upper Short-term amplitude threshold, )*1 3 Lower Short-term amplitude threshold, )*2 0.5 Zero-crossing rate threshold, %/0 5 These threshold values is obtained from trial and error. After defined the threshold values, the frames of the signal is evaluated by the following rules. Rule 1: the frame’s STA is greater than STA1 threshold is considered as a part of heart sound and the starting point of the heart sound will be calculated. Rule 2: the frame’s with STA is greater than STA2 threshold or ZCR is greater than ZCR threshold is considered as possible heart sound signal and will be further evaluated next frames. If the next frame matches rule 2, the evaluation will be repeated until it matches rule 1 or rule 3. Once the next frame matches rule1, the starting point of the heart sound will be equal to the starting point of the very first frame which matches rule 2. The ending point of this heart sound is evaluated when the following frame matches rule 3. If the next frames does not matches rule 1 and directly matches rule 3, this sequence of frames will not consider as a heart sound. Rule 3: the frame’s with STA is lower than STA2 threshold and ZCR is lower than ZCR threshold is considered as not a heart sound. The flow chart of the segmentation technique is shown in figure 1.

B. Signal Pre-processing Although it is impossible to remove all noises from the recorded sound signal, it can be minimized to certain acceptable level. The recorded heart sound signals which are corrupted by various types of noise can reduce the accuracy of identification. Therefore, a fifth order Chebyshev type I lowpass filter with cutoff frequency at 880 Hz is applied to the signals. In this context, background noise or sound that its frequency higher than the filter cut off frequency will be eliminated. Then, the signals were normalized to absolute maximum according to the equation (1) (1)

C. Segmentation The heart produces two strong and audible sounds namely S1 and S2. These two heart sounds contain important features for human identity verification. Therefore, the heart sound segmentation is the first step of this automatic heart sound biometric system [6]. The segmentation technique that is employed in this system is based on zero-crossing rate (ZCR) and short-term amplitude (STA). First, the noise-filtered and normalized signal is blocked into frames of 5ms length with 66.7% overlapped. Next, the short-term amplitude and zero-crossing rate of each frames is calculated based on equation 2 and 3 respectively.

ISBN: 978-1-61804-215-6

   

1 %  |   ' 1| 2

A. Database An open heart sounds database HSCT-11 collected by the University of Catania Italy is applied to evaluate the performance of proposed heart sound authentication system. This database is a collection of heart sounds to be used for biometric research purpose and freely available on the internet [5]. It contains heart sounds collected from 206 people, i.e. 49 female and 157 male. Only 10 female and 5 male heart sounds are randomly selected from the database and have been used in this research. The heart sounds recordings are recorded in WAV format at a sampling frequency of 11.025 kHz, near the pulmonary valve and contains only sequences recorded in resting condition.







where ,  1, … , " is the audio samples of the #$ frame. This simple feature can be used for detecting silent part in audio signals.

II. METHODOLOGY

  / max||



D. Feature Extraction The extraction of the best parametric representation of acoustic signals is an important task in the design of any sound-based biometric recognition system so that a better identification performance can be produced. Mel Frequency Cepstral Coefficients (MFCC) is one of the most commonly used feature extraction method in speech recognition. MFCC takes human hearing perception sensitivity with respect to frequencies into consideration [7], [8]. After the heart sound signal is segmented, framed, and windowed, MFCC is used to extract meaningful parameter in the heart sound signal. The steps to implement MFCC in this system are as the following.

37

Proceedings of the 2014 International Conference on Communications, Signal Processing and Computers

Fig. 1 Heart sound segmentation flow chart based on ZCR and STA 1.

Discrete Fourier Transform (DFT) of each frame is computed. Each frame of N samples is converted from time domain into frequency domain in this step. The DFT of all frames of the pre-processed heart sound signal is 

1# 2  #3 4 

2.

5678  ,1

9 9:

3.

Normally, log energy is obtained by computing the logarithm of square magnitude of the coefficients C#  . C#  is the #$ filter bank output. In this project, the log energy is obtained by computing logarithm of the magnitude of the coefficients. This is done in purpose of reducing the complexity of computing.

4.

Inverse DFT is computed on the logarithm of the magnitude of the filter bank output as shown following:

(4)

where  is the number of samples in a frame, and  is the domain index of the DFT. The signal spectrum is then processed by mel filter bank processing. The magnitude of frequency is multiplied by Mel filter bank. This is to obtain the log energy of each triangular band-pass filter in the filter bank. The filter bank used in this project consists of 24 triangular band-pass filter that is emphasize on processing the spectrum which frequency is below 1 kHz.

 D# 

The positions of these filters are equally spaced along the Mel frequency scale and related by following equation: ;