2012 Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing
A New Statistical-Based Algorithm for ECG Identification Fufu Zeng1, Kuo-Kun Tseng1, Huang-Nan Huang2*, Shu-Yi Tu3, and Jeng-Shyang Pan1 1
Department of Computer Science and Technology, Harbin Institute of Technology, Shenzhen Graduate School, China 2 Department of Mathematics, Tunghai University, Taichung 40704, Taiwan(ROC) 3 Department of Mathematics, University of Michigan, Flint MI 48502, USA * Corresponding author, e-mail:
[email protected]
Abstract—In this paper, a new statistical-based ECG algorithm, which applies the idea of matching Reduced Binary Pattern, is proposed to seek a timely and accurate human identity recognition. A comparison with previous researches, the proposed design requires neither waveform complex information nor de-noising pre-processing in advance. Our algorithm is tested on the public MIT-BIH arrhythmia and normal sinus rhythm databases. The experimental result confirms that the proposed scheme is feasible for high accuracy, low complexity, and fast processing for ECG identification.
II. RELATED WORKS Recently, ECG has become a popular tool in analysing heart diseases [1-2]. Moreover, not only is ECG one useful diagnostic device, it also has been extensively applied on information watermarking [3-4], data compression [5-6], and human identification [7-18]. For human identification, researches are usually focused on areas such as signal pre-processing, feature extraction, data classification, data reduction, and intelligence optimization. The low-pass, high-pass, and/or FFT filters are the main algorithms utilized in signal pre-processing; fuzzy rules, SVM, neural network, Bayesian, and rule-based schemes are frequently adopted in data classification. For data reduction, ICA, PCA, and LDA designs have been widely applied to assist ECG signal processing. The intelligence optimization techniques such as GA, PSO, and ant-colony etc. are commonly employed in tuning parameters of the aforementioned algorithms. The ECG human identification methods rely heavily on feature extraction from ECG signals. Based on our survey, the ECG feature extraction algorithms can be classified into three categories: transform-based, waveform-based, and statistical-based. The transform-based algorithms consist of transforms in wavelet [7-8] and frequency domain. Since the wavelet transform contains information in both time- and frequency-domain, it is more popular than the frequency based techniques which include Fourier transform [9] and discrete cosine transform (DCT) [10]. The waveform-based algorithms [11-18] use distance, height, and/or area in the time domain of certain feature points inside the ECG waveform to match or classify the signals. These algorithms provide high accuracy in identifying regular ECG signals, but the performance might not be good for those signals with some irregular waveforms; therefore much more effort is unavoidable in the signal pre-processing stage. The ECG signals are obviously a non-stationary time series which presents some irregularities in the shape of waveform. Unlike the waveform-based algorithms, the transform-based algorithms analyse the non-stationary information in the ECG signals based on its frequency domain presentation. Such processing is not only slow but also hard to extract the good identified features.
Keywords—Electrocardiogram Identification, Biometric, Access Control System
I. INTRODUCTION Electrocardiogram (ECG/EKG) is the voltage variation signal that detects the electrical changes of the heart on the skin which are caused when the heart muscle depolarizes during each heartbeat. In addition to the analysis of heart disease using ECG signals, we propose a statistical-based algorithm called the reduced binary pattern (RBP) using ECG signals for human identification. This algorithm, which meets the accuracy and cost requirements in an ECG identification system, converts signals into concise binary patterns and performs statistical counting and ranking for identification. Its advantages will be illustrated as follows: 1) No signal pre-processing such as de-nosing, adjusting signal means etc. are needed. Therefore, the noise inside the ECG signals can be recognized as one feature for better classification. 2) There is no need for PQRS detection while running the algorithm and the result may still be robust to dynamic variation of ECG signals. This is different from those ECG identification methods in literature since they all require PQRS detection in advance. 3) Variations of the length and the sampling rate of matching signals are allowed. 4) The algorithm performs rapidly with low computational complexity and requires less ECG information than other techniques in literature. The remaining parts of the paper are organized as follows: Section II contains overviews of related work in ECG identification; Section III introduces the proposed ECG identification algorithm; experimental results are presented in Section IV; some concluding remarks are stated in Section V. 978-0-7695-4712-1/12 $26.00 © 2012 IEEE DOI 10.1109/IIH-MSP.2012.79
301
The statistical-based algorithms [19-20] are usually less time consuming but need a well-designed statistical feature to maintain high-quality accuracy. The access control system, based on access cards, has been implemented in the ECG signals for human identification [21]. This paper aims to designing a fast, secure, and efficient ECG identification algorithm, which is embedded into the ECG card access control system for the purpose of recognition. III. ALGORITHM Yang et al [22] presents the linguistic analysis of the human heartbeat using frequency and rank order statistics. The proposed RBP algorithm extends their methodology to the bare ECG signals. The differences between these two algorithms are illustrated in the following.
Figure 2 Flow chart of the counting and ranking processes
As depicted in Fig. 2, the frequency of each w whose value ranges between 0 and 2 − 1 is calculated in the counting process; a ranking process follows to find corresponding probability of each w . The ranking process can be omitted if the trained and tested ECG segments have the same sampling duration.
Their method converts, counts, and ranks P waves in ECG signals only; but our algorithm uses every sample data to present the reduced binary patterns.
C) Measurement of Similarity Consider two segments of the original ECG data S1 and S2, which may belong to two distinct subjects. If we are interested in knowing how closely they are related, the measurement of similarity needs to be defined. Therefore, we incorporate a weighted distance formula to define the measurement of similarity between S1 and S2:
Their method uses the variation of detected human heartbeat to classify different heart diseases; ours develops a human identification method to recognize each person with his/her own ECG signals. Steps of the proposed algorithm are explained as follows: A) Reduced Binary Pattern Conversion xi˶ 321
ECG Signal y i ˶
㻝
324 325 330 351
372 378 365 364 366 382 333 345
㻝
㻝
㻝
㻝
㻝
1
㻜 2
㻜
㻝
㻝
㻜
D (S , S ) =
2 249
3 243
| (
(
) )∑
(
)| (
( )
) (
(
)
)
(2)
where p (w ) and R (w ) represent the probability and ranking of w in the sequence S , i = 1 or 2 . The absolute difference between two rankings is multiplied by the normalized probabilities as a weighted sum; the factor 2 − 1 in the denominator is to ensure all values of D lie within the scope of [0, 1].
㻝
3
4 8-bitwords wi˶ 1 252
∑
4 230
Figure 1 Schematic diagram of the Reduced Binary Pattern conversion
IV. EXPERIMENT SETUP & RESULT DISCUSSION
For any ECG signal, we can express it as {x , x , x ,…, x } where x represents the i signal from the input data. In order to extract and reduce the waveform pattern into a concise reduced binary sequence, a reduced function is defined to serve the purpose. According to the decrease or increase of two consecutive x values, the two-state function, R , maps to the values of 0 and 1, respectively. 0, x ≤ x (1) R = 1, x > x The reduced binary pattern is simply one binary sequence consisting of digits 0 and 1. After the reduced binary sequence is obtained, we group every m bits to form a reduced binary pattern of length m, referred as an m-bit word. We then convert each m-bit word, w , from its binary expansion to the decimal expression which represents the local pattern of the ECG signals. For instance, the numbers 252, 249, 243 and 230 shown in Fig. 1 are the decimal expansions of the first four 8-bit words, w through w .
A. ECG Database To compute the success rate of the proposed identification algorithm, a comprehensive experiment is tested on an arrhythmia ECG database (MIT-BIH arrhythmia database) and a healthy human ECG database (The MIT-BIH normal sinus rhythm database). Descriptions of these two databases are in the following. 1) The MIT-BIH Normal Sinus Rhythm Database [23]: This database contains 18 long-term ECG recordings from 5 men, aged 26 to 45, and 13 women, aged 20 to 50, with no significant arrhythmias. The ECG signal has a sampling rate of 128Hz and a 12-bit binary representation, also known as a '212' format, over a 10mV range. 2) MIT-BIH Arrhythmia Database [24]: The database contains 48 groups, within two-lead ECG recordings for half an hour, a total of up to 24 hours of ECG data. The data holds 47 subjects’ ECG information (dataset ID 201 and 202 come from the same person); subjects consist of 25 men aged between 32 to 89 and 22 women aged from 23 to 89. The
B) Counting and Ranking Processes
302
100 through 107. For example, 0.032 (1st row & 1st column) and 0.048 (1st row & 2nd column) are the average values from all numbers in Table 1 and 2, respectively.
ECG data has 360Hz as its sampling rate and is stored using a '212' format over a 10mV range. B. Experimental Result To verify whether the proposed algorithm is efficient in ECG human identification, a through experiment is conducted. For any signal drawn from the database, we take every 10 sampling periods as one unit, called a segment. Steps 1 & 2 of the RBP algorithm are then applied to each segment. Two types of comparison, self-comparison and subject-comparison, for ECG signals selected from both databases with m=8 are examined in this study.
Table 3 Basic BRP comparison result for 8 subjects ID 100-107
1) Self-Comparison: We select two 8-segment data from the same subject and measure their distances using (2). Table 1 lists all 64 intra-subject distances, obtained from segment 1 through 8, for the subject ID number 100 in the MIT-BIH arrhythmia database. We detect that entries in Table 3 are symmetric and all diagonal entries are zero since they represent the distances between two identical segments.
1
2
3
4
5
6
7
8
0
0.033
0.033
0.035
0.040
0.037
0.036
0.033
2
0.033
0
0.028
0.039
0.047
0.045
0.036
0.045
3
0.033
0.028
0
0.033
0.042
0.042
0.031
0.039
4
0.035
0.039
0.033
0
0.035
0.041
0.030
0.034
5
0.040
0.047
0.042
0.035
0
0.032
0.037
0.023
6
0.037
0.045
0.042
0.041
0.032
0
0.043
0.033
7
0.036
0.036
0.031
0.030
0.037
0.043
0
0.034
8
0.033
0.045
0.039
0.034
0.023
0.033
0.034
0
101
102
103
104
105
106
107
0.032
0.048
0.045
0.061
0.058
0.059
0.056
0.062
101
0.048
0.028
0.040
0.044
0.041
0.036
0.039
0.031
102
0.045
0.040
0.031
0.048
0.044
0.042
0.048
0.032
103
0.061
0.044
0.048
0.016
0.034
0.030
0.043
0.009
104
0.058
0.041
0.044
0.034
0.029
0.029
0.039
0.013
105
0.059
0.036
0.042
0.030
0.029
0.017
0.031
0.008
106
0.056
0.039
0.048
0.043
0.039
0.031
0.027
0.022
107
0.062
0.031
0.032
0.009
0.013
0.008
0.022
0.001
Intuitively, the average intra-subject distances should be smaller than the intra-group distances; if not, we consider it as an identification error. Similarly, the RBP algorithm is also applied to subjects who belong to the MIT-BIH normal sinus rhythm database. The total number of errors and the identification success rates for both databases are listed in Table 4. The success rates for the two groups of people, with and without significant arrhythmias, are 95.791% and 90.196%, respectively.
Table 1 Self-comparison for subject ID number 100
1
100 100
Table 4 Number of errors and success rate of the RBP algorithm Database Algorithm Reduced Binary Pattern
Normal Sinus Rhythm Total Success number of rate errors
Arrhythmia Total Success number of rate errors
30
91
90.196%
95.791%
2) Subject-Comparison: Two 8-segment data are taken out from two distinct subjects who belong to the same ECG database; the distance between each pair of subjects is evaluated. Table 2 lists these 64 inter-subject distances for the pair of subjects, ID number 100 and 101, from the MIT-BIH arrhythmia database.
The execution duration for carrying out one single comparison cycle of the proposed algorithm is recorded in Table 5. Such a short execution period confirms that the proposed scheme is indeed one algorithm with high efficiency.
Table 2 Subject-comparison between subjects ID number 100 and 101
Table 5 One cycle execution time of the BRP algorithm
1
2
3
4
5
6
7
8
Algorithm
Cost Time/one cycle
1
0.045
0.040
0.051
0.057
0.047
0.046
0.045
0.044
RBP
0.013305s
2
0.051
0.041
0.055
0.055
0.052
0.048
0.049
0.052
3
0.052
0.041
0.056
0.060
0.052
0.046
0.043
0.048
4
0.049
0.045
0.051
0.054
0.050
0.053
0.037
0.047
5
0.043
0.041
0.038
0.060
0.037
0.045
0.037
0.042
6
0.052
0.039
0.052
0.065
0.046
0.048
0.045
0.040
7
0.052
0.046
0.057
0.056
0.054
0.055
0.044
0.052
8
0.043
0.041
0.043
0.060
0.039
0.047
0.037
0.041
V. CONCLUSIONS In this paper, we propose a novel statistical-based algorithm for ECG human identification. Verifications tested on subjects selected from the two public MIT-BIH databases prove that the proposed Reduced Binary Pattern algorithm not only performs in a timely manner with low computational complexity but also effectively in ECG human identification. Our future work will focus on improvements of the RBP algorithm by tuning the parameters, for example, variations of m, the length of the input raw ECG data.
Next, we combine results from both comparisons to obtain the average intra-group distance, namely, the average distance between subjects in the same database. Table 3 records all average intra-group distances for those subjects ID number
303
REFERENCES
[12] Y. Wang, F. Agrafioti, D. Hatzinakos, and K. N. Plataniotis, “Analysis of human electrocardiogram ECG for biometric recognition”, EURASIP Journal on Advances in Signal Processing, Vol. 2008, Article No. 20, 2008. [13] P. Sasikala and R. S. D. Wahidauanu, “Identification of Individuals using Electrocardiogram,” IJCSNS International Journal of Computer Science and Network Security, Vol. 10(12), 2010. [14] S. Saechia, J. Koseeyaporn, P. Wardkein, “Human Identification System Based ECG Signal,” IEEE Region 10 TENCON 2005, Melbourne, Qld, pp. 21-24, Nov. 2005. [15] L. Biel, O. Pettersson, L. Philipson, and P. Wide, “ECG Analysis: A new approach in human identification,” IEEE Trans. on Instrumentation and Measurement, Vol. 1, pp. 557-561, 2001. [16] S. A. Israel, J. M. Irvine, C. Andrew, D. W. Mark and K. W. Brenda, “ECG to Identify Individuals, Pattern Recognition,” Vol. 38(1), pp. 133142, 2005. [17] Y. N. Singh and P. Gupta, “ECG to Individual Identification,” 2nd IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS 2008), Arlington, Oct. 2008. [18] H. Silva, H. Gamboa, and A. Fred, “One Lead ECG Based Personal Identification with Feature Subspace Ensembles,” Machine Learning and Data Mining in Pattern Recognition, 5th International Conference, MLDM 2007, Vol. 4571, pp. 770-783, 2007. [19] Z. Zheng and D. Wei, “A New ECG Identification Method Using Bayes’ Theorem,” IEEE Region 10 TENCON 2006, Hong Kong, China, pp. 1 -4, Nov. 2006 [20] F. Agrafioti and D. Hatzinakos, “ECG Based Recognition Using Second Order Statistics,” 2008 Communication Networks and Services Research Conference (CNSR 2008), pp. 82-87, Nov. 2008 [21] F. F. Zeng, K. K. Tseng, M. Zhao, J. S. Pan, et.al, “Biometric Electrocardiogram Card for Access Control System”, Fifth International Conference on Genetic and Evolutionary Computing (ICGEC-2011), Kinmen, Taiwan/Xiamen, China, pp. 373-376, August 2011. [22] A. C.-C. Yang, S.-S. Hseu, H.-W. Yien, A. L. Goldberger, and C.-K. Peng, “Linguistic Analysis of the Human Heartbeat Using Frequency and Rank Order Statistics,” Physcial Review Letters, Vol. 90(10), id 18103, pp. 1-4 , 2003. [23] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, et.al, “Components of a New Research Resource for Complex Physiologic Signals,” PhysioBank, PhysioToolkit, and PhysioNet, Vol. 101(23), pp. e215-e220, June . 2000. [24] G. B. Moody, R. G. Mark, “The impact of the MIT-BIH Arrhythmia Database,” IEEE Engineering in Medicine and Biology Magazine, Vol. 20(3), pp. 45-50, May. 2001.
[1]
H. Zhang, L. Q. Zhang, “ECG Analysis based on PCA and Support Vector Machines,” Proceedings on the International Conference on Neural Networks and Brain, Beijing, pp. 743-747, October 2005. [2] T. Stamkopoulos, K. Diamantaras, N. Maglaveras, M. Strintzis, “ECG analysis using nonlinear PCA neural networks for ischemia detection, ” IEEE Transactions on Signal Processing, Vol. 46, pp. 3058-3067, 1998. [3] M. S. Nambakhsh, A. Ahmadian, M. Ghavami, R. S. Dilmaghani, and S. F. Karimi, “A Novel Blind Watermarking of ECG Signals on Medical Images Using EZW Algorithm,” Proceedings on the 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. New York, pp. 3274-3277, Aug. 2006. [4] Ibaida, I. Khalil and R. V. Schyndel, “A Low Complexity High Capacity ECG Signal Watermark for Wearable Sensor-net Health Monitoring System,” Computing in Cardiology 2011 (CinC 2011) Hangzhou. China, September 2011. [5] S. G. Miaou, H. L. Yen, and C. L. Lin, “Wavelet-based ECG compression using dynamic vector quantization with tree codevectors in single codebook,” IEEE Trans. Biomed. Eng, Vol. 49, pp. 71-80. [6] K. Ranjeet, A. Kumar and R. K. Pandey, “ECG Signal Compression Using Different Techniques,” Communications in Computer and Information Science, Vol. 125, pp. 231-241, 2011. [7] C. C. Chiu, C. M. Chuang, and C. Y. Hsu, “Discrete Wavelet Transform Applied on Personal Identity Verification with ECG Signal,” International Journal of Wavelets, Multiresolution and Information Processing, Vol. 7, pp. 341-335, 2009. [8] A. D. C. Chan, M. M. Hamdy, A. Badre, and V. Badee, “Wavelet Distance Measure for Person Identification Using Electrocardiograms,” IEEE Transactions on Instrumentation and Measurement, Vol. 57, pp. 248-253. [9] A. D. C. Chan, M. M. Hamdy, A. Badre, and V. Badee, “Person Identification using Electrocardiograms,” P Canadian Conference on Signals, Electrical and Computer Engineering (CCECE'06), Ottawa, Ontaria, pp. 1-4, May 2006. [10] C. Ye, M. T. Coimbra, B. V. K. V. Kumar, “Investigation of Human Identification using Two-Lead Electrocardigram (ECG),” Proceedings on the Fourth IEEE International Conference on Biometrics: Theory Applications and Systems (BTAS 2010), Washington, DC, pp. 27-29, September 2010. [11] A. D. C. Chan, M. M. Hamdy, A. Badre, and V. Badee, “Wavelet Distance Measure for Person Identification Using Electrocardiograms,” IEEE Transactions on Instrumentation and Measurement, Vol. 57(2), pp. 248- 253, 2008.
304