Pattern Recognition Methods for SAW Sensor Array Based Electronic Nose S.K. Jha1, R.D.S.Yadava2 Department of Physics, Faculty of Science, Banaras Hindu University, Varanasi, India. (e-mail:
[email protected],
[email protected],
[email protected])
Abstract – The paper presents an analysis of several preprocessing, feature extraction and classification methods in combination for yielding optimum performance for SAW sensors array based electronic nose systems. It is found that the combination of logarithmic data preprocessing, linear discriminant analysis based feature extraction and support vector machine based classification yields optimum results. Keywords - SAW sensor array, electronic nose, pattern recognition, odor recognition
I. INTRODUCTION Development of electronic nose technology based on vapor sensors array is important for monitoring industrial hazards, environmental pollution, homeland security, food quality and public health. An electronic nose consists of a vapor selective sensor array integrated with a pattern recognition system. Most of the sensor array developments have occurred by using tin-oxide chemiresistor, composite conducting polymer chemiresistor, quartz crystal microbalance and surface acoustic wave (SAW) oscillator sensors [1-4]. In response to vapor stimulus different sensors in the array generate a set of outputs, which is analyzed by the pattern recognition system to determine vapor identity analogous to the way mammalians smell-sensing organ function. The pattern recognition system consists of data preprocessing, feature extraction and classification stages [5-6]. The performance of the system depends on an optimal combination of the algorithms used to do signal processing at each of these stages. The preprocessing is to prepare raw data in such a way that non vapor specific signal contents from the data are either removed, or suitably normalized and scaled to enable accurate and reliable extraction of features or vapor descriptors. A linear feature extraction method such as principal component analysis (PCA) would need input data that has linear relation to the vapors characteristic parameters. Therefore, if linear PCA is used for feature extraction, it is important also to linearize data at the preprocessing stage with respect to (hidden) intrinsic vapor variables. The accuracy of feature extractor is crucial for the performance of the classifier. In most of electronic nose data processing studies, some form of noise removal, base-line correction, vector autoscaling and dimensional autoscaling methods have been employed at preprocessing stage, which have been combined most commonly with linear PCA and artificial neural network
978-1-4244-5967-4/10/$26.00 ©2010 IEEE
(ANN) for pattern recognition [2, 7-9]. In this strategy, the design of preprocessor needs special attention as it is here that the information-coded data is presented to the pattern analyzer. In particular, linearization of data is very crucial if the feature extractor is a linear processor. The linearization would however need knowledge of exact dependencies of sensors output on vapor identity making variables. This knowledge can come only from accurate theoretical modeling of the sensor responses. In other words, the data linearization procedure must target to remove any nonlinearity according to functional dependencies inherent in the sensor model. Recently, an effort of this kind was made in a study on polymer coated SAW sensor array [10]. The SAW sensor model in this paper revealed that the logarithm of sensor output is a linear combination of the partial free energies that are associated with different mechanisms of chemical interaction between a vapor molecule and the polymer matrix, see Eq. (26) of [10]. Thus, by taking the strengths of specific chemical interaction processes to be the descriptors of the vapor identity, a logarithmic scaling of the measured data is suggested. In that paper, detailed analysis of a real data was also presented based on PCA and ICA (independent component analysis) by doing the logarithmic preprocessing. A further elaboration of this followed in [11] by analyzing several SAW sensor array datasets in combination with a neural network classifier. It was found that the logarithmic preprocessing enhances interclass separation in the PCA generated feature space and substantially improves classification efficiency of the neural network. In the present study, we endeavor to analyze which among the feature extraction algorithms PCA, ICA and LDA (linear discriminant analysis), and the classification algorithms ANN, KNN (K-nearest neighbor), SVM (support vector machine) and NB (Naive Bayes) performs best in combination with the logarithmic preprocessing of the SAW sensor array data. The aim is to determine an efficient pattern recognition system for SAW electronic nose systems. II. DATA PROCESSING A. Data Two SAW sensors array datasets collected from [12] and [13] were used in this analysis. These data were used in our earlier analyses also [10, 11]. The one set (Set-1) taken from Table 3 of [12] is 3-sensor array data exposed
to 125 vapor samples divided into two classes (nerve agent and non-nerve agent). The other set (Set-2) taken from Table 3 of [13] is the response of 9-sensor array exposed to 40 vapor samples of hazardous chemicals belonging to nine different classes. The sensors are polymer coated SAW delay line oscillators. Some more details are summarized in [10, 11]. The sensor outputs are changes in oscillator frequency due to equilibrated mass loadings of polymers under vapor sorption. B. Preprocessor The preprocessor in this analysis is same as implemented earlier [10, 11]. The raw sensor outputs Δf are first normalized with respect to vapor concentration coating to
ρ′
classifier combinations are in Table 1. The Table also shows results without logarithmic scaling of raw data for the same combinations. The impact of logarithmic scaling is quite apparent for all processing combinations. It is difficult to judge with certainty the superior performance of one classifier over the others from this limited result. However, the SVM appears to be consistently yielding high performance with all the feature extraction methods. The dataset contains limited data points. Therefore, it is not feasible to carry out classifier testing with this. Fig. 2 however shows various feature space projections. From the visual examination of this plot, one can easily conclude that LDA produces maximum interclass separation as well as intraclass compaction.
and frequency shift due to polymer
Δf p , then logarithmically scaled according
x = ln(Δf / ρ ′Δf p ) . Finally, mean-centering and
variance normalization is done across vapor samples for each sensor. The final step is usually referred to as dimensional autoscaling. C. Feature Extraction and Classification Two feature extraction methods (PCA and ICA) were used in combination with four classifiers ANN, KNN, SVM and NB, and LDA has been used in the present analysis. The LDA combines both the feature extractor and classifier features into a single algorithm. The analysis is done by using the functions available in ‘R’ packages [14-17] as mentioned below. The ‘stats’ for PCA, ‘fast ICA’ for ICA, ‘MASS’ for LDA, ‘nnet’ for ANN, ‘class’ for KNN, and ‘e1071’ for SVM and NB. A 2 × 3 × 2 backpropagation ANN architecture with sigmoidal activation in the hidden layer and linear activation function in output layer has been used. The network convergence reaches after nearly 1000 iterations. The KNN yields best results with k = 5 . The performance of SVM with radial Gaussian kernel is found to be the best for kernel parameter y = 0.5 . The NB yields best results with default values of the model parameters. III. RESULTS Fig. 1 and Fig. 2 show the feature components obtained by PCA, ICA and LDA for the two datasets transformed through logarithmic scaling. Note that for a two-class problem LDA generates only one linear discriminant. The y-axis in Fig. 1(c) therefore shows sample number only for visualization purpose. The classification results for this dataset were obtained by using 90 (40 nerve + 50 non-nerve) out of 125 samples for the training, and the rest 35 (12 nerve + 23 non-nerve) for testing. The results for different feature extractor and
Fig.1 Feature space projections of the dataset-1 generated by (a) PCA, (b) ICA, and (c) LDA after logarithmic scaling of raw data followed by dimensional autoscaling.
TABLE I CLASSIFICATION RESULTS FOR DATA SET-1 Feature Extraction Method
Type of Classifier
Classification Rate in % Preprocessing without logscaling
PCA
ICA
LDA
with log scaling
ANN
66
94
KNN
83
91
SVM
88
94
NB
87
94
ANN
86
94
KNN
94
97
SVM
94
97
NB
89
94
ANN
89
91
KNN
89
91
SVM
89
97
NB
86
97
identifying the functional dependencies of sensors output on odor parameters. This necessitates development of accurate sensor models. So far, the logarithmic scaling of raw data substantially improves the performance of subsequent processing steps. The study extends to seek high performance combination of feature extraction and classification methods for SAW electronic noses. By experimenting with three widely used feature extractors and four classifiers, it is suggested that the (LDA+SVM) combination has potential to yield best results.
IV. DISCUSSION The present results reveal some underlying compatibilities between different feature extraction and classification methods, besides being consistent with the earlier findings [10, 11] on the significance of sensor model suggested logarithmic scaling as an important preprocessing step for the SAW sensors. As found earlier [10], the ICA scores over PCA for these sensors. This indicates that statistical characteristics of SAW sensor array data are fully Gaussian. Recall that PCA functions only on second order statistics, which is sufficient to describe Gaussian random variables; whereas ICA operates on higher order statistics (non-Gaussianity) of data. The LDA as feature extractor performs the best. This is not surprising because this method makes use of the training information to maximize between-classes scatter and minimize within-class scatter. Among the classifiers, SVM computes the class decision function using only a select few of the training feature vectors termed as support vectors. Therefore, SVM inherently selects good training vectors over bad ones which might be creating confusion. In contrast, other classifiers (ANN, KNN and NB) make use of the complete training data. Possibly, this is the reason for relatively better performance of SVM. It is interesting to note that NB in combination with LDA performs equally well. V. CONCLUSION The present study suggests that processing of polymer coated SAW sensor array data for odor recognition must optimize different data processing steps by carefully
Fig. 2. Feature space projections of the dataset-2 generated by (a) PCA, (b) ICA, and (c) LDA after logarithmic scaling of the raw data followed by dimensional autoscaling.
ACKNOWLEDGMENT This work was supported by the Government of India under Defence Research & Development Organization
Grant No. ERIP-ER-0703643-01-1025. The author SKJ acknowledges the financial support by the Directorate of Forensic Sciences, New Delhi in the form of Senior Research Fellowship, and is thankful to the Director, CFSL, Chandigarh for extending full support and cooperation. The authors acknowledge and thank the authors whose published experimental data is used in the present analysis. REFERENCES [1] F. Röck, N. Barsan, and U. Weimar, “Electronic Nose: Current Status and Future Trends”, Chem. Rev., vol.108, no. 2, pp 705–725, 2008. [2] T. C. Pearce, S. S. Schiffman, H. T. Nagle, and J. W. Gardner, Handbook of machine olfaction: electronic nose technology. Weinheim, UK: WILLY-VHC Verlag GmbH & Co. KGaA, 2003, pp-125-127. [3] J. W. Grate, “Acoustic Wave Microsensor Arrays for Vapor Sensing”, Chem. Rev., vol. 100, no. 7, pp. 2627-2648, 2000. [4] P. C. Jurs, G. A. Bakken, and H. E. McClelland, “Computational Methods for the Analysis of Chemical Sensor Array Data from Volatile Analytes”, Chem. Rev., vol. 100, no. 7, pp- 2649-2678, 2000. [5] A. Jain, R Duin, and J. Mao, “Statistical Pattern Recognition: A Review”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp 4-37, 2000. [6] S. Theodoridis, and K. Koutroumbas, Pattern Recognition. San Diego CA: Academic, 2003. [7] R. Guitierrez-Osuna and H. T. Nagle, “Method for evaluating data-preprocessing techniques for odor classification with an array of gas sensor”, IEEE Trans. System, Man, Cybernetics B, vol. 29, no. 5, pp. 626-632, 1999. [8] P. McAlernon, J. M. Staler, and K. T. Lau, “Mapping of chemical functionality using an array of quartz crystal microbalances in conjuction with Kohonen self-organizing maps”, Analyst, vol. 124, pp.851-857, 1999. [9] J. W. Gardener, M. Crawen, C. Dow, and E. L. Hines, “The prediction of bacteria type and culture growth phase by an electronic nose with a multi-layer perceptron network”, Meas. Sci. Technol., vol.9, no. 1, pp.120-127, 1998. [10] R. D. S. Yadava, and R. Chaudhary, Solvation, transduction and independent component analysis for pattern recognition in SAW electronic nose, Sens. Actuators B: Chem., vol. 113, no. 1, pp. 1-21, 2006. [11] S. K. Jha, and R. D. S. Yadava, “Preprocessing of SAW Sensor Array Data and Pattern Recognition”, IEEE Sensors J. vol. 9, no. 10, pp. 1202-1208, 2009. [12] S. L. Rose-Pehrson, D. D. Lella, and J. W. Grate, “Smart sensor system and method using surface acoustic wave vapor sensor array and pattern recognition for selective trace organic vapor detection”, U. S. Patent, 5469369, Nov. 21, 1995. [13] S. L. Rose-Pehrson, J. W. Grate, D. S. Ballantine Jr., and P. C. Jurs, “Detection of hazardous vapors including mixtures using pattern recognition analysis of responses from surface acoustic wave devices”, Anal. Chem., vol. 60, no. 24, pp.2801-2811, 1988.
[14] R Development Core Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. ISBN 3-90005107-0, URL http://www.R-project.org, 2008. [15] J. L. Marchini and C. Heaton B D Ripley
[email protected] fastICA: FastICA Algorithms to perform ICA and Projection Pursuit. R package version 1.1-9, 2007. [16] W. N. Venables and B. D. Ripley, Modern Applied Statistics with S. Fourth Edition. New York: Springer, ISBN 0-387-95457-0, 2002. [17] E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer and A. Weingessel, e1071: Misc Functions of the Department of Statistics (e1071). TU Wien. R package version 1.5-18. 2008.