Denoising by Singular Value Decomposition and Its Application to ...

15 downloads 0 Views 825KB Size Report
Abstract—This paper analyzes the role of singular value de- composition (SVD) in denoising sensor array data of electronic nose systems. It is argued that the ...
IEEE SENSORS JOURNAL, VOL. 11, NO. 1, JANUARY 2011

35

Denoising by Singular Value Decomposition and Its Application to Electronic Nose Data Processing Sunil K. Jha and R. D. S. Yadava

Abstract—This paper analyzes the role of singular value decomposition (SVD) in denoising sensor array data of electronic nose systems. It is argued that the SVD decomposition of raw data matrix distributes additive noise over orthogonal singular directions representing both the sensor and the odor variables. The noise removal is done by truncating the SVD matrices up to a few largest singular value components, and then reconstructing a denoised data matrix by using the remaining singular vectors. In electronic nose systems this method seems to be very effective in reducing noise components arising from both the odor sampling and delivery system and the sensors electronics. The feature extraction by principal component analysis based on the SVD denoised data matrix is seen to reduce separation between samples of the same class and increase separation between samples of different classes. This is beneficial for improving classification efficiency of electronic noses by reducing overlap between classes in feature space. The efficacy of SVD denoising method in electronic nose data analysis is demonstrated by analyzing five data sets available in public domain which are based on surface acoustic wave (SAW) sensors, conducting composite polymer sensors and the tin-oxide sensors arrays. Index Terms—Denoising, electronic nose data processing, odor classification, singular value decomposition (SVD).

I. INTRODUCTION HE electronic nose systems based on sensor arrays composed of a set of microelectronic chemical sensors rely on: 1) the abilities of individual sensors to generate output by combining in some way contributions from latent variables of odors and 2) the efficiency of data processing methods to build a parametric representation of the measured array responses in such a way that individual odor types (or classes) are associated with distinctly different sets of values of these parameters. The parameters are mathematical descriptors of the odor identities; and the sets of their values specific to different odor classes represent their mathematical signature [1]. These are variously referred to as feature vector, odorprint or chemical fingerprint. To achieve this, the individual sensors in the array are chosen

T

Manuscript received November 22, 2009; revised March 10, 2010; accepted April 20, 2010. Date of publication June 07, 2010; date of current version October 29, 2010. This work was supported in part by the Defence Research and Development Organization (Government of India) under Grant ERIP-ER-0703643-01-1025 and in part by the Department of Science and Technology (Government of India) under Grant DST-TSG-PT-2007-06. The associate editor coordinating the review of this paper and approving it for publication was Prof. Ralph Etienne-Cummings. The authors are with the Department of Physics, Faculty of Science, Banaras Hindu University, Varanasi 221005, India (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSEN.2010.2049351

to provide varying degrees of cross sensitivities for different chemical constituents in odors. The set of sensor array outputs corresponding to an odor sample thus constitutes varied realizations of intrinsic variables of the odor identification problem. If the intrinsic odor variables are independent and if the sensors responses with regard to them are linear, then the multivariate problem can be modeled as a set of linear parametric equations; and, one can expect to build a unique set of mathematical variables through linear combination of the sensors output to represent an odor sample. This consideration provides basis for a number of feature extraction and pattern classification methods based on linear matrix algebra [1]–[4]. Theoretically, to seek unique odor signature from a measured array response the number of sensors in the array (that is, the set of linear equations) should be at least equal to the number of anticipated latent variables. The presence of noise and uncertainty in odorants (discussed in detail later) however brings stochasticity to the problem, which is handled by statistical estimation methods for odor parameters. To make a reasonably accurate estimate of them one collects sensor array response data for a large number of samples (much larger than the number of sensors), and analyzes this data using multivariate statistical methods. The sensors in array represent independent variables of the measurement space (data space), and their outputs are odor samples are measured coordinates of odor samples. If sensors then the whole dataset can be repreby an array of matrix whose rows represent the samples sented by a by -dimensional vectors in the data space. The intrinsic variables of vapors are latent in the data matrix. The first goal of any data processing procedure is to establish an efficient feature extraction procedure often based on multivariate statistical methods. This requires collecting data on a large number of samples from various target odor classes. The description of various data processing methods employed in the domain of electronic nose applications are presented in a number of publications [1], [5]–[10]. matrix repreIn linear matrix algebra, the rank of a sents the number of linearly independent vectors defined either along rows or columns, and is always less than or equal to the [11]. The sensor smaller of the two, which is data matrix has sample vectors in array generated Therefore, the rank of the data matrix would rows, and be The rank of matrix in this context may be interpreted as the number of independent latent variables of the vapor detection problem provided the outputs of individual sensors are caused by linear combinations of these variables. Most of the electronic nose data processing techniques implicitly assume this to be so. For example, the singular value decomposition

1530-437X/$26.00 © 2010 IEEE

36

IEEE SENSORS JOURNAL, VOL. 11, NO. 1, JANUARY 2011

(SVD) and linear principal component analysis (PCA) are the two most commonly used such methods [12]. Both the methods seek decorrelating data space in terms of a set of new data variables which are mutually orthogonal and optimize certain statistical measures (singular values or variances). The data space redefined by these new variables is called the feature space. The transformation of measurements from data space to feature space hopefully brings greater separation between odors of different classes. This enables odors discrimination and their identities declaration based on certain measures of distance and separation in the feature space. Application of feature extraction methods to sensor array data analysis faces difficulties if the measured data is noisy. This is inevitable in any measurement. The noise processes and their influence on target discriminating data however may pose varied difficulty levels depending upon the data source and the data collection technique. The data generated by electronic nose sensors monitoring steady-state conditions of vapor-sensor interactions contain noise contributions from odor sampling and delivery system, chemical interferents, inhomogeneities of chemical interfaces, cross-transduction processes, thermal fluctuations, power drift, sensor circuit instability and electrical interferences in sensor electronics. The noise processes could be additive and non-additive in sensor output [13]. A non-additive noise source causes random fluctuation in the transduction efficiencies related to various latent variables of the stimulus. They effectively generate spurious correlation between independent latent variables. The additive noise variables are uncorrelated to transduction variables, and their influence on the sensors output can be represented as summation with the true signal. Therefore the additive noise variables can be treated as being similar to the independent variables of the vapor samples. Application of linear decorrelation methods like SVD and PCA should then transform the additive noise contributions along separate orthogonal directions in feature space. The SVD has been widely used for denoising and compression of data in the field of image processing [14]–[16]. However, its use for denoising the steady-state responses of electronic nose sensors data has not come to our notice. Keeping in view the mathematical basis of SVD and the type of dominant noise sources in the sensor array based electronic nose systems we argue in the following section that the SVD based noise removal method must be very effective in eliminating vapor variables related fluctuations. This situation is in some way unique to vapor sensor systems. Further, in the next two sections, using PCA for feature extraction and artificial neural network (ANN) for classification, the efficacy of this approach in noise reduction is demonstrated by analyzing some vapor sensor array data collected from published sources. II. DENOISING BY SINGULAR VALUE DECOMPOSITION A. SVD Basis for Denoising In order to see the role of SVD in denoising sensor array data, it is pertinent to consider its mathematical basis, and also its relation to PCA. Both SVD and PCA are decorrelation techniques which seek an orthogonal basis for representation of the data space (assumed to be linear). Let denote the real value data matrix and denote its rank. The rank of a matrix

denotes the number of independent hidden variables of the data. , that is, the number of independent data We assume that variables is less than the number of data space variables (sensors). By making use of the SVD theorem, and the orthonormal and eigenvalue properties of SVD matrices, it is shown that the data matrix can be given full rank SVD expansion as [17], [18] (1) are the sinwith of left singular vectors, and with are the right singular vectors. Each term of the summation is a rank-1 matrix. One may recall that the left and the right singular vectors and respectively with are the eigenvectors of being the respective eigenvalues [17], [18]. It may be noted that dimensions of the matrix are determined the by the number of rows (or vapor samples) in the data matrix. The left singular vectors therefore provide a basis for the vadimensions of the pors as variables. Similarly, matrix are determined by the number of columns (or sensors in the array) of the data matrix. The right singular vectors therefore provide a basis for the sensors as variables. The eigenvectors are linearly independent variables which form the basis for representation of observations. In sensor array data analysis the data matrix therefore can be interpreted mathematically in two ways: columns as variables and rows as observation or rows as variables and columns as observation. In view of this, the right singular vectors make the basis for sensors as variables. They may be called virtual sensors. Similarly, the left singular vectors make the basis for vapors as variables, and may be called virlinearly indepentual vapors. The latter means that there are dent normalized vapor dimensions which can be combined to reconstruct sensor outputs. The individual terms of the full rank SVD expansion of (1) can be interpreted as projection of vir) tual sensor dimensions weighted by their singular values ( onto the largest singular value virtual vapor dimensions ( ). This is an important point to note to exploit SVD for denoising. matrices, each The data matrix is thus composed of sum of and weighted by decreasing singular values. of size Since the rank of the data matrix is assumed to be lower than the sensor variables, all the singular values appearing after the rank number will be zero. Therefore, in the SVD expansion (1) can be dropped. The SVD expansion the matrices beyond also provides the basis for dimensionality reduction where the , lowest singular value terms beyond a desired rank, say are eliminated. The full rank decomposition of (1) implicitly assumes that the data generation processes are deterministic and noise free. This is not true for any real physical measurement. Almost all experimental observations contain noise of varying amounts. However, if the noise processes that corrupt signals at generation or processing stages are additive then the experimental data matrix can be modeled as where gular values of , being the first

(2)

JHA AND YADAVA: DENOISING BY SINGULAR VALUE DECOMPOSITION

37

where denotes the noise free data matrix and denotes the full rank noise matrix. The rank of the noise free data may be much lower than the full rank . In this model, the noise contribution to each sensor output is assumed to be realizations independent random variables. The measurement (data) of vector is thus result of addition of -dimensional true signal vector and -dimensional noise vector. Therefore, theoretically it can be expected that the SVD of perturbed data matrix will divide the data space into true signal and noise subspaces pertaining to separate sets of singular values. However, there could be some regions of overlap depending on the nature and strength of signal and noise variables. This makes an estimation of true signals in presence of noise difficult, unless accurate description of the noise variables is available, which is often not the case. To overcome this difficulty a number of stochastic estimation procedures have been proposed based on certain assumptions about the statistical behavior of the noise sources. For example, in [19] the noise variables are assumed to be independent and identically distributed (i.i.d.) Gaussian, and effective rank of noisy data matrix is estimated by applying a threshold criterion that distinguishes between significantly small and insignificantly large singular values. This kind of estimation by SVD truncation up to the effective rank is a tradeoff between loss of true signal and gain due to noise reduction, but it needs some prior knowledge about strength of noise. Ideally, if denote the rank of noise free data and denote the truncated SVD expansion, then the denoised data matrix becomes (3) where denotes the singular value matrix after setting all sinto be zero. In a practical situation, an gular values with estimate of effective rank may be used to represent the denoised data matrix. An effective rank estimate separates most of noise space without much loss of information space. The success of threshold filtering by effective rank depends on formulation of threshold criterion and some good estimate of threshold value. A commonly used method is to estimate effective rank such that singular values satisfy the condition for predefined threshold . Defining a threshold however needs knowledge about statistical behavior of the noise. By assuming i.i.d. Gaussian noise model with zero-mean and variance , Konstantinides and Yao [19] defined three new stochastic threshold bounds besides listing 5 others defined earlier. They evaluated the efficiency of these criteria in data denoising through simulation experiments, and found that a particularly is quite efsimple threshold bound defined as fective in removing most of noise. The value of , however, needs to be adjusted empirically. Natarajan [20] noted that it is hard to compress additive random noise in doing data compression by a linear compression algorithm. Most compression occurs due to deterministic true signals. He exploited this feature to introduce a new quantitative criterion for discriminating noise subspace from the signal subspace. The method is based on the performance of a data compression algorithm. In this . The method, the compressed data size is plotted against with a steep slope in region plot shows a kink at and low slope in region. He argued that inefficient data

) is due to difficulty in compression beyond the kink ( compressing noise, and thus defined a noise threshold bound at . Konstantinides et al. [21] later demonstrated the effectiveness of this criterion for SVD based data compression and noise elimination. It should be noted however, that this definition of noise threshold bound is linked to the efficiency of data compression algorithm. Several authors have explored a number of other filtering methods as well. For example, Hassanpour [22] applied Savitzky–Golay low-pass filter, Zarowski [23] applied Rissanen’s minimum description length (MDL) criterion, Van der Veen [24] described a low rank approximation method based on left-singular vectors, and Hansen and Jensen [25] presented a finite-impulse-response (FIR) filtering method to represent truncated SVD. In the present context of odor discrimination we applied SVD truncation by threshold bound setting empirically by linking the effect of thresholding onto the performance of PCA based feature extraction and neural network classification. The lowest singular values were set to zero successively, and for each setting, the denoised data matrix was calculated by SVD truncation (3). The PCA and neural network classification was then done based on the denoised data matrix of reduced rank. The class separability in feature space and classification accuracy was monitored. The truncation was continued heuristically until best class-discrimination was obtained for each data set. The effective rank of the data matrix is thus taken to be that which yields maximum class separation. Though at present this approach is not based on some well-defined quantitative measure yet the positive impact of SVD based denoising of electronic nose data can be clearly seen in the results section. B. SVD, PCA, and Denoising The relation between SVD and PCA is discussed in [17], [18], [26], and [27]. A detailed clarification is given in [26]. The PCA finds orthogonal directions in the linear data space with maximum variance. This is done by estimating eigenvectors of the covariance matrix formed by covariances between data variables (sensors in the present case). To calculate covariances, the data matrix is first mean-centered with respect to individual variables over the complete range of observations (vapor samples). The covariance is defined between two sensors as the expectation value of the product of their mean-centered outputs estimated over the complete range of observations. The expectation is estimated by averaging over the samples. The covariance matrix is composed of all such possible covariances of the data matrix. This is done by mean-centering the columns of the data matrix, and then taking the inner products of the column vectors with . The elements of covariance matrix are then given as (4) , and the covariance matrix . with If SVD of the data matrix is also done on the basis of mean-cenmatrix and the covariance matrix tred columns then the are identical. This means that the right singular vectors of the mean-centered data matrix are also the eigenvectors

38

of the covariance matrix with corresponding eigenvalues , . The PCA therefore can be done based on the eigenvalue solutions of SVD. By eliminating the low eigenvalue principal components one then expects that the PCA should also produce dimensionality reduction and denoising effects similar to the SVD as discussed in the preceding subsection. Indeed, it does, and has been used extensively in multivariate data analysis [28]–[30]. In the present paper, however, we argue that SVD is much more effective in denoising. The reasons are as follows. A typical sensor array data involves many times more vapor sam. This means that ples than the number of sensors, more singular values and more there are left singular vectors in SVD than the number of eigenvalues and eigenvectors in PCA. The PCA eigenvectors define uncorrelated virtual sensor directions which are the same as those defined by the right singular vectors in SVD. However, SVD involves additional decorrelation among the vapor samples also. This is repleft singular vectors and associated singular resented by the values. The noise variables in the data therefore get represented onto a larger number of uncorrelated directions in the SVD than in the PCA. The truncation of the full rank SVD matrices by lower ranks would then cut larger portion of the noise spread over lowest singular value components. The regenerated data matrix based on the truncated SVD would therefore approximate the original data with reduced noise. This method of denoising seems to be well suited for denoising the vapor sensor array based electronic nose data for the following reasons. The noise variables associated with odor samples arising from the sampling methods, fluidics and chemical interferents will be represented in the lowest order left singular vectors. The noise variables associated with sensor electronics and operating conditions will be represented in the lowest order right singular vectors. The SVD truncation as discussed above would therefore eliminate noise variables of vapor origins more effectively than those of sensor origins as . One can reasonably expect that this method would clean the electronic nose data much more efficiently than that obtained through dimensionality reduction by PCA. In fact, in any other domain of application where the noise in both instances (samples) and variables (sensors) are significant this method can be expected to produce good results. C. Data Processing and Software Validation of the above described method of denoising is done here by analyzing five sets of vapor sensor array data collected from literature [31]–[35]. The data are the responses of arrays of polymer coated surface acoustic wave (SAW) sensors, conducting composite polymer sensors and tin-oxide sensors exposed to various vapor samples, described in Section III. The denoising procedure is to do SVD of data matrices with sensors by in columns and samples in rows, truncate it to rank to to zero, and then setting lowest singular values from reconstruct the data matrix according to (3). Further processing is done by PCA for feature extraction using the reconstructed data matrix as input. Effect of SVD denoising is evaluated by examining intraclass groupings and interclass separability in the principal component score plots.

IEEE SENSORS JOURNAL, VOL. 11, NO. 1, JANUARY 2011

It is desirable to seek further validation by using the PCA results for vapor classification. However, most of the collected data, except those in [31], do not contain enough samples to be divided into proper training and test sets. The data in [31] however was subjected to artificial neural network (ANN) classification also based on an error backpropagation algorithm. The program implementations were done in Matlab using the statistical and neural network tools. III. THE DATA A. Data Set-I The first set of data contains responses of a polymer coated SAW sensor array exposed to vapor samples of chemical warfare agents from two categories: nerve and non-nerve. It is collected from [31], wherein Table 3 presents measurements by three-sensors array under equilibrium exposure to 125 samples , ). The sensors are 158 MHz SAW delay ( line oscillators coated with fluoropolyol (FPOL), ethyl cellulose (ECEL) and poly epichlorohydrin (PECH) films having thicknesses equivalent to nearly 250 kHz shifts from the uncoated condition. The nerve agent samples include 52 vapor samples of DMMP, GD, and VX. The non-nerve agent category comprises 73 samples of other vapors that do not contain a nerve agent, though some of these contain blister agents such as HD and JP4. A sensor response is the shift in oscillator frequencies in Hz) under equilibrium exposure to vapors at various con( centrations (in ) at 30 C. B. Data Set-II The second data set is collected from [32]. It consists of responses of a nine-element SAW sensors array exposed to 40 vapor samples ( , ) of nine compounds: dimethylacetamide (DMAC), dimethyl methylphosphonate (DMMP), dichloroethane (DCE), diethyl sulfide (DES), water ), isooctane (ISO), toluene (TOL), 1-butanol (1BTL), and ( 2-butanone (2BTN). The nine sensor coatings are: fluoropolyol (FPOL), polyethylene maleate (PEM), ethyl cellulose (ECEL), polyethylenimine (PEI), polyethylene phthalate (PEPH), poly-epichlorohydrin (PECH), poly isoprenefluoro alcohol (PFA), polyisobutylene (PIB), and polybutadiene hydroxilated (PBOH). A sensor response is defined as the shift in oscillator frequency normalized with respect to the shift due to polymer in Hz/kHz) under equilibrium exposure to coating ( ) at C. vapors at various concentrations (in C. Data Set-III The third data set is collected from [33, Table 3]. It consists of responses of a 16-element sensor array composed of conducting carbon black-polymer composite sensors exposed to 22 vapor samples ( , ). The array actually contains 17 sensors but the 1st sensor is reported to be defective by the authors [22], therefore, it is not included in the present analysis. The different polymers used in making these composites are: poly(4-vinylphenol), poly(styrene-co-allyl alcohol)–5.7% hydroxyl, poly( -methylstyrene), poly(vinyl chloride-co-vinyl acetate)–10% vinyl acetate, poly(vinyl acetate), poly(N-vinyl pyrrolidone), poly(carbonate bisphenol A),

JHA AND YADAVA: DENOISING BY SINGULAR VALUE DECOMPOSITION

39

TABLE I DATA SET PARAMETERS AND SVD TRUNCATION DETAILS

poly(styrene), poly(styrene-co-maleic anhydride)–50% styrene, poly(sulfone), poly(methyl methacrylate), poly(methyl vinyl ether-co-maleic anhydride), poly(vinyl butyral), poly(vinylidene chloride-co-acrylonitrile)–80% vinylidene chloride, poly(caprolactone), poly(ethylene-co-vinyl acetate)–82% ethylene, poly(ethylene oxide). The vapor samples are consisted of methanol, ethanol and their blends at various concentrations (in parts per thousand). The sensor response is defined as the relative peak change in resistance under cyclic vapor exposure ) at C. ( D. Data Set-IV The fourth data set is collected from [34, Table 2]. It represents responses of four-element tin-oxide gas sensor array exposed to nine vapor samples ( , ) of ethanol, toluene, O-xylene, and their mixtures in dry air. The sensor response is defined as the relative change in conductance under ) at room temperature and 35% steady-state conditions ( relative humidity. E. Data Set-V The fifth data set is collected from [35], Table S1 therein. It consists of responses of 12-element metal oxide sensor array exposed to 110 vapor samples ( , ) belonging to ten chemical classes: amines, lactones, acids, sulphides, terpenes, aldehydes, ketones, aromatics, alcohols and esters. The sensor responses are defined by fractional resistance changes ) at 220 C. ( IV. RESULTS All the results presented below are obtained by using the original data matrix without any preprocessing, except the mean centering. That is, the PCA results are obtained from the covariance of mean-centered data matrices. As explained in Section II-A the number of terms for the SVD truncation were chosen heuristically by examining the PCA results, and also by considering the number of sensors in the array and neural network classification results. Table I summarizes the data matrix parameters, rank estimates, first singular value and threshold bounds for different data sets used in the analysis. If we assume

Fig. 1. Principal component scores of the data set I: (a) by using the original data matrix and (b) by using the reconstructed data matrix after SVD truncation to rank 2. The symbol ( ) represents nerve agents and () represents non-nerve agents.

i.i.d. Gaussian additive noise as done in [21] and interpret the threshold bound in terms of Gaussian standard deviation and the data matrix parameter , we get estimates of as shown in the table. The last row of this table shows what fraction this makes of the first singular value. It is interesting to note that this fraction is nearly 0.1% for the four data sets and 0.5% for the one (4th) data set. That means, any loss of information by truncation up to the present rank estimates, if any, is small. It also indicates that perhaps a truncation criterion can be set in terms loss limit to the true information as measured by the most significant singular vector. Fig. 1 shows PCA results of the Data Set-I. This data set contains only three sensors; therefore, there is not much scope for the SVD truncation. However, truncation was done by eliminating the last component by assuming the true rank of the data matrix to be 2. This, of course, may have removed one of the true intrinsic variables, still the beneficial impact of SVD truncation can be seen from this figure as well as the results in Tables II and III. The 125 samples of the two vapor classes (nerve and non-nerve agents) in this data set can be seen to occupy relatively larger region of the PC space in Fig. 1(a) than in Fig. 1(b). The latter is generated after SVD truncation and data reconstruction. Table II shows the eigenvalues and cumulative variances of all the principal components. After SVD approximation of the data matrix, a little reapportioning of variances (eigenvalues) between PC-1 and PC-2 can be seen. It may be reminded that

40

IEEE SENSORS JOURNAL, VOL. 11, NO. 1, JANUARY 2011

Fig. 2. Principal component score plots of the data set II: (a) and (c) are based on mean-centered original data, and (b) and (d) are based on reconstructed data matrix after truncating SVD expansion up to five terms (that is, four lowest singular value components eliminated). TABLE II EFFECT OF SVD TRUNCATION ON PCA OF DATA SET-I

TABLE III ANN CLASSIFICATION AFTER SVD TRUNCATION OF DATA SET-I

a better classification efficiency is achieved if the samples of a class are closer to each other and the samples of different classes are well separated in the feature space. The comparison of Fig. 1(a) and (b) clearly shows that in PC-2 direction samples of both the nerve and the non-nerve classes have compacted. Even though not much compaction is visible in the PC-1 direction, their numerical values however revealed some reduction in the spread. The influence of these small changes in the formation of interclass and intraclass groupings in the PC-space on the classification rate is however substantial. This data set contains enough samples which can be divided into the training and the test sets for classification by artificial neural network (ANN) classification. Of the total 125 samples, 90 samples (40 nerve agent, 50 non-nerve agent) were used for training the network and 35 samples (12 nerve agent, 23 non-nerve agent) were used for testing or validating the network prediction. The ANN was a three-layer network with one hidden layer having 3 nodes, one input layer with two nodes (PC-1, PC-2) and one output layer

with two nodes (nerve, non-nerve). The sigmoidal activation , the learning rate were used. The converwith gence is reached after 50 000 iterations. Table III summarizes the classification results. The processing based on SVD approximation results in substantially reduced misclassification (from 12 to 1 of the 35 samples tested. There is an overall improvement of 31% in the classification rate. A similar effect on the class compaction in feature space was seen in all the other four data sets. Fig. 2 shows the PC score plots of the nine different vapors in the Data Set-II. The SVD truncation was done by eliminating four lowest singular value components. Both the PC-1/PC-2 and PC-2/PC-3 projections are shown. The class compaction is clearly seen to improve in these plots. For example, note the spread of DMMP, DCE, and 2BTN samples in Fig. 2(a) and (c) before SVD truncation, and their compaction in Fig. 2(b) and (d) after SVD truncation. The other data points are also influenced similarly to varying extents. The Data Set-III based on the 16 conducting composite polymer sensors provides a fit case for examining the denoising

JHA AND YADAVA: DENOISING BY SINGULAR VALUE DECOMPOSITION

41

TABLE IV DENOISING EFFECT OF SVD TRUNCATION ON PCA OF DATA SET-III

Fig. 4. Principal component score plots of the data set IV: (a) and (c) are computed by using the mean-centered raw data matrix, and (b) and (d) by using the reconstructed data matrix after SVD truncation up to rank 3. The symbols are: () ethanol, ( ) toluene, and ( ) O-xylene.

Fig. 3. Principal component score plots of the data set III: (a) and (c) are computed by using the mean-centered original data, and (b) and (d) by using the reconstructed data after truncating SVD expansion up to sixth term (that is, after eliminating ten lowest singular value components).

influence of SVD expansion and truncation. The data matrix approximation was done by assuming its true rank to be 6. That is, the ten lowest value singular components were eliminated from the SVD expansion. The PCA results from the original data and from the reconstructed data are presented in Table IV, and the principal component score plots are shown in Fig. 3. Notice that the eigenvalues associated with the principal components PC-7 to PC-16 do not vary as drastically as usually happens with the first few largest principal components, and contain only 0.254% of the total cumulative variance. This suggests that without carrying any meaningful information these components only make the observation noisy. The SVD truncation apparently eliminates the noisy components, and makes the variation of eigenvalues less steep. The compaction of different classes in the feature space is also quite evident in Fig. 3.

Fig. 5. Principal component score plots of the data set V: (a) based on autoscaling of the original data and (b) based on the reconstructed data after SVD truncation up to rank 4.

Similar observations were made with the analysis of Data Set-IV based on the tin-oxide sensor array. This set contained only four sensors. The SVD truncation was done by eliminating only one component, and the PCA results are shown in Fig. 4.

42

IEEE SENSORS JOURNAL, VOL. 11, NO. 1, JANUARY 2011

Fig. 6. Principal component score plots of the data set V as shown in Fig. 5, but selected in class pairs for comparison. The illustrations in the left column (a), (c), (e), and (f) are without SVD denoising, and those in the right column (b), (d), (f), and (h) are after SVD denoising. The class pairs are as indicated.

In these PC score plots both the intraclass compaction and interclass separation are quite evident. It can be noticed that the separation between all the three vapor classes in the PC-2 direction has improved after SVD truncation. The PCA results of the Data Set-V obtained before and after SVD denoising are shown in Fig. 5. The intraclass compaction is quite apparent but interclass separation does not seem to improve. However, when the PC space was examined by plotting the data in pairs of classes the influence of SVD denoising is seen to unambiguously enhance class separability. Fig. 6 shows few such examples. V. DISCUSSION The influence of denoising by SVD truncation on all the experimental data sets analyzed here clearly underlines the effi-

cacy of this method for sensor array based electronic nose data processing. Even though the use of SVD for data denoising or dimensionality reduction has been extensively reported, particularly in the image processing and speech processing literature [36], [37] , we did not find it being used in gas sensing literature. In latter applications mostly the noise reduction at data collection stage through novel sensor designs, circuit improvisation or ambient stabilization is emphasized; and at the preprocessing stage mainly filtering methods are used, see, for example, [1, Chap. 5, 6, and 14] and the review [6]–[10]. The analysis presented in the present paper by using five sets of real data collected from published sources demonstrates that the SVD denoising is quite effective in cleaning up the feature space representation. As a result, both within class compaction and across classes separation improves. This will obviously be of great help

JHA AND YADAVA: DENOISING BY SINGULAR VALUE DECOMPOSITION

in improving the classification efficiency of odors by electronic noses. The noises in sensor array data collection originate mainly from two sources: odor sampling and delivery associated fluidics and instabilities in sensors electronics. In addition, fluctuations in chemical constituents (particularly the interferents) of odorous sample may be a major source of noise in field deployable systems. The fluidic originated noise arises from fluctuations in gas flow rate, sensing surface temperature, analyte diffusion and time lag between odor source and sensor surface. The sensors electronic noise arise from fluctuations in supply voltages, inherent noise in transduction and signal generation processes, temperature, electromagnetic cross talk and interferences, sensor drift and hysteresis etc. As discussed in Section II, in processing the noisy sensor data by the conventional method of PCA the data space defined by the sensor dimensionalities are transformed to an orthogonal feature space where the features are created by some linear combination of the sensors output. The feature directions (eigenvectors of PCA) are usually referred to virtual sensors. The noise components are transformed into low eigenvalue feature components. By eliminating the lowest eigenvalue components not only the dimensionality reduction is achieved but some noise associated with sensors is also eliminated. In comparison the SVD denoising described in Section II eliminates noise components associated with both the sensors and the odor samples (represented by virtual sensors and virtual vapors respectively). It is therefore expected that SVD must be more effective in denoising than PCA. All the comparative results presented in Section IV actually support this. Going through the electronic nose literature we did not come across any significant use of SVD in vapor sensor array data denoising, despite it being used effectively in many other fields. The present work clarifies rationale as well as provides proof for the efficacy of SVD denoising in electronic nose data analysis. VI. CONCLUSION The electronic nose sensor array data analysis presented in this paper demonstrates the usefulness of the denoising procedure described in Section II based on the SVD expansion and rank reduction by eliminating the lowest singular value components. The noise reduction by this procedure promotes intraclass compaction and interclass dispersion in the feature space. Some results based on ANN classification after SVD denoising show the influence of this procedure in enhancing the odor classification rate. This study suggests that the rank approximation by SVD truncation is an effective method for removing additive noise from the sensor array based electronic nose data. ACKNOWLEDGMENT S. K. Jha is thankful to the Directorate of Forensic Science, Ministry of Home Affairs, New Delhi, and the Director, Central Forensic Science Laboratory, Chandigarh, for their support and JRF sponsorship. The authors thank all those authors whose published experimental data were used in this analysis. REFERENCES [1] T. C. Pearce, S. S. Schiffman, H. T. Nagle, and J. W. Gardner, Handbook of Machine Olfaction. Weinheim, Germany: Wiley-VCH, 2003, ch. 6, 14.

43

[2] E. L. Hines, E. Llobet, and J. W. Gardner, “Electronic noses: A review of signal processing techniques,” Proc. Inst. Elect. Eng., vol. 146, no. 6, pp. 297–310, 1999. [3] B. R. Kowalski and C. F. Bender, “Pattern recognition – A powerful approach to interpreting chemical data,” J. Amer. Chem. Soc., vol. 94, no. 16, pp. 5632–5639, Aug. 1972. [4] W. P. Carey and B. R. Kowalski, “Chemical piezoelectric sensor and sensor array characterization,” Anal. Chem., vol. 58, pp. 3077–3084, 1986. [5] A. W. J. Cranny and J. K. Atkinson, J. W. Gardner and P. N. Bartlett, Eds., “The use of pattern recognition techniques applied to signals generated by multi-element gas sensor array as a means of compensating for poor individual element response,” in Proc. NATO Advanced Research Workshop on Sensors and Sensory Systems for an Electronic Nose, Dordrecht, The Netherlands, 1992, pp. 197–216. [6] F. Rock, N. Barsan, and U. Weimar, “Electronic nose: Current status and future trends,” Chem. Rev., vol. 108, no. 2, pp. 705–725, 2008. [7] S. M. Scott, D. James, and Z. Ali, “Data analysis for electronic nose systems,” Michrochim. Acta., vol. 156, pp. 183–207, 2007. [8] K. J. Albert, N. S. Lewis, C. L. Schauer, G. A. Sotzing, S. E. Stitzel, T. P. Vaid, and D. R. Walt, “Cross-reactive chemical sensor arrays,” Chem. Rev., vol. 100, no. 7, pp. 2595–2626, 2000. [9] P. C. Jurs, G. A. Bakken, and H. E. McClelland, “Computational methods for the analysis of chemical sensors array data from volatile analytes,” Chem. Rev., vol. 100, no. 7, pp. 2649–2678, 2000. [10] W. Zhao, A. Bhusan, A. D. Santamaria, M. G. Simon, and C. F. Davis, “Machine learning: A crucial tool for sensor design,” Algorithms, vol. 1, pp. 130–152, 2008. [11] K. F. Riley, M. P. Hobson, and S. J. Bence, Mathematical Methods for Physics and Engineering. New York: Cambridge Univ. Press, 2006, ch. 8. [12] S. Theodoridis and K. Koutroumbas, Pattern Recognition. San Diego, CA: Academic, 2003, ch. 6. [13] I. Song, J. Bae, and S. Y. Kim, Advanced Theory of Signal Detection. Berlin, Germany: Springer-Verlag, 2002, sec. 1.3, p. 8. [14] D. W. Tufts, R. Kumarsen, and I. Kirsteins, “Data adaptive signal estimation by singular value decomposition of a data matrix,” Proc. IEEE, vol. 70, pp. 684–685, 1982. [15] Y. Wongsawat, K. R. Rao, and S. Oraintara, “Multichannel SVD-based image denoising,” in Proc. IEEE Int. Symp. Circuits and Systems, 2005, vol. 6, pp. 5990–5993. [16] Z. Hou, “Adaptive singular value decomposition in wavelet domain for image denoising,” Pattern Recognit., vol. 36, pp. 1747–1763, 2003. [17] L. Elden, Matrix Methods in Data Mining and Pattern Recognition. Philadelphia, PA: SIAM, 2007, ch. 6. [18] S. J. Orfanidis, 332:525 Optimum Signal Processing Rutgers University, 2002–2007. [Online]. Available: http://www.ece.rutgers.edu/~orfanidi/ece525/svd.pdf [19] K. Konstantinides and K. Yao, “Statistical analysis of effective singular values in matrix rank determination,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 757–763, 1988. [20] B. K. Natarajan, “Filtering random noise from deterministic signals via data compression,” IEEE Trans. Signal Processing, vol. 43, pp. 2595–2605, 1995. [21] K. Konstantinides, B. Natarajan, and G. S. Yovanof, “Noise estimation and filtering using block-based singular value decomposition,” IEEE Trans. Image Processing, vol. 6, pp. 479–483, 1997. [22] H. Hassanpour, “A time-frequency approach for noise reduction,” Digital Signal Process., vol. 18, pp. 728–738, 2008. [23] C. J. Zarowski, “The MDL criterion for rank determination via effective singular values,” IEEE Trans. Signal Process., vol. 46, pp. 1741–1744, Jun. 1998. [24] A. Van der Veen, E. F. Deprettere, and A. L. Swindlehurst, “Subspace based signal analysis using singular value decomposition,” Proc. IEEE, vol. 81, pp. 1277–1308, 1993. [25] P. C. Hansen and S. H. Jensen, “FIR filter representations of reducedrank noise reduction,” IEEE Trans. Signal Process., vol. 46, no. 6, pp. 1737–1741, Jun. 1998. [26] J. J. Gerbrands, “On the relationships between SVD, KLT and PCA,” Pattern Recognit., vol. 14, no. 1–6, pp. 375–381, 1981. [27] M. E. Wall, A. Rechtsteiner, and L. M. Rocha, “Singular value decomposition and principal component analysis,” in A Practical Approach to Microarray Data Analysis, D. P. Berrar, W. Dubitzky, and M. Granzow, Eds. Norwell, MA: Kluwer, 2003, vol. LANL LA-UR-024001, pp. 91–109.

44

IEEE SENSORS JOURNAL, VOL. 11, NO. 1, JANUARY 2011

[28] G. S. Koutsogiannis and J. J. Soraghan, “Selection of number of principal components for de-noising signals,” Electron. Lett., vol. 38, no. 13, pp. 664–666, Jun. 2002. [29] M. Aminghafari, N. Cheze, and J. M. Poggi, “Multivariate denoising using wavelets and principal component analysis,” Comput. Statist. Data Anal., vol. 50, pp. 2381–2398, 2006. [30] C. G. Thomas, R. A. Harshman, and R. Menon, “Noise reduction in BOLD-based fMRI using component analysis,” NeuroImage, vol. 17, pp. 1521–1537, 2002. [31] S. L. Rose-Pehrson, D. D. Lella, and J. W. Grate, “Smart sensor system and method using surface acoustic wave vapor sensor array and pattern recognition for selective trace organic vapor detection,” U.S. Patent 5,469,369, Nov. 21, 1995. [32] S. L. Rose-Pehrson, J. W. Grate, D. S. Ballantine Jr., and P. C. Jurs, “Detection of hazardous vapors including mixtures using pattern recognition analysis of responses from surface acoustic wave devices,” Anal. Chem., vol. 60, no. 24, pp. 2801–2811, 1988. [33] M. C. Lonergan, E. J. Severin, B. J. Doleman, S. A. Beaber, R. H. Grubbs, and N. S. Lewis, “Array based vapor sensing using chemically sensitive carbon black-polymer resistors,” Chem. Mater., vol. 8, pp. 2298–2312, 1996. [34] E. Llobet, J. Brezmes, X. Vilanova, J. E. Sueiras, and X. Correig, “Qualitative and quantitative analysis of volatile organic compounds using transient and steady state response of thick film tin oxide gas sensor array,” Sens. Actuators B: Chem., vol. 41, pp. 13–21, 1997. [35] A. Z. Berna, A. R. Anderson, and S. C. Trowell, “Bio-benchmarking of electronic nose sensors,” Chem. Sensing, vol. 4, no. 7, pp. 1–9, Jul. 2009. [36] H. C. Andrews and C. L. Patterson, “Singular value decompositions and digital image processing,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-24, pp. 26–53, 1976. [37] Z. Hou, “Adaptive singular value decomposition in wavelet domain for image denoising,” Pattern Recognit., vol. 36, pp. 1747–1763, 2003.

Sunil K. Jha received the B.Sc. and M.Sc. degrees in physics from Udai Pratap Autonomous College, Varanasi, India, affiliated to V.B.S. Purvanchal University, Jaunpur, India, in 2003 and 2005, respectively. He is currently working towards the Ph.D. degree. His research interests include sensor array vapor detection, multivariate data analysis, data processing, and pattern recognition.

R. D. S. Yadava received the B.Sc. (Hons), M.Sc., and Ph.D. degrees in physics from Banaras Hindu University, Varanasi, India, in 1974, 1976, and 1981, respectively. He is presently Professor of Physics at the Department of Physics, Faculty of Science, Banaras Hindu University. From December 2005 to October 1980, he was a Research and Development scientist at the Solid State Physics Laboratory, Defence Research and Development Organization, Ministry of Defence, Government of India, Delhi. He has extensive experience of working in several fields: design, modeling, fabrication and characterization of SAW devices and SAW chemical sensors; sensor array based electronic nose technology for trace detection of hazardous materials; statistical, neural and genetic algorithms for electronic nose data processing; narrow-gap mercury cadmium telluride crystal physics, technology and infrared sensors; noise processes, ac conductance and breakdown in metal-oxide-semiconductor field-effect structures; fractals in hopping transport, 1=f noise and dielectric behavior of metal-insulator percolation systems; radiation blistering in solids, ion implantation, radiation damage and laser annealing. His current research interests include SAW signal processing devices and sensors; polymeric resistive sensors; electronic nose system and data processing research; sensor/data fusion, pattern recognition and sensor intelligence; and nonlinear oscillators as novel sensors.

Suggest Documents