Fast and Robust Gas Identification System Using an ... - CiteSeerX

2 downloads 0 Views 934KB Size Report
systems using microelectronic gas sensors featuring small size and low-cost ... still remain a challenge even for most advanced pattern recog- .... spond to variations in the CO concentration but to a lower extent. A good ... V stands for Valve, ...... Intel Corporation, Santa Clara, CA, as a Senior Engineer in the Technology De-.
IEEE SENSORS JOURNAL, VOL. 5, NO. 6, DECEMBER 2005

1433

Fast and Robust Gas Identification System Using an Integrated Gas Sensor Technology and Gaussian Mixture Models Sofiane Brahim-Belhouari, Amine Bermak, Senior Member, IEEE, Minghua Shi, and Philip C. H. Chan, Senior Member, IEEE

Abstract—Among the most serious limitations facing the success of future consumer gas identification systems are the drift problem and the real-time detection due to the slow response of most of today’s gas sensors. This paper shows that the combination of an integrated sensor array and a Gaussian mixture model permits success in gas identification problems. An integrated sensor array has been designed with the aim of combustion gases identification. Our identification system is able to quickly recognize gases with more than 96% accuracy. Robust detection is introduced through a drift counteraction approach based on extending the training data set using a simulated drift. Index Terms—Drift counteraction, fast recognition, gas sensors, Gaussian mixture model (GMM).

I. INTRODUCTION

G

AS identification on a real-time basis is very critical for a very wide range of applications in the civil and military environments. The past decade has seen a significant increase in the application of multisensor arrays to gas classification and quantification. Most of this work has been focused on systems using microelectronic gas sensors featuring small size and low-cost fabrication, making them attractive for consumer applications. A number of interesting applications have also emerged in the last decade, whether related to hazard detection, poisonous and dangerous gases or to quality and environmental applications such as air quality control. Among various types of microelectronic gas sensors, the microhotplate based SnO thin film sensors offer a number of interesting features and are particularly attractive for their practical interest [1]. Indeed, these devices feature high sensitivity, low power consumption, as well as compactness and good compatibility with semiconductor technology. Unfortunately, thin film sensors (as do all gas sensors), suffer from a number of shortcomings such as cross selectivity to gases i.e., low selectivity, high sensitivity to humidity, nonlinearities of the sensor’s response, drift, and slow response. Poor selectivity toward the monitored gas, or cross sensitivity toward other gases makes a sensor’s output unreliable. Long exposure

Manuscript received April 12, 2004; revised April 27, 2005. This work was supported by the Research Grant Council (RGC) of Hong Kong under the competitive earmarked research Grant HKUST 6162/04E. The associate editor coordinating the review of this paper and approving it for publication was Prof. Michael Schoening. The authors are with the Electrical and Electronic Engineering Department, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, SAR China (e-mail: [email protected]; [email protected]; [email protected]; [email protected]). Digital Object Identifier 10.1109/JSEN.2005.858926

cycles of the sensors as well as aging factors and poor stability causes a sensor’s calibration curve drift with time [2]. The drift can be explained as a random temporal variation of the sensor response when exposed to the same gases under identical conditions. These drifts are due to unknown dynamic processes in the sensor system (e.g., poisoning or ageing of sensors) or environmental changes (e.g., temperature and pressure conditions). The slow response feature of most gas sensors is also a very critical issue as fast detection is one of the most important requirements in a number of applications, such as hazard, poisonous and dangerous gas detection. Indeed, gas sensors react slowly and the steady-state response is typically obtained after few minutes. An interesting current tendency in research is to address the slow response using nano-structural engineering such as grain growth of metal oxide films. Response time in the order of few seconds has been reported in the literature [3]. Unfortunately technological potentiality of structural engineering are not yet fully exploited and the techniques used are not studied sufficiently [2]. An alternative solution to improve the response time is to address the issue at the algorithmic level by processing the transient response rather than the steady-state response of the sensor. Pattern recognition algorithms combined with a gas sensor array have been traditionally used to address non selectivity and sensors nonlinearity issues [4]. In fact, a gas sensor array permits to improve the selectivity of the single gas sensor, and shows the ability to classify different gases. Significant research work has been carried out during the last decade in gas detection, preprocessing techniques as well as pattern recognition algorithms [4], [5]. However, robust and accurate gas discrimination still remain a challenge even for most advanced pattern recognition techniques due to the problems previously mentioned. In this paper, we present a fast and robust gas classification approach for combustible gases application using micromachined SnO gas sensors, and a pattern recognition algorithm based on class-conditional density estimation using Gaussian mixture models (GMM). Fast recognition is achieved by developing novel methods for selecting dynamic features of the sensors signals, which not only improve detection performances, but also speed it up considerably. Robust detection is also proposed through a drift counteraction approach based on extending the training data set using a simulated drift. The performance of the retrained GMM shows the effectiveness of the new approach in improving the classification performance in the presence of drift. The paper is organized as follows. Section II presents the integrated micromachined gas sensor technology and the experi-

1530-437X/$20.00 © 2005 IEEE

1434

IEEE SENSORS JOURNAL, VOL. 5, NO. 6, DECEMBER 2005

Fig. 1. Cross section of the gas sensor.

mental characterization of the integrated gas sensor array. Section III describes the pattern recognition system based on GMM classifiers. Section IV presents the experimental results related to the classification. Special emphasis will be placed on fast and robust detection. Section V concludes this paper. II. GAS SENSORS TECHNOLOGY AND EXPERIMENTAL CHARACTERIZATION A. Gas Sensor Technology The tin oxide is still the most popularly used material for the detection of combustible gases and toxic contaminants. Microelectronic gas sensors based on tin oxide films are, therefore, extensively used in gas detection applications. Such devices are sensitive to specific gases when heated at high temperature levels (around 300 C). To reach such high temperatures, a microstructure called the microhotplate (MHP) was developed [6]. This structure is built using either front-side or back-side etch bulk micromachining techniques. The thermally isolated hotplate was fabricated using surface silicon micromachining technique. The front-side surface machined MHP permits to retain all the desirable thermal characteristics that are essential to the integrated gas sensor applications. The cross-sectional view of the device is shown in Fig. 1. The MHP is suspended by four microbridges at the four corners. The bridges are 30 m wide and 58 m long. The area outside the MHP remains at the silicon substrate temperature, which reduces the thermal crosstalk between individual MHPs in a sensor array system and when supporting electronic circuity is to be integrated with the MHP on the same chip. In order to improve the temperature uniformity of the MHP, a polysilicon heater ring is placed at the outer perimeter of the microstructure. The polysilicon resistor at the center monitors the temperature of the MHP. The insulating air-gap was formed by etching away polysilicon sacrificial layer. Finite element thermal analysis suggested that 1.5–2- m air-gap provides effective thermal isolation for the MHP. The device was fabricated using our in-house 4- m design rule process. The device requires seven masks and occupies an area of 120 120 m. The stability of the sensor as well as its sensitivity, defined as the ratio between the change in sensor resistance in the presence and in the absence of the gas, was

Fig. 2.

Microphotograph of the integrated gas sensor array.

experimentally characterized [1]. It was found that the sensor exhibits very good stability (up to 500 cycles) and excellent sensitivity to CO as low as 1 ppm. Based on this sensor structure, an integrated gas sensor array, including four individually controllable units was developed. Fig. 2 shows a microphotograph of the manufactured chip including four sensors on a single chip. The main advantage of the surface machined process compared with the bulk silicon etching process is the high yield, which is of primary importance when implementing an array. Indeed, the yield typically drops in an array based implementation as compared to single sensor structure due to an increase in silicon area and the requirement of different sensing films making the process more difficult. A single chip integrates four sensor elements as shown in the microphotograph of Fig. 2. Each sensor has its own heater and temperature sensor. Three different sensing films are used to implement the sensor array. One sensor is based on Au/SnO (sensor 1), another sensor is based on Pt/Cu(0.16 wt%)-SnO (sensor 2), and the remaining two sensors are based on Pt/SnO (sensors 3 and 4). Totally, two chips were used and calibrated by tuning their selectivity to a given set of gases using the temperature parameter. Before carrying out electrical measurement, the temperature of the microhotplate is calibrated by first recording the current flow through

BRAHIM-BELHOUARI et al.: FAST AND ROBUST GAS IDENTIFICATION SYSTEM

1435

TABLE I SENSOR DESCRIPTION

the heating layer and the temperature (sensor resistance). The sensitivity to CO was found to reach its peak at about 300 C for sensor 3 and sensor 4 Pt/SnO ), the remaining sensors also respond to variations in the CO concentration but to a lower extent. A good sensitivity to H was obtained at about 260 C, while a good sensitivity to CH was obtained at about 300 C. Table I summarizes all the sensors used together with the different parameters. It should be noticed that the two chips are identical; however, the operating temperature is different, allowing us to tune the selectivity of the two chips to different gases. Using the temperature parameter to set the selectivity of gas sensors is gaining greater interest by researchers in this field mainly because the selectivity is greatly influenced by the operating temperature [7]. The two chips provide eight responses, which could be seen as a fingerprint or a signature corresponding to a given gas mixture, which can then be exploited by a pattern recognition system in order to build a selective detection system, as will be described in the next sections.

Fig. 3. Experimental setup used to characterize the sensors. V stands for Valve, MFC stands for mass flow controller, and the DAQ is the data acquisition board used to control the setup and to acquire the signals from the sensor array. DUT stands for device under test. TABLE II SENSOR ARRAY CHARACTERISTICS AND CHARACTERIZATION SETUP PARAMETERS

B. Experimental Characterization An automated experimental setup was built in order to perform electrical and gas sensing characterization. The setup can be used to measure gas-sensing characteristics in well defined temperature cycles and gas concentration levels. Fig. 3 illustrates an overall view of the system including the gas chamber, the gas delivery system as well as the data acquisition system. The gas chamber with a diameter of 90 mm and a reaction volume of 100 cm was used for the experiment. The chip carrier is inserted into a chip socket and placed inside the chamber with feed-through wires used for resistance measurement and temperature control. The gas delivery system includes valves and three mass flow controllers (MFC) (with a maximum flow rate of 500 ml/min) for the tested gas and one MFC for the synthetic air (1l/min). The data acquisition board (DAQ) from National Instrument is used in order to control the valves and the MFCs. The DAQ is also used to record the output of the sensors for further processing. The gas concentrations in the sensor chamber are adjusted by selecting the correct flow rate for different gases. Input signals generated by the data acquisition board and used to control the MFC are pulse signals corresponding to different concentrations. The mode of operation is, therefore, an online measurement without reference gas with gradually increasing concentrations. The temperature of the sensors is constantly monitored by periodically reading data from the integrated temperature sensor. The microhotplates of each chip are heated to a particular temperature by flowing the precalibrated current through the heating element. A current flow of 2.8 mA/ m through the heating layer is required for an operating temperature of about 300 C.

This corresponds to a maximum power consumption of about 200 mW per chip. Table II summarizes the chip features as well as the characterization set-up parameters. The sensors output are raw voltage measurements in the form of exponential-like curves typically described by Fig. 4. Gases used in the experiment are methane, carbon monoxide, hydrogen, and two binary mixtures: one of methane and carbon monoxide and another of hydrogen and carbon monoxide. Vapors were injected into the gas chamber at a flow rate determined by the mass flow controllers (MFC). The steady-state values of the array sensor were recorded while periodically injecting different gases and the baseline response of each sensor was normalized using the Euclidean norm. Normalization has been previously employed in gas discrimination applications where the identification must be based on signature pattern, and not on the concentration dependent amplitudes [8]–[10]. On one hand, normalization is useful to set range in order to the range of values for sensors’ output to avoid the data pattern with larger signal magnitude to dominate in the data space. On the other hand, normalization is applied to remove the concentration dependence within the data space. Because the concentration for an unknown gas is also unknown, the identification must be based on signature pattern, and not on the concentration dependent amplitudes. Fig. 5 shows an example

1436

IEEE SENSORS JOURNAL, VOL. 5, NO. 6, DECEMBER 2005

Fig. 4. Raw response of an array of eight microelectronic gas sensors.

Fig. 5. Histograms showing the response patterns of the eight gas sensors exposed to CH , CO, and their mixture.

of a typical steady-state response for the sensor array exposed to different gases. We can note that the response of the two chips is

quite different due to different operating temperatures, as well as mismatch in the fabrication process.

BRAHIM-BELHOUARI et al.: FAST AND ROBUST GAS IDENTIFICATION SYSTEM

Fig. 6.

1437

Additive drift affecting the sensor baseline.

TABLE III REVIEW OF THE TYPICAL RESPONSE TIMES OF SnO -BASED GAS SENSORS ILLUSTRATING THE MATERIAL USED, THE GASES INVOLVED, AND THEIR CONCENTRATIONS AS WELL AS THE OPERATING TEMPERATURE

A gas data set of 220 patterns (each pattern consists of eight sensor responses) was created to train the different density classifiers and to evaluate their identification performances. C. Sensor Nonidealities and Response Time Our aim is to achieve identification of combustion gases with an array of tin oxide sensors. The pattern recognition strategy should be robust against the inherent problems of the sensors used. These include nonlinear response, poor selectivity, drift and slow response time. In addition to nonlinearity and nonselectivity, one of the most serious limitations of tin oxide gas sensors is the drift problem, which causes significant temporal variations of the sensor response when exposed to the same gases under identical conditions. As a result of drift, the cluster distribution in the feature space becomes unstable over time, making

useless the decision surface built by the classifier during the training phase. The drift can affect both the baseline (additive) and the sensitivity of the sensor (multiplicative). Fig. 6 illustrates an example of an additive drift problem in which we have reported the response of the sensor as function of the concentration of gases periodically injected into a gas chamber in which the sensors are being placed. We can note that the baseline response of the sensor is shifted which complicates the classification problem even further. Finally, detecting combustion and dangerous gases requires fast and reliable detection. This requirement is made very challenging due to the typically slow response of the tin oxide sensors. This issue is critical for most gas sensors with reaction time in the order of few minutes. Table III illustrates response times of SnO based gas sensors, reported in the literature. Typical re-

1438

IEEE SENSORS JOURNAL, VOL. 5, NO. 6, DECEMBER 2005

sponse times are reported to be in the range of few minutes. It should be, however, noticed that response times cannot be stated solely as they are dependent on many parameters such as type of material, thickness of the membrane, operating temperature as well as concentration of the gases in question. Our sensors present a response time of 1–3 min at about 300 C and 50-ppm CO. Recently, faster gas sensors based on nano-structures using special chemical deposition process have been reported in the literature [3] with response times in the order of 2–6 s at operating temperatures of about 200 C and H concentrations in the range of 1000–8000 ppm. A recent study [2] suggested three main methods for reducing the response time, which are 1) increasing the working temperature, 2) reducing the film thickness, and 3) introducing noble metals into the film. Our aim is to achieve robust and fast identification of combustion gases with an array of tin oxide sensors using advanced pattern recognition algorithms overcoming the inherent problems of our sensors. The robustness as well as the speed of classification is here tackled at the algorithmic level. It should be noticed that there is currently a trend to tackle these issues at the physical and fabrication level as suggested in [2]. III. PATTERN RECOGNITION SYSTEM The algorithmic part of a gas identification system includes a preprocessing module and a classification procedure. The role of the preprocessing module is to segment the pattern of interest from the background, remove noise, normalize the pattern and any other operation that contributes in defining a compact representation of the pattern. Classification tasks address the problem of identifying unknown sample as one from a set of recognizable gases. A. Feature Selection The data preprocessing stage operates on the gas sensor responses in a way that improves the overall pattern analysis performance. It can be achieved by extracting parameters that are descriptive of the sensor array responses. Thus, the raw data is transformed into a characteristic feature vector. Numerous preprocessing techniques have been proposed in the literature [4]. The most common procedure uses the steady state of the sensors’ response as a feature vector and ignores the transient response [19]. A number of compression algorithms have been proposed to extract additional information from the transient response, resulting in improved selectivity and increased recognition accuracy [20]. However, these techniques normally require complicated analysis of the whole dataset, and so that, in any case, it is necessary to wait for the steady-state response. In order to achieve reliable recognition as fast as possible, we propose to select features over short time interval since gas injection. The problem of feature selection can be defined as that of selecting a subset of features that achieve the best classification performance. Let be the initial dataset containing a number of voltage features. Each feature corresponds to different time points of the search containing interval . The objective is to find a subset few features that minimizes a selection criterion (1)

In our case, classification error was used as the selection criterion since the focus is on classifying different gases. Several search methods have been used to explore efficiently the feature space. The simplest method is sequential forward selection (SFS) [21]. This procedure starts from the empty set and sequentially adds features that achieve the lowest value for the selection criterion . This process continues until all features are included in the subset. B. GMM Classifier The objective of pattern recognition is to set a decision rule, which optimally partitions the data space into regions, one for each class . A pattern classifier generates a class label for an from a discrete set of previously unknown feature vector learned classes. The most general classification approach is to . To use the posterior probability of class membership minimize the probability of misclassification one should consider the maximum a posterior rule and assign to class (2) where is the class-conditional density and is the prior probability. In the absence of prior knowledge, can be approximated by the relative frequency of examples in the dataset. One way to build a classifier is to estimate the classconditional densities by using representation models for how each pattern class populates the feature space. In this approach, classifier systems are built by considering each of the class in turn, and estimating the corresponding class-conditional densifrom data. ties The most widely used method of nonparametric density esnearest neighbors (KNN). Despite the simtimation is the plicity of the algorithm, it often performs very well and is an important benchmark method. However, one drawback of KNN is that all the training data must be stored, and a large amount of processing is needed to evaluate the density for a new input pattern. An alternative is to combine the advantages of both parametric and nonparametric methods by allowing a very general class of functional forms in which the number of adaptive parameters can be increased to build more flexible models. This leads us to a powerful technique for density estimation, called mixture model [22]. In our work we focus on semi-parametric models based on Gaussian mixture distributions. In a Gaussian mixture model, a probability density function is expressed as a linear combination of basis functions. A model components is described as mixture distribution [22] with (3) where are the mixing coefficients and the parameters of the component density functions vary with . Each mixture component is defined by a Gaussian parametric distribution in -dimensional space

BRAHIM-BELHOUARI et al.: FAST AND ROBUST GAS IDENTIFICATION SYSTEM

The parameters to be estimated are the mixing coefficients , and the mean vector . The covarithe covariance matrix , diagonal i.e., ance matrix can be spherical where or any positive definite full matrix. The method for training mixture model is based on maximizing the data likelihood. The log likelihood of the dataset , which is treated as an error, is defined by (4) A specialized method is commonly used to produce optimum parameters, known as the expectation-maximization (EM) algorithm [23]. The EM algorithm iteratively modifies the model . EM guarparameters starting from the initial iteration antees a monotonically non decreasing likelihood although its ability to find a local maximum depends on parameter initialization. Convergence can be accelerated by modifying the maximization step. For GMM, the EM optimization can be carried out analytically with a simple set of equations [23], where the mixing coefficients are estimated by (5) and the estimate for the means for each component is given by (6) and, finally, the update equation for the covariance matrix is (7) Minimum description length (MDL) criterion is able to select an optimal number of components in the model and so partition the dataset. MDL was derived by Risanen [24] from an information theoretic perspective. Although the class-conditional distributions in feature space are generally non-Gaussian, the resulting multimodal approximation is remarkably accurate. GMM can approximate any continuous density with an arbitrary accuracy provided the model has a sufficiently large number of components, and provided the parameters of the model are chosen correctly. The price we have to pay is that the training process is computationally intensive compared to the simple procedure needed for parametric methods. When the training algorithm constructs a mixture model with only one component, the GMM classifier has the same behavior as a Bayesian quadratic classifier, where the decision boundaries are hyper-ellipsoids or hyper-paraboloids. IV. RESULTS AND ANALYSIS In this section, we present results obtained by using several configurations with our pattern recognition system. Experiments are based on the gas dataset of 220 patterns with each pattern consisting of eight sensor responses.

1439

TABLE IV GMM STRUCTURE SELECTION

A. Classification Results In order to evaluate the classification performance of the proposed classifier, we first consider the steady-state response and perform three studies. 1) GMM Structure Selection: Several covariance matrix forms can be used to estimate the class-conditional density. Three competing structures based on full, diagonal or spherical form are considered here. The GMM classifier is built by considering each of the class in turn, and estimating the corresponding from the data. MDL criteclass-conditional densities rion is used to select an optimal number of components for each density model. The parameters of each model were adapted to the training data in the maximum likelihood framework using EM algorithm. These GMM structures have different number of parameters. In order to compare the performance of different structures, a ten-fold cross validation approach was used to overcome the problem of the limited data set typically available in gas sensors applications. In our ten-fold cross validation approach, the data set is split into two mutually exclusive subsets, one for learning and one for testing. In order to minimize dependency between the data partition and the classification performance, 10 different partitions were created. The classification accuracy was evaluated for each partition and the final result is expressed as the average performance. The obtained results are reported in Table IV. The full structure appears to be the best model in terms of classification accuracy and complexity given by the structure parameters number. GMM with a full structure provides a better representation of the true density of the data set. The full Gaussian model permits to build a very general structure in which the number of adaptive parameters can be increased in a systematic way featuring a more flexible and accurate model. 2) Dimensionality Reduction: Previous results were obtained using the whole dataset. This second analysis was made on reduced datasets using a projection technique. Prior to applying the GMM classifier, a dimensionality reduction technique namely principal component analysis (PCA) was used in order to perform redundancy removing and feature reduction. Fig. 7 presents the two-dimensional PCA scores for all the studied gas sensors steady-state voltages. We can note that the decision boundaries are not well defined due to a strong overlapping of the classes. GMMs are built on projected datasets with different number of principal components. Fig. 8 shows the class-conditional density for the third class (CO and CH mixture) using GMM with three component density functions and a full covariance matrix. The best performance is achieved when projecting to five principal components. The addition of components actually degrades the performance of the classifier. Results of this analysis are shown in Table V. PCA projection improves the classification accuracy for all density model classifiers.

1440

IEEE SENSORS JOURNAL, VOL. 5, NO. 6, DECEMBER 2005

Fig. 7. PCA results for the microelectronic sensor array steady-state voltage. Measurement type: CO (circles), CH (plus signs), mixture CO-CH (diamonds), H (triangles), and mixture CO-H (squares).

Fig. 8.

Class-conditional density for the third class (CO and CH mixture) using GMM with three component density functions and a full covariance matrix.

BRAHIM-BELHOUARI et al.: FAST AND ROBUST GAS IDENTIFICATION SYSTEM

TABLE V CLASSIFICATION ACCURACY (%) WITH AND WITHOUT PCA PROJECTION

It can be concluded that the use of dimensionality reduction depends on the relationship between the training dataset size and the number of features. If the number of training examples is very large, then the classification error does not increase as the number of features increases. However, if the number of training examples is small relative to the number of features, a dimensionality reduction technique is needed to guarantee an acceptable classification accuracy. This is typically the case regardless of the GMM structure used as evidenced from Table V. Since only a limited number of examples are typically available, there is an optimal number of feature dimensions beyond which the performance of the Gaussian mixture models start to degrade. The GMM performance was also evaluated using a single chip operated at a single temperature of either 300 C or 260 C compared to the case of two chips operated simultaneously at two different temperatures. For a single chip solution, the performance were found to be at best1 86%. This performance is clearly lower than that of the two chips operating at two temperatures which was found to be 94.2%. 3) GMM Versus Other Classifiers: The GMM classification performance was compared with widely used classification techniques in pattern recognition problems such as K-nearest neighbor (KNN), multilayer perceptron (MLP), support vector machines (SVMs), and the probabilistic principal component analysis (PPCA). PPCA is a density model based classifier similar in its concept to the GMM algorithm [25]. Indeed, in PPCA, each component density function is given by a probabilistic PCA and the training of such a model can be done in the maximum likelihood framework using an EM algorithm. SVMs are now used in gas sensors applications [26] because they are well-grounded in statistical learning theory and they overcome many of the drawbacks seen in previously described pattern recognition techniques. In this comparison, all classifiers were individually optimized and their performance was compared for different principal components. GMM with a full covariance matrix is now considered as it has proven to be the most effective structure. For MLP, an optimized structure has been found with eight hidden units (117 weights). Best generalization performance for KNN algorithm is given for K equal to 3. For SVMs, the multiclass implementation of [27] was used. Generalization performances were estimated using again a ten-fold cross validation approach. Fig. 9 shows that the most accurate discriminant function is the SVM, while the most accurate density model is the GMM. For eight principal components, the best performance is obtained for SVMs. SVMs are shown to work quite well when the dataset is small with respect to the input dimensionality. However, better performance 1The best-case performance is obtained by taking the maximum performance when operating the sensor at either 300 C or 260 C, and for all possible PCA projections.

1441

is obtained for GMM when projecting the data to the five first principal components. The best performance is achieved using GMM with a success rate of 94.2% obtained for five principal components. This points out to an important result which suggests that higher generalization performance can be obtained by using feature reduction and selection techniques as preprocessing techniques for increasing the ratio of the number of training samples over the number of features. B. Fast Recognition The sensor array system reacts slowly and takes, on average, 10 min to reach the stationary state. This time is a combination of the time to fill the chamber and the sensors time response. One possible goal in pattern recognition is to achieve reliable results as fast as possible. In order to study the possibility of building fast recognition system based on dynamical response, we select patterns from different time points (over short period since gas injection) of the transient sensors responses. For different search intervals since gas injection, we set the initial data set to features corresponding to time points within each interval. We use the sequential forward algorithm as a feature selector (1) with GMM and PCA dimensionality reduction. The optimization results are shown in Table VI. It was found that a maximum of 96.3% accuracy is obtained in the time window [0, 30s] since gas injection with a subset of three selected features, corresponding to 6, 18, and 27s, per sensor. It is remarkable to note that, for times lower than 30s, a classification performance of more than 94% is still achievable using the optimized structure of GMM. C. Robustness The drift problem can cause a serious robustness issue as it can be interpreted as temporal variations of the pattern distribution in the feature space. The decision surface obtained during the training phase is, therefore, made obsolete, and, hence, retraining the entire system is necessary. To compensate for the patterns dispersion movement, we propose to extract robust features by generating simulated drift. The training data set is extended by extracting more features from the drifted measurements. This new learning space increases the classifier robustness to temporal variation of the sensor response. The efficiency of this procedure has been tested against simulated linear drift. The drift has been modeled as where is the sensor output before the drift was chosen randomly for each sensor experiment and [28]. Drift varying between 0 and 30% has been artificially generated. The performance of the best classifier was evaluated over the drifted measurements. Fig. 10 shows that drift affects the recognition ability of GMM as the classification success declines significantly (dashed line of Fig. 10). The drift counteraction strategy is to retrain GMM using drifted sensor responses (solid line of Fig. 10). The performance of the retrained GMM was evaluated using the ten-fold cross validation method. It is shown that the counteraction procedure improves the performance of GMM in presence of 30% drift by a factor of over 30%. The final assessment of this procedure has to be achieved by testing it over real sensor’s drift data.

1442

IEEE SENSORS JOURNAL, VOL. 5, NO. 6, DECEMBER 2005

Fig. 9. Accuracy as a function of the number of principal components.

TABLE VI FAST RECOGNITION RESULTS. CLASSIFICATION PERFORMANCE AS FUNCTION OF SEVERAL TIME WINDOWS AND THE CORRESPONDING TIME POINTS SELECTED

D. Results Analysis and Practical Considerations Preprocessing the sensors signals and using advanced pattern recognition techniques are fundamental parts of gas identification systems. However, training a classifier for fast and robust detection and recognition of different odors still remains challenging partly because of slow response of the sensors, the temporal variability of the gas sensors, the large intra-class variance as compared to the small inter-class separation and the small amount of training data available. The use of dimensionality reduction is an effective way to improve the performance of density models classifiers particularly when the number of training examples is small relative to the number of features. We have shown that optimized GMM permits to achieve the highest accuracy when projecting the data to the five first principal components. We have also demonstrated that using advanced feature selection algorithms to select transient points can achieve improved and fast detection. It is, however, important to notice that in order for GMM to produce good generalization results, the test data should have

a relatively similar probability distribution compared to that of the training set. A general approach to this problem is to monitor the likelihood of feature vectors during operation and compare them with the range of likelihood in the training set. Any values that fall outside this range are probably due to novel inputs, and the corresponding model output should not be relied upon. This novelty detection approach is a useful technique that should be considered for future investigation in order to improve our gas identification performance. We have also shown that drift counteraction approach based on extracting robust features using a simulated drift is quite efficient in improving the robustness of the classifier. Further offset cancellation techniques are currently being explored using hardware implementation in which the sensor information is dynamically obtained by subtracting the steady-state value with the most updated offset value. This approach is a real-time solution which appears to be effective in reducing additive drift. V. CONCLUSION In this paper, we presented a gas identification approach based on a microelectronic gas sensors technology and classconditional density estimation using GMMs. In-house microelectronic gas sensors based on tin oxide films and a microstructure called the MHP were used. Extensive measurements and sensor characterization were performed using an automated experimental setup for combustion gases identification (methane, carbon monoxide, hydrogen, and their mixtures). The GMM classifier structure was optimized by

BRAHIM-BELHOUARI et al.: FAST AND ROBUST GAS IDENTIFICATION SYSTEM

Fig. 10.

1443

Classification performance as function of drift (expressed in %) before (dashed) and after (solid) retraining.

considering several covariance matrix forms used to estimate the class-conditional density. It was found that the structure with a full covariance matrix optimized with MDL criterion, presents the best classification accuracy for the gas sensors data set collected from the integrated gas sensor array. The proposed classifier is shown to perform very well as compared to traditional as well as advanced pattern recognition algorithms such as KNN, MLP, SVM, and PPCA. Indeed, the best performance is achieved using GMM with a success rate of 94.2% obtained for five principal components. This points out to an important result which suggests that higher generalization performance can be obtained by using feature reduction and selection techniques as preprocessing techniques for increasing the ratio of the number of training samples over the number of features. Using the operating temperature as a parameter to tune the selectivity of the sensor chip to different target gases was also proven to be an effective way to improve performance of the overall system. In addition, fast recognition with an excellent accuracy of more than 96% was obtained in the time window [0–30 s] since gas injection. It was found that the classification accuracy can be trade for even faster recognition (94.5% for 15s). This classification success rate is achieved by the combination of sensitive gas sensor array, dynamic features selected using SFS algorithm, and GMM with optimized structure. It was, however, found that the drift seriously degrades the classification performance of GMM. A drift counteraction approach based on extracting robust features using a simulated drift was proposed. The performance of the retrained GMM

was evaluated using a cross validation method which shows a gain of over 30% obtained for up to 30% drift. ACKNOWLEDGMENT The authors would like to thank Prof. G. Yan and Dr. D. Martinez for their technical support and help. REFERENCES [1] P. C.H. Chan, G. Yan, L. Sheng, R. K. Sharma, Z. Tang, J.K.O. Sin, I.-M. Hsing, and Y. Wang, “An integrated gas sensor technology using surface micro-machining,” Sens. Actuators B, vol. 82, pp. 277–283, 2002. [2] G. Korotcenkov, “Gas response control through structural and chemical modification of metal oxide films: State of the art and approaches,” in Sens. Actuators B, to be published. [3] J. W. Gong, Q. F. Chen, W. F. Fei, and S. Seal, “Micromachined nanocrystalline SnO2 chemical gas sensors for electronic nose,” Sens. Actuators B, vol. 102, no. 1, pp. 117–125, 2004. [4] R. Gutierrez-Osuna, “Pattern analysis for machine olfaction: A review,” IEEE Sensors J., vol. 2, no. 3, pp. 189–202, Jun. 2002. [5] M. Pardo and G. Sberveglieri, “Learning from data: A tutorial with emphasis on modern pattern recognition methods,” IEEE Sensors J., vol. 2, no. 3, pp. 189–202, Jun. 2002. [6] G. Yan, L. Sheng, Z. Tang, J. Wu, P. C.H. Chan, and J.K.O. Sin, “A low power CMOS compatible integrated gas sensor using maskless tin oxide sputtering,” Sens. Actuators B, vol. 49, pp. 81–87, 1998. , “Multi-frequency temperature modulation for metal-oxide gas [7] sensors,” presented at the 8th Int. Symp. Olfaction and the Electronic Nose, Washington, DC, Mar. 2001. [8] T. C. Pearce, “Computational parallels between the biological olfactory pathway and its analogue ‘The Electronic Nose’: Part II. Sensor-based machine olfaction,” Biosystems, vol. 41, pp. 69–90, 1997.

1444

[9] R. Polikar, R. Shinar, V. Honavar, L. Udpa, and M. D. Porter, “Detection and identification of odorants using an electronic nose,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. 5, 2001, pp. 3137–3140. [10] E. L. Hines, E. Llobet, and J. W. Gardner, “Electronic noses: A review of signal processing techniques,” Proc. Inst. Elect. Eng., Circuit Devices Syst., vol. 146, no. 6, pp. 297–310, 1999. [11] M. A. Aronova, K. S. Chang, I. Takeuchi, H. Jabs, D. Westerheim, A. Gonzalez-Martin, J. Kim, and B. Lewis, “Combinatorial libraries of semiconductor gas sensors as inorganic electronic noses,” Appl. Phys. Lett., vol. 83, no. 6, pp. 1255–1257, 2003. [12] K. D. Mitzner, J. Sternhagen, and D. W. Galipeau, “Development of a micromachined hazardous gas sensor array,” Sens. Actuators B, vol. 93, no. 1–3, pp. 92–99, 2003. [13] L. Chambon, J. P. Germain, A. Pauly, V. Demarne, and A. Grisel, “A metallic oxide gas sensor array for a selective detection of the CO and NH3 gases,” Sens. Actuators B, vol. 60, no. 2–3, pp. 138–147, 1999. [14] P. Althainz, A. Dahlke, M. Frietsch-Klarhof, J. Goschnick, and H. J. Ache, “Reception tuning of gas-sensor microsystems by selective coatings,” Sens. Actuators B, vol. 24–25, pp. 366–369, 1995. [15] G. G. Mandayo, E. Castano, and F. J. Gracia, “Carbo monoxide detector fabricated on the basis of a tin oxide novel doping method,” IEEE Sensors J., vol. 2, no. 4, pp. 322–328, Aug. 2002. [16] R. P. Lyle and D. Walters, “Commercialization of silicon-based gas sensors,” in Proc. Transducers, Solid State Sensors and Actuators, Chicago, IL, 1997, pp. 975–978. [17] J. W. Allen, B. T. Marquis, and D. J. Smith, “Investigating the long-term heating and analyte exposure effects on tin oxide thick-film sensors,” in Proc. IEEE Sensors Conf., vol. 2, 2002, pp. 1260–1265. [18] I. Sayago, M. D. C. Horrillo, S. Baluk, M. Aleixandre, M. J. Fernandez, L. Ares, M. Garcia, J. Santos, and J. Gutierrez, “Detection of toxic gases by a tin oxide multisensor,” IEEE Sensors J., vol. 2, no. 5, pp. 387–393, Oct. 2002. [19] J. W. Gardener, M. Craven, C. Dow, and E. L. Hines, “The prediction of bacteria type and culture growth phase by an electronic nose with a multi-layer perceptron network,” Meas. Sci. Technol., vol. 9, pp. 120–127, 1998. [20] T. Eklov, P. Martensson, and I. Lundstrom, “Enhanced selectivity of MOSFET gas sensors by systematical analysis of transient parameters,” Anal. Chim. Acta, vol. 353, pp. 291–300, 1997. , “Selection of variables for interpreting multivariate gas sensor [21] data,” Anal. Chim. Acta, vol. 381, pp. 221–232, 1999. [22] D. M. Titterington, A. F. M. Smith, and U. E. Makov, Statistical Analysis of Finite Mixture Distributions. New York: Wiley, 1985. [23] C. M. Bishop, Neural Networks for Pattern Recognition. Oxford, U.K.: Clarendon, 1995. [24] J. Rissanen, “Modeling by Shortest Data Description,” Automatica, vol. 14, pp. 465–471, 1978. [25] S. Brahim-Belhouari, A. Bermak, G. Wei, and P.C.H. Chan, “A comparative study of density models for gas identification using microelectronic gas sensor,” presented at the IEEE Conf. Signal Processing and Information Technology ISSPIT, Darmstadt, Germany, 2003. [26] C. Distante, N. Ancona, and P. Siciliano, “Support vector machines for olfactory signals recognition,” Sens. Actuators B, vol. 88, no. 1, pp. 30–39, 2003. [27] Y. Guermeur, “Combining discriminant models with new multi-class SVMs,” Pattern Anal. Appl., vol. 5, no. 2, pp. 168–179, 2002. [28] S. Marco, A. Ortega, A. Pardo, and J. Samitier, “Gas identification with tin oxide sensor array and self-organizing maps: Adaptive correction of sensor drifts,” IEEE Trans. Instrum. Meas., vol. 47, no. 1, pp. 316–321, Feb. 1998.

Sofiane Brahim-Belhouari received the engineer diploma in electrical engineering from the Polytechnic Institute of Algiers, Algeria, in 1993, and the Ph.D. degree in automatic control and signal processing from the University of Paris XI, Paris, France, in 2000. After receiving the Ph.D. degree, he held postdoctorate positions at the Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland, and the Hong Kong University of Science and Technology, Kowloon. His main research interests are in data analysis, statistical signal processing, and pattern recognition.

IEEE SENSORS JOURNAL, VOL. 5, NO. 6, DECEMBER 2005

Amine Bermak (M’99–SM’04) received the M.Eng. and Ph.D. degrees in electronic engineering from Paul Sabatier University, France, in 1994 and 1998, respectively. During his Ph.D. studies, he was part of the Microsystems and Microstructures Research Group at the French National Research Center LAAS-CNRS where he developed a 3-D VLSI chip for artificial neural network classification and detection applications. He joined the Advanced Computer Architecture research group at York University, York, U.K., where he was working as a postdoctorate on the VLSI implementation of CMM neural networks for vision applications in a project funded by the British Aerospace. In November 1998, he joined Edith Cowan University, Perth, Australia, first as a Research Fellow working on smart vision sensors, then as a Lecturer and a Senior Lecturer with the School of Engineering and Mathematics. He is currently an Assistant Professor with the Electrical and Electronic Engineering Department, Hong Kong University of Science and Technology (HKUST), Kowloon, where he also serves as the Associate Director of the Computer Engineering Program. He has published more than 70 papers in journals, book chapters, and refereed international conferences. Dr. Bermak was awarded the “Bechtel Foundation Engineering Teaching Excellence Award” at HKUST in 2004.

Minghua Shi received the B.S. degree in electronic engineering from the East China University of Science and Technology, Shanghai, where he graduated with highest honors. He is currently pursuing the Ph.D degree at the Hong Kong University of Science and Technology (HKUST), Kowloon, under the supervision of Prof. Amine Bermak. In September 2002, he joined the Electrical and Electronic Engineering Department, HKUST. His research interests are related to hardware implementation of pattern recognition algorithms for gas identification and electronic nose applications. Mr. Shi received the first prize scholarship from the East China University of Science and Technology. He was also selected for the honor of Excellent Student of Shanghai City.

Philip C. H. Chan (SM’95) was born in Shanghai, China, and raised in Hong Kong. He received his B.S. degree in electrical engineering from the University of California, Davis, where he graduated with highest honors and departmental citation, and the received M.S. and Ph.D. degrees in electrical engineering from the University of Illinois, Urbana-Champaign, under Prof. C. T. Sah. He was with the University of Illinois as an IBM Postdoctoral Fellow and later as Visiting Assistant Professor in electrical engineering. In 1981, he joined Intel Corporation, Santa Clara, CA, as a Senior Engineer in the Technology Development Computer-Aided Design Department, where he later became a Principal Engineer and Senior Project Manager and had corporate responsibility for circuit simulation tools, VLSI device modeling, and process characterization. In 1990, he joined the Design Technology Department of Microproducts Group, where he led a team of engineers that defined and developed a CAD system to design multichip module products. This effort led to the first functional 486-based multichip module at Intel. He joined the Hong Kong University of Science of Technology (HKUST), Kowloon, in April 1991 as a Reader. He became a Professor in 1997. He served as the Director of Undergraduate Studies, the founding Director of Computer Engineering Program, the Associate Dean of Engineering, and both the Acting Head and Head of the Department of Electrical and Electronic Engineering until 2002. He was also the Director of the Microelectronic Fabrication Facility, the facility that supports all the microelectronic related research at HKUST. He is also the Director of the Advanced Electronic Packaging Laboratory, where he initiated his research in flip-chip technology. He became the Dean of Engineering in September 2003. His research interests include microelectronics devices, circuits, integrated sensors, and electronic packaging.