1st International Conference on Advanced Technologies for Signal and Image Processing - ATSIP'2014 March 17-19, 2014, Sousse, Tunisia
MSI-18
Gases Identification with Support Vector Machines Technique (SVMs) Souhir BEDOUI, Hekmet SAMET and Mounir SAMET
Abdennaceur KACHOURI
Department of Electrical Engineering, LETI Laboratory, National School of Engineer Sfax, University of Sfax, Tunisia
[email protected];
[email protected]
Higher Institute Of Industrial Systems Gabes (ISSIG) University of Gabes, Tunisia
[email protected]
Abstract—Air pollution is an olfactory pollution because many polluting gases have a strong odor even at low concentrations. These pollutants are natural or anthropogenic emission sources. This pollution has many harmful effects on human health or upon the environment. So it is necessary to detect the pollution to reduce its effects.
For the regression estimation case, SVMs have been compared on benchmark time series prediction tests (M¨uller et al., 1997; Mukherjee, Osuna and Girosi, 1997), the Boston housing problem (Drucker et al., 1997), and (on artificial data) on the PET operator inversion problem (Vapnik, Golowich and Smola, 1996). [3]
An electronic nose is capable of detecting the presence of gas after learning. The artificial nose consists of an array of chemical sensors and an electronic system capable of recognizing patterns odors simple and complex.
In this article we test the SVM for the detection of gases atmospheric pollutants.
The performance of a sensor network is discussed by using pattern recognition methods. These methods can be supervised methods or unsupervised. Support Vector Machines SVMs is a supervised learning algorithm. In this article, we tested SVM based on kernel functions to evaluate the ability of our sensor array to distinguish between different groups of gases.
Keywords—gas identification; electronic nose; sensor array; SVM; kernel function
I.
INTRODUCTION
In machine learning, Support Vector Machines (SVM) is supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. [1] In particular, an SVM classifier is capable of finding the optimal hyperplane that separates two classes. This optimal hyperplane is a linear decision boundary which separates the two classes and leaves the largest margin between the samples of the two classes. [2] We summarize some recent applications and extensions of support vector machines. For the pattern recognition case, SVMs have been used for isolated handwritten digit recognition (Cortes and Vapnik, 1995; Sch¨olkopf, Burges and Vapnik, 1995; Sch¨olkopf, Burges and Vapnik, 1996; Burges and Sch¨olkopf, 1997), object recognition (Blanz et al., 1996), speaker identification (Schmidt, 1996), charmed quark detection1, face detection in images (Osuna, Freund and Girosi, 1997a), and text categorization (Joachims, 1997).
978-1-4799-4888-8/14/$31.00 ©2014 IEEE
Pollution is the degradation of a living place by the introduction generally human, of chemical or organic substances. The air pollution is a mixture of gases that alter the natural composition of the atmosphere. This can be dangerous for your health and bad for the soil, water and the environment in general. The air pollution is caused by vehicles, factories that pollute the atmosphere enormously. To minimize the serious consequences of pollution we need to control our environment. This control requires techniques very expensive and very complex, such as the chromatography or the colorimetric tube hence the idea of designing an intelligent "artificial nose". An electronic nose is a device used for the detection of gaseous mixtures. There are different applications of electronic nose such as medicine, pharmaceutical, environmental, etc... Electronic nose consists of an array of chemical sensors such as fiber optic sensors, piezoelectric sensors or sensor type MOSFET and an electronic system capable of recognizing patterns odors simple and complex. This paper is organized as follows: In Section 2 we present the sensors array, the type of sensors used in our application and the characteristic parameters. In section 3 we briefly describe the support vector machine (SVM) method. The application of this method and the results are described in Section 4 and finally our conclusion in section 5.
271
II.
III.
GAS SENSORS
SUPPORT VECTOR MACHINES SVM
The SVM is a machine learning algorithm which In fact Metal Oxide Semiconductor (MOX) gas transducers are one of the preferable technologies to build electronic noses because of their high sensitivity and low price [4].
•
solves classification problems
•
For our application the sensor array is composed of three types of gas sensors Taguchi Gas Sensor (TGS.) type 8xx and 2xxx sensitive to different atmospheric gases as summarized in the following table.
uses a flexible representation of the class boundaries
•
implements automatic complexity control to reduce overfitting
•
has a single global minimum which can be found in polynomial time [7]
TABLE I.
RECAPITULATIVE OF SENSORS
Resistance
Target gas
TGS826
46 KΩ
NH3
TGS2106
3.9 KΩ
NO2
TGS2610
33 KΩ
Combustion gases
A. Separable case .
For the linearly separable case, the support vector algorithm searches the separating hyperplane with largest margin. Using a linear discriminant function, there are infinity numbers of answers
The Electronic nose system used for gas detection depends on the resistance variation of the gas sensor, which given the possibility increases the selectivity and sensibility [5]. The performance of the sensor network is discussed by using pattern recognition methods.[6] We address a pattern recognition problem in two steps: •
Feature extraction
From the time responses of the sensors we extract the characteristic parameters of sensors which are the initial conductance G0, the final conductance Gs, and the sensitivity S.
G0 =
Gs =
1 600 ∑ Gk 20 k =580
G − G0 S= s G0 •
Linear separating hyperplanes
As we can see in Fig. 1 there are different possibilities of hypeplanes separating two different classes. The linear discriminant function (classifier) with the maximum margin is the best.
20
1 ∑ Gk 20 k =1
Fig. 1.
(1)
(2)
We must seek a hyperplane with the largest minimum distance to the training examples. This distance is called the ‘margin’. This is the optimal separating hyperplane. Determining the optimal classifier is simple and easy. In fact, the maximum margin classifier that can solve problems linearly separable classes. For the case of two linearly separable classes, a hyperplane has the equation:
f ( x) = wx + b = 0
The distance from a point to plan is: (3)
Machine learning algorithms such us Principal Component Analysis PCA, Artificial Neural Network ANN, Support Vector Machine SVM etc…
d ( x) =
wx + b
w
(4)
(5)
Maximize the distance mean minimize | | w | |. w is minimized under various constraints: primal problem gives us the following solution:
272
w = ∑αi xi yi
By introducing the kernel functions:
i
w = ∑ αiφ ( xi )
(6)
i
Where α are the coefficients of Lagrange dual problem as a solution:
wT x + b = ∑αi xi yi xT + b i
(9)
And the decision function is:
f ( x) = ∑ αiφ ( xi )φ ( x) + b
(7)
i
(10)
The kernel function is given by the following Formula:
φ ( xi )φ ( x) = k ( xi , x)
(11)
The decision function is then:
f ( x) = ∑α i K ( xi , x) + b i
Fig. 2.
(12)
Optimal separating hyper-plan in a two-dimensional space
Case of polynomial kernel (for MATLAB, "d" the order of the polynomial = 3 by default)
k ( xi , x) = ( xi x + 1) d
B. Non separable case For the nonlinear case, the kernel functions which may be a polynomial kernel, based on radial or sigmoid function is introduced.
(13)
Case of RBF kernel (δ = 1 by default using MATLAB) 2
k ( xi , x) = exp(−δ x − xi )
(14)
C. Advantages and disadvantages of SVM Advantages
Fig. 3.
Fig. 4.
Non linear case
Effective in high dimensional spaces.
•
Still effective in cases where number of dimensions is greater than the number of samples.
•
Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
•
Versatile: different Kernel functions can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels. [8]
Disadvantages
Kernel functions for non linear case
In case of linear classification decision function is:
f ( x) = wx + b
•
•
slow training
•
Not work very well on multiple classes [9]
(8)
273
IV.
IDENTIFYING GROUPS OF GASEOUS MIXTURE BY SVM
The sensitivities of the three sensors for each target gas were used as input parameters for SVM. A.
. • Linear kernel: Fig. 6 shows the result for linear SVM kernel gas "only" or "binary mixture".
Group gas “alone”
1
0
First, we study the discrimination between two different gases. Three combinations are present: H2S/NO2, H2S/SO2 or NO2/SO2.
-2
-3
Using MATLAB we apply the SVM to our database. The function "svmtrain" is used for learning a SVM classifier.
-4
-5
The following instructions are used in the simulation:
-6
Svmstruct svmtrain = (data (train, :), groups (train) 'KERNEL_FUNCTION', 'Kfun', 'showplot'true) with Kfun can be either 'rbf' or 'polynomial'. To evaluate the ability of our electronic nose, we studied the separation of gas if "alone" group in the case of the presence of two gases H2S and NO2. The idea is to separate the data into two: one for generating the classifier (i.e. train the SVM) and another test (i.e. seen whether the classifier is good)
mixture (training) mixture (classified) alone (training) alone (classified) Support Vectors
-1
-7
-8 -3
-2.5
Fig. 6.
-2
-1.5
-1
-0.5
0
0.5
1
SVM kernel linear gas "alone" or "binary mixture"
• RBF kernel: Fig. 7 shows the result of SVM RBF kernel (with b = 6.8131) for gas "alone" or "binary mixture".
1
0 0.5
-1
NO2 (training) NO2 (classified) H2S (training) H2S (classified) Support Vectors
-2
0
-0.5 mixture (training) mixture(classified) alone (training) alone(classified) Support Vectors
-1
-3
-1.5
-4 -2
-5
-2.5
-6
-3
-3.5
-7
-4
-8 -3
-2.5
-2
-1.5
-1
-0.5
0
0.5
-4.5 -0.6
Fig. 5.
SVM linear kernel for H2S and NO2
Fig. 7.
The linear simulation ensures the separation between the two classes, so there is no need to use other types of kernel. Fig. 5 shows the result of linear SVM kernel for H2S and NO2 Classification performance is good, it is 90.37%. B.
-0.4
-0.2
0
0.2
0.4
0.6
SVM ‘RBF kernel’ for gas "alone" or "binary mixture"
0.6
mixture(training) mixture (classified) alone (training) alone (classified) Support Vectors
0.55
Identification of two groups of gas mixture 0.5
We study the separation into two groups "alone" and "binary mixture".
0.45
In this case, the linear kernel does not give good results; hence we have tried other types of core
0.4
0.33
Fig. 8.
0.34
0.35
0.36
0.37
0.38
0.39
0.4
0.41
Zoom for SVM ‘RBF kernel’ for gas "alone" or "binary mixture"
274
•
Polynomial kernel:
samples
Correctly classified negative samples/true negative
The result polynomial kernel SVM (with b = 3.3248) for gas "alone" or "binary mixture" is shown in Fig. 9.
TABLE II.
PERFORMANCE EVALUATION
Kernel
Number
Function
Of
CR
ER
S
SP
0
-1
Observation
-2
mixture(training) mixture (classified) alone (training) alone (classified) Support Vectors
-3
-4
-5
Linear
200
0.91
0.09
0.94
0.88
polynomial
200
1
0
1
1
Rbf
200
1
0
1
1
-6
-7 -0.6
Fig. 9.
-0.4
-0.2
0
0.2
0.4
0.6
0.8
SVM ‘polynomial kernel’ for gas "alone" or "binary mixture"
From the table above, we note that the problem of nonlinearity was solved using kernel functions and two different gaseous groups appear well separated mainly by applying the RBF kernel which confirms our good choice of sensor network able to identify different gases and different gas groups. V.
0.55
0.5
Currently, SVM is widely used in object detection & recognition, content-based image retrieval, text recognition, biometrics, speech recognition, etc.
0.45
0.4
0.35 0.24
Fig. 10.
0.26
0.28
0.3
0.32
0.34
0.36
0.38
0.4
0.42
In this paper, we presented a method for pattern recognition: the SVM. This method based on finding a hyperplane between two classes, gives acceptable results for the distinction between two different target gases and two groups of gases "alone" and "binary mixture". But this method is limited to problems of classification in two classes.
Zoom for SVM ‘polynomial kernel’ for gas "alone" or "binary mixture"
Next table evaluate the performance of classifier where CR=Correctrate= Correctly classified samples/classified samples
REFERENCES [1]
ER=Errorrate= Uncoerrectly classified samples/classified samples S=Sensitivity= samples
CONCLUSION
Air pollution is an annoying problem that many studies are looking for a solution. The identification of polluting gas is the objective of various studies. Electronic nose is among the solutions that can ensure the control of the air. It can identify the different gaseous substances after using the methods of pattern recognition. SVM is one of these methods.
mixture(training) mixture (classified) alone (training) alone (classified) Support Vectors
Correctly classified positive samples/true positive
SP=Specificity=
[2] [3]
Gang Zhao, Jianhao Song, Junyi Song “Analysis about Performance of Multiclass SVM Applying in IDS” School of Information Management, Beijing Information Science & Technology University, Beijing, China International Conference on Information, Business and Education Technology (ICIBIT 2013) Marcelo N. Kappa, , Robert Sabourina, Patrick Maupinba “A dynamic model selection strategy for support vector machine classifiers” École de technologie supérieure, Université du Québec, Canada b Defense Research and Development Canada - Valcartier (DRDC Valcartier), Canada M.N. Kapp et al. / Applied Soft Computing 12 p2550–2565 2012
275
[4]
Christopher j.c. Burges “A Tutorial on Support Vector Machines for Pattern Recognition” Bell Laboratories, Lucent Technologies Appeared in: Data Mining and Knowledge Discovery 2, 121-167, 1998 [5] Javier G. Monroy, Javier Gonz´alez-Jim´enez and Jose Luis Blanco," Overcoming the Slow Recovery of MOX Gas Sensors through a System Modeling Approach", Department of System Engineering and Automation, University of Malaga, Campus de Teatinos, 29071 Malaga, Spain, p13664-13680; 2012, [6] Iman Morsi, "Electronic Nose System and Artificial Intelligent Techniques for Gases Identification", Arab Academy for Science and Technology, Electronics and Communications Department AlexandriaEgypt.2010 [7] Souhir BEDOUI, Rabeb FALEH and Hekmet SAMET “Electronic Nose System and Principal Component Analysis Technique for Gases Identification” 10th International Multi-Conference on Systems, Signals & Devices (SSD) 2013 [8] Geoff Gordon “Support Vector Machines and Kernel Methods”
[email protected] 15, 2004 [9] Parin M. Shah “Face Detection from Images Using Support Vector Machine” Master's Projects , San Jose State University SJSU ScholarWorks 2012 [10] Andreas Vlachos “Active Learning with Support Vector Machines” Master of Science School of Informatics University of Edinburgh 2004
.
This work was supported by CMPTM 13/TM32
276