Parametric Representation of Objects in Color Space Using One

0 downloads 0 Views 1MB Size Report
The first concept is to apply D. Tax's one-class classifier as a steerable descriptor of such a ... Keywords: One-class classification, Pixel classifiers, Fire detection,.
Parametric Representation of Objects in Color Space Using One-Class Classifiers Aleksandr Larin1, , Oleg Seredin2 , Andrey Kopylov2, Sy-Yen Kuo3 , Shih-Chia Huang4 , and Bo-Hao Chen4 1

Moscow Institute of Physics and Technology, Moscow, Russia [email protected] 2 Tula State University, Tula, Russia [email protected], [email protected] 3 National Taiwan University, Taipei, Taiwan [email protected] 4 National Taipei University of Technology, Taipei, Taiwan [email protected], [email protected]

Abstract. Two new approaches to parametrization of specific (flame representative) part of a color space, labeled by an expert, are presented. The first concept is to apply D. Tax’s one-class classifier as a steerable descriptor of such a complex volumetric structure. The second concept is based on approximation of the training data by a set of elliptic cylinders arranged along the principal components. Parameters of such elliptic cylinders describe the training set. The efficiency of the approaches has been proven by experimental study which let allowed us to compare the standard Gaussian Mixture Model based approach with the two proposed in the paper. Keywords: One-class classification, Pixel classifiers, Fire detection, Flame detection, Support Vector Data Description, PCA.

1

Introduction

One of the working stages of computer vision system, solving the flame detection problem, is a description of pixels set that parametrically presents objects of interest in the image. In the course of color analysis any picture pixel can be imagined as a three-dimensional vector, each coordinate of which coincides with the value of corresponding RGB component of the pixel. The construction of mathematical model, based on number of pixels in color space specified by expert, allows the parametric description of the areas of interest in a feature space with given accuracy. This construction is called parameterization problem or Color Data Modeling [1]. Parameterization of flame pixels is an example of such problem and will be covered in the paper. 

This research was supported by grants 14-07-00527 of the Russian Foundation for Basic Research and 12-07-92000 of the Russian Foundation for Basic Research and the Taipei-Moscow Coordination Commission on Economic and Cultural Cooperation.

P. Perner (Ed.): MLDM 2014, LNAI 8556, pp. 300–314, 2014. c Springer International Publishing Switzerland 2014 

Parametric Representation of Objects in Color Space

301

The main hypothesis of flame pixels detection in image is based on an expert indication of fragments (rectangles, polygons, explicit labeling of pixels) that cover the flame region with some range of accuracy. It is necessary to provide mathematical model that enables us to classify pixels into one of two following classes: flame pixels - ”not-flame” pixels. An example of extempore indication of a region of pixels which acts as training set is presented in Fig. 1. A training set region marked in Fig.1 forms a collection of three-dimensional vectors that can be mapped into RGB space as shown in Fig. 2 (a). Gaussian Mixture Model is widely used for color data modeling [2]. Various versions of this method were applied to flame detection. For example, the method of experimental data parametrization by spheres was explained in detail in [3]. The spacial arrangement of spheres is based on GMM adjustment in such a way that the center of each sphere is a point represented by mean value of corresponding distribution and the radius of sphere is a doubled standard deviation of the same distribution (see Fig. 2 (b)). For such parametric model, any pixel is considered to be a part of flame in case it falls into at least one approximating sphere. The literature describes various methods of selecting the parameters for a mixture of normal distributions that best describes the original data set, but the most commonly used method based on the EM-algorithm (Expectationmaximization) [5]. Adjustable parameters of the algorithm are the number of normal distributions in the mixture and dispersion threshold for decision.

Fig. 1. The image of flame. The rectangle marks training set region.

The significant restriction of such method is that the number of spheres should be set a priori. The cited paper reports the sufficient quality of flame pixels parameterization can be achieved by usage of ten approximating spheres. Task formulation, the appearance of the training set in RGB space and our previous experience in pattern recognition inspire us to consider the initial task of color data modeling as a kind of parameterization problem which can be solved by using the methods of machine learning. The specific character of the

302

A. Larin et al.

(a)

(b)

Fig. 2. (a) The training set mapping into RGB space, (b) Data approximation by spheres (the center of sphere — distribution mean value, radius — doubled standard deviation

task lays in the fact the training set contains objects of one class only. That is why we suggest to use a specific description of the task for objects identification learning, namely, construction of the one-class classifier. The first suggested concept is to apply D. Tax’s Support Vector Data Description [4] as a steerable descriptor of such a complex volumetric structure. Nevertheless, the computational time needed to form a description based on one-class classifier is hardly predictable and depends on the number of support objects. As an alternative way, we introduce another approximation of expert data in RGB space, namely by a set of elliptic cylinders in the direction of the principal component. First, initial color space is subjected to orthogonal transform by means of Principal Component Analysis (PCA) to reduce the correlation of color components. Next, the transformed space is divided into the set of twodimensional layers that are orthogonal to the direction of the main component. Then, the ellipse of scattering is constructed for each layer. Thus, the set of parameters of such ellipses, arranged along the main components, describes the training set. This paper has undertook the study and comparison of three one-class classifiers implemented to the pixels parameterization task in 3D space. In particular, the classifiers, constructed according to Tax’s model, Guassian mixture model [5] and model of PCA elliptic cylinders approximation were observed. Algorithms have been compared against three criteria, namely: classification quality, parameterization time of the given data set and time of new objects classification.

2

Support Vector Data Description

Method of one-class classification, which will be applied to solve the given problem, is introduced by Dr. Tax in [4] and is called Support Vector Data Descrip-

Parametric Representation of Objects in Color Space

303

Fig. 3. The spherical model of data description

tion. The peculiarity of this method is in its strong analogy with the V. Vapnik’s method of support vectors. Training set description model (a ∈ Rn , n — feature space dimension, R ∈ R, N — number of objects in training set) in method xi ∈ Rn , i = 1, . . . , N — hypersphere representing the nearest outer border around data. Restrictive hypersphere is unambiguously determined by its centre and radius, which are selected in a following way: radius should be minimal but most training total objects should not go beyond its borders (see Fig. 3). At the same time, the objects that fall out of the borders of hypersphere should be fined, and the sum of all fines of all the objects should be minimal as well. Thus, the structural error of the model should be minimized: ⎧ N ⎪ ⎨R2 + C  δ → min , i R,a,δ i=1 ⎪ ⎩ 2 xi − a ≤ R2 + δi , δi ≥ 0, i = 1, . . . , N.

(1)

The given problem is a quadratic programming problem and can be solved by method of Lagrange multipliers. The dual problem is equivalent to the original can be presented as follows: ⎧N N  N   ⎪ ⎪ ⎪ λ (x · x ) − λi λj (xi · xj ) → max, i i i ⎪ ⎨ λ i=1

i=1 j=1

N ⎪  ⎪ ⎪ ⎪ λi = 1, 0 ≤ λi ≤ C, i = 1, . . . , N, ⎩

(2)

i=1

where λ — Lagrange multipliers. D. Tax suggests using only support objects that are on the borderline of hypersphere to describe training total of decision procedure. A new object is considered to belong to the area of interest in case the distance between the object and the centre of hypersphere is smaller that its radius.

304

A. Larin et al.

It follows the function of one-class decision procedure for object recognition will take the form of indicator function:   (3) d (z; λ, R) = I z − a2 ≤ R2 , where 2

z − a = (z · z) − 2

N SV 

λi (z · xi ) +

i=1

R2 = (xk · xk ) − 2

N SV  i=1

λi (xk · xi ) +

N SV N SV  

λi λj (xi · xj ),

(4)

i=1 j=1 N SV N SV  

λi λj (xi · xj ),

(5)

i=1 j=1

where NSV — number of support objects, and xk — any support object. D. Tax used “kernel trick” method [6,7] to enter rectifying space of larger dimension features to make the data description possible in more “flexible” manner than the sphere. Polynomial and Gaussian kernels are frequently used for this purpose.

− xi − xj 2 . (6) K(xi , xj ) = exp s2 Thus, in (2), (4) and (5) it is necessary to change an operation of scalar product of two vectors by calculating the kernel value of two arguments to receive better data description model according to D. Tax’s method. It is evident the Gaussian kernel is more efficient for solving the parametrization problem of flame pixels color representation. Therefore, this function will be further used. Fig. 1 shows a rectangular area from which pixels range was obtained to train Dr. Tax’s classifier. Fig. 10 presents the results of borderline constructed around data, which were received assuming trained classifier. Fig. 4 illustrates application of classifier for flame determination in the image. It should be noted, the area becomes more “dense” and restricts smaller pixels range around training set in RGB space when Gaussian potential function parameter is reduced. Thus, it becomes necessary to select optimal potential function parameter. 2.1

Selection of Optimal Parameter s for Gaussian Kernel

In his research paper D. Tax suggested the algorithm that allows the selection of an optimal parameter s value of Gaussian kernel. The essence of the algorithm is to search parameter s that will correspond to the given recognition error within the limits of minimum and maximum parameter value. These values can be determined by following formulas: smin = min xi − xj , i = j, smax = max xi − xj . i,j

i,j

Parametric Representation of Objects in Color Space

305

Fig. 4. The results of applying one-class classifier trained on the set of points marked in Fig. 1

(a)

(b)

Fig. 5. The relations between kernel parameter and (a) classification error in slide control, (b) number of support objects in classifier’s decision procedure

Search of parameter should be done according to iterative scheme with the given step until recognition error corresponds to the desired one. It was proposed to estimate recognition error by using the procedure of slide control leave-oneout. Relation between the classification error and Gaussian kernel parameter is shown in Fig. 5 (a). When attention is drawn to diagram of relation between Gaussian potential function parameter and the number of support objects that take part in recognition decision procedure (see Fig. 5 (b)), it becomes clear the number of support objects decreases with increase of potential function parameter. Thus, the number of support objects in decision procedure is an indirect indicator that helps give rough evaluation of future classification error.

306

2.2

A. Larin et al.

Optimization of Classifier’s Operation Speed

Application of realized program classifier to analyse video stream that works in real time has shown the necessity of speeding-up its operation time at the stage of recognition. The reason is the recognition procedure should be done for each pixel of processing image several times a second. Focusing attention on the Gaussian kernel structure (6), it becomes evident it inherits two characteristic features from an exponential function: it possesses only positive value at any point of the function domain, and in the case of arguments equality possesses the value of 1. So, the form of the decision rule (3) can be simplified into

N N SV SV   λi K (z, xi ) − λi K (xk , xi ) ≥ 0 , (7) d (z) = I i=1

i=1

where xk — any support vector. As the Gaussian potential function is positive it is enough to partly calculate the first sum in the expression (7). The second sum can be figured out in advance before classification takes place. It becomes clear the smaller number of operations is needed to determine the given expression. It means classification process takes less time. It is well-known, the estimation of exponent is a difficult operation and we suggested that decrease of this function call can reduce the time of pixels recognition. In our case a feature vector looks as follows: xi = (ri , gi , bi ) .

(8)

By substituting (8) into initial expression of Gaussian kernel (6) and by using one of exponent’s features we have the following expression of Gaussian kernel: |ri − rj |2 |gi − gj |2 |bi − bj |2 K(xi , xj ) = exp − exp − exp − . s2 s2 s2 by introducing the following function f (v): 2 v f (v) = exp − 2 s

(9)

(10)

and by changing (9) expression the result is K(xi , xj ) = f (|ri − rj |)f (|gi − gj |)f (|bi − bj |). Note, the value of each RGB color component is restricted by multitude (0, 1, . . . , 255). That is why the difference module of any two color components will belong to this multitude as well, which means the definition area of the function (10) will be discrete and restricted by (0, 1, . . . , 255). That is why all values of this function can be figured out before main classification procedure takes place. This allows us to abandon the exponent function evaluation, which takes a lot of time.

Parametric Representation of Objects in Color Space

307

Fig. 6. The results of decision rule optimization

Therefore, the number of operations needed for classification was essentially reduced and the exponent function evaluation was totally abandoned. Comparative results are shown in Fig. 6.

3

PCA-Based Parametric Representation of an Object in Color Space

We introduce here an approximation of expert data in RGB space by a set of elliptic cylinders strung on the principal component. First, initial color space is subjected to orthogonal transform by means of Principal Component Analysis (PCA) to reduce the correlation of color components. Next, the transformed space is divided into the set of two-dimensional layers that are orthogonal to the direction of the main component. Then, the ellipse of scattering is constructed for each layer. Thus, the set of parameters of such ellipses, arranged along the main components, describes the training set. 3.1

Orthogonal Transformation of Color Space

Original RGB color space, which is often used for representation of expert data, is not optimal in terms of computer processing [8], since the color components for real-world objects are strongly correlated, and the space itself is not uniform. Switching to another color space, for example HSV or L*a*b*, cannot refine the situation essentially as it does not allow to take into account the spatial location of the data. We use here a transformation of the training set color space so that the sample variances of the data along each axis have been sequentially maximized while preserving the orthogonality conditions [9], [10]. This transformation can be found by means of PCA [11]. In accordance with that method, three maximum eigenvectors U1 , U2 , U1 of the empirical sample covariance matrix of the training set form the basis of a new coordinate system. In this paper we use the Jacobi eigenvalue algorithm for calculation of eigenvectors of a covariance matrix. A visual interpretation of such coordinate system is shown in Fig. 7 (a).

308

A. Larin et al.

(a)

(b)

(c)

Fig. 7. Algorithm steps of the PCA-based method: (a) transformation of the training set in space of principal components, (b) partitioning the space of principal components by a set of planes, (c) scattering ellipse construction by PCA

The transition to the new coordinate system can be done by superposition of shift and rotation of the coordinate axes. 3.2

Approximation of the Training Set by the Ellipses of Scattering

Cutting the training set in the new coordinate system orthogonally to the first principal component by planes, we get a set of layers (Fig. 7 (b)). Each layer is defined by neighboring planes and contains ni points, where i is a number of a layer. Using PCA once again, but now in 2D space, we approximate the set of points at each layer by the scattering ellipse. Two maximum eigenvectors of the covariance matrix of points belonging to the layer i give us the direction of principal components. Now we can define the scattering ellipse in the 2D coordinate system of principal components (Fig. 7 (c)). Choose, for example, the triple value of the standard deviation = 3 as the length of the axis ai and bi of the scattering ellipse at layer i such that

 2   ni (u ) − (¯ u ) 1 1  j=1 ij i ai = μσui1 = μ , ni

 2   ni (u ) − (¯ u2 )i 2 ij  j=1 bi = μσui2 = μ . ni It is obvious that standard deviations of components are the square root of the diagonal elements of the covariance matrix. Therefore, the decision rule for recognition of objects, belonging to the area defined by an expert, can be obtained on the basis of the following ellipse equation:

(u1 )i ai



2 +

(u2 )i bi

2 ≤ 1.

Parametric Representation of Objects in Color Space

309

It is clear that the more objects can be found at the particular layer the more accurate is the scattering ellipse. If the layer has only one point, representing an object, the scattering ellipse cannot be built, since the standard deviations σui1 and σui2 (and the length of the ellipse axes as well) will be zero. But this is fairly common situation in the case of a sparse data. However, some regularization can be applied if we suppose that the area of interest, given by an expert in the color space, is smooth enough, so the scattering ellipses in neighboring layers will have similar parameters. This implies that we have two contradictory requirements: on the one hand, the scattering ellipse must fit the data in corresponding layer as good as possible; on the other hand, the parameters of ellipses in the neighboring layers must have similar values. We apply a simple dynamic programming algorithm [12] to solve a corresponding optimization task and adjust parameters of the neighboring ellipses.

4

Methodology of Experimental Study

A special model of the general population of objects was used for experimental studies. This model is formed by the points located inside the spheres with sinusoidally varying radius and the center located along the spiral curve in the 3D space (Fig.8 (a)). Data generator parameters were adjusted for the general population to include about 100 thousand of objects (all RGB cube contains 2563 ≈ 16.8 millions objects). In addition, the part of general population, selected randomly, was performed as a training set in our experiments (Fig.8) and the rest of the cube RGB — as objects to control the recognition quality.

(a)

(b)

(c)

Fig. 8. Imaging of data set that forms experimental training set in RGB space: (a) entire general population (100000 objects), (b) 1 % of the general population (1000 objects), (c) 0.1 % of the general population (100 objects)

In order to avoid inaccuracies when measuring the performance techniques, only C++ implementations of all three algorithms assembled in one test project were used. GMM model realization source code was taken from the site [13].

310

A. Larin et al.

Experiments with the SVDD method were made using the library, specifically optimized to work with the RGB object space. Method of data approximation with elliptical cylinders (PCA) was independently implemented specifically for this study. The algorithms were compared against three criteria: classification quality, model training and new object classification time. To assess the quality of the algorithms with different parameters error estimations of the first and second kind presented in the form of ROC curves were applied [14]. Moreover, both all objects from the general population and remaining set of objects in space of RGB were used as control objects. The essence of the parameterization classifier speed estimation was in repeated training on the same dataset in order to reduce measurement error. An average parameterization time was then evaluated. In course of training the algorithm parameters were varied as the operating time of algorithms may depend on them. New RGB object classification time was evaluated as an average time of all RGB cube objects classification: “class” and “non-class” for target classifier.

5

Experimental Study

Fig.9, 10, 11 present results of the methods described above for approximation of source data set shown in Fig.8 a.

(a)

(b)

Fig. 9. Results of the separating boundary construction that uses the GMM model for different distributions quantity: (a) 3, (b) 5

The numerical estimations of the algorithms rate are shown below in the form of charts demonstrating the dependency between the size of training set and the rate of corresponding run-time operations. The method parameters to be estimated were selected in the way that minimizes the classification error (Fig.14). The estimation of algorithms quality for training set containing 1 % of general population (Fig.8, a) is presented in the form of ROC curve (Fig.14). ROC curve for training set containing 0.1 % (Fig.8, b) of general population is shown in Fig.15.

Parametric Representation of Objects in Color Space

(a)

311

(b)

Fig. 10. Results of the separating boundary construction that uses the SVDD classifier for different Gaussian kernel parameters s: (a) 50, (b) 100

(a)

(b)

Fig. 11. Results of the separating boundary construction that uses the PCA-based classifier for different values of dispersion threshold: (a) 0.5, (b) 1.5

Fig. 12. The diagram showing the dependency between model training time (mc) and the size of the training set (part of the general population)

312

A. Larin et al.

Fig. 13. The diagram showing the dependency between classification time of the new object (mc) and the size of the training set (part of the general population)

Fig. 14. ROC-curve characterizing the quality of algorithms classification for the training set containing 1 % of the general population

It can be noticed, that the classification error of GMM method with small amount of clusters slightly exceeds the similar error of SVDD method. Classification quality for GMM model can be improved by increasing the amount of the normal distributions that were used. However, when the training set size is small like in the second case (about 100 objects) the model will be the subject to overtraining. Similar effect can be observed for the PCA method, the classification error is dramatically tenfold increased with the decreasing training sample. Therefore, we can conclude that the SVDD method is the most robust in the case of insufficient training material.

Parametric Representation of Objects in Color Space

313

Fig. 15. ROC-curve characterizing the quality of algorithms classification for the training set containing 0.1 % of the general population

6

Conclusion

Our modeling experiments show, that for considered tasks the both SVDD and PCA methods achieve better performance than classical GMMs. PCA method is the fastest of the all methods considered here so it could be used then the training and recognition time is important . But PCA as well as GMM method is very sensitive to the size and especially sparseness of the training set in contrast to the SVDD method, which uses Gaussian kernel. Consequently, this method is the most balanced in terms of speed and quality for solving the task of pixel classification. In addition, the presence of only one tuning parameter in SVDD method can be noted as a significant advantage. This, undoubtedly, greatly simplifies the preset of the best quality SVDD classifier. Further research will be aimed to application of the described classifiers to solve practical problems of image segmentation.

References 1. Rother, C., Kolmogorov, V., Blake, A.: Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (TOG), 309–314 (2004) 2. Ruzon, M.A., Tomasi, C.: Alpha estimation in natural images. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2000 (Cat. No.PR00662), vol. 1, pp. 18–25. IEEE Comput. Soc., Los Alamitos (2000) 3. T¨ oreyin, B.: Fire detection algorithms using multimodal signal and image analysis. PhD thesis, Institute of Engineering and Science of Bılkent university (2009) 4. Tax, D.M.J.: One-class classification; Concept-learning in the absence of counterexamples, Ph.D thesis. Delft University of Technology, ASCI Dissertation Series, 146 p. (2001) 5. Bilmes, J.: A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models. Technical

314

6. 7. 8. 9. 10. 11. 12.

13. 14.

A. Larin et al. Report TR-97-021. International Computer Science Institute and Computer Science Division. University of California at Berkeley (April 1998) Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, NY (1998) Aizerman, M., Braverman, E., Rozonoer, L.: Method of potential functions in machine learning theory. Nauka, Moscow (1970) (in Russian) Paschos, G.: Perceptually uniform color spaces for color texture analysis: an empirical evaluation. IEEE Transactions on Image Processing 10(6), 932–937 (2001) Tsaig, Y.: Automatic segmentation of moving objects in video sequences: a region labeling approach. Circuits and Systems for Video 12(7), 597–612 (2002) Abdel-Mottaleb, M., Jain, A.: Face detection in color images. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(5), 696–706 (2002) Jolliffe, I.T.: Principal component analysis. Applied Optics 44(30), 64–86 (2005) Mottl, V.V., Blinov, A.B., Kopylov, A.V., Kostin, A.A., Muchnik, I.B.: Variational methods in signal and image analysis. In: Proceedings of the 14th International Conference on Pattern Recognition, Brisbane, Australia, August 16-20, vol. I, pp. 525–527 (1998) Gaussian Mixture Model and Regression (2008), http://sourceforge.net/projects/gmm-gmr/ Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers. In: Proc. of the 19th International Joint Conference on Artificial Intelligence, pp. 702–707 (2005)

Suggest Documents