Survey on LBP based texture descriptors for image classification

5 downloads 1043 Views 304KB Size Report
tion invariant local quinary pattern, where a bin selection based on variance is ... As the classifier, we have tested a stand-alone support vector machine (SVM) ...
Expert Systems with Applications 39 (2012) 3634–3641

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Survey on LBP based texture descriptors for image classification Loris Nanni a,⇑, Alessandra Lumini b, Sheryl Brahnam c a

Department of Information Engineering, University of Padua Via Gradenigo, 6/B, 35131 Padova, Italy DEIS, University of Bologna, Via Venezia 52, 47521 Cesena, Italy c Computer Information Systems, Missouri State University, 901 S. National, Springfield, MO 65804, USA b

a r t i c l e

i n f o

Keywords: Texture descriptors Local binary patterns Local quinary patterns Support vector machines Random subspace

a b s t r a c t The aim of this work is to find the best way for describing a given texture using a local binary pattern (LBP) based approach. First several different approaches are compared, then the best fusion approach is tested on different datasets and compared with several approaches proposed in the literature (for fair comparisons, when possible we have used code shared by the original authors). Our experiments show that a fusion approach based on uniform local quinary pattern (LQP) and a rotation invariant local quinary pattern, where a bin selection based on variance is performed and Neighborhood Preserving Embedding (NPE) feature transform is applied, obtains a method that performs well on all tested datasets. As the classifier, we have tested a stand-alone support vector machine (SVM) and a random subspace ensemble of SVM. We compare several texture descriptors and show that our proposed approach coupled with random subspace ensemble outperforms other recent state-of-the-art approaches. This conclusion is based on extensive experiments conducted in several domains using six benchmark databases. Ó 2011 Elsevier Ltd. All rights reserved.

1. Introduction Science is increasingly becoming more computational and data based. This has resulted in an explosion in scientific data. Since the turn of the century, vast volumes of scientific data have become warehoused in data centers around the world. There are ongoing efforts to curate this data and make it publicly accessible for continued analysis by researchers in various fields (Hey, Tansley, & Tolle, 2009). Large international data registries are being formed. Soon researchers world wide will have access to data stemming from multiple experimental sources. A problem that will increase in relevance is developing classifier systems that can handle all the scales and shapes of data stemming from large international stores and experimental centers. This need is being felt in medicine as medical units and hospital centers stockpile valuable medical data from multiple laboratory reports. Since much of this data, especially in the medical field, is based on images, general purpose methods for data mining image data bases across a wide array of domains needs to be developed. Many state-of-the-art machine learning methods utilize image texture descriptors (Liua, Zhang, Hou, Li, & Yang, 2010). Local binary patterns (LBP), first proposed in Ojala, Pietikainen, and Maeenpaa (2002), is one of the most widely used descriptors because of its resistance to lighting changes, low computational ⇑ Corresponding author. E-mail addresses: [email protected] (L. Nanni), [email protected] (A. Lumini), [email protected] (S. Brahnam). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.09.054

complexity, and ability to code fine details. LBP has been extensively studied in a wide array of fields and has demonstrated superior performance in several comparative studies (Ahonen, Hadid, & Pietikainen, 2006; Heikkilä & Matti Pietikäinen, 2006; Ojala et al., 2002; Zhao & Pietikäinen, 2007). In medicine, LBP has been used to identify malignant breast cells (Oliver, Lladó, Freixenet, & Martí, 2007), to find relevant slices in brain MR (magnetic resonance) volumes (Unay & Ekin, 2008), and as a texture feature extracted from thyroid slices (Keramidas, Iakovidis, Maroulis, & Dimitropoulos, 2008). Recent work has also investigated LBP in automated cell phenotype image classification (see, for example, Nanni & Lumini, 2008b). A collection of papers that explore LBP in the medical field is available at http://www.ee.oulu.fi/mvg/page/lbp_ bibliography#biomedical. Other fields where LBP has been recently investigated include face recognition (Ahonen et al., 2006; Nanni & Lumini, 2007; Zhao & Pietikäinen, 2007) and biometrics (Nanni & Lumini, 2008a). Another interesting variant has been proposed by Tan and Triggs (2007) to solve the problem of the sensitivity to noise in near-uniform image regions. This method, called local ternary patterns (LTP), proposed a 3-valued coding that includes a threshold around zero for the evaluation of the local gray-scale difference. Numerous variants of LBP descriptors have been proposed within the last five years. Several works only utilized the uniform patterns. Combining uniform patterns with a few non-uniform patterns was shown to improve performance in Zhou, Wang, and Wang (2008). Rotation invariant patterns have been explored in Liao, Law, and Chung (2009), where patterns that represent 80%

L. Nanni et al. / Expert Systems with Applications 39 (2012) 3634–3641

3635

of all the patterns in training data are used. Several other variants have also been proposed (Nanni et al., in press). In this paper, we expand the tests reported in Nanni, Brahnam, and Lumini (2010). Our objective is to find an optimal combination of LBP based descriptors that perform well in several domains. We suggest combining a random subspace (RS) of SVM trained using the uniform bins extracted by LQP and a random subspace of SVM trained using the rotation invariant bins where a bin selection is performed based on variance and a Neighborhood Preserving Embedding (NPE) feature transform is applied. In Section 2 we review related work on LBP. In Section 3 we present our approach. In Section 4 we report results using our approach on sex benchmark databases that span a number of domains, and we compare our results to the state-of-the-art reported in the literature using the same databases. We conclude in Section 5 with a few remarks and suggestions for future research.

mp = jdp j, for additional discriminant power. CLBP also considers the intensity of the central pixel, qc. Thus, three operators are defined in CLBP: CLBP_S, which considers the sign component of the difference, CLBP_M, which considers the magnitude component of the difference, and CLBP_C, which considers the intensity of the central pixel. CLBP_S is the conventional LBP sign operator s(x). CLBP_M is defined as follows:

2. Related work on LBP

where t(x) is defined as in Eq. (3) and s1 is the average gray level of the entire image. These three codes are then combined to form the CLBP feature map of the original image.

The basic idea behind LBP is that an image is composed of micropatterns. LBP is the first-order circular derivative of patterns that is generated by concatenating the binary gradient directions. A histogram of these micropatterns contains information about the distribution of edges and other local features in an image. The conventional LBP operator (Ojala et al., 2002) extracts information that is invariant to local grayscale variations in the image. It is computed at each pixel location, considering the values of a small circular neighborhood (with radius R pixels) around the value of a central pixel qc. Formally, the LBP operator is defined as follows:

LBPðP; RÞ ¼

P1 X

sðqp  qc Þ2p

ð1Þ

p¼0

where P is the number of pixels in the neighborhood, R is the radius, and s(x) = 1 if x P 0, otherwise 0. The histogram of these binary numbers is then used to describe the texture of the image. Two types of patterns are distinguished: uniform patterns, which have at most two transitions from 0 to 1, and nonuniform patterns. A simple method for selecting a set of uniform patterns is to choose the rotation invariant bins (see Ojala et al. (2002) for details). 2.1. Centralized binary pattern (CBP) CBP, proposed in Fu and Wei (2008), compares pairs of pixels in the neighborhood, as seen in Eq. (2), to overcome the rather long histogram, and it considers the local effect by assigning the largest weight to the central pixel. Formally, CBP is defined as follows:

CBPðP; RÞ ¼

ðP=2Þ1 X

sðqp  qpþðP=2Þ Þ2p þ s qc 

p¼0

P 1 X 1 q þ qc P þ 1 p¼0 p

CLBPMP;R ¼

P1 X

tðmp ; cÞ2p

where t(x) = 1 if x P 0, otherwise 0, and c is the mean value of the absolute value of the differences between a pixel and one neighbor. CLBP_C is defined as follows:

CLBPCP;R ¼ tðqp  s1 Þ

ð4Þ

2.3. Three-patch (TPLBP) and four-patch LBP (FPLBP) codes The TPLBP and FPLBP, first proposed by Wolf, Hassner, and Taigman (2008), compare the values of either three or four patches to produce a bit code for each pixel value. In TPLBP, for each pixel q in an image, a patch C of size w  w pixels is centered around q, along with S patches distributed uniformly in a ring of radius r around C. Pairs of patches a patches apart along the radius of the circle are compared, and the resulting code has S bits per pixel. Formally, the TPLBP is operator is defined as follows:

TPLBPr;S;w;a ðpÞ ¼

S X

f ðdðC i ; C p Þ  dðC iþa mod S; C p ÞÞ2i

ð5Þ

i

where Cp is the central patch and Ci and Ci +a mod S are two patches along the ring, d(, ) is any distance measure, and f is f(x) = 1 if x P s, otherwise 0. The value s is set slightly greater than zero. In FPLBP two rings of r1 and r2 are considered for each pixel in the image, and S patches of size w  w are spread out evenly around each ring. Two center symmetric patches in each ring are compared a patches apart. The code is determined according to which of the two pairs of patches in each ring is the closest. The length of the FPLBP binary code is S/2. Formally, the FPLBP operator is defined as follows:

!! 2P=2

ð3Þ

p¼0

FPLBPr1 ;r2 ;w;a ðpÞ ¼

S=2 X

f ðdðC 1i ; C 2;iþa mod SÞ

i

ð2Þ where s(x) = 1 if jxj P s , otherwise 0. The value s is some threshold constant. 2.2. Completed LBP (CLBP) CLBP, proposed in Guo, Zhang, and Zhang (in press) utilizes both the sign and magnitude information in the difference d between the central pixel, qc, and some pixel in its neighborhood qp. In conventional LBP operator only the sign component of d is utilized. If dp = qp  qc its sign sp is as we see above in Eq. (1), sp (dp) = 1 if dp P 0, otherwise 0. CLBP utilizes the magnitude mp of dp, where

 dðC 1;iþS=2 ; C 2;iþS=2þa mod SÞÞ2i

ð6Þ

In both TPLBP and FTLBP, the image is then divided into nonoverlapping regions and a histogram is computed for each region. Histograms are normalized to unit length and concatenated into a single vector. 2.4. Fuzzy LBP (FLBP) FLBP, proposed in Iakovidis, Keramidas, and Maroulis (2008), incorporates fuzzy logic into LBP using a set of fuzzy rules. Given some neighborhood P around a central point qc along a circle of radius R, FLBP uses the two following rules.

3636

L. Nanni et al. / Expert Systems with Applications 39 (2012) 3634–3641

Given the difference in conventional LBP of dp = qp  qc, then Rule 0 The more negative dp, the greater the certainty that sign(dp) = 0. Rule 1 The more positive dp, the greater the certainty that sign(dp) = 1. Two membership functions, l0 and l1 can be defined based on these rules. Let l0 define the degree to is dp negative and sign(dp) = 0, then the decreasing function l0 can be defined as follows:

l0 ðpÞ ¼

8 > 2F

:

1

if dp P F if dp 6 F

l1 ðpÞ ¼

if dp P F

:

if dp 6 F

0

ð8Þ

if  F < dp < F

where F in both l0 and l1 is a parameter that controls the degree of fuzziness. Given a texton using FLBP can receive more than one LBP code. The degree to which a LBP code represents a texton depends on both l0 and l1. FLBP histograms are more informative as seen by the fact that they do not have bins with zero values and there are more spikes. 2.5. Local derivative pattern (LDP) LBP can be considered a nondirectional, first-order local pattern operator. LDP, proposed in Zhang, Gao, Zhao, and Liu (2010), is local pattern operator that considers higher order derivatives. Second-order derivatives encode the turning point in the change of the derivative direction among local neighborhoods. The nth order derivative captures detailed directional information. Given image I and a neighborhood of size 3  3, LBP can be defined as the thresholding function f(, ), which can formally be represented as the following: f(I(Q0),

IðQ i ÞÞ ¼ 1ifIðQ 0 Þ  IðQ i Þ P s;

otherwise 0;

u P x þ s2 x þ s1 6 u < x þ s2 x  s1 6 u < x þ s1 x  s2 6 u < x  s1 otherwise

The quinary pattern (Nanni, Brahnam, et al., 2010) is split into four binary patterns (Fig. 1) according to the following binary function bc(x), c 2 {2, 1, 1, 2}:

bc ðxÞ ¼

Let l1 define the degree to is dp positive and sign(dp) = 1, then l1 can be defined as follows: Fþdp > 2F

8 2; > > > > > > < 1; D ¼ 0; > > > 1; > > > : 2;

ð7Þ

if  F < dp < F

8 > s;

otherwise 1;

ð11Þ

where i = 1, 2, . . . , 8 and s = 0. 3. Proposed approach In order to reduce the sensitivity to noise in near-uniform image regions in (Nanni, Brahnam, et al., 2010) we suggest to use a fivevalue encoding, named quinary, in order to obtain a more robust descriptor. In this variant the difference between the gray value of the center pixel x from the gray values of one of its neighbor-

0

1

0

0

0

1

0

1

0 1

-2

-1 1

1

0

0

1

0

0

0

0

0

2 -1

0 1

0

0

1

0

0

1

0

0 0

0 0

0

Fig. 1. An example of splitting a quinary code into four LBP codes.

L. Nanni et al. / Expert Systems with Applications 39 (2012) 3634–3641

Step 8: A support vector machine is trained and tested using the features selected in Step 7. Step 9: Steps 7–8 are performed 50 times. Step 10: The 50 classifier results are then combined using the sum rule obtaining the set of class similarity named SCORE-B. Step 11: The final score is given by the sum of SCORE-A and SCORE-B. As can be seen in step 1, we use a variance selection process. We select the histogram bins with the highest variance in the training data instead of the dominant pattern selection method proposed in Liao et al. (2009). After selecting a random subset of features in step 2, Neighborhood Preserving Embedding (NPE), a feature transform technique, is used as a bin selector in step 3. To reduce computation time, however, we use PCA as a feature reduction method. Since we retain 99.999% of the variance1 in the data, little information is lost in the application of PCA in this step. The classifier used in our experiments is the support vector machine (SVM) (Cristianini & Shawe-Taylor 2000). In step 4, we combine 50 classifiers by sum rule. 3.1. Random subspace Random subspace (RS) reduces dimensionality by randomly sampling subsets of features (50% of all the features in our experiments). It modifies the training data set by generating K (K = 50 in our experiments) new training sets. Classifiers are built on these modified training sets, that is, each classifier is trained on each of the new training sets. The results are combined using the sum rule. The random subspace ensemble method is a three step process: 1. Given a d-dimensional data set D ¼ fðxj ; t j Þj1 6 j 6 mg; xj 2 Rd ; tj 2 C = f1; . . . ; cg, n new projected k-dimensional data sets Di = {ðP i ðxj Þ; t j Þj1 6 j 6 m} are generated ð1 6 i 6 nÞ, where Pi is a random projection. Pi is obtained by random selecting, through the uniform probability distribution, a k-subset A ¼ fa1 ; . . . ; ak g from f1; 2; . . . ; dg and setting P i ðxi ; . . . ; xd Þ ¼ ðxa1 ; . . . ; xak Þ. 2. Each new data set Di is given in input to a fixed learning algorithm L which outputs the classifiers hi for all i; 1 6 1 6 n. 3. The final classifier h is obtained by aggregating the base classifiers hi, . . . , hn through a given decision rule. 3.2. Neighborhood preserving embedding Unlike PCA, which aims at preserving the global Euclidean structure, Neighborhood Preserving Embedding (NPE), first proposed in (He, Cai, Yan, & Zhang, 2005), preserves the local neighborhood structure on the data manifold. As a result, NPE is less sensitive to outliers than is PCA. The first step in the NPE algorithm is to construct an adjacency graph, where the ith node of the graph corresponds to the ith training pattern xi. The edge between the nodes i and j is built if xj is among the K nearest neighbors of xi. The weight of the edge from node i to node j is computed by minimizing a given objective function. The generalized eigenvector problem is solved to compute the linear projections. The MATLAB code used in our experiments is freely available at http://www.cs.uiuc.edu/homes/dengcai2/Data/ data.html. 1 w=pca(dataset(TR),0.99999);%function of the PRTools 3.1.7TR=+(w⁄dataset(TR)); TE=+(w⁄dataset(TE)); clear W; options = []; options.k = 5; options.NeighborMode = ’Supervised’; options.gnd = yTR; [eigvector, eigvalue] = NPE(options, TR).

3637

4. Datasets In this section we describe the datasets and evaluation protocols used in the experimental section. Unless otherwise mentioned the protocol used is a fivefold cross validation technique where each dataset is randomly divided into 4/5ths for training and 1/5th for testing. If a database contains two classes, we use a base SVM. If the database contains more than two classes, we use one versus one SVM. In the 2-classes problems we adopt the area under the ROC-curve (Fawcett, 2004) in multi-class problems we use accuracy. The area under the ROC-curve is a two-dimensional measure of classification performance that plots the probability of classifying correctly the genuine examples against the rate of incorrectly classifying impostor examples. All images, except those in the dataset DNA, are preprocessed by transforming the values using contrast-limited adaptive histogram equalization (MATLAB function adapthisetq.m). 4.1. 2D HeLa dataset The 2D HeLa dataset contains 862 single-cell images (Chebira et al., 2007).2 Each image is a 16 bit greyscale image of size 512 by 382 pixels. The dataset has ten classes: ActinFilaments, Endsome, ER, Golgi Giantin, Golgi GPP130, Lysosome, Microtubules, Mitochondria, Nucleolus, and Nucleus. When using dominant LBP/LTP, the bin with the higher occurrence is discarded because it represents the black background. 4.2. Pap smear dataset The pap smear database contains 917 samples collected at the Herlev University Hospital using a digital camera and microscope (Jantzen, Norup, Dounias, & Bjerregaard, 2005). The dataset has two classes: normal versus abnormal. The classification of cells was determined by two cyto-technicians. In the case where the two disagreed, a medical doctor classified the cells. 4.3. LOCATE mouse protein sub-cellular localization endogenous database The LOCATE mouse protein sub-cellular localization endogenous database contains approximately 50 images. Each image contains somewhere between 1 and 13 cells per class (Fink et al., 2006).3 The dataset is divided into 11 classes: Actin-Cytoskeleton, Cytoplasm, Endosomes, ER, Golgi, Lysosomes, Microtubule, Mitochondria, Nucleus, Peroxisomes, PM. When using dominant LBP/ LTP, the bin with the higher occurrence is discarded because it represents the black background. 4.4. DNA-binding proteins (DNA) In this problem the textures are extracted from the 2-D distance matrix obtained from the 3-D tertiary structure of a given protein (Nanni, Shi, Brahnam, & Lumini 2010). Instead of considering all atoms in the protein, the distance matrix is calculated by considering only those atoms that belong to the protein backbone. Given a protein Pi as first step we need to extract its backbone: it is described as a vector Bi ¼ fCooria;1 ; Cooria;2 ; L; . . . ; Cooria;N g, where Cooria;n is coordinates vector of the nth Ca atom. The distance matrix is defined as the matrix DM ¼ fdmi ðp; qÞ ¼ distðCooria;p ; Cooria;q Þg DM ¼ fdmi ðp; qÞ ¼ distðCooria;p ; Coorria;q Þg where dist() is 2

HeLa dataset is available at at http://murphylab.web.cmu.edu/. The LOCATE mouse protein sub-cellular localization endogenous database is available at http://locate.imb.uq.edu.au/. 3

3638

L. Nanni et al. / Expert Systems with Applications 39 (2012) 3634–3641 Table 1 Performance obtained by LBP with P = 16 and R = 2. LBP

LBP-riu Dom K = 80% Dom K = 90% VR(100) VR(250) R-Dom K R-VR(250) R-VR R-RIU

Fig. 2. Texture extracted from a protein.

Datasets 2D-Hela

PAP

LOCATE

0.827 0.837 0.797 0.826 0.808 0.900 0.839 0.896 0.776

0.749 0.831 0.829 0.822 0.851 0.828 0.843 0.848 0.730

0.833 0.825 0.858 0.843 0.849 0.880 0.870 0.871 0.774

Table 2 Performance obtained by LTP with P = 16 and R = 2.

simply the Euclidean distance between the two set of coordinates (considered as a vector) and 1 6 p, q 6 N.4 We have used the dataset reported in (Fang, Guo, Feng, & Li 2008), it contains 118 DNA-binding Proteins and 231 Non-DNA-binding proteins. These proteins have less than 35% sequence identity between each pair. DNA-binding proteins are proteins that are composed of DNA-binding domains and thus have a specific or general affinity for either single or double stranded DNA. Sequence-specific DNA-binding proteins generally interact with the major groove of B-DNA. A sample of distance matrix extracted from the proteins of this dataset is reported in Fig. 2. For this database we used the 10-fold cross validation protocol.

LTP

LTP-riu Dom K = 80% Dom K = 90% VR(100) VR(250) R-Dom K R-VR(250) R-VR R-RIU

Datasets 2D-Hela

PAP

LOCATE

0.920 0.893 0.855 0.892 0.853 0.922 0.901 0.932 0.909

0.829 0.834 0.822 0.849 0.853 0.853 0.853 0.868 0.814

0.913 0.884 0.871 0.872 0.886 0.913 0.886 0.929 0.907

5. Experimental results 4.5. Plant leaf identification This is a dataset of several species from Brazilian flora (Casanova, Joaci de Mesquita, & Bruno 2009). A resolution of 1200 dpi (dots per inch) was used to obtain high detailed leaf texture images. A total of 400 samples, divided into 20 classes (20 samples per class), were collected. The process of foliar acquisition was not perfect. The presence of dirt on the leaf or on the scanner glass produces noise in the final image. It is important to note also that the digitalization of a 3D surface to a 2D surface produced shadows and/or superimpositions. Three windows (128  128 pixels) were extracted manually from each sample making a total of 1200 textures. The protocol used in our experiments is a fivefold cross validation technique (with the constraint that all the windows extracted from a given leaf belong to the training or to the test set). 4.6. BREAST cancer Breast cancer is the major cause of cancer-related deaths among adult women. It is known that the best prevention method is precocious diagnosis. For our experiments in this domain, we used the Digital Database for Screening Mammography (DDSM), a publicly available database of digitized screen-film mammograms. This database has two classes: benign and malignant tissues. We selected the same 273 malignant and 311 benign images used in (Junior, Cardoso de Paiva, Silva, & Muniz de Oliveira, 2009). It is very important to note that in this dataset the images have very different dimensions, so it is we had to normalize the histograms extracted from each image to obtain a good performance.

4 The matlab code for extracting the distance matrix is available at http:// bias.csr.unibo.it/nanni/DM.zip.

In Table 1, we report our experimental results using the system architecture and datasets with performance indicators described in previous sections. We selected the best approach running tests on only three datasets (due to computational issues). We then compared our selected approach with the literature using all the six datasets. For all the texture descriptors, we employed a Linear Support Vector Machine with default parameters.5 Better results could be obtained if parameter tuning of the classifiers had been performed separately for each dataset. We once again draw attention to the aim of this work, which is to compare several texture descriptors across databases and not to optimize the performance of the classifier system for each dataset. The first set of experiments uses LBP with P = 16 and R = 2. LBPriu reports experiments using the standard rotation invariant uniform LBP descriptor. Dom K = X% reports experiments using DLBP with K = X. VR(X) reports experiments using the X bins with highest variance in the training data. R-Dom K reports experiments using a random subspace method coupled with Dom K = 90%. R-VR(X) reports experiments using a random subspace of SVM obtained starting from the features selected by VR(X). R-VR reports experiments using the ensemble method reported above that is based on the combination of VR(250), random subspace, and a feature transform using NPE. R-RIU reports the performance obtained by a random subspace of SVM obtained starting from the features LBP-riu. A similar set of experimental results is reported in Table 2. In these experiments LTP has P = 16 and R = 2. The threshold s in LTP is 3. In Table 3 LTP with uniform bins is considered. LTP-u reports experiments using all the uniform bins. R-U reports the performance obtained by a random subspace of SVM obtained starting from the uniform bins extracted by LTP. Examining the three tables above, we can make the following 5

i.e. the default values of LIBSVM.

3639

L. Nanni et al. / Expert Systems with Applications 39 (2012) 3634–3641 Table 3 Performance obtained by LTP (only the uniform bins are considered) with P = 16 and R = 2. LTP

LTP-u Dom K = 80% Dom K = 90% VR(100) VR(250) R-Dom K R-VR(250) R-VR R-U

Datasets

Datasets 2D-Hela

PAP

LOCATE

0.870 0.361 0.761 0.796 0.830 0.845 0.845 0.840 0.890

0.889 0.873 0.884 0.865 0.873 0.886 0.880 0.874 0.895

0.935 0.502 0.878 0.845 0.906 0.871 0.916 0.911 0.945

conclusions:  LTP outperforms LBP.  Both dominant LBP and variance selection for LBP outperform standard LBP.  None of the methods work well with LTP. Neither Dom K = 80% nor Dom K = 90%, when coupled with LPT, improves LTP. Also, VR(250) does not improve LTP.  A random subspace of SVMs does not performance remarkably well when coupled with LBP-riu and LTP-riu (see the row R-RIU in the tables above); instead it works well when coupled with LTP-u. In our opinion this is due to the high dimension of the uniform bin histogram (the curse of dimensionality problem).  In some datasets the best results are obtained considering the uniform bins; in other datasets the best results are obtained considering the rotation invariant bins. We also ran tests that vary the R-VR method that reported in Table 4:  We tested different feature transforms (NPE, Orthogonal Locality Preserving Projections – OLPP, Intrinsic Discriminant Analysis – IDA), with different feature dimension reductions (dim).  We tested the Mutual Information (named MI in Table 4) as bin selector (for selecting the 250 bins) coupled with NPE (dim = 45).  We tested a weighted random subspace (Yaslan & Cataltepe, 2010) (named WR in Table 4) instead of the standard random subspace coupled with NPE (dim = 45).  We examined the performance of our idea to couple random subspace and NPE using as starting features the bins extracted using Dominant LTP. This test show that the idea of using NPE starting from a random subspace, also improves the performance of Dominant LTP.

Table 4 Performance obtained by LTP with P = 16 and R = 2. Datasets

NPE (dim = 15) NPE (dim = 30) NPE (dim = 45) OLPP(dim = 15) OLPP (dim = 30) OLPP (dim = 45) IDA (dim = 45) R-Dom K + NPE (dim = 30) MI WR FUS

Table 5 Performance obtained by different combinations.

2D-Hela

PAP

LOCATE

0.890 0.932 0.932 0.728 0.825 0.893 0.903 0.923 0.912 0.880 0.932

0.868 0.868 0.868 0.869 0.871 0.872 0.866 0.858 0.868 0.880 0.885

0.923 0.929 0.933 0.938 0.937 0.937 0.939 0.940 0.938 0.928 0.959

MAG1 MAG2 TMAG FUS FUS + TMAG

2D-Hela

PAP

LOCATE

0.694 0.738 0.807 0.932 0.900

0.908 0.873 0.883 0.885 0.885

0.761 0.753 0.929 0.959 0.945

 Finally, we report the performance of the whole system described in Section 3, named FUS in Table 4 (notice that in FUS, LQP is used instead of LTP to extract the histogram). The tests reported in Table 4 show that the best approach is FUS. On average the best feature transform is NPE, and both MI and WR do not improve the performance. In table 5 we report our attempts to improve the performance of FUS considering also the Magnitude features. In (Guo et al., in press) it was shown that the magnitude features can be used for improving the performance of LBP. First the absolute value of the differences among a pixel and its neighborhood are calculated. The neighborhoods are found as in LBP. Then a binary coding is extracted from each pixel. The same procedure used in LBP is then used to extract a histogram: p1 X

tðmp ; cÞ2p ;

p¼0

tðx; cÞ ¼



1; x 6 c 0;

x 0, so the flat regions are not considered, are used to calculate the threshold c. We also test a ternary coding (named TMAG in Table 5) p1 X p¼0

p

tðmp ; cÞ2 ;

8 xPc > < 1; tðx; cÞ ¼ 0; x P c; x < c > : 1; x < c

In ternary coding, x is the value of the differences between a pixel and one neighbor (not the absolute value). When calculating c, we do not consider the differences of value 0. The ternary coding is then used as in LTP. All the tests reported in Table 5 are performed using the uniform bins extracted by LBP and a random subspace of SVM as classifier. As evident in our tests, the magnitude features does not improve performance. We combine FUS and TMAG by sum rule. Perhaps a weighted sum rule should be investigated in some future study since TMAG has a performance lower than FUS. Finally, in Table 6, we compare our FUS with the state of the art (for each method we have tested both stand-alone and random subspace of SVM and we report the best performance):  FuzzyLBP, see Section 2.  LPQ, local phase quantization with filter size = 3 (Ojansivu & Heikkila, 2008).  MultiLPQ, The multiresolution representation LPQ approach, it concatenates the different descriptors extracted by varying the filter size (1, 3 and 5) (Chan, Kittler, Poh, Ahonen, & Pietikäinen, 2009).  LBP/Var, LBP Variance with Global Matching (Guo, Zhang, & Zhang, 2010);

3640

L. Nanni et al. / Expert Systems with Applications 39 (2012) 3634–3641

Table 6 Performance obtained by LTP with P = 16 and R = 2. Datasets

FUS FUS + M FuzzyLBP LPQ MultiLPQ Dominant LTP Discriminative LBP Discriminative LTP TPLBP FPLBP Local derivative pattern LBP-uniform LBP-rotation invariant LBP-rotation invariant uniform LTP-uniform LTP-rotation invariant LTP-rotation invariant uniform LBP/VAR Completed LBP LBP-HF NP

Rank

2D-Hela

PAP

LOC

DNA

LEAVE

BREAST

0.932 0.938 0.847 0.795 0.901 0.893 0.845 0.866 0.688 0.526 0.853 0.891 0.750 0.827 0.890 0.790 0.920 0.660 0.888 0.834 0.932

0.885 0.896 0.871 0.839 0.871 0.834 0.850 0.863 0.768 0.693 0.898 0.856 0.755 0.749 0.895 0.795 0.829 0.803 0.881 0.850 0.868

0.959 0.965 0.905 0.872 0.882 0.884 0.896 0.896 0.511 0.476 0.949 0.929 0.756 0.833 0.945 0.782 0.913 0.741 0.909 0.843 0.929

0.846 0.860 0.853 0.795 0.851 0.750 0.767 0.808 0.871 0.854 0.859 0.704 0.705 0.711 0.786 0.730 0.746 0.812 0.767 0.754 0.859

0.813 0.863 0.754 0.754 0.866 0.701 0.700 0.779 0.159 0.133 0.654 0.725 0.605 0.562 0.779 0.691 0.646 0.362 0.792 0.675 0.708

0.965 0.971 0.851 0.924 0.965 0.779 0.833 0.891 0.754 0.695 0.941 0.910 0.753 0.805 0.970 0.779 0.911 0.810 0.960 0.891 0.934

TPLBP, see Section 3. FPLBP, see Section 3. LocalDerivativePattern, see Section 3. Discriminative LBP, discriminative local binary patterns (Mu, Yan, Liu, Huang, & Zhou, 2008).  Completed LBP, see Section 3.  LBP-HF, Local Binary Pattern Histogram Fourier Features (Ahonen, Matas, He, & Pietikäinen, 2009).  NP, the method proposed in (Nanni, Brahnam, et al., 2010) for improving LTP-rotation invariant.    

Moreover we report, the method named FUS + M in Table 6, the performance obtained by the fusion by sum rule between proposed method and MultiLPQ. For almost all the descriptors random subspace outperforms stand-alone SVM, only when LBP/LTP rotation invariant uniform are used as descriptors the stand-alone SVM outperforms a random subspace. The column named ‘‘RANK’’ reports the average rank of the given descriptors in the tested datasets (e.g., if a classifier always obtains the best performance in each dataset, its rank is 1). Examining Table 6, we can make the following conclusions:  Also using all the six dataset we obtains similar conclusions to that obtained using only three datasets.  The best approach is FUS + M it obtains a very good average rank of 1.5, this is a very interesting results for practitioners since the proposed texture descriptors works very well in a wide selection of different problems. 6. Conclusion In this paper, we perform a set of empirical experiments on several benchmark databases to determine the best method for extracting feature using LBP based techniques. We compare several methods and reach the following conclusions:  Both dominant and variance selection work well with LBP, none of the methods work very well with LTP.  LTP outperforms LBP.

3.6 1.5 8.8 11.6 6.2 12.7 12.2 9.3 16.2 18.3 6.5 10.3 18.7 17.3 5.8 16.7 11.3 16.3 7.5 13.7 6.3

 Our ideas obtain the best performance in almost all the datasets, it is clear that random subspace permits to select a more wide set of bins and to handle the correlation problems of the selected set.  A random subspace of SVMs does not performance remarkably well when coupled with LBP-riu and LTP-riu (see the row R-RIU in the tables above), instead it works well when coupled with LTP-u. In our opinion this is due to the high dimension of the uniform bin histogram (curse of dimensionality problem).  In some datasets the best results are obtained considering the uniform bins, in other datasets the best results are obtained considering the rotation invariant bins.  The best approach is to combine a method based on uniform bins and a method based on rotation invariant bins. Our experiments made use of a broad spectrum of datasets and several approaches proposed in literature are compared. For future studies, we plan on investigating the performance of the proposed texture descriptors after feature extraction is performed from images that have been preprocessed using different methods (e.g., Gabor filters). We also intend to investigate whether performance can be improved using different loci of points in various combinations as in Nanni, Lumini, and Brahnam (2010). Acknowledgements The Matlab code for LBP used in this paper is available at http:// www.ee.oulu.fi/mvg/page/lbp_matlab. We want to thanks the authors that have shared the code of: LBP; LPQ; LBP/VAR; FuzzyLBP; LBP-HF; Completed LBP. References Ahonen, T., Matas, J., He, C., & Pietikäinen, M. (2009) Rotation invariant image description with local binary pattern histogram Fourier features. In Proceedings of the 16th Scandinavian conference on image analysis (SCIA 2009), Oslo, Norway. Lecture notes in computer science (p. 5575). Ahonen, T., Hadid, A., & Pietikainen, M. (2006). Face description with local binary patterns: Application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2037–2041. Casanova, D., Joaci de Mesquita, J., & Bruno, O. M. (2009). Plant leaf identification using gabor wavelets. International Journal of Imaging Systems and Technology, 19(3), 236–243.

L. Nanni et al. / Expert Systems with Applications 39 (2012) 3634–3641 Chan, C. H., Kittler, J., Poh, N., Ahonen, T., & Pietikäinen, M. (2009). (Multiscale) local phase quantization histogram discriminant analysis with score normalisation for robust face recognition. In IEEE workshop on video-oriented object and event classification, Kyoto, Japan (pp. 633–640). Chebira, A., Barbotin, Y., Jackson, C., Merryman, T., Srinivasa, G., Murphy, R. F., et al. (2007). A multiresolution approach to automated classification of protein subcellular location images. BMC Bioinformatics, 8, 210. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge, UK: Cambridge University Press. Fang, Y., Guo, Y., Feng, Y., & Li, M. (2008). Predicting DNA-binding proteins: Approached from chou’s pseudo amino acid composition and other specific sequence features. Amino Acids, 34(1), 103–109. Fawcett, T. (2004). ROC graphs: notes and practical considerations for researchers. Technical Report, Palo Alto, USA: HP Laboratories. Fink, J. L., Aturaliya, R. N., Davis, M. J., Zhang, F., Hanson, K., Teasdale, M. S., et al. (2006). Locate: A protein subcellular localization database. Nucleic Acids Research, 34. Fu, X., Wei, W. (2008) Centralized binary patterns embedded with image euclidean distance for facial expression recognition. In Paper presented at the fourth international conference on natural computation. Guo, Z., Zhang, L., & Zhang, D. (in press). A completed modeling of local binary pattern operator for texture classification. IEEE Transactions on Image Processing 89. Guo, Z., Zhang, L., & Zhang, D. (2010). Rotation invariant texture classification using LBP variance (LBPV) with global matching. Pattern Recognition, 43(3), 706–719. He, X., Cai, D., Yan, S., & Zhang, H.-J. (2005). Neighborhood preserving embedding. In Paper presented at the tenth IEEE international conference on computer vision (ICCV’2005). Heikkilä, M., & Matti Pietikäinen, M. (2006). A texture-based method for modeling the background and detecting moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4), 657–662. Hey, T., Tansley, S., Tolle, K. (Eds.) (2009). The fourth paradigm: Data intensive scientific discovery. Redmond, WA: Microsoft Research. Iakovidis, D. K., Keramidas, E., & Maroulis, D. (2008) Fuzzy local binary patterns for ultrasound texture characterization. In Image analysis and recognition, 5th international conference (iciar 2008), lecture notes in computer science (Vol. 5112, pp. 750–759). Springer. Jantzen, J., Norup, J., Dounias, G., & Bjerregaard, B. (2005) Pap-smear benchmark data for pattern classification. In Paper presented at the nature inspired smart information systems (NiSIS), Albufeira, Portugal. Junior, G. B., Cardoso de Paiva, A., Silva, A. C., & Muniz de Oliveira, A. C. (2009). Classification of breast tissues using moran’s index and geary’s coefficient as texture signatures and svm. Computers in Biology and Medicine, 39(12), 1063–1072. Keramidas, E. G., Iakovidis, D. K., Maroulis, D., & Dimitropoulos, N. (2008). Thyroid texture representation via noise resistant image features. In Paper presented at the twenty-first IEEE international symposium on computer-based medical systems (CBMS 2008).

3641

Liao, S., Law, M. W. K., & Chung, A. C. S. (2009). Dominant local binary patterns for texture classification. IEEE Transactions on Image Processing, 18(5), 1107–1118. Liua, G.-H., Zhang, L., Hou, Y.-K., Li, Z.-y., & Yang, J.-Y. (2010). Image retrieval based on multi-texton histogram. Pattern Recognition, 43(7), 2380–2389. Mu, Y., Yan, S., Liu, Y., Huang, T. S., & Zhou, B. (2008). Discriminative local binary patterns for human detection in personal album. In CVPR 2008. Nanni, L., Brahnam, S., & Lumini, A. (2010). A study for selecting the best performing rotation invariant patterns in local binary/ternary patterns. In Paper presented at the image processing, computer vision, & pattern recognition (IPCV’10), Las Vegas. Nanni, L., & Lumini, A. (2007). Regionboost learning for 2d+3d based face recognition. Pattern Recognition Letters, 28(15), 2063–2070. Nanni, L., & Lumini, A. (2008a). Local binary patterns for a hybrid fingerprint matcher. Pattern Recognition, 11, 3461–3466. Nanni, L., & Lumini, A. (2008b). A reliable method for cell phenotype image classification. Artificial Intelligence in Medicine, 43(2), 87–97. Nanni, L., Lumini, A., & Brahnam, S. (2010). Local binary patterns variants as texture descriptors for medical image analysis. Artificial Intelligence in Medicine, 49(2), 117–125. Nanni, L., Shi, J.-Y., Brahnam, S., & Lumini, A. (2010). Protein classification using texture descriptors extracted from the protein backbone image. Journal of Theoretical Biology, 3(7), 1024–1032. Ojala, T., Pietikainen, M., & Maeenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987. Ojansivu, V., & Heikkila, J. (2008) Blur insensitive texture classification using local phase quantization. In ICISP, 2008. Oliver, A., Lladó, X., Freixenet, J., & Martí, J. (2007) False positive reduction in mammographic mass detection using local binary patterns. In Medical image computing and computer-assisted intervention (miccai), lecture notes in computer science 4791 (Vol. 1, pp. 286–293). Brisbane, Australia: Springer. Tan, X., & Triggs, B. (2007). Enhanced local texture feature sets for face recognition under difficult lighting conditions. In Analysis and modelling of faces and gestures (pp. 168–182). Rio de Janeiro, Brazil. Unay, D., & Ekin, A. (2008) Intensity versus texture for medical image search and retrieval. In Paper presented at the 5th IEEE international symposium on biomedical imaging: From nano to macro. Wolf, L., Hassner, T., & Taigman, Y. (2008) Descriptor based methods in the wild. In Paper presented at the post ECCV workshop on faces in real-life images: Detection, alignment, and recognition. Yaslan, Y., & Cataltepe, Z. (2010). Co-training with relevant random subspaces. Neurocomputing, 73(10–12), 1652–1661. Zhang, B., Gao, Y., Zhao, S., & Liu, J. (2010). Local derivative pattern versus local binary pattern: Face recognition with high-order local pattern descriptor. IEEE Transactions on Image Processing, 19(2), 533–544. Zhao, G., & Pietikäinen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 915–928. Zhou, H., Wang, R., & Wang, C. (2008). A novel extended local binary pattern operator for texture analysis. Information Sciences, 178(22), 4314–4325.

Suggest Documents