T. Toyoda and O. Hasegawa: Texture classification using extended higher order local autocorrelation features. In Texture 2005: Proceedings of the 4th International Workshop on Texture Analysis and Synthesis, pp. 131–136, 2005.
Texture Classification Using Extended Higher Order Local Autocorrelation Features Takahiro Toyoda Tokyo Institute of Technology 4259 Nagatsuta, Midori-ku, Yokohama, JAPAN
[email protected]
Abstract This study investigates effective image features for characterization of local regions. We propose an extension of higher order local autocorrelation (HLAC) features. The original HLAC features are restricted up to the second order. They are represented by 25 mask patterns. We increase their orders up to eight and extract the extended HLAC features using 223 mask patterns. Large mask patterns are also created to support large displacement regions. They are used to construct multi-resolution HLAC features. The proposed method outperforms Gaussian Markov random fields, Gabor features, and local binary pattern operator in texture classification.
1
Introduction
Texture analysis plays an important role in many applications. Various feature extraction methods have been proposed. Some approaches are based on local properties of the image (e.g. Gaussian Markov random fields (GMRF) [4] and local binary pattern operator (LBP) [13]). Others use frequency representations (e.g. wavelet transform [10] and Gabor features [11]). In this paper, we characterize the local properties of texture images using higher order local autocorrelation (HLAC) features [14]. HLAC features, an extension of autocorrelation features (second-order statistics), are based on higher-order statistics (HOS). Tsatsanis et al. demonstrated a noise insenseitivity of HOS, using third-order cumulants [17]. We use even higher-order statistics, up to ninth order (eighth-order autocorrelations), and show their effectiveness in texture classification. The original HLAC features are restricted up to the second order and are represented by 25 mask patterns. We increase their orders up to eight and represent the extended
Osamu Hasegawa Tokyo Institute of Technology, PRESTO, Japan Science and Technology Agency (JST)
[email protected]
HLAC features using 223 mask patterns. Each feature value corresponds to the power spectrum of the mask pattern. The use of many mask patterns allows detailed characterization of an image. Large mask patterns are also used for the support of large displacement regions. Multi-resolution HLAC features are constructed by using different mask sizes together. The performance of the proposed method is evaluated through several texture classifications using images of different sizes, scaled images, and rotated images. Other popular methods such as GMRF, Gabor features and LBP are employed for comparison.
2 2.1
Higher Order Local (HLAC) features
Autocorrelation
Conventional second-order HLAC Features
The N th-order autocorrelation functions, extensions of autocorrelation functions, are defined as x(a1 , a2 , · · · , aN ) = f (r)f (r+a1 ) · · · f (r+aN )dr, (1) where f (r) denotes the intensity at the observing pixel r, and a1 , a2 , · · · , aN are N displacements. HLAC features [14] are primitive image features based on Eq. (1). Their orders and displacements are arbitrary. However, highorder features with a large displacement region become extremely numerous. Hence, the original HLAC features are restricted up to the second order (three-point relations) and within a 3 × 3 displacement region. They are represented by 25 mask patterns with 0, 1 and 2 displacements (the first 25 mask patterns in Fig. 1). The feature values are calculated by scanning the image with the mask patterns and computing the sums of products of the intensities of corresponding pixels. Each feature value represents the power spectrum of the mask pattern,
Figure 1. 223 mask patterns of the 0th-order to 8th-order HLAC features (3 × 3 pixels) which corresponds to a basis functions of frequency analysis. Roughly comparison with a Fourier transform, the mask size corresponds to the frequency component, and the distribution of the displacements corresponds to the direction component. Since the HLAC features use the information of two-dimensional distributions as well as the directions, they analyze an image more closely. The features of a 3 × 3 displacement region mainly extract the local information. More information can be extracted by setting the displacements in large regions. However, large displacement regions produce a huge number of features, e.g. 205 features in a 5 × 5 region [7]. This approach is impractical even in larger displacement regions. Instead, other techniques such as the use of a pyramidal image structure [9] and scaling the mask patterns [5] have been proposed. Another modification method for two-class classification is presented in [15]. It allows the use of high-order features in large displacement regions. This method calculates only the inner products of autocorrelations without explicitly computing the autocorrelations themselves. In their experiment, the features of large displacement regions unexpectedly showed lower accuracies. That are considered to result from excessive number of features produced in large displacement regions.
Figure 2. Larger mask patterns obtained through dilation
2.2
Extension of HLAC Features
The original HLAC features are restricted up to the second order and are extracted by 25 mask patterns. The feature value represents the power spectrum of the mask pattern. Images are more closely characterized using various mask patterns. We increase the orders of HLAC features up to eight and extract the extended HLAC features using the 223 mask patterns shown in Fig. 1. Large mask patterns are also created to support large displacement regions (Fig. 2). They extract the features of low resolutions or low frequencies. Multi-resolution HLAC features are constructed by concatenating the single-resolution feature vectors which are extracted using masks of different sizes. Multi-resolution HLAC features contain the highorder HLAC features of both high and low frequencies.
Table 1. Experimental data (Outex database) Cla- Image Training Outex ID sses sizes samples Test
Figure 3. Standard feature and “repetition reference features” calculated using multiplication of the same pixels
HLAC features can have the same displacements within a1 , a2 , · · · , aN (e.g., a1 = 0, a1 = a2 ) in Eq. (1). Those features are calculated by multiplying the same pixels of the mask patterns two or three times (Fig. 3). We call these features “repetition reference features” and use them together with standard HLAC features. Although they are calculated using the same mask patterns as the standard ones, they contribute to the close characterization of the image. For example, when the image has a gradient of intensity to the right, the “repetition reference feature”, which repeatedly refers the right pixel, shows a large value. It enhances the difference of the images and simplifies discrimination. The number of “repetition reference features” is large for high orders. In the previous study [8], their orders were restricted up to two; thereby 10 “repetition reference features” were calculated (two features from each of the five mask patterns of the zeroth and first order). We increase the orders up to three and calculate 83 “repetition reference features”: 3 features from the zeroth-order mask pattern, 20 features from the first-order ones (5 × 4 patterns = 20), and 60 features from the second-order ones (3 × 20 patterns = 60).
3
Texture classification
Five classification tests were carried out using gray-scale texture images of the “Outex database” [12]. Outex is an empirical evaluation framework for texture analysis. It contains widely various texture images and precisely specified classification problems. Table 1 presents a description of the experimental data. The performance was evaluated with classification of test samples that have no overlap with training samples. The proposed method was compared with the conventional second-order HLAC features and other popular meth-
00 01
Outex TC 00000 Outex TC 00001
24 24
128×128 64×64
10 44
02 03 04
Outex TC 00002 Outex TC 00016 Contrib TC 00004
24 319 32
32×32 128×128 64×64
184 10,15,19 32 (8×4)
ods mentioned below. The second-order HLAC features adopted the multi-resolution approach and used the 10 “repetition reference features” that were restricted up to the second-order. Multi-resolution HLAC features were constructed by adding the different resolutions, i.e. different mask sizes, one by one. They were added in ascending size order: 3 × 3, 5 × 5, 7 × 7, · · ·. The “repetition reference features” were calculated at each size. For comparison, other popular methods, Gaussian Markov random fields (GMRF) [4], Gabor features [11], and local binary pattern (LBP) operator [13], were applied to the same classification problems. They were implemented based on publicly available source codes [1, 11, 2]. The parameters were adjusted to the peak performance in each problem. The GMRF features were based on the source code of the “MeasTex” site [1]. They were computed using standard symmetric masks of the first to twentieth orders. We combined the features of different orders and created new sets of GMRF features. They were constructed by adding the features in the ascending order of the GMRF mask order, and set to the best combination. They showed superior performance to that of the original single order features. For the Gabor features, we used filter banks designed in [11]. We set the lower and upper center frequencies of interest to 0.05 and 0.4 as used in [11]. The numbers of scales (3–9) and orientations (3–15) of the filter bank were adjusted in each test. Three operators of radii 1, 2, and 3 were used for the LBP. They respectively produce histograms of 59, 243, and 555 bins with the “uniform” pattern approach [13]. Aside from the single-resolution LBP histograms, we created four multi-resolution LBP histograms by concatenating the single-resolution ones. They were combinations of radii: (1, 2), (1, 3), (2, 3), and (1, 2, 3). A widely used simple classifier, linear discriminant analysis (LDA), was employed for classification. It is rather robust against noise and is unlikely to overfit. In the experiment, each feature value was normalized with its standard deviation of the training samples before applying the LDA.
Figure 4. Examples of texture images (Test 00)
3.1
Test 00, 01 and 02: Classifications of 24 textures
Tests 00, 01 and 02 in Table 1 involved 24 texture images that were normalized to have an average intensity of 128 and a standard deviation of 20. They were separated into disjoint subimages, which were used as training samples and test samples (Fig. 4). The respective image sizes and the numbers of training samples are shown in Table 1. The performance was evaluated by classifying the test samples of the same number as the training ones. Each Test was repeated 100 times with different sets of training and test samples. We tested the second-order to eighth-order HLAC features with different numbers of resolutions (i.e. mask sizes). Classification results of Test 02 are shown in Fig. 5. The four points on the extreme left represent results of the conventional second-order HLAC features, which contain 10 “repetition reference features”. This figure indicates that the increase of the orders raised the recognition rate. Further improvement was achieved using the multi-resolution approach. Table 2 shows the comparison of recognition rates obtained by the second-order and sixth-order HLAC features with resolutions of one to four. At the single resolution, i.e. using only the 3 × 3 mask patterns, the rate of the second-order features was 85.4%. It improved up to 95.8%, over 10% improvement, by the increase of the orders and the use of the 83 “repetition reference features”. The best rate achieved by the four-resolution HLAC features was 97.5% that improved from 91.9% obtained by the second-order ones. Table 3 shows a comparison of results obtained using the proposed method and others. In Test 02, the best rate obtained by other method was 93.4% of the Gabor features It used a filter bank with 7 scales and 12 orientations. Since the images are small (32 × 32 pixels) in Test 02, it requires close analysis in local regions. Many filters were used for the Gabor features. However, the extended HLAC features, which use a large variety of mask patterns, outperformed the Gabor features. The LBP was affected by the small image size. It generated statistically unstable and unreliable histograms which contain many empty bins. As a result, the best LBP rate was 93.1% when the operators of the radii
Figure 5. Recognition rates using the 2ndorder to 8th-order HLAC features (Test 02)
Table 2. Improvement by the increase of the HLAC orders and the multi-resolution (1–4) approach (Test 02)
2nd-order HLAC 6th-order HLAC
Multi resolutions 1 2 3 85.4 89.5 91.2 95.8 97.0 97.4
(%) 4 91.9 97.5
one and two were used. As shown in Table 3, the extended HLAC features achieved the best rates in all three Tests. For classification of large images (Test 00), the GMRF and the LBP performed well. However, for classification of small images such as Test 02, the proposed method showed outstanding analytical power in local regions.
3.2
Test 03: Classification of 319 textures
Test 03 in Table 1 involved 319 diverse textures ranging from canvases to granites (Fig. 6). Each texture had 20 samples that were normalized to give an average intensity of 128 and a standard deviation of 20. We divided them into two sets – one for training and the other for testing – and designed classification problems. Three problems were conducted with different numbers of training samples per texture: 10, 15 and 19 samples. The rest of the samples were used for testing. Each problem was performed 100 times with different sets of the training samples and test samples. Table 4 shows mean results over 100 trials. The extended HLAC features achieved the best rates in all problems. This
Table 3. Test 00, 01 and 02 (24 Outex texture classification) Test 00 Test 01 Test 02 Extended-HLAC (Resolutions, Orders) 99.8% (3, 5) 99.3% (3, 3) 97.5% (4, 6) 99.6% (3, 2) 97.6% (3, 2) 91.9% (4, 2) 2nd-order HLAC (Resolutions, Orders) 99.7% (1–7) 98.6% (1–14) 89.8% (1–18) GMRF (Orders) Gabor (Scales, Orientations) 99.6% (3, 7) 98.0% (3, 10) 93.4% (7, 12) 99.8% (1, 2, 3) 98.4% (1) 93.1% (1, 2) LBP (Radii)
Table 4. Test 03 (319 Outex texture classification) and Test 04 (32 Brodatz texture classification including scaled and rotated images)
Extended-HLAC (Resolutions, Orders) 2nd-order HLAC (Resolutions, Orders) GMRF (Orders) Gabor (Scales, Orientations) LBP (Radii)
10 samples 79.1% (3, 2) 75.0% (3, 2) 77.4% (1–11) 73.9% (6, 3) 77.9% (1, 2)
Figure 6. Examples of texture images (Test 03)
experiment did not require many features because only a few training samples were available. In case of the HLAC features, the second order was sufficient for the classifications. The extended HLAC features were improved through the use of the third-order “repetition reference features.” They raised recognition rates by about 5% from the conventional second-order features, which used only 10 “repetition reference features.” This result indicates the effectiveness of the third-order “repetition reference features.”
3.3
Test 04: Classification including scaled and 90-degree rotated images
Test 04 in Table 1 involved the 32 textures that are contained in the Brodatz album [3] (Fig. 7 (a)). Each image (256 × 256 pixels) was divided into 16 disjoint 64 × 64
Test 03 15 samples 82.0% (3, 2) 76.6% (3, 2) 77.0% (1–11) 75.9% (6, 4) 80.8% (1, 2)
Test 04 19 samples 83.0% (3, 2) 76.8% (3, 2) 77.1% (1–14) 76.4% (6, 6) 82.2% (1, 2)
95.7% (2, 3) 89.7% (3, 2) 85.3% (1–11) 87.9% (5, 8) 91.1% (1, 2)
subimages. They were independently histogram-equalized and used as training and test samples. Test 04 included scaled and rotated images. They were generated from each of the training and test samples: a sample rotated by 90 degrees, a 64 × 64 scaled sample obtained from the 45 × 45 pixels in the middle of the original sample, and a sample that was both rotated and scaled (Fig. 7 (b)). The number of test samples was the same as the training samples. Table 4 shows mean accuracies over 10 trials with different sets of training and test samples. The extended HLAC features achieved the best rate of 95.7%, which was much better than the second best rate of 91.1% achieved by the LBP. This result shows the robustness of the proposed method against the transformation of the image. This is a desirable property for real-world applications.
4
Conclusions
We propose an extension of the higher order local autocorrelation (HLAC) features which excel in characterization of local image properties. They have been restricted up to second order. We increased their orders up to eight and represented the extended HLAC features with 223 mask patterns. The extended HLAC features characterize the image more closely than conventional second-order ones, which use only 25 mask patterns. We also construct multiresolution HLAC features that contain both high-frequency and low-frequency information. In texture classification, the proposed method demon-
(a) 32 Brodatz textures (256 × 256 pixels)
(b) Transformations of sample images (64 × 64 pixels): Scaling, 90-degree rotation, and both scaling and rotation Figure 7. (a) 32 Brodatz textures and (b) Transformations of sample images (Test 04)
strated better performance than other methods such as Gaussian Markov random fields, Gabor features, and local binary pattern operator. For example, in the classification of small 32 × 32 texture images, the proposed method achieved a 97.5% recognition rate compared to 93.4% using Gabor features. Good performance was also shown for classification of scaled images and 90-degree rotated images, and for classification of a large number of classes, over 300, using only a few training samples. We consider the proposed method is suitable for practical applications because its features are computed rapidly. Its processing speed can be enhanced further by implementation of the vision chip [18]. Previous studies have presented several other extensions of HLAC features. One is extension to scale-invariant and rotation-invariant features, which are obtained using a logpolar transformation of the image [6]. Another extension is the extraction of HLAC features from three-dimensional (3D) data such as 3D polygons and volume data [16]. These extensions expand the fields of applicability of HLAC features.
References [1] Meastex image texture database and test suite. http://www.cssip.uq.edu.au/meastex/meastex.html. [2] University of Oulu, Department of Electrical and Information Engineering, Information Processing Laboratory, Machine Vision Group. http://www.ee.oulu.fi/mvg/mvg.php. [3] P. Brodatz. Textures: A photographic album for artists and designers. Dover, New York, 1966.
[4] R. Chellappa and S. Chatterjee. Classification of textures using Gaussian Markov random fields. IEEE Trans. Acoustics Speech and Signal Processing, 33:959–963, 1985. [5] F. Goudail, E. Lange, T. Iwamoto, K. Kyuma, and N. Otsu. Face recognition system using local autocorrelations and multiscale integration. IEEE Trans. Pattern Anal. Mach. Intell., 18(10):1024–1028, 1996. [6] K. Hotta, T. Kurita, and T. Mishima. Scale invariant face detection method using higher-order local autocorrelation features extracted from log-polar image. In Proceedings of the third IEEE International Conference on Face and Gesture Recognition, pages 70–75, 1998. [7] M. Kreutz, B. V¨olpel, and H. Janssen. Scale-invariant image recognition based on higher order autocorrelation features. Pattern Recognition, 29(1):19–26, 1996. [8] T. Kurita and S. Hayamizu. Gesture recognition using HLAC features of PARCOR images and HMM based recognizer. In Proceedings of the International Conference on Automatic Face and Gesture Recognition, pages 422–427, 1998. [9] T. Kurita, N. Otsu, and T. Sato. A face recognition method using higher order local autocorrelation and multivariate analysis. In Proceedings of the International Conference on Pattern Recognition, volume 2, pages 213–216, 1992. [10] S. Mallat. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell., 11(7):674–693, 1989. [11] B. S. Manjunath and W. Y. Ma. Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intell., 18(8):837–842, 1996. http://vision.ece.ucsb.edu/texture/software/. [12] T. Ojala, T. M¨anp¨aa¨ , M. Pietik¨ainen, J. Viertola, J. Kyll¨onen, and S. Huovinen. Outex - New framework for empirical evaluation of texture analysis algorithms. In Proceedings of the International Conference on Pattern Recognition, volume 1, pages 701–706, 2002. http://www.outex.oulu.fi/outex.php. [13] T. Ojala, M. Pietik¨ainen, and T. M¨anp¨aa¨ . Multiresolution gray-scale and rotation-invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell., 24(7):971–987, 2002. [14] N. Otsu and T. Kurita. A new scheme for practical flexible and intelligent vision systems. In Proceedings of the IAPR Workshop on Computer Vision, pages 431–435, 1988. [15] V. Popovici and J. P. Thiran. Higher order autocorrelations for pattern classification. In Proceedings of the International Conference on Image Processing, pages 724–727, 2001. [16] M. Suzuki, Y. Yaginuma, N. Osawa, and Y. Sugimoto. Classification of 3D solid textures using 3D mask patterns. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pages 6342–6347, 2004. [17] M. K. Tsatsanis and G. B. Giannakis. Object and texture classification using higher order statistics. IEEE Trans. Pattern Anal. Mach. Intell., 14(7):733–750, 1992. [18] K. Yamamoto and I. Ishii. A design of higher order autocorrelation vision chip. IEICE Trans. on Information and Systems (D-II), Vol.J86-D-II, (8):1205–1211, 2003. (in Japanese).