Cascaded and Hierarchical Neural Networks for Classifying Surface ...

1 downloads 0 Views 1MB Size Report
face images of marble slabs, feature extraction, hierarchical radial basis function ...... Mirmehdi, M. Petrou, P. L. Roy, R. Salgari, and G. Vernazza, “ASSIST:.
426

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 39, NO. 4, JULY 2009

Cascaded and Hierarchical Neural Networks for Classifying Surface Images of Marble Slabs ¨ ¨ M. Alper Selver, Olcay Akay, Member, IEEE, Emre Ardalı, A. Bahadır Yavuz, Okan Onal, and G¨urkan Ozden

Abstract—Marble quality classification is an important procedure generally performed by human experts. However, using human experts for classification is error prone and subjective. Therefore, automatic and computerized methods are needed in order to obtain reproducible and objective results. Although several methods are proposed for this purpose, we demonstrate that their performance is limited when dealing with diverse datasets containing a large number of quality groups. In this work, we test several feature sets and neural network topologies to obtain a better classification performance. During these tests, it is observed that different feature sets represent different subgroup(s) in a quality group rather than representing the whole group. Therefore, our approach is to use these features in a cascaded manner in which a quality group is classified by classifying all of its subgroups. We first realize this approach by using a two-stage cascaded network. Then, we design a hierarchical radial basis function network (HRBFN) in which correctly classified marble samples are taken out of the dataset and a different feature extraction method is applied to the remaining samples at each network level. The HRBFN system produces successful results for industrial applications and facilitates the desirable property of implementation in a quasi real-time manner. Index Terms—Artificial neural networks, classification of surface images of marble slabs, feature extraction, hierarchical radial basis function networks.

I. INTRODUCTION

A

Fig. 1. Typical sample images from each quality group. (a) Homogenous limestone. (b) Limestone with veins. (c) Samples containing grains (limestone) that are separated by unified cohesive matrix regions. (d) Homogenous cohesive matrix.

Manuscript received April 13, 2008; revised August 21, 2008 and January 5, 2009. First published April 7, 2009; current version published June 17, 2009. This work was supported by TUBITAK-MAG under the Grant 104M358. This paper was recommended by Associate Editor M. Last. M. A. Selver and O. Akay are with the Dept. of Electrical and Electronics Engineering, Dokuz Eyl¨ul University, Izmir 35160, Turkey (e-mail: [email protected]; [email protected]). E. Ardalı was with the Dept. of Electrical and Electronics Engineering, Dokuz Eyl¨ul University, Izmir 35160, Turkey (e-mail: [email protected]). A. B. Yavuz is with the Dept. of Geological Engineering, Dokuz Eyl¨ul University, Izmir 35160, Turkey (e-mail: [email protected]). ¨ ¨ O. Onal and G. Ozden are with the Dept. of Civil Engineering, Dokuz Eyl¨ul University, Izmir 35160, Turkey (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSMCC.2009.2013816

erties required by the standards and, at the same time, should present attractive color and pattern choices [1]. Important constraints for an aesthetic appearance are homogeneity, texture, color and distribution of limestone (beige colored), cohesive material (red-brown colored), and thin joints filled by cohesive material (red-brown colored veins) (Fig. 1). Physical and mechanical properties and durability of limestone can change due to the amount, distribution, and shape of limestone and cohesive material (unified or vein-like). For instance, joints filled with cohesive material reduce the material strength of the limestone. Thus, two limestone slabs, one exhibiting unified cohesive material regions and the other containing vein-like cohesive material structures, should be treated as belonging to different quality groups even if they have the same amount of cohesive material. Since false classification of marbles can result in major economic drawbacks, it is necessary to classify marble slabs correctly according to their quality and appearance. The classification process is mostly carried out at the end of the production line, where human experts evaluate and classify the product visually according to the parameters mentioned earlier. However, using human experts for classification can be error prone owing to subjective criteria of the operator (even different operators

T PRESENT, the term of building stone means all those natural stone that may be cut or shaped to specific sizes in order to use for construction or decoration purposes. Marble, travertine and limestone blocks and slabs can be mentioned as mostly encountered building stones. These stones with different origin, color, texture and structure are extensively used for inside and outside decorations of buildings. Limestone has been used for quite some time in various artistic structures due to its specific qualities such as long endurance and beautiful colors. For being used as a natural building stone, limestone should possess certain physical, mechanical, and technological prop-

1094-6977/$25.00 © 2009 IEEE Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

SELVER et al.: CASCADED AND HIERARCHICAL NEURAL NETWORKS FOR CLASSIFYING SURFACE IMAGES OF MARBLE SLABS

due to shift work) and the visual fatigue after a period of time, which degrades the classification performance. These problems generate results highly dependent on several conditions. Thus, it is necessary to use an automated system capable of performing the same classification tasks that are currently carried out by human experts. Technological advances in digital image acquisition and processing allow building automated visual inspection (AVI) systems [2], [3], which improve the classification performance and reduce the manufacturing costs. In most of these systems, neural networks [4], [5] are used, thanks to their superior learning and generalization capabilities, particularly in classification applications [6]. Histogram-based techniques are used in different studies for extracting several features that include mean, variance, correlation between color channels, and other statistical properties. These features are classified using stochastic classifiers (i.e., Bayesian) [7] or neural network methods [8]. Segmentation of the grains in marble textures is also used in different steps of marble classification. In [9], gray level segmentation is used for classification. In [10], color segmentation is employed for feature extraction. Moreover, a bottom-up segmentation algorithm is used in [11]. A method based on multiresolution decomposition with wavelets is described in [12], while support vector machines are used for classification in [13]. Different applications of AVI systems are also presented in the literature for classification of several materials [14], [15]. In all of the works mentioned earlier, the results were promising but improvements were required for industrial applications. In [16], a detailed texture analysis was introduced as a new approach to improve upon previous results, using sum and difference histograms (SDH) [17] for texture analysis and multi layer perceptron (MLP) neural networks for classification. Although the results were successful, the set of images used in the study was not large enough (75 samples). Hence, the leave-one-out method was used to test the proposed approach. Besides, the dataset in [16] consisted of only three quality groups, named as extra, commercial, and low. In this study, we first obtain a very large and diverse dataset containing 1158 marble surface images (193 marble cube specimens times 6 surfaces), which are classified into four quality groups (shown in Fig. 1) by human experts. In the rest of the paper, the word “sample” is used to refer to a surface image of a cubic marble specimen. Our simulations showed that the application of the existing methods mentioned earlier to our diverse and large dataset cannot provide successful performance results comparable to ones reported in the literature. Among those methods, we obtained the best results using [16]. However, performance was still limited, since only color and texture information was used for classification. Other important factors such as orientation and shapes of the grains must also have been taken into account for further improvements. During our study, we have tested several feature sets and neural network topologies to obtain better classification performance. In these tests, it is observed that different feature sets represent different subgroup(s) in a quality group rather than representing the whole quality group. Therefore, our approach

427

is to use these features in a cascaded manner so that a quality group is classified by classifying all of its subgroups. We extract different features in a successive manner so that each feature set is used only for the subgroup(s) that can be correctly classified by that specific feature set. To prevent computational burden caused by several feature sets, efficient classification schemes are needed. Thus, our approach is realized first by using a twostage network. This cascaded network consists of a preclassifier followed by an MLP. As an alternative realization, a hierarchical radial basis function network (HRBFN) topology is also designed in which correctly classified marble samples are taken out of the dataset at each level and a different (generally more complex) feature extraction method is used for the rest of the samples at the next level. The HRBFN system provides acceptable results for industrial applications. In addition, it facilitates implementation in quasi real time. In this paper, we present our proposed automatic systems for quality-based inspection and classification of marble slabs. The rest of the paper is organized as follows. In Section II, origin and characteristics of the marble samples used in this study and classification constraints are explained along with the image acquisition system and preprocessing of the acquired images. In Section III, formulations of some previously proposed classification methods are presented together with the results obtained by their application to our dataset. Our two-stage network and its simulation results are given in Section IV. The computational algorithms for texture analysis, feature extraction, and classification are detailed, and the HRBFN system is presented in Section V. Lastly, the results obtained are discussed in Section VI.

II. MARBLE SURFACE IMAGES A. Origin and Characteristics of Marble Samples The cubic marble specimens used in this study are obtained from the mines in Manisa region of Turkey. Although there are not unique criteria for classifying surface images of these marble specimens, color, homogeneity, size, orientation, thickness, and distribution of the filled joints (red-brown colored veins) are often used by human experts to visually perform the classification. Similarly, classification of marble surface images in our study is done by human experts considering smooth gradients of color, the presence of joints on the surface, the continuity, thickness and orientation of joints (represented with the term “veins” in this paper), and the ratio of limestone grains and the cohesive matrix regions. Under these criteria, four quality groups have been considered. 1) Homogenous limestone (beige color) [Fig. 1(a)]. 2) Limestone with thin joints (veins) [Fig. 1(b)]. 3) Brecciated limestone (composed of limestone grains of different shape and size cemented with cohesive matrix) [Fig. 1(c)]. Here, cohesive matrix is defined as the collection of joints (veins) that unify and construct a larger area of red-brown material. 4) Homogenous cohesive matrix [Fig. 1(d)].

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

428

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 39, NO. 4, JULY 2009

TABLE I NUMBERS OF SAMPLES IN TRAINING AND TEST SETS

Fig. 2. Image acquisition system used in this study. Camera has been mounted on top of a stand with fixed lighting conditions.

TABLE II PERFORMANCES OF DIFFERENT CLASSIFIERS USING SDH FEATURES WITH DIFFERENT COLOR SPACES (CC: CORRECT CLASSIFICATION, SE: SENSITIVITY, SP: SPECIFICITY)

Fig. 3. Surface image of a marble sample belonging to Group 3. (a) Before sanding and polishing. (b) After sanding and polishing.

B. Image Acquisition System Success of the digital image processing operations greatly depends on the quality of the captured images. Illumination conditions and the resolution of the digital images may influence the performance of the operation directly. Therefore, an image acquisition system has been produced in order to standardize the capturing of the marble surface images. The system consists of an eight-megapixel digital camera (Canon EOS 350D digital camera with 18–55 mm EF-S zoom lens), connection cables, light sources, a desktop computer, and a closed cabinet that was used to ensure a fully isolated and uniformly illuminated area. The camera was set to have a perpendicular position to the bottom surface of the cabinet, and the USB connection to the desktop personal computer was established via cables. The florescent light sources were positioned to prevent blazing that may occur on the surfaces of the marble samples (Fig. 2). The twain interface of the camera’s software enabled both remote shooting over USB connection and the manual adjustment of the aperture, shutter speed, and ISO settings of the camera’s objective. F5.6 aperture and 1/100-s shutter speed combined with 100 ISO value have ensured obtaining images with sufficiently high quality for this study. All sample surfaces are sanded and polished, which are standard procedures in marble industry, generally performed at the end of the production line before the acquisition of the images (Fig. 3). C. Image Preprocessing After the acquisition of surface images of marble samples, the region of interest in each image is determined since the acquired

images contain both marble surface and the black background. Then, the resulting images are reduced in size from 1575 × 1550 to 315 × 310 for the purpose of diminishing the computational burden. Since there is a tradeoff between image resolution and the amount of computation, the final image size was determined after considerable experimentation. All images were saved in “jpeg” format. Typical sample images belonging to the four quality groups after preprocessing are shown in Fig. 1. The numbers of training and test samples in our dataset for each group are given in Table I. III. APPLICATION OF THE EXISTING METHODS FROM THE LITERATURE As explained briefly in Section I, several methods have been proposed for classification of marbles. Recently in [12] and [16], promising results were obtained for industrial applications. Therefore, we first attempted to apply the methods in [12] and [16] to our dataset using different color spaces and different classifiers. Results of the application of these existing methods from literature to our marble image dataset are presented in Tables II, III, and IV. Different classifiers have been used in simulations, and the effect of employing principal component analysis (PCA), which is used for reduction of features, has also been investigated.

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

SELVER et al.: CASCADED AND HIERARCHICAL NEURAL NETWORKS FOR CLASSIFYING SURFACE IMAGES OF MARBLE SLABS

TABLE III PERFORMANCE OF MLP CLASSIFIER IN RGB COLOR SPACE USING FEATURES OBTAINED BY WAVELETS

TABLE IV PERFORMANCE OF MLP CLASSIFIER IN RGB COLOR SPACE USING THE COMBINATION OF SDH AND WAVELETS

The results obtained after applying existing methods from the literature show that, for our large and diverse dataset, the correct classification rates reported in the literature could not be reached. Hence, we conclude that classification performance should be improved by developing novel methodologies. For that purpose, two newly developed techniques have been introduced in Sections IV and V for classification of marble slab images in our dataset. A. Application of Texture-Based Features In [16], textural features are extracted by using SDH [17] and the distance metric is chosen to be the 8-neighborhood. Using the obtained SDH vectors, seven statistical features (mean, variance, energy, correlation, entropy, contrast, homogeneity) [16] are computed. For each color channel, these calculations produce a total of 21 (7 features times 3 color channels) features. Since the number of features is high, PCA [18] is used to prevent the “curse of dimensionality” [19] and to reduce the computational burden. Only the principal components with a contribution equal to or greater than 0.1% are taken into consideration. Thus, the accumulated variance is above 99.5%. This operation reduces the dimension of the feature space from 21 to 8, 9, or 10, depending on the color space. Then, an MLP network [20], which is trained using backpropagation with adaptive learning rate, is employed. The MLP has one hidden layer with six neurons with tangent sigmoid activation functions and an output layer with four neurons with linear activation functions. The network goal is chosen to be 0.001, and the maximum number of iterations is determined as 15 000 epochs. The adaptive learning rate is initialized to 0.01. If performance decreases toward the goal, the learning rate increases with a ratio of 1.05, otherwise it decreases with a ratio of 0.7. These parameter values are set as identical to the ones in [16] for a fair comparison.

429

In our simulations, radial basis function network (RBFN) [20] and probabilistic neural network (PNN) [20] are also used to test the effect of the classifier on the performance. The RBFN used in our simulations is a two-layer network. The first layer has neurons with Gaussian activation functions. The second layer has neurons with linear activation functions. The same topology is also used for the PNN. Spread of the Gaussians is an important parameter for both networks. The important condition to meet is to make sure that spread is large enough so that the active input regions of the neurons in the first layer overlap, which makes the network function smoother and results in better generalization for new input vectors. However, spread should not be so large that each neuron is effectively responding in the same large area of the input space. Spread is chosen to be 0.6952 for the RBFN and 0.1 for the PNN. The color representation of the samples is also an important factor that makes it necessary to analyze different color spaces in order to compare their effect on classification. Being strongly dependent on application, performance of four color spaces must be observed and the best one must be chosen [22]. In [16] and in our simulations, RGB, KL, XYZ, and YIQ [22] were the four different color spaces to be experimented with. All images were converted from RGB format to the other three color spaces by using the well-known color space conversion matrices [22]. As a result of our simulations, performances of different classifiers obtained by using SDH features with different color spaces are given in Table II. Although the KL color space is found to be the best in [16], our results in Table II show that different color spaces do not affect the classification performance significantly for our dataset. Since RGB is the original color space of our images, it allows lower computational complexity. Therefore, RGB is used in the rest of this paper. Our results in Table II also show that MLP network provides a better performance compared to PNN and RBFN. In general, our classification performance is lower than the one reported in [16]. For example, none of the four quality groups are completely (100%) correctly classified, i.e., because our dataset is much larger (1158 images as compared to 75) and it contains more diverse surface images of marble samples (four quality groups versus three in [16]). Therefore, to handle our more challenging dataset, more complicated classification schemes must be devised. We also wanted to observe the effect of PCA on classification performance by repeating the simulations without the application of the PCA to the feature space. As can be observed from Table V, although the classification performance increases without the application of PCA, the correct classification rates of Groups 2 and 3 are around 95% and should be increased. Moreover, the sensitivity values should be higher, especially for Group 1 whose marbles are of best quality. Sensitivity and specificity [20] are statistical measures of the performance of a binary classification test. Sensitivity measures the proportion of actual positives that are correctly identified as such, and specificity measures the proportion of negatives that are correctly identified. There is usually a tradeoff between these two measures, and they are also desired to be high, just as the correct classification rate.

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

430

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 39, NO. 4, JULY 2009

TABLE V PERFORMANCE OF MLP CLASSIFIER IN RGB AND KL COLOR SPACES USING FEATURES OBTAINED BY SDH WITHOUT USING PCA

B. Application of Wavelet-Based Features Next, we have applied the methodology proposed in [12] to our dataset by using wavelet analysis instead of SDH for extracting features. This method relies on the assumption that each marble quality group differs in terms of frequency (spectral) content. Therefore, the aim is to obtain the energy level of a sample and check if it is inside a band of frequency or not. The algorithm used for multiresolution decomposition is the discrete wavelet transform (DWT) [21]. The DWT is equivalent to a series of low- and high-pass filtering of the original signal followed by down sampling. In case of a 2-D signal (i.e., surface image of a marble sample), high-pass filtering includes filtering in the horizontal, vertical, and diagonal directions. Thus, four images are generated at each level of decomposition: one low-pass and three high-pass (horizontal, vertical, and diagonal) images. In our study, three levels of DWT decomposition are performed for each sample image using biorthogonal wavelets [21]. Mean, median, and variance of each level of decomposition are computed to obtain a 1 × 36 feature vector for each sample. Finally, PCA is used to reduce the number of features. Applying PCA showed that 91% of all the variation in the feature matrix was contributed by the variance of the third-level vertical and diagonal details. The classification results (Table III) using MLP network show that performance of the wavelet features is slightly less than the classification performance obtained with the textural features extracted by using SDH. C. Combination of Texture- and Wavelet-Based Features Finally, features obtained by SDH and wavelets are used together as input to the PCA. The output of PCA is given to an MLP network whose results are presented in Table IV. Although there is an increase in classification performance using both wavelet and SDH features, the improvement is not significant. The results also show that the samples from Groups 1 and 4 are classified with acceptable performance rates while misclassification ratio is high between the samples of Groups 2 and 3. D. Comparison of the Simulation Results With Literature When we analyze the graphical representations of the feature space and properties of misclassified samples, we observe that the pattern distributions of Groups 2 and 3 are very similar. Hence, these two groups cannot be separated fully. Based on our simulations, we notice that different feature sets are suc-

Fig. 4.

Challenging marble sample images from (a) Group 2, (b) Group 3.

cessful in classifying different subgroups from Groups 2 and 3. For instance, some samples that belong to Groups 2 and 3 are challenging, because they cannot be classified using features employed earlier. These challenging samples have very similar (grain area)/(total area) ratio as illustrated in Fig. 4(a) and (b). Moreover, their textural properties and spectral energy contents are not different enough to correctly classify these samples. Classification criteria of the human experts for these samples are based on the distribution of the veins. Thus, for these types of sample images, it is important to determine the structure of the veins. As seen from Tables II through V, the results obtained by applying the existing methods to our dataset are not as successful as reported in the literature [12], [16]. This necessitates proposal of new methodologies to increase the separability between Groups 2 and 3 by using different features in a cascaded manner. In the following two sections, we propose new techniques in which correctly classified marble samples (subgroups) are taken out of the dataset at each level and a different (generally more complex) feature extraction method is used for the remaining samples at the next level. Since using more complex feature extraction methods or classifiers can jeopardize the quasi real-time implementation of the system for industrial applications, completely novel approaches are what we need. Two such systems are realized by using: 1) A two-stage network and 2) HRBFN topology. Sections IV and V are devoted to these two novel methods proposed in this study. IV. TWO-STAGE NETWORK In the previous section, the samples that caused a reduction in classification performance are determined mostly to be the visually similar images from Groups 2 and 3. Classifying samples from Groups 1 and 4 is relatively easy except for some challenging surface images. Therefore, a two-stage network is designed to handle the quality groups separately. A similar approach was used for detection of spikes in electroencephalograms in [23]. Our aim in the first stage is, by using two discrete perceptrons, to map all samples into three distinct classes. Class 1: Definite samples from Group 1; Class 2: Definite samples from Group 4;

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

SELVER et al.: CASCADED AND HIERARCHICAL NEURAL NETWORKS FOR CLASSIFYING SURFACE IMAGES OF MARBLE SLABS

Fig. 5.

431

Process chart of the two-stage network.

Class 3: Samples from Groups 2 and 3, and nondefinite samples from Groups 1 and 4. In other words, the definite samples from Groups 1 and 4 are separated from nondefinite samples of these groups in the first stage. The second stage receives all samples from Groups 2 and 3 and only nondefinite samples from Groups 1 and 4. One of the discrete perceptrons is trained in such a way that its output would be 1 for definite samples from Group 1 and 0 otherwise. The second discrete perceptron, on the other hand, is trained to produce 1 for definite samples from Group 4 and 0 otherwise. A sample, which produces 0 at the output of both discrete perceptrons, is assigned to Class 3 (i.e., possible Group 1, Group 4, Group 2, and Group 3). Since samples belonging to Class 1 and Class 2 are precisely classified in the first stage as described earlier, they are not involved in the second stage. However, samples belonging to Class 3 are not classified in the first stage. Therefore, they are to be further processed in the second stage by a nonlinear classifier that is designed to be able to discriminate only the samples in Class 3. The process chart of this two-stage network is given in Fig. 5. One major challenge in this system is to obtain the solution, which separates the definite samples from Groups 1 and 4 without containing any nondefinite samples, among several possible solutions at the first stage. Usual perceptron learning is not proper for this linearly nonseparable problem, and therefore we might not reach the expected solution at the end of iterations. The pocket algorithm [24] has the advantage of finding the optimum separation plane among the possible solutions by means of positive correct ratio for linearly nonseparable problems. Therefore, the procedure used to simulate the first stage is based on the pocket algorithm in Fig. 6. The first-stage preclassifier perceptrons use different sets of features. Those features are (grain area)/(total area) ratio on the marble surface and the mean value of each color layer histogram as a color index. These features are decided upon considering the visual appearance of samples from Groups 1 and 4 that are seen to be separable using these basic features. Preclassifiers are trained using 756 samples and tested on 402 samples. Performance of the preclassifiers is given in

Fig. 6.

The pseudo-code for the pocket algorithm. TABLE VI CLASSIFICATION PERFORMANCE OF PRECLASSIFIERS 1 AND 2

Table VI. Preclassifier 1 is the perceptron that classifies the samples as Group 1 or Non-group 1, and similarly preclassifier 2 is the perceptron that classifies the samples as Group 4 or Non-group 4. Training samples of the postclassifier are then selected as the intersection of the outputs of the preclassifiers (intersection of Non-group 1 and Non-group 4 samples, which are classified by

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

432

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 39, NO. 4, JULY 2009

TABLE VII CLASSIFICATION PERFORMANCE OF MLP CLASSIFIER IN RGB COLOR SPACE USING THE TWO-STAGE NETWORK

the preclassifiers). The postclassifier is an MLP network that has four outputs and is able to classify the previously misclassified samples (belonging to Groups 1 and 4) by the perceptrons. The postclassifier unit has the same parameters and topology with the MLP network explained in Section III. The results are shown in Table VII. Although there is an increase in classification performance for Groups 3 and 4 compared to wavelet and SDH based features, improvement is still not quite satisfactory. Interestingly, slightly negative effect of using PCA on performance is not as evident as in the single-stage network applications in Section III. Applying PCA prior to classification does not create significant differences on the performance results. On the other hand, training duration is increased around 30% as compared to using only an MLP classifier, due to the training of preclassifiers. However, duration of classification of a sample at testing is increased only around 3%. As the performance of the two-stage network is not as high as desired, it is necessary to use additional features to improve the classification results. However, this should be done together with a suitable network topology, because increasing the number of features and extracting more complex features can increase the computational burden drastically. Our approach to overcome this tradeoff is to extract and use more complex features for as less number of samples as possible. In other words, samples that can be classified with simpler features should be classified at the beginning of the classification process and the remaining samples should be classified by using more complex features. Due to its hierarchical structure, HRBFN topology is determined to be suitable for classification using several feature sets that are employed independently during classification. V. CLASSIFICATION WITH HRBFN To realize the hierarchical classification scheme and to determine the classification performance robustly, features and classifiers should be combined in an efficient and computationally tractable manner. Using the divide-and-conquer strategy [25], [26], hierarchical classification attempts to combine classes with similar characteristics into one class, which can be separated later at the succeeding layers. A method for such an approach is proposed for combining multiple probabilistic classifiers on different feature sets in [27]. An automatic feature rank

mechanism is proposed to use different feature sets in an optimal way, and a linear combination scheme has been implemented yielding good results for speaker identification [28], [29]. Another effective approach to improve classification performance in hierarchical classifiers is the pruning of training tokens with good interclass discrimination and successively optimizing features and classifier topologies for the remaining tokens [30]. The backbone of this approach is a combination of iterative optimization of features and classifiers as a function of a reduced training feature subset. In this study, a similar approach is realized by introducing the novel modular classification system in Fig. 7 consisting of both simple classification systems (i.e., a single perceptron etc.) and complex ones (i.e., MLP, RBFN, etc.) which are combined with a data-dependent and automated switching mechanism that decides to apply one of them at each step. In Fig. 7, assume that there are N diverse feature sets, N expert networks, and K groups. Process starts with the complete dataset, and at each level of classification properly classified samples are collected in their corresponding groups (Ck ; k = 1, . . . , K) via set-theoretic union operation, ∪. Afterward, the unclassified samples (rejected data), which are collected in Group CK +1 , are fed into a different network by using the same or a different feature extraction method. This process continues until all of the samples are classified or all features and/or networks are utilized. The switching is based on the rejected samples, which are the samples that cannot be classified into any of the output groups by using the feature space and classifier of that level. At each level, these rejected samples are used as the input of another classifier (i.e., usually a more complex one) after another feature is extracted from them. The switching mechanism of the introduced classification system can be: 1) automated or semiautomated based on data-dependent decision task(s); 2) fixed to perform a predefined order of feature extraction and classification methods as in this study. The introduced modular classification system (either with data-dependent and automated or predefined and fixed switching mechanism) constitutes a new kind of system of classifiers, some of which are simple and therefore efficient in time and memory requirements and the others are complex providing a high classification performance, such that depending on the dataset, herein marble samples, one of the classifiers becomes active. The switching mechanism does indeed perform a classification task that assigns different subgroups of the dataset into one of the available classifiers (and feature extraction methods). Thus, the proposed method combines classifiers in order to achieve good performance and higher efficiency in terms of time and memory requirements. For the application of marble classification, a particular version of this system is used as explained later. Since this study presents an application that is desired to work online, both time efficiency and the overall classification performance are deemed important. Our decision for using the features at a specific order is solely based on computational

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

SELVER et al.: CASCADED AND HIERARCHICAL NEURAL NETWORKS FOR CLASSIFYING SURFACE IMAGES OF MARBLE SLABS

Fig. 7.

433

Process chart of the proposed N-level hierarchical structure.

efficiency of them. Thus, samples that can be classified with simpler features are taken out at the beginning of the classification process. More complex features are extracted only if there are challenging samples that need to be classified. As discussed, HRBFN topology is determined to be suitable for classification in which different feature sets are used in a cascaded manner. The HRBFN was first proposed in [31] where some of the input data were rejected based on an error criterion at the end of each level. These rejected data become input to the next level where the number of neurons (Gaussian units) is determined as a logarithmic function of the number of rejected data. Recently, a new approach has been proposed in [32], in which approximation is achieved through a neural network model. It is a particular multiscale version of RBFN that self-organizes to allocate more units when the data contain higher frequencies. The quasi real-time implementation of the proposed HRBFN is presented in [33]. HRBFN is also used for classification in various applications [34], such as for recognition of facial expressions [35]. Unlike the previous models that are used for function approximation and classification, our HRBFN system consists of a fixed number (four) of RBFNs (levels). At each level, HRBFN consists of four Gaussian units, each of which represents a quality group. In total, the network contains 16 Gaussian units. These 16 Gaussian units can be defined over different range of values, and they construct the quality groups when combined in an appropriate way. If a single level is used, scale of the Gaussians should be small enough to reconstruct each challenging sample. However, this may cause waste of resources (due to the need for a large number of units) and overfitting.

HRBFN performs the mapping s : RD → R, as the union of four classifications, {li (·)}i=1,2,3,4 y(x) =

4 

li (x) = l1 (x)



l2 (x)



l3 (x)



l4 (x).

(1)

i=1

Here, i represents the index of each RBFN network, called  level, and represents the union operation. When all of the four levels, {li (·)}i=1,2,3,4 , are combined, they construct y(x) that represents the overall classification result. Each li (·) is a union of four Gaussian units, with each of which representing a quality group and containing correctly classified samples for that level, li (x) =

4 

wi,j .gij (x − ci,j ; σ i,j ).

(2)

j =1

In (2), gij (·) represents the jth Gaussian unit at the ith level and is defined by gij (·) = exp(−(x − ci,j )T ·(1/σ 2i,j )·(x − ci,j )). Here, {ci,j | ci,j ∈ RD } represents centers and {σ i,j | σ i,j ∈ RD } denotes widths of Gaussian units at each level, with {wi,j | wi,j ∈ R} denoting the synaptic weights. The complete output of HRBFN can be regarded as the combination of all levels. In other words, samples of a quality group that are classified at each level are collected together at the end of the process. Therefore, classification performance strongly depends on the proper determination of the parameters mentioned earlier. The process chart of the HRBFN structure is given in Fig. 8. The main idea behind our HRBFN is to separate the correctly classified and the misclassified marble samples at each level

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

434

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 39, NO. 4, JULY 2009

Fig. 9. Example of overlapping and nonoverlapping regions between two feature sets after the application of PCA and choosing the first and second principal components (♦ denotes samples from group 3, and  denotes samples from group 4).

Fig. 8.

Process chart of the proposed HRBFN structure.

and to reclassify the misclassified samples (rejected data) at the next level using a different (generally more complex) feature extraction method. For the four levels, determination of the centers and widths of four Gaussian units are summarized as follows. A. Determination of the Centers and Widths for a Level In our study, one of the most difficult problems is to distinguish the overlapping samples from the nonoverlapping ones as shown in Fig. 9. Since the samples in the overlapping regions will be rejected at that level, representing an overlapping region (between two Gaussian units) in a clear way is very important. Representing overlapping regions depends on correct determination of centers and widths of Gaussian units located at nonoverlapping regions. A well-known method to determine the centers of RBFNs utilizes a clustering algorithm (i.e., K-means) [36]. However, in our case, using such a clustering method by itself may produce incorrect results, since the correctly classified and rejected data must be carefully determined

and separated. Hence, it should be supported with a proper method for determining the widths of the Gaussian units. Although the use of the same width for all units has proved to be sufficient to obtain a universal approximator [37], it can produce inefficient results for our problem since the rejected data are not considered in fixed width solution. Therefore, determination of a different width for each Gaussian unit is necessary. In our study, the width values of all the Gaussian units have been calculated using a modified version of the closest RBFN heuristic [38]. To find out the center of a Gaussian unit, first, K-means method is used to determine the initial clusters and their centers. Then, at each cluster, misclassified samples that are closest to the cluster center are determined. The width of a Gaussian unit is then calculated as the distance from the Gaussian center to the sample at the border. The sample at the border is the sample that is closer to the center of that quality group represented by the corresponding Gaussian unit than any other sample that belongs to any other quality group. Since the Gaussians used in HRBFN are not necessarily circular, the width search can find different values for dimensions x1 , x2 , . . . , xk −1 in a k dimensional feature space σ i,j = min (ci,j − xb ) , ∀i, j.

(3)

Here, σ i,j is the width of jth Gaussian unit at the ith level and xb is a sample at the border. B. Determination of the Weights Since the center and width determination is carried out in a k − 1 dimensional space for a k dimensional feature space, a minimization algorithm is needed to adjust the weights after the production of the initial model by locating the Gaussian units and calculating their widths. Such an algorithm reduces the approximation error of the model iteratively until the nearest local minimum to the initial configuration is reached. The minimization algorithm chosen in our study is the gradient descent

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

SELVER et al.: CASCADED AND HIERARCHICAL NEURAL NETWORKS FOR CLASSIFYING SURFACE IMAGES OF MARBLE SLABS

435

algorithm [38] because of its fast convergence and relatively less computational complexity. C. HRBFN Procedure for Marble Classification Starting from the training set, msi (x), first, (grain area)/(total area) ratio of the samples are calculated as the feature of the first level. To do so, all samples are thresholded using Otsu thresholding method [39]. Thresholding is applied to the blue color channel, since for most of our sample images, the intensity of the blue component is comparatively low as witnessed by the color histograms. Then, (grain area)/(total area) ratio is calculated as the first feature. After classification using the RBFN network of that level, the rejected data are determined as the samples that stay inside at least one of the overlapping regions. The next level always receives the rejected data of the previous level as input and the structural parameters (i.e., centers, widths, and synaptic weights) are calculated for the rejected data again. The general formulas for calculating the rejected data for level i, as shown in Fig. 8, are given as ri (xn ) = msi(xn ) −

i 

lj (xn )

j =1

= ri−1 (xn ) − li−1 (xn ).

(4)

Thus, the second level receives the rejected data from the first level. Then, for this data 21 textural features (7 statistical features times 3 color channels) are calculated. After PCA and classification with the second level (RBFN network), the rejected data are sent to the third level. At the third level, wavelet features are calculated for the rejected data, PCA is applied, and the third RBFN performs the classification. The fourth level receives the rejected data from the third level, and three morphological features (i.e., area, compactness, and elongatedness) are extracted from the rejected data. After the third level, only the most challenging samples, which belong mostly to Groups 2 and 3, are left. These samples are challenging, because they cannot be classified at the previous levels of the HRBFN network. These challenging samples have very similar (grain area)/(total area) ratio. Moreover, their textural and spectral properties are not considerably different. Therefore, for these samples it is important to determine the structure of the veins. For samples from Group 2, the veins should be separate and in the form of thin and long lines. However, for Group 3, they should be unified and form cohesive material (matrix) regions on the marble surface. Our approach to solve this problem is to use the morphological features at the fourth level of the HRBFN to determine if the veins are separated or unified. To be more accurate, let us consider two challenging samples from Groups 2 and 3 [see Fig. 4(a) and (b)]. For extracting the morphological features, first the contrast of the gray scale image obtained using blue channel information is enhanced by using histogram equalization [Fig. 10(a) and (b)]. Then, all remaining samples are thresholded using Otsu thresholding method [39] as in the first level of the HRBFN. After that, the resulting image is first eroded [22], ˆ x ⊆ A} A ⊗ B = {x | (B)

(5)

Fig. 10. Creation of labeled regions for the challenging samples from Group 2 (a, c, e, g) and Group 3 (b, d, f, h). (a), (b) Blue channel images of marble samples; (c), (d) thresholded blue channel images; (e), (f) labeled regions; (g), (h) labeled regions on the original image.

and then dilated [22], ˆ x ∩ A = ø} A ⊕ B = {x | (B)

(6)

using the same structure element in order to remove very small objects and weakly connected grains [Fig. 10(c) and (d)]. In (5) and (6), A represents the image to be eroded while B is the structure element. Finally, cohesive material regions are determined and labeled using connected component analysis [22] [Fig. 10(e) and (f)]. These labeled regions are shown with the circle marks on the challenging samples in Fig. 10(g) and (h).

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

436

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 39, NO. 4, JULY 2009

After labeling, for each cohesive material region the morphological features of area, compactness and elongatedness are calculated. Then, average of each feature for all labeled grains in a sample is calculated to determine the value of the corresponding feature for that sample image. Area: This feature is defined as the area surrounded by the outer contour of a labeled region normalized by the image size. Let (xk , yk )be the coordinate of the kth outer contour point, with (x0 , y0 ) = (xn , yn ) where n is the number of contour points. Area is defined in (7). Here, image size (image width times image height) is equal to 97 650 (315 × 310).  −1  1/2 nk =0 (xk yk +1 − xk +1 yk ) . (7) Area = (image size) Compactness: Compactness, referring to [40], is a very useful shape descriptor to evaluate the complexity of contours. Compactness is independent of translation, rotation, and scaling. The most compact region that possesses the minimal value of compactness is a circle whose compactness is equal to 4π that is approximately 12.566. Compactness is defined as Compactness =

(labeled region border length)2 . number of pixels in labeled region

(8)

In our dataset, marble sample images that belong to Group 2 have less compact labeled regions than the samples from Group 3. This is because the cohesive material regions of Group 2 are in the form of veins that are thin structures. However, the samples from Group 3 have larger veins that are unified and form cohesive matrix that has a more compact form. For instance, the average value of compactness for the sample in Fig. 4(a) is 102.721 while it is 36.145 for the sample in Fig. 4(b). Elongatedness: Elongatedness, referring to [41], is computed using the width, l1 , and height, l2 , of the bounding rectangle of a region and is defined as Elongatedness = 1 −

l1 l2

where l1 < l2 .

(9)

Marble sample images that belong to Group 2 are more elongated as compared to the samples in Group 3, because the differences between l1 and l2 distances are higher in vein-like structures as in the labeled regions of the samples from Group 2. However, l1 and l2 distances are generally similar in the labeled regions of the samples from Group 3. For example, the average value of elongatedness for the sample in Fig. 4(a) is 0.76 while it is 0.58 for the sample in Fig. 4(b). After classification at the fourth level, our HRBFN stops since no data are rejected any more. Classification results obtained by the HRBFN are given in Table VIII. Thanks to its multilevel structure, the HRBFN significantly improves the classification performance as compared to the other techniques presented in this study. Only two of the samples are wrongly classified as belonging to Group 1 as, in fact, they belonged to Group 2 and only one sample is incorrectly classified in the other way around. Considering Group 4, only two samples are assigned to Group 4 while in fact they belonged to Group 3. None of the samples that belonged to Group 4 is assigned to any other groups. From an

TABLE VIII CLASSIFICATION PERFORMANCE OF THE HRBFN CLASSIFIER

economical point of view, this is a very important result, since marble samples of very high and very low qualities are almost never confused. Thus, it is ensured that the supplied marble samples are always in agreement with their classification. VI. SUMMARY AND DISCUSSION Fully automated novel systems for classifying surface images of marble slabs have been presented in this paper. Before designing these new systems, classification performances of different existing neural network models (i.e., MLP, RBFN, and PNN) have been compared by employing different feature sets and color spaces. Although different combinations of these network models and feature sets are presented to be successful in classification of marble slabs, their performance should be further increased for industrial applications as shown by simulations using our large and diverse dataset consisting of 1158 images divided into four quality groups. First of all, textural features together with MLP, RBFN, and PNN networks are used to classify the images. This process is repeated for four different color spaces. The results show that there is no significant difference among different color spaces, since no more than 1% of correct classification rate difference has been achieved. Therefore, the RGB color space has been selected and used in succeeding simulations because of its computational simplicity. Among these neural network models, MLP shows the best performance with correct classification rates of 96.52%, 91.79%, 94.53%, and 99.25% for Groups 1–4, respectively. However, these performance values are lower than the ones in the system presented in [16], because some of the samples in our dataset belong to different quality groups although their textures are highly similar. Moreover, sensitivity and specificity values should also be higher in addition to the correct classification rate because of the serious economic drawbacks of incorrect classification. These results have necessitated developing an improved and more complex method for classification. In this context, different feature sets have been used including color properties, spectral features (multiresolution wavelets) and their combinations. The experimental results showed that there are no significant differences in the correct classification rates of using these feature sets or their combinations. When the characteristics of misclassified samples are observed, it is determined that some challenging samples, belonging mostly to Groups 2 and 3, are wrongly placed. These samples are challenging, because they cannot be separated fully by extracting

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

SELVER et al.: CASCADED AND HIERARCHICAL NEURAL NETWORKS FOR CLASSIFYING SURFACE IMAGES OF MARBLE SLABS

TABLE IX COMPUTATIONAL COMPLEXITIES OF ALL THE CLASSIFIERS USED IN THE STUDY

the existing features from the complete dataset. These samples mostly have very similar (grain area)/(total area) ratios and textural and spectral information, which preclude their correct classification. Moreover, it is also observed that different feature sets represent different subgroup(s) in a quality group rather than representing the complete quality group. Therefore, we have used these features in a cascaded manner such that a quality group is classified by classifying all of its subgroups separately. We extract different features in a successive manner so that each feature set is used only for the subgroup(s) that can be correctly classified by that feature set. This approach was realized first by using a two-stage network. Although, the correct classification rate was improved significantly, an even better approach was needed for industrial applications. Therefore, HRBFN topology, in which correctly classified marble samples are taken out of the dataset at each level and a different (generally more complex) feature extraction method is used for the remaining samples at the next level, has been used. Since the number of veins on a marble surface and the number of branches of a vein are not known a priori, these features are extracted by using a morphological feature extraction technique. The proposed system is proven to have better performance than the previously proposed systems for the diverse and large dataset used in this study. The HRBFN approach is shown to be very useful for marble classification applications, because using different feature sets to classify different sample subgroups is seem to be very efficient compared to using a single feature set for the whole dataset. Another advantage of using HRBFN network is the possibility of quasi real-time implementation [30], which is an important criterion for industrial applications. Table IX includes computational complexities of all the classification methodologies presented in this paper. All of the simulations are performed on a Pentium 4 PC with 4-GB memory and 3-GHz processor speed. Average feature extraction times (in seconds) for one sample using MATLAB 2007b software is measured to be 0.1548 for (grain area)/(total area) ratio, 0.8062 for SDH, 1.3268 for

437

wavelet features, and 2.6671 for morphological features. Training duration is increased around 30% in a two-stage network topology as compared to using only one MLP classifier due to the training of preclassifiers. However, classification duration of a sample in testing is increased only around 3%. In the proposed HRBFN topology, the time required to train the system is around three times more than the two-stage network. Since training is done offline and only once prior to the implementation of the system, it does not cause a serious concern. For testing, the time required to classify a sample is decreased compared to the usage of only the MLP if the sample can be classified at the first level. As the levels increase, the time required for classifying the samples also increases. Together with feature extraction, average classification times (for levels 1 through 4) for a sample using HRBFN are 0.27, 1.2, 2.65, and 5.43 s. respectively (together with the application of PCA). This time is around 1 s. for the MLP and the two-stage network. Addition of PCA increases the computation time only around 0.1 s. for one classification. Although it depends on the particular factory, in general a marble AVI system requires a processing time of around 10 s. on the conveyor belt for classifying a sample. Hence, considered together with highly successful performance values, our computational complexity results provide enough justification for integrating the proposed system in an industrial environment. As clearly stated in [25] and [26], using a composite feature formed by lumping diverse features together in some way has disadvantages including curse of dimensionality, formation difficulty, and redundancy due to dependent components. As reported in [27]–[29], combining multiple classifiers with diverse features yields improved performance in classification problems like ours. In that vein, our method can also be viewed as a kind of suboptimal solution of the problem. Although an extensive experimentation procedure is carried out during the simulations by changing the order of features and classifiers, this does not yield any significant difference in classification performance, causing only a dramatic increase in overall classification time. This suboptimality may be considered as a weakness of the proposed methodology, which necessitates making careful choices in terms of order of classifiers and utilized features dependent on the priorities (i.e., success rate, processing time, etc.) of the application at hand. Therefore, performing a detailed theoretical analysis for extending the methodology (as in [26]) to realize the joint use of different feature sets and classifiers in an optimal way could be noted as a challenging future study direction. REFERENCES [1] A. B. Yavuz, N. T¨urk, and M. Y. Koca, “The use of micritic limestone as building stone. A case study of Akhisar beige marble in western Turkey,” in Proc. Ind. Miner. Build. Stones, Proc., ˙Istanbul, Turkey, 2003, pp. 277– 281. [2] T. S. Newman and A. K. Jain, “A survey of automated visual inspection,” Comput. Vis. Image Understand., vol. 61, no. 2, pp. 231–262, 1995. [3] E. N. Malamas, E. G. M. Petrakis, M. Zervakis, L. Petit, and J.-D. Legat, “A survey on industrial vision systems, applications, and tools,” Image Vis. Comput., vol. 21, pp. 171–188, 2003. [4] S. Wang and H. Wang, “Password authentication using Hopfield neural networks,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 38, no. 2, pp. 265–268, Mar. 2008.

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

438

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 39, NO. 4, JULY 2009

[5] L. Wang, W. Liu, and H. Shi, “Noisy chaotic neural networks with variable thresholds for the frequency assignment problem in satellite communications,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 38, no. 2, pp. 209–217, Mar. 2008. [6] G. P. Zhang, “Neural networks for classification: A survey,” IEEE Trans. Syst., Man, Cyber. C, Appl. Rev., vol. 30, no. 4, pp. 451–462, Nov. 2000. [7] V. Garcer´an-Hern´andez, L. G. Garc´ıa-P´erez, P. Clemente-P´erez, L. M. Tom´as-Balibrea, and H. D. Puyosa-Pi˜na, “Traditional and neural networks algorithms: Applications to the inspection of marble slabs,” in Proc. IEEE Int. Conf. Syst., Man, Cybern., Vancouver, BC, Canada, 1995, pp. 3960–3965. [8] P. Clemente-P´erez, V. Garcer´an-Hern´andez, H. D. Puyosa-Pi˜na, and L. M. Tom´as-Balibrea, “Automatic system to quality control: Using artificial vision and neural nets for classification of marble slabs in production line,” in Proc. Int. Symp. Artif. Neural Netw., Taiwan, 1995, pp. E3.26– E3.31.T. [9] F. Lumbreras and J. Serrat, “Segmentation of petrographical images of marbles,” Comput. Geosci., vol. 22, no. 5, pp. 547–558, Jun. 1996. [10] M. Deviren, M. K. Balci, U. M. Lelo˘glu, and M. Severcan, “A feature extraction method for marble tile classification,” in Proc. 3rd Int. Conf. Comput. Vis., Pattern Recognit., Image Process., Atlantic City, NJ, 2000, pp. 25–28. [11] L. Shafarenko, M. Petrou, and J. Kittler, “Automatic watershed segmentation of randomly textured color images,” IEEE Trans. Image Process., vol. 6, no. 11, pp. 1530–1543, Nov. 1997. [12] J. D. Luis-Delgado, J. Martinez-Alajarin, and L. M. Tomas-Balibrea, “Classification of marble surfaces using wavelets,” Inst. Electr. Eng. Electron. Lett., vol. 39, no. 9, pp. 714–715, 2003. [13] J. Martinez-Alajarin, “Supervised classification of marble textures using support vector machines,” Inst. Electr. Eng. Electron. Lett., vol. 40, pp. 664–666, 2004. [14] C. Boukouvalas, F. D. Natale, G. D. Toni, J. Kittler, R. Marik, M. Mirmehdi, M. Petrou, P. L. Roy, R. Salgari, and G. Vernazza, “ASSIST: Automatic system for surface inspection and sorting of tiles,” J. Mater. Process. Technol., vol. 82, no. 1–3, pp. 179–188, Oct. 1998. [15] D. M. Tsai and T. Y. Huang, “Automated surface inspection for statistical textures,” Image Vis. Comput., vol. 21, pp. 307–323, 2003. [16] J. Martinez-Alajarin, J. D. Luis-Delgado, and L. M. Tomas-Balibrea, “Automatic system for quality-based classification of marble textures,” IEEE Trans. Syst., Man, Cyber. C, Appl. Rev., vol. 35, no. 4, pp. 488–497, Nov. 2005. [17] M. Unser, “Sum and difference histograms for texture classification,” IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-8, no. 1, pp. 118–125, Jan. 1986. [18] S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,” Chemometr. Intell. Lab. Sys., vol. 2, pp. 37–52, 1987. [19] R. E. Bellman, Adaptive Control Processes: A Guided Tour. Princeton, NJ: Princeton Univ. Press, 1961. [20] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed. Upper Saddle River, NJ: Prentice-Hall, 1998. [21] S. G. Mallat, A Wavelet Tour of Signal Processing, 2nd ed. San Diego, CA: Academic, 1999. [22] R. C. Gonz´alez and R. E. Woods, Digital Image Processing, 2nd ed. Boston, MA: Addison-Wesley, 1992. [23] N. Acir, I. Oztura, M. Kuntalp, B. Baklan, and C. Guzelis, “Automatic detection of epileptiform events in EEG by a three-stage procedure based on artificial neural networks,” IEEE Trans. Biomed. Eng., vol. 52, no. 1, pp. 30–40, Jan. 2005. [24] S. I. Gallant, “Perceptron-based learning algorithms,” IEEE Trans. Neural Netw, vol. 1, no. 2, pp. 179–191, Jun. 1990. [25] M. I. Jordan and R. A. Jacobs, “Hierarchical mixtures of experts and the EM algorithm,” Int. Joint Conf. Neural Netw., vol. 2, pp. 1339–1344, Oct. 1993. [26] K. Chen, “A connectionist method for pattern classification with diverse features,” Pattern Recognit. Lett., vol. 19, no. 7, pp. 545–558, 1998. [27] K. Chen and H. Chi, “A method of combining multiple probabilistic classifiers through soft competition on different feature sets,” Neurocomputing, vol. 20, pp. 227–252, 1998. [28] K. Chen, L. Wang, and H. Chi, “Methods of combining multiple classifiers with different features and their applications to text-independent speaker identification,” Int. J. Pattern Recog. Artif. Int., vol. 11, no. 3, pp. 417–445, 1997. [29] K. Chen, “On the use of different speech representations for speaker modeling,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 35, no. 3, pp. 301–314, Aug. 2005.

[30] D. Kil and F. Shin, Pattern Recognition and Prediction With Applications to Signal Characterization. New York: AIP, 1996. [31] K. V. Ha, “Hierarchical radial basis function networks,” in Proc. Neural Netw., 1998, pp. 1893–1898. [32] S. Ferrari, M. Maggioni, and N. Borghese, “Multiscale approximation with hierarchical radial basis function networks,” IEEE Trans. Neural Netw., vol. 15, no. 14, pp. 178–188, Jan. 2004. [33] N. D’Apuzzo, “Modeling human faces with multi-image photogrammetry,” in Proc. SPIE Three-Dim. Image Capture Appl., San Jose, SPIE, 2002, vol. 4661, pp. 191–197. [34] Y. Chen, L. Peng, and A. Abraham, “Hierarchical radial basis function neural networks for classification problems,” LNCS Adv. Neural Netw.ISNN, vol. 3971, pp. 873–879, 2006. [35] D.-T. Lin and J. Chen, “Facial expressions classification with hierarchical radial basis function networks,” in Proc. Int. Conf. Neural Inf. (ICONIP 1999), vol. 3, pp. 1202–1207. [36] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proc. 5th Berkeley Symp. Math. Stat. Prob., 1967, vol. 281, pp. 281–297. [37] J. Park and I. W. Sandberg, “Universal approximation using radial-basis function networks,” Neural Comput., vol. 3, pp. 257–546, 1991. [38] M. T. Musavi, W. Ahmed, K. H. Chan, K. B. Faris, and D. M. Hummels, “On the training of radial basis function classifiers,” Neural Netw., vol. 5, pp. 595–603, 1992. [39] N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst., Man Cybern., vol. SMC-9, no. 1, pp. 62–66, Jan. 1979. [40] L. Shen, R. M. Rangayyan, and J. E. Leo Desautels, “Application of shape analysis to mammographic calcifications,” IEEE Trans. Med. Imag., vol. 13, no. 2, pp. 263–274, Jun. 1994. [41] M. Sonka, V. Hlavac, and R. Boyle, Image Processing, Analysis and Machine Vision. New York: Chapman & Hall, 1993.

M. Alper Selver was born in 1980. He received the B.Sc. degree in electrical and electronics engineering, in 2002, from Gazi University, Ankara, Turkey, and the M.Sc. degree, in 2005, from Dokuz Eyl¨ul University, Izmir, Turkey, where he is currently working toward the Ph.D. degree in the Department of Electrical and Electronics Engineering. Since 2002, he has been a Research Assistant in the Department of Electrical and Electronics Engineering, Dokuz Eyl¨ul University. His current research interests include radiological and industrial image processing, software development, artificial neural networks, and statistical learning methods.

Olcay Akay (S’95–M’03) received the B.S. (Hons.) degree from Dokuz Eyl¨ul University, Izmir, Turkey, in 1990, the M.S. degree from the University of Michigan, Ann Arbor, in 1993, and the Ph.D. degree from the University of Rhode Island, Kingston, in 2000, all in electrical engineering. He is with the Department of Electrical and Electronics Engineering, Dokuz Eyl¨ul University, where he is currently an Assistant Professor. His current research interests include statistical signal processing, time-frequency analysis methods, and artificial neural networks.

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.

SELVER et al.: CASCADED AND HIERARCHICAL NEURAL NETWORKS FOR CLASSIFYING SURFACE IMAGES OF MARBLE SLABS

439

Emre Ardalı was born in 1983. He received the B.Sc. degree in electrical and electronics engineering and the M.Sc. degree from Dokuz Eyl¨ul University, Izmir, Turkey, in 2005 and 2008, respectively. Since 2005, he has been working at a consumer electronics company in Turkey as a TV Hardware Design Engineer. His current research interests include artificial neural networks, machine learning, and digital signal processing.

¨ Okan Onal was born in 1975. He received the B.Sc. degree in civil engineering from Dokuz Eyl¨ul University, Izmir, Turkey, in 1998, and the Ph.D. degree from the Dept. of Civil Engineering, Dokuz Eyl¨ul University, in 2008. He has been working as a Research Assistant at the Department of Civil Engineering since 2002. His current research interests include digital image processing, rock mechanics, and geotechnics.

A. Bahadır Yavuz was born in 1968. He received the B.Sc. degree in geological engineering, and the M.Sc. and Ph.D. degrees from Dokuz Eyl¨ul University, Izmir, Turkey, in 1992, 1996, and 2001, respectively. Between 1993 and 2008, he was with Torbali Vocational School, Dokuz Eyl¨ul University. Now, he is with the Dept. of Geological Engineering, Dokuz Eyl¨ul University. His current research interests include engineering geology, mass and material properties of natural building stones, weathering, stone deterioration, and construction materials.

¨ ¨ Gurkan Ozden was born in 1965. He received the B.Sc. degree in civil engineering from Istanbul Technical University, Istanbul, Turkey, in 1989, and the Ph.D. degree from Wayne State University Detroit, MI, in 1999. Since 1991, he has been a Faculty Member in the Department of Civil Engineering, Dokuz Eyl¨ul University, Izmir, Turkey. His current research interests include geotechnical earthquake engineering, pile foundations, digital image processing, and foundation engineering.

Authorized licensed use limited to: Columbia University. Downloaded on June 8, 2009 at 15:33 from IEEE Xplore. Restrictions apply.