XMUBET with CONSENT: Pixel Hostility Induced Multiscale Object Extractor #
Siddhartha Bhattacharyya 1 , Paramartha Dutta 2
1
2
Department of Information Technology, Kalyani Government Engineering College Kalyani-741235, India,
[email protected] Department of Computer Science and Technology, Kalyani Government Engineering College Kalyani-741235, India,
[email protected] Abstract
Capability of mapping image information into multiple scales of gray is induced in a single multilayer self organizing neural network (MLSONN) by a MUBET activation function. An eXtended MUltilevel BETa (XMUBET) activation function with CONtext SENsitive Thresholding (CONSENT) based on image neighborhood context information, for multiscale object extraction using a single (MLSONN) is proposed in this article. A pixel hostility index is proposed which can account for the neighborhood context information in an image. The threshold values for the XMUBET are derived from the multiscale pixel neighborhood contexts by computing the hostility indices of the image pixels. A single MLSONN with error compensation based on the linear indices of fuzziness in the output layer outputs, is then employed to extract multiscale objects from a multiscale image scene. An application of the proposed XMUBET with CONSENT is demonstrated with two real life multiscale images. The results are substantiated with the standard correlation coefficients between the original and the extracted images. 1. Introduction Binary object extraction can be dealt with neural networking approaches [1][2][3]. Different neural network topologies are in existence for handling these tasks. These network architectures essentially map image information into binary outputs guided by a characteristic transfer/activation function. However, neither the existing topologies nor the existing transfer functions can be effectively used for the extraction of multiscale objects. To incorporate multiscale detection capability in the networks, either the existing network topologies or the transfer functions are needed to be modified. Of the various attempts in this direction, some have used a collection of several MLSONNs in a layered architecture, for this purpose [4]. Here, the number of such networks in the collection depends on the number of classes in the multiscale image scene. This however implies a proportional increase in the network complexity with the number of classes. Other attempts [5] were centered on using a multilevel sigmoidal activation function as a transfer function to a single MLSONN. Thus, multiscale object extraction is achieved without increasing the network complexity. Attempt was also made using the
0-7803-8894-1/04/$20.00 2004 IEEE
277
MUBET activation function [6] to induce multiscaling capability in a single MLSONN for multiscale object extraction. The characteristics of the MUBET activation function are guided by a threshold value, which in turn, is determined by two parameters. Specific values of the parameters were chosen resulting in fixed thresholding values for the transfer function [6]. Such a choice of the parameters was made with the presumption of a homogeneous gray scale distribution in the image scene necessitating a uniform and fixed thresholding. Real life images, however lack homogeneity in gray scale distribution. These images, on the contrary, exhibit fair amount of heterogeneity in gray scales, which can be appreciated from the histogram of an image. This implies that variable and nonuniform thresholding over the entire image gray scale gamut is a requirement for efficient object extraction. To be precise, the performance of an object extractor will greatly depend on this image heterogeneity. The more an object extractor is capable of incorporating this heterogeneity in its transfer function through the threshold values, the better it is going to perform. In this article, we have proposed an extended MUBET transfer function (XMUBET) which takes into account the image context information to decide the appropriate threshold values. The context sensitive threshold (CONSENT) values are computed using the pixel-neighbor contribution in a second order neighborhood. This contribution is determined by a proposed hostility index for a pixel. The resulting XMUBET with CONSENT acts as the transfer function for a single MLSONN based object extractor. An application of the approach is demonstrated with two real life multiscale images. 2. Mathematical Preliminaries A brief idea about fuzzy sets, linear index of fuzziness and the hostility index is introduced in this section. A. Fuzzy sets and subsets A fuzzy set [7] [8] contains elements {x1 , x2 , x3 , ...., xn } with a certain degree of membership, µA (xi ), i = 1, 2, 3, ..., n. µA (xi ) lies in [0,1]. A higher value of µA (xi ) implies strong containment, whereas, a lower µA (xi ) implies weak containment of the element xi in set A. The height (hgt) of a fuzzy set is the maximum membership value of the elements of a fuzzy set. If, for a fuzzy set, hgt = 1, then it is a normal fuzzy set,
ISSNIP 2004
otherwise, it is referred to as a subnormal fuzzy set. A normal fuzzy set is a superset of several nonempty subnormal fuzzy subsets. A subnormal fuzzy subset (As ) can be converted to its normalized equivalent by means of the normalization operator given by, N orm As (x) =
As (x) hgt(As )
(1)
For a subnormal fuzzy subset with support [L, U ], the normalization operator is expressed as, N orm As (x) =
As (x) − L U −L
(2)
The corresponding denormalization can be achieved by
homogeneity/heterogeneity in a second order neighborhood can be accounted for by a hostility index defined as, 8
ζ=
|p − qi | 1 8 i=1 |p + 1| + |qi + 1|
(8)
where p is the gray value of the candidate pixel and qi , i = 1, 2, 3, ....., 8 are the gray values of its neighbors. ζ lies in [0,1]. A higher value of ζ implies lower pixel level homogeneity and a lower value of ζ implies higher pixel level homogeneity. This pixel hostility index is used to determine the pixel neighborhood context sensitive thresholding (CONSENT) for the proposed extended MUBET (XMUBET) activation function. 3. XMUBET with CONSENT
Denorm As (x) = L + (U − L)N orm As (x)
(3)
The index of fuzziness [2] of a fuzzy set A having n supporting points is defined as 2 ν(A) = k d(A, A) n where d(A, A) denotes the its nearest ordinary set A. fuzzy set A is defined as 0 µA (x) = 1
(4)
distance between fuzzy set A and An ordinary set A nearest to the
(5)
The value of k depends on the type of distance used. k=1 is used for generalized Hamming distance. The corresponding index of fuzziness is called the linear index of fuzziness νl (A). It is given by, (6)
i.e., n
νl (A) =
2 [min{µA (xi ), (1 − µA (xi ))}] n i=1
0
t
Kxα (1 − x)β dx
α, β ≥ 0, t ∈ [0, 1]
(9)
where t represents the class widths and K is a normalizing constant such that,
0
µA (x) > 0.5 .
n 2 |µA (xi ) − µA (xi )| νl (A) = n i=1
f (t) =
K = 1
if µA (x) ≤ 0.5 if
The MUBET activation function [6] is derived from the generalized beta activation function, which is given by,
1 xα (1
(10)
− x)β dx
The function is symmetrical around the α and β parameters, which control the shape and slope of the function. The different levels of the MUBET activation function (shown in Fig. 1(a)) are derived from this generalized form as, f (t) ← f (t) + (γ − 1)f (c), (γ − 1)c ≤ t < γc
(11)
where, γ represents the gray scale index and 1 ≤ γ ≤ L, the number of gray scale objects. Here, c represents the gray scale contribution (assumed equal for all scales). Different gray scale responses are obtained by varying the α and β parameters.
(7)
B. Hostility Index A pixel (i, j) in a lattice/image of dimension (M × N ) may be related to its neighboring pixels by different orders of neighborhood. For a first order neighborhood, there are four neighbors to a pixel, whereas, for a second order neighborhood, there are eight neighbors to a pixel. The distribution of gray levels of the pixels in a neighborhood reflects the degree of homogeneity/heterogeneity. The closer the gray values of a pixel and its neighbors, the higher is the homogeneity in the neighborhood and lesser is the pixel hostile to its neighbors. On the contrary, a heterogeneous neighborhood implies sharp deviation in the gray values of the neighborhood pixels and more hostile is a pixel to its neighbors. This
ISSNIP 2004
278
1 Fig. 1: (a)MUBET with fixed point thresholding α α+β and 1 1 c1 and α XMUBET with CONSENT based thresholding α α+β c1
c1
α2 α2 +β2 αc2 c2 +βc2
(b)
For a fixed value of the α and β parameters, the different levels of the function operates in a fixed thresholding mode. As regards to an image context, this is valid if and only if the pixel neighborhood regions are homogeneous. However, in real life images, there is always some heterogeneity in a neighborhood region. This local level heterogeneity of the image context, which is reflected by the pixel hostility indices, influences the threshold values for the different levels of the MUBET activation function. To be specific, the context sensitive threshold (CONSENT) value for a particular level is a linear function of the pixel hostility and is given by, αi αi−1 αi+1 ) − f( ) = (1 − ζ) f ( αi + βi αi+1 + βi+1 αi−1 + βi−1 (12) where (αi−1 , αi , αi+1 ) and (βi−1 , βi , βi+1 ) are the parameters of the function at the (i−1)th , ith and the (i+1)th levels. With this CONSENT value, the generalized beta activation function takes the form, f (t) =
0
t
Kxα (1 − x)ατ dx
α, β ≥ 0, t ∈ [0, 1]
(13)
β is the CONSENT value. The corresponding where τ = α normalizing constant (K) is obtained as,
K = 1 0
1 xα (1 − x)ατ dx
(14)
The XMUBET activation function with CONSENT (shown in Fig. 1(b)) is derived from this generalized form using eqn. 11. From Fig. 1(a) and 1(b), it is evident that in contrary to the MUBET activation function, the XMUBET activation function lacks symmetry. This is solely due to the fact that the local level image neighborhood heterogeneity has been accounted for by incorporating it through the pixel hostility index. The XMUBET activation function, thus more aptly replicates the image context information as compared to the MUBET activation function. 4. MLSONN based Multiscale Object Extractor using XMUBET with CONSENT A single MLSONN [6] using a MUBET activation function and comprising an input layer, any number of hidden layers and an output layer, can be used to extract multiscale objects from a multiscale image scene by means of self organization. The same network can be employed to extract multiscale objects from a multiscale image scene (with heterogeneity in image context) if it uses the XMUBET activation function with CONSENT based thresholding. The neurons, which correspond to the image pixels, are not connected to each other within the same layer. However, each neuron in a layer is connected to the corresponding neuron in the previous layer and to its neighbors following a neighborhood based topology. There are connections between the output layer and the input layer neurons for feedback of outputs. The inputs to the input layer of the network lie in [0,1], a value proportional to
279
the pixel gray values and all the interconnection weights are initially set to 1. The schematic representation of a threelayer network architecture for a fully connected second order neighborhood is shown in Fig. 2.
Fig. 2: Fully connected three-layer SONN (with second order connectivity)
The total input (Ii ) to any neuron in any layer is given by Ii =
wij oj
(15)
j
where, oj is the output of the j th neuron in the previous layer and wij is the connection weight between the ith node of one layer and the j th node of the previous layer. The output of a node i is obtained as, oi = f (Ii )
(16)
where f is the XMUBET activation function with CONSENT. These outputs acts as inputs for the next layer of the network. In this way, the network inputs are propagated in the network from one layer to the next. The output layer outputs of the network can be regarded as a fuzzy superset of several subnormal fuzzy subsets. Since the network operates in a self supervised mode, the system error is determined by computing the linear index of fuzziness in brightness of each of these fuzzy subsets. The subnormal fuzzy subsets are normalized to their normal equivalents and the corresponding linear indices of fuzziness are evaluated with respect to their nearest ordinary sets. The linear indices of fuzziness are then denormalized so as to obtain the system error. These errors are backpropagated to adjust the interconnection weights between the neurons at different layers. The outputs are then fed back to the input layer. This is carried on until the system error gets reduced to zero or to some tolerable limit. On stabilization, the input image gets segregated into different multiscale regions. 5. Proposed Object Extraction Methodology The entire technique for multiscale object extraction using a single three layer SONN has been accomplished in three phases, which are discussed below. The flow diagram is shown in Fig. 3.
ISSNIP 2004
CONSENT based XMUBET activation function for multiscale object extraction is presented using a Lena image (fig. 4(a)) and a biomedical image (fig. 5(a)). The extracted Lena and biomedical images using 4, 6 and 8 class extraction are shown in fig. 4(b), 4(c), 4(d) and fig. 5(b), 5(c), 5(d) respectively.
Fig. 3: Flow diagram of the object extraction process
A. Determination of Pixel Hostility Index The pixel level hostility indices, which are indicative of the level of homogeneity/heterogeneity in the neighborhood regions of an image scene are computed using eqn. 8. These pixel hostility indices induce non-uniformity in the thresholding values of the XMUBET activation function by determining the CONSENT values of the activation function. B. Designing XMUBET with CONSENT The context sensitive thresholding values pertaining to the pixel neighborhood contributions in an image scene, determined through the pixel hostility indices, are applied to the generalized beta activation function for designing the XMUBET activation function. The different levels of the XMUBET activation function are obtained from this generalized beta activation function. The resultant XMUBET, characterized by non-uniform, variable and context sensitive thresholding (CONSENT), is used by a single three layer SONN for multiscale object extraction from a multiscale image scene.
Fig. 4: Lena images (a) original image (b) 5 class extraction (c) 7 class extraction (d) 9 class extraction
C. Object extraction by a single three layer SONN A single three layer SONN comprising an input layer, a hidden layer and an output layer is used to extract multiscale objects from a multiscale image scene by using a XMUBET activation function (with CONSENT values) as the transfer function. The linear indices of fuzziness of the different subnormal fuzzy subsets of gray levels at the output layer are used as the system error. These are evaluated by computing the corresponding indices of fuzziness of the normalized versions of these subnormal fuzzy subsets and denormalizing them back to the respective subnormal domains. This system error is back propagated to adjust the interconnection weights. The outputs are then fed back to the input layer for further processing until the system error gets reduced to some tolerable limit. Finally, the input image scene is segregated into several multiscale regions corresponding to the number of levels of the XMUBET activation function. 6. Results The proposed technique has been applied on several real life multiscale images. However, an application of the proposed
ISSNIP 2004
280
Fig. 5: Biomedical images (a) original image (b) 5 class extraction (c) 7 class extraction (d) 9 class extraction
A. Choice of XMUBET parameters The XMUBET activation function is characterized by the α and the β parameters. While generating the XMUBET activation function, the α parameter is chosen to be unity. The pixel level hostility indices determine the CONSENT values specific to a pixel neighborhood. These CONSENT values along with the α parameter are used to determine the β
parameter. The nature of the XMUBET activation function is thus guided by the β parameter through the CONSENT values. B. Evaluation of image quality For the evaluation of the quality of the extracted images, we have evaluated the standard measure of the correlation coefficient (ρ) [5] between the original and the extracted images, which is given by, 1 n2
ρ=
1 n2
n n
(Xij − X)(Yij − Y )
(17) n n n n 1 2 2 (Xij − X) (Yij − Y ) n2 i=1 j=1
i=1 j=1
i=1 j=1
where, Xij , 1 ≤ i, j ≤ n and Yij , 1 ≤ i, j ≤ n are the original and the processed images respectively of dimension n × n and X and Y are their respective average intensity values. The values of ρ for the two test images for different number of classes (L=4, 6 and 8) are shown in Table 1.
References [1] J. Hertz, A. Krogh, R. G. Palmer, Introduction to the theory of neural computation, Addison-Wesley, 1991. [2] A. Ghosh, N. R. Pal, S. K. Pal, “Self-organization for object extraction using multilayer neural network and fuzziness measures”, IEEE Transactions on Fuzzy Systems, Vol. 1, No. 1, 1993, pp. 54-68. [3] M. N. Nasrabadi, W. Li, “Object recognition by a Hopfield neural network”, IEEE Transactions on Systems, Man and Cybernetics, Vol. 21, No. 6, 1991, pp. 1523-1535. [4] A. Ghosh, A. Sen, Soft Computing Approach To Pattern Recognition And Image Processing, World Scientific, 2002. [5] S. Bhattacharyya, P. Dutta, U. Maulik, “Multi-scale object extraction using self organizing neural network with a multi-level sigmoidal activation function”, In: Proc. Fifth Intl. Conference on Advances in Pattern Recognition, India, 2003, pp. 435-438. [6] P. Dutta, S. Bhattacharyya, K. Dasgupta, “Multi-scale object extraction using a self organizing neural network with a multi-level beta activation function”, In: Proc. Intl. Conference on Intelligent Sensing and Information Processing, India, 2004, pp. 139-142. [7] L. A. Zadeh, “Fuzzy Sets”, Inform. and Control, Vol. 8, 1965, pp. 338-353. [8] T. J. Ross, T. Ross, Fuzzy Logic With Engineering Applications, McGraw Hill College Div., 1995.
Table 1: Correlation coefficients for the extracted images with CONSENT based thresholding
L 4 6 8
Lena Image 0.932443 0.937130 0.938957
Biomedical Image 0.986238 0.986814 0.987096
From the table it is clear that the extraction efficiencies as indicated by the standard correlation coefficients for both the images are fairly high. It is empirically observed that, for any choice of number of classes (L), there is significant improvement in the correlation coefficients by resorting to variable CONSENT based thresholding vis-a-vis its fixed counterpart. For example, with 8 class extraction, the correlation coefficients were found to be (0.909, 0.906 and 0.903 for the Lena image) and (0.973, 0.973 and 0.972 for the Biomedical image) with different choices of the α and β parameters, viz., (1,1), (2,2) and (3,3) as against the figures appearing in the final row of Table 1. 7. Discussion and Conclusions A CONSENT based XMUBET activation function for multiscale object extraction using a single MLSONN (with a three layer SONN as a specific example) is introduced in this article. The CONSENT threshold values of the activation function are derived from the neighborhood context information in the form of pixel level hostilities. This activation function acts as the characteristic transfer function of a three layer SONN for segmenting the multiscale image information into a number of different homogeneous multiscale object regions. The CONSENT values are used to determine the β parameters of the XMUBET activation function. However, the performance of the activation function still remains dependent on the choice of the α parameter. Methods needs to be investigated to determine the α parameters from the image context information. The authors are currently engaged in this direction.
281
ISSNIP 2004
ISSNIP 2004
282