Learning and Extracting Edges from Images by a Modified ... - CiteSeerX

3 downloads 740 Views 114KB Size Report
learning phase, the network performance is tested to ensure that it can detect .... range of rescaling step, and t, the range of discrete time variation (t from 0 to 2/s) ...
Learning and Extracting Edges from Images by a Modified Hopfield Neural Network Sylvain Chartier

Richard Lepage

Département de psychologie Université du Québec à Montréal [email protected]

Département de génie de la production automatisée École de technologie supérieure [email protected]

Abstract This paper introduced a modified unsupervised Hopfield network that can learn the underlying process in an edge detection task from grey level images. After the learning phase, the network performance is tested to ensure that it can detect only the significant edges and that it can generalise other images.

2. Presentation of the network This model, as any neural network model, is entirely described by its architecture, transmission rule and learning rule. The architecture of the model is illustrated at figure 1.

1. Introduction Natural grey level images contain a lot more information then needed in automatic recognition task. One way to reduce this amount of information is through edge detection. Images resulting from edge detection are much more compact and contain important features that are better suited for various applications [1]. Mainly two avenues have been developed for important feature extractions. The first one comes from computer vision field and is based upon the first or second order derivative of a function over the spatial domain. Either a maximum in the first order derivative of the image or a zero crossing of the second order derivative of the image are detected in order to find important features present in the image The second one is neurally inspired and uses artificial neural networks in order to discover important features by learning. However, a supervised algorithm is mostly often used to obtain the desired network behaviour [2] or a competitive model is tuned to respond to appropriate edges [3]. On the other hand, it will be interesting if an unsupervised correlational neural network can develop the appropriate function that detect the significant edges from grey level images without any previous knowledge of the task. Therefore, in this paper we present a modified Hopfield network that can learn to perform unsupervised multi-scale edge detection.

Figure 1: Illustration of the architecture of N unit networks. The transmission rule used in this network is a bipolar signum function defined by the following expression

x t +1 = sgn[ Wx t ] , t = 1…T

(1)

 a j = 1, If a j > 0  sgn[a j ] = a j = −1, If a j < 0  Else a = 0 j 

(2)

W represents the weight matrix connections, x an input vector and a, the activation function (a = Wxt). We cannot use the original Hopfield learning rule for the simple reason that its learning rule is unbound. Consequently, by learning thousands of different stimuli the network will develop only one eigenvalue and will not be able to accomplish the edge detection task. To

overcome this crucial limits in the model, Chartier and Proulx [4] propose the following learning rule.

Wk +1 = Wk + η[x 0 x T0 − ( Wx t )(Wx t ) T ]

(3)

Where, η (η > 0) represent the learning coefficient and t (t > 0) the number of iteration the state vector has to perform before the weight matrix update. Contrarily to the Hopfield network, this rule is iterative and always converges to the pseudo-inverse of the connection matrix W [4]. Thus, the learning algorithm permits an increase in the storage capacity, better-defined basins of attraction and there is no eigenvalue domination in the resulting weight matrix when the network is place in an over learning situation.

1- Initialization of the weight matrix (Wij = 0). Initialization of the learning coefficient (η = 0.01). 2- Random selection of an input vector. 3- Computation of xt according of the equation 1 4- Computation of the weight matrix update according to the equation 2 (t = 1). 5- Repetition of 2 to 4 for 2000 learning trial. After the learning has been accomplished, we can perform singular value decomposition over the resulting weight matrix.

2.2 Learning Results The figure 3 shows the resulting eigenvalues spectrum. We see that there is 3 important eigenvalues.

Before testing the network with a recall task, it must adjust its weight matrix through learning. To accomplish this, we first select a grey level image that will serve to train the network. Figure 2 shows some images that can serve for the learning. Those images present good varieties of simples, complexes, curves, horizontals and verticals edges. Figure 3: Eigenvalues spectrum of the weight matrix. The first eigenvalue represent the mean of the distribution and does not give any distinctive cues. On the other hand, the eigenvalues number 2 and 3 act like discriminant functions. a)

b) We can visualize their corresponding eigenvectors to look at which underlying categorization process is at work. Figure 4 shows that the second eigenvector acts like a horizontal edge detector and the third eigenvector like a vertical edge detector.

c) d) Figure2: Examples of images that can serve as a learning base.

2.1 Learning Methodology For the learning, we used the image 2a). This image has a dimension of 128x128 pixels. Each pixel is encoded according to a 8 bits grey scale (0 to 255). We first rescale the values of the image in the range [-1, 1]. In addition, we scan a window of 3x3 dimensions over the image where each window is therefore a 9-dimensions vector. From the image, we extract all the 3x3 subimages to form 15625 input vectors. The simulation was done according to the following procedure.

Eigenvector 2 Eigenvector 3 Figure 4: Density plot of eigenvector number 2 and 3. If we look at the real value for each of the eigenvector we see that they act like classic Prewitt differentiation mask.

What differentiates this model from the Prewitt mask is the nonlinear process in the transmission rule. To study

the role of this nonlinearity we performed several edge detection recall task.

3. Edges Detection Before testing the network, we must take a closer look on how the network behaves in the presence and in the absence of an edge. First, the network outputs are windows of 3x3 dimensions. The final output should be 1 in the presence of an edge and 0 in the absence. Thus, we connect the outputs of the network to a single unit that sums the window output values. Second, the network will not only respond to edges of an image but also to the absence of edge. In the absence of an edge, the outputs of the network will all have the same value (all +1, or all -1). As a result, the final output of the network should be 0 if the sum of the output value is 9 (or -9) and 1 otherwise. These two requirements are accomplished by the function express as the equation 4. If c(x) =  

∑x

j

= 9 or

Else,

∑x

j

= −9, 0

(4)

1

Thus, the final response will be 1 if and only if the network has detected an edge. We tested the network with the image illustrated at figure 2a.

3.2 Rescaled input vectors To overcome this limitation, we must rescale the input vectors before the recall process is done. However, the rescaling process will be done iteratively to permit that the network attention is focused on only a small range of values at a time. This is accomplished by the following function.

xr = x0 −1 + s × t

(5)

Where, xr represent the rescale input, s (0 < s ≤ 1) the range of rescaling step, and t, the range of discrete time variation (t from 0 to 2/s). For example, if we want a range of s = 0.2 between each time step, the parameter t will be 0, 1, … , 10. Thus the same input vector will be presented 11 times, each time from a different scaling point. To obtain the final image we must sum all the images obtained from the different scale images and divide it by 2/s + 1. We test the network with this preprocessing on the edge detection task using the same previous image illustrated at the figure 2a. For the simulations we wanted the network to be able to distinguish small contrast variation so we set s to 0.1.

3.3 Results using rescaled inputs 3.1 Results using the summation function The figure 5 shows the resulting output using the summation function. As it can be seen, the network did not detect all the significant edges.

The figure 6 shows examples of images obtained by rescaling the inputs at different time step.

t=2

t=5

t=8

Figure 5: Image obtains after edge detection. In fact, the network only detected edges from dark object over light ones as it is the case for example with the shutters over the wall. The network is not able to detect grey objects over grey ones that vary just a bit as it is the case with the pants over the grass or the window cross separator over the glass.

the time step t.

If we take a closer at the values we see that the pants have a pixel value of 0.8 and the grass a value of 0.6. These two values are part of the same attraction basin and therefore indistinguishable from each others.

We can see from those images that with the rescaling procedure the network attention is focusing over a small range of values. Also, the image with the time step t = 10 is the same as the one illustrated by the figure 5 for the

t = 10 t = 13 t = 16 Figure 6: Images obtain after edge detection in function of

obvious reason that at this time step there is no rescaling. The figure 7 shows the final image obtained from the sum of the 21 rescaled images.

Also, if we look at the generalization performance of the network over the other images, leaving θ = 2, we see that the performance is still good from the results illustrated at figure 9. Finally, it should be noted that the performance is similar to the one obtain with classic convolution masks.

Figure 7: Illustration showing edges of image 2a. We can see in this case that too many edges are detected in the image. By selecting a narrow range of attention, the network found edges that are not significant for further processing. To overcome this difficulty, the network should output only edges that are part in most of the rescaled images. This is simply done by adding a threshold.

3.4 Threshold The desired threshold must be flexible from an image to other. Also, it must be in relation with the number of time steps. In regard to those requirements, the final outputs y is obtained by the following expression

 If y= 

2

s

∑x

t

> 2 s - θ, 1

and 2d.

4. Discussion In light of the previous result the simple modified Hopfield network is able to accurately detect edges from grey level images. This is done by adding some preprocessing and postprocessing. However those treatments are there only to increase the flexibility of the networks over various edge detection situations. Also, by implementing rescaled inputs we increase the computation time by a factor of 2/s. However, the task can be easily implemented in a parallel processing thus the whole process does not take time.

(6)

t =0

Else,

Figure 9: Illustration of detected edges from the image 2b, 2c

0

Where, θ (θ ≥ 0) is the threshold parameter. If we repeat the simulations but this time by setting the threshold at θ = 2 the network should retain only the significant contours.

3.5 Results using a threshold The figure 8 illustrates the edges detected with the reference image 2a. We can see that in this case the performance of the network is adequate. By appropriately setting the threshold, only significant edges remains.

5. Conclusion The present study has demonstrated that a modified Hopfield network can learn and detect edges from grey level images. We also demonstrated that the weight matrix acts like a Prewitt mask obtained from an unsupervised process. Finally, we showed that by adding a rescaled function and a threshold function the network is able of good flexibility in detecting significant edges.

6. References [1] Marr, D., & Hildreth, E. (1980). Theory of edge detection. Proceedings of the Royal Society of London, Series B, 207, 187-217. [2] A. Joshi and C.H. Lee, “Backpropagation learns marr's operator,” Biological Cybernetics, vol. 70, 1993, pp. 65-73. [3] Cohen, M.A. and Grossberg, S. “Neural dynamics of brightness perception: Features, boundaries, diffusion, and resonance,” Perception and Psychophysics, 1984, pp. 428-456.

Figure 8: Illustration of the detected edges of image 2a.

[4] S. Chartier and R. Proulx, “A new online unsupervised learning rule for the BSB model,” in Proc. IJCNN, 2001, pp. 448-453.

Suggest Documents