Representing Non-Rigid Objects with Neural Networks José García-Rodríguez,
[email protected] Francisco Flórez-Revuelta,
[email protected] Juan Manuel García-Chamizo,
[email protected] Dept. of Computer Technology, University of Alicante, Spain
INTRODUCTION Self-organising neural networks try to preserve the topology of an input space by means of their competitive learning. This capacity has been used, among others, for the representation of objects and their motion. In this work we use a kind of self-organising network, the Growing Neural Gas, to represent deformations in objects along a sequence of images. As a result of an adaptive process the objects are represented by a topology representing graph that constitutes an induced Delaunay triangulation of their shapes. These maps adapt the changes in the objects topology without reset the learning process.
BACKGROUND Self-organising maps, by means of a competitive learning, make an adaptation of the reference vectors of the neurons, as well as, of the interconnection network among them; obtaining a mapping that tries to preserve the topology of an input space. Besides, they are able of a continuous re-adaptation process even if new patterns are entered, with no need to reset the learning. These capacities have been used for the representation of objects (Flórez, García, García & Hernández, 2001)] (Figure 1) and their motion (Flórez, García, García & Hernández, 2002) by means of the Growing Neural Gas (GNG) (Fritzke, 1995) that has a learning process more flexible than other self-organising models, like Kohonen maps (Kohonen, 2001).
Fig. 1. Representation of two-dimensional objects with a self-organising network
These two applications, representation of objects and their motion, have in many cases temporal constraints, reason why it’s interesting the acceleration of the learning process. In computer vision applications the condition of finalization for the GNG algorithm is commonly defined by the insertion of a predefined number of neurons. The
election of this number can affect the quality of the adaptation, measured as the topology preservation of the input space (Martinetz & Schulten, 1994). In this work GNG has been used to represent two-dimensional objects shape deformations in sequences of images, obtaining a topology representing graph that can be used for multiple tasks like representation, classification or tracking. When deformations in objects topology are small and gradual between consecutive frames in a sequence of images, we can use previous maps information to place the neurons without reset the learning process. Using this feature of GNG we achieve a high acceleration of the representation process. One way of selecting points of interest in 2D shapes is to use a topographic mapping where a low dimensional map is fitted to the high dimensional manifold of the shape, whilst preserving the topographic structure of the data. A common way to achieve this is by using self-organising neural networks where input patterns are projected onto a network of neural units such that similar patterns are projected onto units adjacent in the network and vice versa. As a result of this mapping a representation of the input patterns is achieved that in post-processing stages allows one to exploit the similarity relations of the input patterns. Such models have been successfully used in applications such as speech processing (Kohonen, 2001), robotics (Ritter & Schulten, 1986), (Martinez, Ritter, & Schulten, 1990) and image processing (Nasrabati & Feng, 1988). However, most common approaches are not able to provide good neighborhood and topology preservation if the logical structure of the input pattern is not known a priori. In fact, the most common approaches specify in advance the number of neurons in the network and a graph that represents topological relationships between them, for example, a twodimensional grid, and seek the best match to the given input pattern manifold. When this is not the case the networks fail to provide good topology preserving as for example in the case of Kohonen’s algorithm.
REPRESENTATION AND TRACKING OF NON-RIGID OBJECTS WITH TOPOLOGY PRESERVING NEURAL NETWORKS This section is organized as follows: first we provide a detailed description of the topology learning algorithm GNG. Next an explanation on how GNG can be applied to represent objects that change their shapes in a sequence of images is given. And finally a set of experimental results using GNG to represent different input spaces is presented in. The approach presented in this paper is based on self-organising networks trained using the Growing Neural Gas learning method (Fritzke, 1995), an incremental training algorithm. The links between the units in the network are established through competitive hebbian learning (Martinetz, 1994). As a result the algorithm can be used in cases where the topological structure of the input pattern is not known a priori and yields topology preserving maps of feature manifold (Martinetz & Schulten, 1994). Recent studies has presented some modifications of the original GNG algorithm to improve the robustness of the cluster analysis (Cselényi, 2005), (Cheng & Zell, 2000), (Qin & Suganthan, 2004), (Toshihiko, Iwasaki & Sato, 2003), but none of them use the structure of the map as starting point to represent deformations in a sequence of objects shapes.
Growing Neural Gas With Growing Neural Gas (GNG) (Fritzke, 1995) a growth process takes place from a minimal network size and new units are inserted successively using a particular type of vector quantisation (Kohonen, 2001). To determine where to insert new units, local error measures are gathered during the adaptation process and each new unit is inserted near the unit which has the highest accumulated error. At each adaptation step a connection between the winner and the second-nearest unit is created as dictated by the competitive hebbian learning algorithm. This is continued until an ending condition is fulfilled, as for example evaluation of the optimal network topology based on some measure. Also the ending condition could it be the insertion of a predefined number of neurons or a temporal constrain. In addition, in GNG networks learning parameters are constant in time, in contrast to other methods whose learning is based on decaying parameters. In the remaining of this Section we describe the growing neural gas algorithm and ending condition as used in this work. The network is specified as: A set N of nodes (neurons). Each neuron c N has its associated reference vector wc R d . The reference vectors can be regarded as positions in the input space of their corresponding neurons. A set of edges (connections) between pairs of neurons. These connections are not weighted and its purpose is to define the topological structure. An edge aging scheme is used to remove connections that are invalid due to the motion of the neuron during the adaptation process. The GNG learning algorithm to approach the network to the input manifold is as follows: d 1. Start with two neurons a and b at random positions wa and w b in R . 2. Generate a random input pattern according to the data distribution P ( ) of each input pattern. In our case since the input space is 2D, the input pattern is the ( x, y) coordinate of the points belonging to the object. Typically, for the training of the network we generate 1000 to 10000 input patterns depending on the complexity of the input space. s s 3. Find the nearest neuron (winner neuron) 1 and the second nearest 2 using squared Euclidean distance. s 4. Increase the age of all the edges emanating from 1 . 5. Add the squared distance between the input signal and the winner neuron to a s counter error of 1 such as: 2 (1) error(s1 ) ws1
s Move the winner neuron 1 and its topological neighbours (neurons s connected to 1 ) towards by a learning step w and n , respectively, of the total distance: (2) ws1 w ( ws1 ) (3) ws n ( ws ) n
n
If s 1 and s 2 are connected by an edge, set the age of this edge to 0. If it does not exist, create it. a 6. Remove the edges larger than max . If this results in isolated neurons (without emanating edges), remove them as well. 7. Every certain number of input signals generated, insert a new neuron as follows: Determine the neuron q with the maximum accumulated error. Insert a new neuron r between q and its further neighbour f : wr 0.5wq w f
(4)
Insert new edges connecting the neuron r with neurons q and f , removing the old edge between q and f . Decrease the error variables of neurons q and f multiplying them with a constant . Initialize the error variable of r with the new value of the error variable of q and f .
8. Decrease all error variables by multiplying them with a constant . 9. If the stopping criterion is not yet achieved, go to step 2. (In our case the criterion is the number of neurons inserted)
Representation of 2D Objects with GNG x , y T I x , y Given an image I ( x , y ) R we perform the transformation T that associates to each one of the pixels its probability of belonging to the object, according to a property T . For instance, in figure 2, this transformation is a threshold function. P ( ) T , we can apply the learning algorithm If we consider x , y and of the GNG to the image I , so that the network adapts its topology to the object. This adaptive process is iterative, so the GNG represents the object during all the learning.
Fig. 2. Silhouette extraction.
As a result of the GNG learning we obtain a graph, the Topology Preserving Graph TPG N ,C , with a vertex (neurons) set N and an edge set C that connect them (figure 1). This TPG establishes a Delaunay triangulation induced by the object (O’Rourke, 2001).
Representing Topology Deformations in Objects The model is able also to characterize different parts of an object, or several present objects in the scene that had the same values for the visual property T , without reset the different data structures for each one of the objects. This is due to the GNG capacity to divide itself into different parts when removing neurons and can be very useful to represent objects that change their topological structure breaking into small pieces or changing their shapes along a sequence of images. In this case a modification in the original algorithm of GNG must be done generating in step 2 a higher number of input signals to readapt from the previous map to the new image and avoiding steps 8 and 9 where neurons are deleted or added if necessary. None of the modifications of the original GNG algorithm to improve the robustness of the cluster analysis (Cselényi, 2005), (Cheng & Zell, 2000), (Qin & Suganthan, 2004), (Toshihiko, Iwasaki & Sato, 2003) use the structure of the map as a starting point to represent deformations in a sequence of objects shapes. In this work GNG has been used to represent two-dimensional objects shape deformations in sequences of images, obtaining a topology representing graph. When deformations in objects topology are small and gradual between consecutive frames in a sequence of images, we can use previous maps information to place the neurons without reset the learning process. Using this feature of GNG we achieve a high acceleration of the representation process. For example in figure 3 are represented some objects with colour as a common feature in both images, that represent the same objects but as a foreground in white on the left and as a background in black on the right.
Fig. 3. Representation of objects with similar visual properties as foreground and background
Experiments To illustrate GNG capacities to represent topological deformations in objects, we have adapted the maps to an object shape that changes its topology from a compact square into four small squares in four steps (frames) obtaining graphs that represent the topology of the object shape along the images sequence but without reset the learning process for any image.
Figure 4 shows the original sequence of images used as input space for the selforganising map where from a homogenous square in the first image (on the left) four small squares are created in the last image (on the right). On the bottom of the figure are showed the results of the GNG adaptation establishing white colour as a visual property of objects to be represented. From the first map (on the left), new maps are obtained based on the previous one without reset the learning process. This feature of GNG allows an acceleration of the images sequence representation. .
Fig. 4. Results of GNG adaptation to changes in the input space
As can be seen in the sequence of images, the map is able to separate the neurons into four groups representing the different squares in the original images when the distance between them is higher than the average of length of the edges that connects the neurons. Figure 5 represents a sequence of deformations from a small circle to an ellipse and finally to a square used as input space to the GNG. The results of the adaptation of the map without reset the learning algorithm between frames are showed.
Fig. 5. Object deformation with GNG adaptation
The parameters used for the simulation are: N=100, λ = 1000 for the first map and
10000-20000 for the subsequent maps, w 0.1 , n 0.001 , 0.5 , = 0.95, max 250 . The computational cost to represent a sequence of deformations is very low, compared with methods based on the adaptation of a new map for any frame of the sequence, since our method does not reset the algorithm for new frames. This feature provides the method with real-time capabilities.
FUTURE TRENDS The iterative and parallel performance of the presented representation model is the departure point for the development of high performance architectures that supply a characterization and tracking of non-rigid objects depending on the time available.
CONCLUSION In this paper, we have demonstrated the GNG capacity of representation of bidimensional objects. Establishing a suitable transformation function, the model is able to adapt its topology to the shape of an object. Then, a simple, but very rich representation of the objects is obtained. The model, by its own adaptation process, is able to divide itself so that it can characterize different fragments from an object or different objects in the same image. In addition, GNG can represent deformations in objects topology representing them along a sequence of images without reset the learning process. This feature accelerates the process of representation and tracking of objects.
REFERENCES Flórez, F., García, J.M., García J. & Hernández, A. (2001). Representation of 2D Objects with a Topology Preserving Network. In Proceedings of the 2nd International Workshop on Pat-tern Recognition in Information Systems (PRIS’02), Alicante. ICEIS Press 267-276. Flórez, F., García, J.M., García, J. & Hernández, A. (2001). Hand Gesture Recognition Following the Dynamics of a Topology-Preserving Network. In Proc. of the 5th IEEE Intern. Conference on Automatic Face and Gesture Recognition, Washington, D.C. IEEE, Inc. 318-323. Fritzke, B. (1995). A Growing Neural Gas Network Learns Topologies. In Advances in Neural Information Processing Systems 7, G. Tesauro, D.S. Touretzky y T.K. Leen (eds.), MIT Press 625-632. Kohonen, T. (2001). Self-Organising Maps. Springer-Verlag, Berlin Heidelberg. Martinetz, T. & Schulten, K. (1994). Topology Representing Networks. Neural Networks, 7(3) 507-522. O’Rourke, J. (2001). Computational Geometry in C. Cambridge University Press. Ritter, H. & Schulten, K. (1986). Topology conserving mappings for learning motor tasks. In Neural Networks for Computing, AIP Conf. Proc. Martinez, T., Ritter, H. & Schulten, K. (1990). Three dimensional neural net for learning visuomo-tor-condination of a robot arm. IEEE Transactions on Neural Networks, 1 131–136. Nasrabati, M. & Feng, T. (1988). Vector quantisation of images based upon kohonen selforganising feature maps. In Proc. IEEE Int. Conf. Neural Networks. 1101–1108 Martinez, T. (1994). Competitive hebbian learning rule forms perfectly topology preserving maps. In ICANN. Cselényi, Z. (2005). Mapping the dimensionality, density and topology of data: The growing adap-tative gas. Computers Methods and Program in Biomedicine 78, 141-156
Cheng, G. & Zell, A. (2000). Double growing neural gas for disease diagnosis. In Proceedings of ANNIMAB-1 Conference, 309-314 Qin, A.K. & Suganthan, P.N. (2004). Robust growing neural gas algorithm with application in clus-ter analysis. Neural Networks 17 1135-1148 Toshihiko, O., Iwasaki, K. & Sato, C. (2003). Topology representing network enables highly accu-rate classification of protein images taken by cryo electron-microscope without maskin. Journal of Structural Biology 143 185-200
TERMS AND DEFINITIONS Growing Neural Gas: A self-organizing neural model where the number of units is increased during the self-organization process using a competitive Hebbian learning for the topology generation. Hebbian Learning: A time-dependent, local, highly interactive mechanism that increases synaptic efficacy as a function of pre- and post-synaptic activity. Non-rigid Objects: A class of objects that suffer deformations changing its appearence along the time. Object Tracking: Is a task within the field of computer vision that consists on the extraction of the motion of an object from a sequence of images estimating its trajectory. Self-Organising Neural Networks: A class of artificial neural networks that are able to self-organize themselves to recognize patterns automatically without previous training preserving neighbourhood relations. Object Representation: Is the construction of a formal description of the object using features based on its shape, contour or specific region. Topology Preserving Graph: Is a graph that represents and preserves the neighbourhood relations of an input space.