A Note on Shape Matching using a Constructive Relaxation Method. Keisuke Kameyama, Kazuo Toraichi. Institute of Information Science and Electronics,.
A Note on Shape Matching using a Constructive Relaxation Method Keisuke Kameyama, Kazuo Toraichi Institute of Information Science and Electronics, Tsukuba Advanced Research Alliance, University of Tsukuba Tsukuba 305-8577, Japan Yukio Kosugi Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology Yokohama 226-8503, Japan
Abstract This paper discusses a novel approach for image shape matching to a set of premodelled templates, named as Constructive Relaxation Matching (CRM). In CRM, the modeling stage for a novel input image, conventionally done in the same procedure used for the template modeling, will be included in the procedure of iterative relaxation. Upon dynamically constructing the model during relaxation, objects having similar template label assignment probabilities will be unified to make one object. The unification of objects develops from Model Switching maneuvers that have proved to be an efficient method of model selection and parameter determination in training layered neural networks.
1. Introduction
In image shape analysis and understanding, it is a common situation that an unknown input image has to be classified to one of the predefined classes, possibly represented by a set of template images. Relaxation Matching (RM) algorithms [1] used for image labeling and matching is one variant of deterministic relaxation algorithms used for solving optimization problems, among which Hopfield neural networks [2] also belong to. Due to the common energy functional minimizing dynamics, RM is known to be very robust against noise, deformations and partial lack of information, where in image matching applications, all of such cases are common. RM have been used in various applications such as scene labeling [1] and handwritten character recognition [3] [4].
Upon using RM, template images will be prepared with their characteristic portions recognized as objects, indexed by labels (image modeling). The process of modeling identifying the characteristic portions of the image will produce a description of the image in relative relations between the objects. However, different images will certainly produce different models (e.g. different number of edge objects etc.). This can result in a very poor matching between the objects in the template images and those in the novel unknown image that ought to be matched. One workaround to this issue would be to consider not only 1 to 1 object matchings, but 1 to n matchings as well. However, this process is not included in the RM algorithm and to incorporate this procedure to tha matching can be quite awkward. In this paper, a scheme of relaxation named Constructive Relaxation Matching (CRM) is introduced. In CRM, the matching process of a newly input image whose class or the portion labelling is still undetermined, will start in an nonparametric style, without object-modelled in a particular way. The model of the input image will be dynamically constructed in the process of iterative relaxation, where objects having similar responses to the labels of the existing template will be unified. By using this approach, the final matching score of the input image will be matched to the existing templates modelled in a different way, template by template, enabling a flexible consistency evaluation. In Sec. 2, the conventional Relaxation Matching algorithm will be reviewed, and their issues when used in image matching will be pointed out. In Sec. 3, Model Switching maneuvers used in layered neural network learning will be reveiewed, and the possibilities of application to RM algorithm will be discussed. Sec. 4 is devoted to discussing about the CRM algorithm, and Sec. 5 concludes the paper.
2. Relaxation Matching
Matching based on labeling and relaxation operations Templates
In the conventional RM, a set of N objects B = fb1 ::: bN g and a set of M labels = f1 ::: M g is initially assumed. The assignment of the labels to object i will be expressed in a vector of assignment probabilities as, pi = pi (1) ::: pi (M )]T 2 RM , where “T” denotes the transpose. The members of p i meets the usual condition of PM =1 pi () = 1. The set of pi for all the N objects will T make vector p = p 1 : : : pN ] 2 RMN , which is the state vector of the dynamical system. For updating the state p, a four dimensional matrix R = rij ( )] 2 RMN MN is defined, and the state vector will be updated according to,
p(t + 1) = f (p(t) q (t)) f (p q ) = f1 (1)
fi () = qi ()pi ()=
: : : fN (M )]T M X 0
and
=1
(1) (2)
Raw data 1
Sampled data 1
Raw data M
Sampled data M
Unknown Input k Raw data k
Sampled data k
q (t) = Rp(t)
(3)
(4)
where t denotes the time. It has been shown in [5] that the RM system with symmetric R will have a Liapunov (energy) function defined as,
;A(p(t)) = ;p(t)T Rp(t)
(5)
which is guaranteed to decrease by the iterative transition of
p until one of the local minima of ;A(p) is reached.
When the above RM algorithm is applied to image shape matching, the template image will be modelled as a set of objects in particular spatial positionings. Each object in the template will be assigned a unique label to make the label set above. The novel input image whose matching to the template is to be evaluated, will be modelled by the same process as in the template case to make the set of objects B . Matrix R is called the compatibility coefficient matrix whose element rij ( ) reflects the compatibility of the situations when labels and are assigned to objects i and j , respectively. Elements of R are commonly nonnegative in image matching. By using RM to the sets and B , with the compatibility matrix R, the state vector p is expected to converge to one of the label assignments (states) that is at a local minimum of ;A(p). Function A(p) can be a measure of the total consistency of the label assignments [6]. The RM algorithm can further be applied to pattern matching problems as in [3] and [4], where the input image is
Modelling operation
(Commmon) Modelling operation
Description 1 in model 1 Description M in model M
Matching by iterative relaxation for each template
Description k in model k
(a)
Matching based on code comparison : (e.g. Layered neural networks) Templates Raw data 1
Sampled data 1
Raw data M
Sampled data M
Unknown Input k Raw data k
Description 1 in model C Common modelling operation C (Dynamical model adjustment via Model Switching on training )
Sampled data k
Description M in model C
Matching in code space by distance evaluation
Description k in model C (b)
Matching based on Constructive Relaxation Matching (CRM) Templates
qi (0 )pi (0 )
Modelling operation
Raw data 1
Sampled data 1
Raw data M
Sampled data M
Modelling operation 1
Modelling operation M
Description 1 in model 1 Description M in model M
Unknown Input k Raw data k
Sampled data k
Iterative relaxation and dynamical modeling by object interactions for each template (c)
Figure 1. Matching strategies of, (a) conventional relaxation matching, (b) matching based on mapping and code comparison, and (c) proposed constructive relaxation matching.
matched against numerous templates each denoting the image class. There, the value of function A(p) at the stable (converged) state of p is used as the matching score, and the input is classified to the class of the template image that resulted in the maximum A(p) after convergence. In the matching stage, an input image of an unknown class will be modelled by the same procedure, and the objects will be labelled according to the similarities in the local connectivity and relative positioning to those in the template images. Additionally, a global matching functional will be defined and evaluated as the goodness of fit in classification [3] [4]. The pattern matching scheme using conventional RM is shown in Fig. 1(a). As opposed to strategies based on dis-
tance evaluation within a common code space being a range of a common map such as a trained layered neural network (Fig. 1(b)), RM inherently rely on the robustness of the algorithm which allows incomplete matching of images described in different models, image by image. Despite this robustness, there are cases in which difference in modeling becomes an obstacle for matching a pair of apparently similar images. For example, a line segment in an image, can happen to be modelled as two separate segment objects due to noise or deformation. Matching the noiseless template single line segment with a new input image modelled as multiple segments, will require external measures that can be quite awkward. In contrast, we will take the way of constructing the model of the input image according to the template object set, during relaxation.
3. Object fusion by similarity measure (Model Switching) In [7] and [8], a scheme of neural network training involving a dynamical model alteration was introduced for efficient feature extraction and network model selection. This method which is named Model Switching (MS), is based on the backpropagation training [9], but additionally enables an input-output map search in a way that the search domain is extended to multiple network models. The model crossover is done by switching the function (network) model at the instant which is possible to preserve the function acquired by the training so far. One of the operation for switching the model in MS, is unit fusion. During training, an external monitor evaluates the response patterns of the hidden layer units to the training set, and fuses the pairs of hidden layer units that have significant response correlation, thereby reducing the network model by one hidden layer unit. The correlation of the unit responses is evaluated by,
sij =
M X
(him ; ei )(hjm ; ej )=
m=1 "M X
m=1
(him ; ei )
2
M X m=1
(hjm ; ej )
2
#1=2
:
(i j = 1 2 : : : N:) (;1 sij 1): (6) In Eq. (6), M , h and e denote the number of training sets,
the unit response and the average unit response, respectively. Unit pairs with jr ij j ' 1 will be the candidates to be reduced to a single unit. In RM, a similar strategy can be taken to fuse the objects in the input image, by evaluating their assignemnt proba-
bility vectors with the existing labels. If there are object pairs that have similar label probability assignments, further meeting the conditions such as being spatially adjacent objects in the input image, then, fusion of objects can take place. Therefore, the object set size jB j will be graduallly reduced during relaxation.
4. Constructive Relaxation Matching (CRM) Here, the Constructive Relaxation Matching (CRM) will be described. In addition to the iterative procedures of RM described in Eq. (1), the following similarity evaluation is required, and further object fusion operation will take place according to the similarities. As a measure of similarities between input image objects, the discrepancies of the label assignment probability vectors can be used directly instead of Eq. (6). After each evaluation of p in Eq. (1), the similarities of the label assignment probability vectors p i (i = 1 : : : N ) will be evaluated as the object similarity matrix defined as,
S = sij ] = pTi pj ] 2 RN N : (7) Spatially adjacent objects i and j having similarity of s ij > Sthres will be fused to one. By modeling the input image according to each existing template object set, the input image will be evaluated without being tied to a single model, thereby reducing the matching difficulties due to the model differences. This scheme of dynamic modeling based on local interactions of objects can be considered as using the Model Switching maneuvers described in the previous Section, to a recurrent dynamical system (neural network model). This idea of constructive process of nonparametric to parametric image matching also owes much to a biologically-inspired nonparametric method for image matching, which evaluate the local pixel correlations of the two bitmap images, for deriving a nonlinear mapping to match two similar images with deformations [10].
5. Conclusion In this paper, a method for image shape matching to a set of premodelled templates called the Constructive Relaxation Matching (CRM), was discussed. In CRM, the modeling stage for a novel input image, conventionally done in the same procedure used for the template modeling, was included in the procedure of iterative relaxation. Upon dynamically constructing the model during relaxation, objects
having similar matching scores with the template object sets are to be unified to make a new object. For the unification of objects, similarities of label assignment probabilities is to be used.
Acknowledgment The authors would like to thank Mr. Paul Wing Hing Kwan of University of Tsukuba, for inspiring discussions regarding Relaxation Matching methods.
References [1] A. Rosenfeld, R. A. Hummel, and S. W. Zucker. Scene labeling by relaxation operations. IEEE Trans. Sys., Man and Cybern., SMC-6(6):420–433, 1976. [2] J. J. Hopfield and D. W. Tank. “neural” computation of decisions in optimization problems. Biological Cybernetics, 52:141–152, 1985. [3] K. Yamamoto. Recognition of handprinted kanji characters by relaxation matching. IECE Trans., J65-D(9):1167–1174, 1982. [4] K. Toraichi, S. Ishiuchi, T. Horiuchi, K. Yamamoto, and H. Yamada. Recognition of handwritten kanji and hiragana characters by relaxation matching. IEICE Trans., J73-DII(9):1448–1457, 1990. [5] Marcello Pelillo. On the dynamics of relaxation labeling processes. Proc. IEEE Int’l. Conf. on Neural Networks, 2:1006–1011, 1994. [6] R. A. Hummel and S. W. Zucker. On the foundations of relaxation labeling processes. IEEE Trans. of Pattern Analysis and Machine Intelligence, PAMI-5(3):267–287, 1983. [7] K. Kameyama and Y. Kosugi. Model switching by channel fusion for network pruning and efficient feature extraction. Proc. Intl Joint Conf. on Neural Networks 1998, pages 1861–1866, 1998. [8] K. Kameyama and Y. Kosugi. Neural network model switching for efficient feature extraction. IEICE Trans. on Inf. and Sys., E82-D(10):1372–1383, 1999. [9] D.E. Rumelhart, J. L. McClelland, and the PDP Research Group. Parallel distributed processing. MIT Press, 1986. [10] Y. Kosugi, M. Sase, H. Kuwatani, N. Kinoshita, T. Momose, J. Nishikawa, and T. Watanabe. Neural network mapping for nonlinear stereotactic normalization of brain mr images. Journal of Computer Assisted Tomography, 17(3):455–460, 1998.