. Abstract. A new method is proposed for shape matching
when the shapes are distorted by noise. The present shape matching techniques
...
Shape Matching using Shape Context in the Presence of Noise Prasanth Kalakota Department of Computer Science, College of Engineering, University of South Carolina, Columbia, USA
[email protected]
Abstract. A new method is proposed for shape matching when the shapes are distorted by noise. The present shape matching techniques doesn’ t work well when the shape is distorted by high noise. The proposed method assumes that the noise distorts the pixels equally in both directions. It takes the mean of the three or four pixels and represents them as one pixel. With this approach any noise present can be minimized to certain extent. This method is tested with the binary images of the hand, and the results are presented.
1 Introduction Shape matching is one of the important areas in Medical Image Processing and has wide variety of applications in Character Recognition and in Computer Vision. Most methods [6] used for Shape Matching are based on the transformation of geometric coordinates of the shape. All the shape matching techniques concentrated more on problems with scaling, rotation and shifting, but not with the noise. Shape Context [1] is a new method for shape matching, which achieves the significant results than the previous methods. However these methods assume that shapes are not much distorted by noise. But in reality shapes can be occluded or distorted by noise. This paper presents a method based on the mean of the points selected from the shape contour. As we represent the shape by a set of points selected from the shape contour, to deal with the noise problem, we select more number of points on the shape contour. Then a small window is used to calculate the mean of the points, which are in the window. A window is a matrix of fixed size. In this one I am using 3x1. That is calculating the mean for every three pixels. After that the window is slide by three points on the shape. The resultant points are used to represent the shape. Then Shape Context [1] method is used to do shape matching. This method is tested on the binary dataset and better results are achieved in some cases than using the shape context method alone. This paper is organized as follows. Chapter 2 discusses the Shape Context method. Chapter 3 discusses the various types of noise and the problem of using shape context method. Chapter 4 gives an introduction to the method I proposed and Chapter 5 gives the results of the experiments and Conclusion in Chapter 6.
2 Shape Context Shape Context [1] is a new method proposed by Serge Belongie, Jitendra Malik and Jan Puzicha for Shape Matching. The idea behind Shape matching can be divided into three steps. (1) To find the best one to one correspondence between the two shapes. (2) Use these correspondences to find an aligning transform. (3) Compute the matching cost between the two shapes as an error. To find the one to one correspondence between the shapes, various points on the shape are taken and the shapes are represented as a set of vectors. Now the problem is to find the best matching point pi on the first shape, the best matching point qj on the second shape. For this one a Shape Context [1] was introduced. In this approach for a given point all the vectors leading to the other points from that point are taken. This is called a rich local descriptor. As the total number of vectors increases this becomes close to the original image. The problem with this approach is it is sensitive to nearby points. To avoid this problem, log-polar graph is used. The logpolar histogram consists of 5 bins for log and 12 bins for angle. The histogram represents the shape context for the point pi. As similar points have same shape descriptor, the problem is to find the matching between two shapes subject to matching is one to one. Given the all histogram-matching costs between the two shapes, the cost matrix Cij between the shapes is given by
Cij = C (pi, qj) =
K
[hi (k ) h j (k )]2
k 1
hi (k ) h j (k )
This is equivalent to weighted bipartite matching. Using Hungarian method [7] this can be solved in O (N3) time. After finding the best matching between the two shapes, we have to find a transformation, which will maps the input points to the output points. In this one Thin Plate Spline (TPS) model is used. The idea behind TPS model is that the points are distributed on an infinitely thin plate, we have to bend the thin plate in a such a way that the resultant shape points to the output shape. The TPS model minimizes the bending energy required. However there are other problems in shape matching. These problems can be classified as Translation, Rotation and Scaling. To avoid the effects of translation, the shapes are shifted to the central origin. To avoid the effect of scaling normalize all the distances by the mean distance. To avoid rotation effect all angles are measured along fixed plane.
3 Effects of Noise on Shape Matching Noise is the common problem in all image-processing applications. Noise can be caused by many sources. As most of the shapes are obtained by taking the edges of the images, if the images are distorted by noise then the shapes are also affected by noise. Noise can be introduced into the images because of lens aberrations, problems with camera. The noise introduced into the shape can be mathematically represented as X = X1 + Xn Y = Y1 + Yn where (X1, Y1) are the original points on shape. Because of the noise the shape coordinates are moved to (X1+ Xn, Y1+ Yn). This effect is illustrated in the Fig as shown. The first one is the original shape without noise and the second one is the shape with noise. Because of the noise the shape distorts.
(a)
(b)
Fig. 1. (a) Original shape without noise. (b) Shape with noise. Because of the noise the shape points are moved to the different location. The Shape Context [1] tries to eliminate the effects of noise by using the logpolar histograms. Since in the Shape Context [1] the descriptor is represented as the radial distances from one point to all other points on the shape. The pixels, which are at distance, are not affected much by the presence of noise. However near by pixels changes drastically due to noise. These pixels may move to adjacent bins causing the histogram to change or causing the shape context to change.
4 Proposed Method The main problem with shape context method is the points used to represent the shape move to other bins for noisy shapes. To counter this problem the idea that I proposed is to take more points on the shape and take a window, find the mean of the points that are in the window. The resultant value can be taken as a resultant point, which can be used for comparison. Then run the shape context method to do the shape matching. For given pixels the mean is given by n n xi yi xi ' , yi ' = i1 , i1 n n
where n is the window length. Window is a matrix of fixed length. In this one I am using a matrix of 3x1. So the value of n is 3 in this case. Depending on the window size, the value of n varies. In this we assumed that the mean of the noise lies along the shape and the standard deviation is fixed and noise is distributed with equal probability around mean. That is the pixel values either increases or decreases. By taking the more number of points on the shape, say three times the original and finding the average of three or more pixels and replacing it with one pixel, then we can represent the shape closely and accurately. If we use shape context method after that then we can achieve better results.
5 Experiments The proposed method is implemented in C on UNIX machine. To implement the Shape Context method, I downloaded the Hungarian algorithm [7] from the website of Brian Gerkey [5]. But I faced a small problem with the code for large data set (more than 25 points). So I used small dataset of the shape. The shape of the dataset that I took is as shown in the Fig. 3 and Fig. 4. In this method I concentrated more on minimizing the bending energy than the fastness of the algorithm. The program takes the image data as arguments and then gives the bending energy required to do the optimal matching.
Fig. 2. Test image 1. (a) Original Image (b) Image distorted by small noise (σ = 10) (c) Image distorted by large noise (σ = 100)
Noise is introduced in these shapes using the C random function. Various noisy images are generated with standard deviation of 10, 50 and 100. For the testing purposes I used the binary hand images. All these images are of equal size. These are as shown in the Fig. 2 and Fig. 3. As this shape has both low curvature and high curvature points, I felt this one is ideal for testing. The actual shapes used for testing are shown in Fig. 4 and Fig. 5
Fig. 3. Test image 2. (a) Original Image (b) Image distorted by small noise (σ = 10) (c) Image distorted by large noise (σ = 100)
When the original shape context algorithm and modified algorithm is run on the images, I found that the algorithm worked fine for smooth images and noisy images with low curvatures. The algorithm showed a little improvement than the author’ s implemented one. However when tested the above image with more curvature, the performance of my algorithm degraded compared to the original one. I think the
reason could be as it is taking the average of the pixels coordinates, the curvatures are not represented effectively.
Fig. 3. Part of the image1. (a) Original Image (b) Image distorted by large noise (SD=100)
Fig. 4. Part of the image1. (a) Original Image (b) Image distorted by large noise (SD=100)
6 Conclusion The new method proposed based on the mean of the pixel coordinates worked well in the presence of noise than the original method. However this method showed degraded performance when the shapes have high curvatures. The reason could be the mean value of the surrounding pixels may not be able to represent the curve properly. The algorithm that I used takes only the mean values of the surrounding points. But this can be improved by selecting the points based on the high order polynomial. We can also look into other methods to remove the effects of noise. Right now the method shows the bending energy only as the output, C graphics can be used to see how the pixels are matching actually.
7 References 1. 2. 3. 4. 5. 6. 7.
S. Belongie, J. Malik, and J. Puzicha. “Shape context: A new descriptor for shape matching and object recognition,” Advances in Neural Information Processing Systems 13: Proc. 2000 Conference, pages 831-837, November 2000. S. Belongie, J. Malik, and J. Puzicha. “Matching Shapes,” Proceedings Eighth International Conference. Computer Vision, pages 454 – 461, July 2001 S. Belongie, J. Malik and J. Puzicha, "Shape Matching and Object Recognition Using Shape Contexts," Pattern Analysis and Machine Intelligence, volume 24, pages 509-522, April 2002. H. Chui and A. Rangarajan. “A new algorithm for non-rigid point matching,” Computer Vision and Pattern Recognition, volume 2, pages 44–51, June 2000. Brian Gerkey http://www-robotics.usc.edu/~gerkey/tools/hungarian.html F.L. Bookstein, “Principal Warps: Thin-Plate Splines and Decompositions of Deformations,” IEEE Transactions on Pattern Analysis and machine Intelligence, volume 11, no. 6, pages 567-585, June 1989. Cpapadimitriou and K. Steiglitz, “Combinatorial Optimization: Algorithms and Complexity,” Prentice hall, 1982