Fast Feature Point Detector

5 downloads 20864 Views 1MB Size Report
Email: [email protected] ... tion is to build a set of corner templates and determine the ... code, and then search for significant turnings at boundary.
2008 IEEE International Conference on Signal Image Technology and Internet Based Systems

Fast Feature Point Detector Neeta Nain, Vijay Laxmi, Bhavitavya Bhadviya, Deepak B M, Mushtaq Ahmed Malaviya National Institute of Technology Jaipur Department of Computer Engineering, JLN Marg, Jaipur, India- 302017 Email: [email protected]

Abstract

based. Boundary based, first extract the boundary as a chain code, and then search for significant turnings at boundary. These corner detectors suffer from high algorithm complexity, as multiple steps are needed. Gray-level based corner detectors, directly operate on the gray level image. The feature detector proposed in this work belongs to this last category. Also the terms feature points and interest points refer to the same concept and hence would be used interchangeably. The paper is organized as section 2 shares light on the existing techniques, section 3 describes the proposed algorithm. The claims are supported by the qualitative analysis in section 4. The test cases and results are illustrated in section 5, followed by related discussion and conclusions in section 6.

This paper presents a new feature point detector that is accurate, efficient and fast. A detailed qualitative evaluation of the proposed feature point detector for gray scale images is then carried out in support of the proposed technique. Experiments have proved that this feature point detector is robust to affine transformations, noise and perspective deformations. More over the proposed detector requires only 28 additions per pixel to evaluate the interest point and its strength, making it one of the fastest feature detectors. The accuracy, speed and parallelizability of this algorithm makes it a strong contender for hardware implementations and applications requiring real time feature point abstraction.

2. Literature Survey of Corner Detectors 1. Introduction

The simplest corner detector is the Moravec detector (1) which computes the sum-of-squared-difference (SSD) between a patch around a candidate corner and patches shifted a small distance in a number of directions. Corner response is then the smallest SSD. Harris (2) builds on this by computing an approximation to the second derivative of SSD with respect to the shift. Brown and Lowe (3) obtained the scale invariance by convolving the image with a Difference of Gaussian (DoG) kernel at multiple scales, retaining locations which are optimal in scale as well as space. Smith and Brady’s method called SU SAN (4) is based on simple masking operations instead of gradient convolution. The technique has a very good localization and noise robustness but it involves high computational cost. Rosten and Drummond (5) method operates by considering that an edge is a boundary between two regions and a corner occurs where the edge changes direction suddenly. Although this feature point detector is extremely fast it is not robust to high level noise and is also susceptible to 1pixel thick straight lines. Mokhtarian and Suomela (6) described a method based on the curvature scale space(CSS) representation. This algorithm is highly accurate and robust to noise but has a very high time complexity. Many other

Most methods for 3D reconstruction, object detection and recognition, image alignment and matching and camera calibration techniques assume that feature points were extracted and put to reliable correspondence. Feature points are locations in the image where the signal changes significantly in two-dimensions. Examples include corners and junctions as well as locations where the texture varies significantly. Of the most intuitive type of features, corners are very critical because they are invariant to rotation and little changes can be observed under different lighting. Corners also minimize the amount of data to be processed without losing the most important information of the gray level image. A large number of corner detectors exist in the literature. They can be divided mainly into two categories: template based and geometry based. Template based corner detection is to build a set of corner templates and determine the similarity between the templates and all the sub windows of the gray level image. Geometry based corner detectors relies on measuring differential geometry features of corners. They are of two kinds: boundary based and gray level

978-0-7695-3493-0/08 $25.00 © 2008 IEEE DOI 10.1109/SITIS.2008.97

301

ure 1 gives rise to four orthogonal difference values: H1 , H2 , V1 and V2 as shown in Figure 1. These difference values are defined as:

variants of the above techniques are also reported in literature.

3. Feature Point Detection Model A one dimensional change in intensity can be classified as an edge and it can be approximated using gradient of the image. Similarly feature points are the sudden changes in two dimensions or second order properties of a surface, which can be approximated with second order derivatives. But second order derivatives are noise sensitive, hence the proposed feature point detector uses only first order derivatives to approximate the second order derivatives. The aim of the proposed technique is to enable detection of two dimensional changes in intensities(corners) and all other points of interest in an image. With the aim of satisfying the universal criterion of localization, consistency, accuracy and noise immunity in a real time environment a short but effective five step algorithm is proposed in the following subsections.

3.1.

Apply the Difference threshold parameter P1

Mask

H1

= | I(i,j) − I(i,j+1) |

(1)

H2 V1

= | I(i+1,j) − I(i+1,j+1) | = | I(i,j) − I(i+1,j) |

(2) (3)

V2

= | I(i,j+1) − I(i+1,j+1) |

(4)

Where I(i, j) is the pixel under consideration. The response of this operator on a set of ideal image templates is shown in Figure 2.

Figure 2. Response of the Difference mask on typical: (a) Horizontal edge. H1 = H2 = 0 and V1 = V2 = 90 − 40 = 50 ,(b) Vertical edge. H1 = H2 = 90 − 40 = 50 and V1 = V2 = 0, (c) Diagonal edge when P1 = 20 and (d) Corner, H1 = 0, H2 = 90 − 40 = 50, V1 = 0 and V2 = 90 − 40 = 50

with

A true feature point can be defined as a sudden and significant two dimensional change in intensity such that it remains invariant to affine transformations. Such points are generally classified as corners (2; 6). A feature point is identified where multiple regions of different intensity meet and convey significant information change in two dimensions. As the number of regions meeting at a point increases, the measure of entropy also increases at that point which is defined as Information Content(IC) of the feature point. Such points are commonly referred as interest points (4; 8). The proposed technique detects both corners and interest points as feature points. Unlike the usual convolution masks of sizes varying from 3x3 to 7x7, we propose the use of a very simple 2x2 difference mask. The complex convolution is replaced with simple differences between the pixel intensity values. The use of the difference masks as shown in Fig-

The response of the 2x2 Difference operator can be classified as, If (H1 = H2 = V1 = V2 = 0) then it is a constant region; If ((H1 AND H2 > P1 ) OR (V1 AND V2 > P1 )) then it is a region of only 1D change; If ((H1 OR H2 AND V1 OR V2 ) > P1 ) then the region contains Feature Points (detected as 2D change). Here P1 is the threshold value for feature point detection. It can be varied as per the interest of feature detection. For example, P1 = 1 will detect all intensity variations as feature points. Similarly P1 = 255 would detect only step changes as feature points. It has been found experimentally that threshold in the range of 10 − 25 could detect good feature points for most of the real images, and for very low contrast images, the threshold can be reduced accordingly. Since, for feature point detection we are only interested in the 3rd category of responses so the pixel satisfying the third criteria is further processed to determine whether it is a true feature point or not.

3.2. Apply the Pseudo Gaussian Mask to the Points Satisfying the first Difference Operator Figure 1. The 2x2 difference operator mask (a) and (b) are at 90◦ to each other

In real case scenario we will never come across images with well defined edges or corners. Not only the feature

302

points will be blurred but the image itself will consist of several types of noises. To increase the noise immunity of our algorithm we propose the application of a PseudoGaussian kernel where the Gaussian smoothing is applied around only those pixels which are part of the region containing feature points.We propose a partial averaging kernel which is derived from a 5x5 Gaussian kernel having σ = 1.3. It is further modified and normalized so that all the weights are a power of 2 to accomplish operations like multiplication or division by simple shift of the bits. Further, we take advantage of the nature of our difference kernel and apply only a partial of the entire Gaussian kernel. as shown in Figure 3. This reduces the Gaussian averaging

one the principle part of the proposed technique. Since the base of the kernel is Gaussian, it performs the intended task of averaging the current pixel using a weighted sum of its neighboring pixels. Each of the four pixels under consideration are averaged by the pixels in its own direction away from their center. Consider a set of pixels where four different gray scale regions meet. As the value of each pixel is averaged by pixels having the same pixel intensity the difference mask in the next step responds very strongly to this corner. This is unlike applying a standard kernel wherein the pixel is averaged with all neighboring pixels and may lead to a very low or no response at all if the nearby regions are close to each other in gray scale values. The effect of our Gaussian kernel is amplifying the change in intensity at the meeting point of regions, in contrast to smoothing the meeting point. The central tendency is found by using the difference of the Gaussian(DOG) on each row and column to get zero crossings in both orthogonal directions which approximates the position of the feature point. Figure 4 illustrates the spatial plots of the Gaussian functions for 2 nearby pixels.

Figure 3. Pseudo-Gaussian mask. overhead by 75% but produces the desired effect on the set of 2x2 pixels under consideration. Each of the four central pixel(weight = 64) is averaged with the neighboring similar colored pixels with their respective weights(in top right corner). The new Gaussian averaged values of the four pixels under consideration are calculated as: 0

I(i,j) 0

I(i,j+1) 0

I(i+1,j) 0

I(i+1,j+1)

= Gk ∗ I(i,j)

(5)

= Gl ∗ I(i,j+1)

(6)

= Gm ∗ I(i+1,j)

(7)

= Gn ∗ I(i+1,j+1)

(8)

Figure 4. Mask to approximate (a) Gk and its plot, (b) Gl and its plot, (c) and (d) Plot of the sum of all the weights of the partial masks Gk and Gl (e) Zero crossing: Difference of the two Gaussian Gk and Gl .

where Gk , Gl , Gm , Gn are shown in Figure 3. To compute second order derivative, the difference operator is re-applied on the new Gaussian averaged values of the four pixels under consideration. But this time a different threshold P2 is used to compare the result of the Difference operator. The use of the second threshold parameter is encouraged so as to avoid missing weak feature points in the presence of noise. That is, a low P1 will detect most of the interest points and a higher or variable P2 will give user control over the noise interference with the desired feature points. The pseudo Gaussian kernel is significant in many ways and is

3.3. Compute Information Content of the Feature Point The Information Content(IC) is a measure of the distinctiveness of a feature point. The more distinct or random the feature point is, higher is the entropy. For every pixel

303

that satisfies the second difference operator the IC(i,j) is computed as: we add a 12 to its (IC), a 7 to the IC of its 4-connected neighbors and a 5 to the IC of its diagonally connected neighbors. This results in a averaged and normalized IC such that the maximum value of IC is 60. Any multiple of 12, 7 and 5 can be used. As discussed in section 3.1 it is very clear that the orthogonal difference masks will respond strongly to all diagonal edges (false positive feature points). In order to avoid these we eliminate all the candidate pixels that are part of diagonal edges in a direction which are only 2-pixel connected in the direction of that diagonal. The effect of false positive removal is shown

Figure 6. (a) Original Image, (b) Initial Difference Operator response (c) After removal of false positives and localization of feature points

Figure 5. Figure showing false positive candidates(Gray) to be eliminated.

truth. This unfortunately relies on subjectively made decisions. Schmid (8) proposes repeatability and information content as the two criteria for qualitative evaluation. The repeatability is the ratio of repeated features and detected features, and information content measures the distinctiveness of the local gray-level pattern at an interest point. According to Mohanna (7) repeatability and information content are nothing but consistency and accuracy combined. We quantitatively evaluated our algorithm on these two metrices on the following tests.

on some junctions in Figure 5. The pixels shown in red are retained. To accurately determine the localization of the detected feature point we compare the IC of the current pixel with the IC of the neighboring nxn pixels. Experiments has shown that n = 4 gives very good localization. Since the maximum value of any pixels IC could be 60, all the points with IC > 52 in the 4x4 region are the most prominent feature points(shown in red in the test results), where at least 6 pixels from different regions are contributing to this feature point. Similarly for the points with 44 < IC < 52(shown in blue), at least 5 pixels from different regions are contributing. For 32 < IC < 44(shown in green), at least 4 pixels from different regions are contributing. For 22 < IC < 32(shown in white), at least 3 pixels from different regions are contributing. Figure 6 shows the output of the algorithm on one of the test images at intermediate stages.

5. Test Results The algorithm is tested on a variety of images for critical evaluation and comparison with some of the existing techniques like (4), (5) and (6) is done. Due to shortage of pages only some instances of the test results are shown and compared with (5) but a complete quantitative analysis is given in Table 1. The test images taken covers the various gradual changes in intensity expected in real images. The following categories of test images are chosen for an un-biased evaluation: Test1: Our test image is a very low contrast image with subtle gradient changes consisting of all straight edge junctions, curved boundaries and diffused edges, having junctions where multiple regions are meeting. Such a test image conclusively proves the accuracy and robustness of our algorithm. A threshold value of P1 = 6 detects all the significant feature points in our low contrast test image as shown in Figure 7. Test2: Test image with highly diffused curved region boundaries (Diffusion at some places extends up to 6 − 8

4. Qualitative Analysis of the Proposed Algorithm Mohanna and Mokhtarian (7) evaluated performance of feature detectors by wrapping test images in an affine manner by a known amount. They define two matrices: Consistency(CCN ) and Accuracy(ACU ). Consistency is defined as: CCN = 100 ∗ 1.1−|nw −no | where nw is the number of features in the wrapped image and no is the number of features in the original image. The accuracy nm n

+ nnm

is defined as: ACU = 100 ∗ o 2 g where ng are the number of ’ground truth’ corners marked by humans and nm is the number of matched corners compared to ground 304

Figure 7. Results: (a) Our algorithm (b) Rosten’s Algorithm Figure 9. Comparative of average CCN . Images rotated till 90◦ in increments of 10◦ pixels). The algorithm was tested for consistency and accuracy on the following experimental cases under various affine transformations like: The test image is rotated to 90◦ in increments of 5◦ each. Both uniform and nonuniform scaling is tested. Images were skewed (non-uniform scaling), scaled horizontally and also uniformly scaled to 150% its original size in 5% increments. 3-D projections: The image was projected on a sphere and tested.

Figure 10. Comparative of average CCN . Images scaled to 150% in increments of 5%

Figure 8. (a) Original Image, (b) Skewed by 15◦ (Non-uniform scaling), (c) Projected on a Sphere(Perspective Deformity), (d) Uniformly Scaled to 125%, and (e) Rotated by 30◦ and scaled horizontally by 50%

Figure 8 illustrates one instance of such affine transformation on a real world image. Figure 9 and Figure 10 depicts the comparison of the various algorithms when average consistency CCN numbers were computed on rotation deformations and scaling/skew on the various test images. Test3: Tests were also performed to analyze the localization capability of the proposed algorithm on diffused image boundaries and to test the resistance of the algorithm to deformities like: Gaussian blur; Disk blur; and Motion blur.

The results on motion blurred image with shift parameters ranging from 5 to 15 pixels in the direction of 45◦ as depicted in Figure 11 concludes that the algorithm is robust to a motion blur of up to 15 pixels unlike Rosten’s algorithm which fails to detect features even at 10 pixel motion shift. Test4: Tests were also performed to analyze the repeatability and stability of the proposed algorithm on noisy images like: Gaussian, Poisson, Salt-n-pepper and Speckle Figure 12 illustrates results of our algorithm on one of the test image induced with artificial noises.

6. Performance Evaluation and Conclusions Figure 7 - Figure 12 depicts one instance of the test results. Table 1 summarizes the mean CCN and ACU numbers for the various transformations discussed in section 5. In most cases the CCN and ACU numbers are high, and gives very good results in case of noise and perspective de305

steps of the proposed technique can be pipelined to obtain higher throughput. Algorithm Rotation CSS Susan Rosten Our Algorithm

32 24 72

Consistency % Uniform Non-Uniform Scaling Deformation 35 28 28 31 89 75

71

86

Noise

84

14 9 37 58

Table 1. Performance comparison based on consistency.

Figure 11. (a)Our algorithm (b) Rosten’s Algorithm

Algorithm

Accuracy % Additions

CSS Susan Rosten Our Algorithm

49.6 53.4 70

95 32.25 42

70

28

Complexity MultiOperations/ plications Pixel 22 117 0.75 33 0 42 0

Table 2. Performance comparison based on accuracy measure and complexity. Figure 12. (a) Gaussian(variance=0.004), (b) Poisson, (c) Salt n Pepper(density=0.004), (d)Speckle noise(variance=0.004)

References formation. We compare the complexity of our algorithm with some of the most popular existing feature point detectors. Our algorithm incur at most 28 subtractions(additions) at any pixel with an upper limit to 22(In general only 80% of pixels in images will be candidate points that pass the first step of our algorithm). No multiplication or division operations is involved in our algorithm making it one of the most suitable contender for real time implementations as shown in Table 2. The proposed algorithm is an excellent contender for use in hardware implementations and embedded systems alike. Comparing the algorithms code-size to Rosten’s (5), the code is a 5KB(100 lines of Code) instead of a 145KB(3500 lines of code) of Rosten’s. Thus the proposed algorithm will find direct use in embedded systems having a very low memory footprint. Moreover from the point of view of hardware implementation, apart from the performance numbers, the algorithm is easily parallelizable. Every pixel operation is dependent on only its surrounding 5x5 pixels and independent of the rest of the image, so coarse parallelization is very much possible. Also the five

[1] H. P. Moravec. Towards automatic visual obstacle avoidance. Proceedings of the 5th International Joint Conference on Artificial Intelligence, 99:584, 1977. [2] C. Harris and M. Stephens. A combined corner and edge detector. Proceedings of 4th Alvey Vision Conference, Manchester, 23:147–151, 1988. [3] D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, pages 91–110, 2004. [4] S. M. Smith and J. M. Brady. Susan - a new approach to low level image processing. International Journal for Computer Vision, 23(1):45–78, 1997. [5] E. Rosten and T. Drummond. Machine learning for highspeed corner detection. ECCV, 17:211–224, 2006. [6] F. Mokhtarian and R. Soumela. Robust image corner detection through curvature scale space. IEEE Transcations on pattern Analysis and Machine Intelligence, 20:1376–1381, 1998. [7] M. Mohannah and M. Mokhtarian. Performance evaluation of corner detectors using consistency and accuracy measures. Computer Vision and Image Understanding, 102(1):81–94, 2006. [8] R. M. C Schmid and C. Bauckhage. Comparing and evaluating interest points. INRIA Rhone, Montbonnot, France, Europe, 655:63–86, 2004.

306

28

Suggest Documents