Bias Characteristics of Bilinear Interpolation Based Registration

23 downloads 0 Views 552KB Size Report
Bilinear Interpolation Based Registration. Donald G Bailey, Andrew Gilman, Roger Browne. Institute of Information Sciences and Technology. Massey University.
Print

Menu

Go Back

Next Page

Bias Characteristics of Bilinear Interpolation Based Registration Donald G Bailey, Andrew Gilman, Roger Browne Institute of Information Sciences and Technology Massey University Palmerston North, New Zealand [email protected], [email protected], [email protected] Abstract—Image registration has application in many image processing tasks. A new gradient-based registration method that uses the bilinear interpolation equation to predict the pixel values of one image from those in a reference image is developed and its bias characteristics analysed. For sinusoidal signals and step edges, the estimated offset is biased towards the nearest reference pixel, however for noise signals, the bias is away from the nearest pixel. Therefore the bias (and registration accuracy) can actually improve with the addition of noise. This phenomenon is demonstrated with two sample images with quite different image statistics. Index Terms—sub-pixel, super-resolution, imaging model

I.

INTRODUCTION

Image registration is an important step in many image processing applications. While pixel accurate registration is adequate for many applications, many techniques can benefit from registration to sub-pixel accuracy. These include: • Image fusion, particularly super-resolution techniques, where multiple offset images are combined to give a single high resolution image [1]. The first step in combining the data from the multiple images is to estimate the offset between the input images to sub-pixel accuracy. • Video compression, where motion compensation is used to predict the values within one frame from those within a previous frame. In general, a more accurate estimate of the inter-frame motion will give a smaller residue, and hence a better compression ratio [2]. • Image fusion where images taken from different sensors or locations are combined together after registration [3]. This includes stereo imaging, where the depth of objects is inferred from their disparity between images captured from different. The depth resolution is dependent on the accuracy of the disparity estimation, so sub-pixel registration will allow improved depth resolution. • Image stitching, where multiple images are mosaiced (for example in creating a panorama image [4]). Improved registration within the overlap region will reduce artefacts caused by merging the data within the overlap. • Motion detection using optical flow [5], where object motion parameters are inferred from differences in the position of the object in successive frames.

V. CATALOG NUMBER

D-ROM Version



Image stabilisation where the camera or sensor is on a vibrating platform. Registration of successive frames allows compensation of the platform motion.

A. Registration Given two images (or subimages in some applications), f and g, which differ only by a translation (rotation and scaling are not considered in this paper), registration involves estimating the offset between the pixel values of the two images. The general approach is to designate one of the images as the reference, and measure the offset of the other image relative to this reference. In the discussion here, it is assumed that all of the images have been pre-registered to the nearest low-resolution pixel (using any method described in [6] or [7]), and this paper investigates the accuracy with which the subpixel offset may be estimated. Let f(x,y) represent the pixel values for the reference image. The image g(x,y) is offset from f by (u,v) such that

f ( x + u, y + v) = f uv = g ( x, y )

(1)

The registration problem then is given f and g, to estimate u and v, where 0 < u,v < 1. B. Aliasing issues Most analyses of image processing operations, including registration, assume that the Nyquist sampling criterion has been met, and that the images are not subject to aliasing. In practice, many images of real world objects contain some degree of aliasing by virtue of the fact that objects have sharp edges or boundaries. Natural objects in particular have detail at a wide range of scales so will contain energy over a broad bandwidth. The only bandwidth limiting elements within image capture systems are the optical transfer function of the lens, and area sampling performed by the sensor. Therefore some degree of aliasing is inevitable. In fact some applications, such as super-resolution, require that the input images be aliased in order to obtain more information from the image ensemble than is available within any single image. Consequently, any sub-pixel image registration technique should be relatively insensitive to aliasing.

he following catalog numbers must appear on the face of the CD-ROM, on the jacket c ewed when the CD-ROM is accessed: IEEE Catalog Number: ISBN:

05CH37710C

0-7803-9312-0

addition, the following CCC Code must be placed on the bottom left hand side of th cording to the guidelines set forth in the attached enclosure: 0-7803-9312-0 /05/$20.00 ©2005 IEEE

Menu

Go Back

Next Page

C. Gradient based registration A wide range of registration methods have been proposed. The key requirements are that the registration method must be capable of sub-pixel accuracy, and maintain the accuracy in the presence of aliasing and a reasonable level of noise. A popular class of registration methods estimates the offset based on derivatives of image intensity [5]. Expanding (1) about f(x,y) using a 2-dimensional Taylor series gives:

f uv = f 00 + u +

f f +v x y

2 2 1 2 2f f f u + 2uv + v2 2 + … 2 x x y y 2!

(2)

f uv

(3)

It is assumed that the high-order terms are negligible, and contribute to noise in the fit. This corresponds to fitting a planar patch to each pixel in the reference image, and using that to estimate the offset. A least squares fit may be used to obtain the values of u and v that minimise the error. Since the images are sampled, and the derivatives are not known, the gradients must be estimated from the pixel values that are available through the use of digital filters. Different approaches for estimating the gradients will give different approximations. The simplest approximation may be obtained from the first difference. Unfortunately, with a small window the derivative of the underlying continuous signal may be estimated with limited accuracy. This may be improved by using matched low pass and derivative filters such that the derivative filter closely approximates the derivative of the low pass filter [9]. II.

Predictive interpolation methods may be considered as a variation of the gradient-based methods described above. Rather than work with the Taylor series model of (2), predictive interpolation uses the neighbouring pixel values in the reference image to estimate the pixels values in the image being registered by using a linear predictor. The prediction parameters are then used to infer an offset based on an interpolation model. The effect of these additional pixels in performing the prediction is to add some of the higher order terms of (2). For example, consider bilinear interpolation ([1], [10]) of an image f within an integer spaced grid as shown in Fig. 1. The value of the interpolated point, fuv, is given by

(4)

Comparing (4) with (2) and (3) we see that (4) has added the uv twist term to the expansion. As the bilinear interpolation model is based on the four nearest pixels, the corresponding linear predictor will have four terms: g 00 = A00 f 00 + A01 f 01 + A10 f10 + A11 f11

fuv

v

f 00

u

f10

Figure 1. Interpolation on an integer grid. fuv is given by a linear combination of the nearest four pixel values.

where the Aij are the linear prediction coefficients. Further, from comparing (4) and (5), we want to apply the following constraint:

(5)

(6)

Substituting (6) into (5) gives:

g 00

f 00 = A01 ( f 01

f 00 ) + A10 ( f10

f 00 ) + A11 ( f11

f 00 )

(7)

Least squares minimisation can then be used over the whole image to give the values of the coefficients that minimise the prediction error from f to g. If we define

gˆ = g 00 f 00 fˆij = f ij f 00

(8)

then the total square error becomes:

Error =

(A

fˆ + A10 fˆ10 + A11 fˆ11

01 01



)

2

(9)

pixels

The coefficients Aij that minimise this error may be found by taking the partial derivative with respect to each of the coefficients and solving for when these derivatives are equal to 0. This gives a system of linear equations fˆ012 fˆ fˆ

fˆ01 fˆ10 fˆ 2

fˆ01 fˆ11 fˆ fˆ

fˆ01 fˆ11

fˆ10 fˆ11



01 10

PREDICTIVE INTERPOLATION

f uv = (1 u )(1 v) f 00 + (1 u )vf 01 + u (1 v) f10 + uvf11

f11

A00 + A01 + A10 + A11 = 1

For small offsets, (2) can be truncated to keep only the first order terms:

f f f 00 + u + v x y

f 01

10

10 11 2 11

A01 A10 = A11

fˆ01 gˆ fˆ gˆ 10

(10)

fˆ11 gˆ

which may then be solved for the prediction coefficients. By equating (4) and (5) we have A01 = v uv A10 = u uv A11 = uv

(11)

which can be rearranged to give an estimate of the sub-pixel offset between the two images as uˆ = A10 + A11 vˆ = A01 + A11

(12)

Therefore, by using bilinear interpolation filter as a prediction equation, it is possible to obtain a direct estimate of the sub-pixel offset between two images. In a similar manner, this process may be readily extended to other interpolation models, such as biquadratic, bicubic, and cubic spline interpolation. III.

BIAS ANALYSIS

In the analysis here, only linear interpolation will be considered. Bilinear interpolation may be considered as linear interpolation in the x direction, followed by linear interpolation

Menu

Go Back

Next Page

in the y direction (or vice versa). For simplicity of analysis, only one dimensional signals will be considered here, although the bias mechanisms identified will be confirmed in section IV with two dimensional images.

frequency. The bias steadily increases to 0.075 pixel at the Nyquist frequency. For higher frequencies, the error increases more rapidly, giving meaningless estimates of the offset as the signal frequency approaches the sample frequency.

f 0 = A1 ( f1

f0 )

(13)

with the least squares estimate of A1 being

uˆ = A1 =

( f1

f 0 )( g 0

( f1

f0 )

f0 )

(14)

2

The least squares fit is effectively weighting the proportion of the distance g0 is between f1 and f0 by the local gradient. B. Bias for sinusoids Consider a pair of one-dimensional sinusoidal signals f ( x) = sin !x g ( x) = sin ! ( x + u )

(15)

Substituting (15) into (14) gives uˆ =

(sin ! ( x + 1) sin !x )(sin ! ( x + u ) (sin ! ( x + 1) sin !x )2

sin !x )

(16)

Unless the sinusoid frequency, , is harmonically related to the sample frequency, the summation will range over all possible angles. Therefore the expected value of û may be obtained by averaging over all possible phase angles. This may be accomplished by replacing the summation in (16) with integration over a single period: " !

E (uˆ ) =

=

# (sin ! ( x + 1)

sin !x )(sin ! ( x + u ) sin !x )dx

" !

" !

# (sin ! ( x + 1) " !

sin !x ) dx 2

(17)

cos ! (1 u ) cos !u + 1 cos ! 2(1 cos ! )

Any deviation between E(û) and u represents a bias in the estimate. Clearly, from (17) the bias is frequency dependent, and there is a bias present even when the signals obey the Nyquist criterion. The bias is also a function of the offset between the images. Fig. 2 shows the bias as a function of offset for a sinusoid at half the Nyquist frequency (i.e. F=H/2). The negative bias for offsets less than 0.5 pixel and positive for offsets greater than 0.5 pixel indicate that the bias is towards the nearest pixel, and away from the centre. The bias pattern remains much the same shape for different frequency signals, although the amplitude of the error is larger at higher frequencies. As the offset is not known in advance (otherwise registration is unnecessary), the RMS error over all possible offsets in the range 0 to 1 provides an estimate of the average error. Fig. 3 shows the error as a function of the signal

0.025 0.02 0.015 0.01 0.005 0 -0.005 -0.01 -0.015 -0.02 -0.025 0

Bias at ! =" /2

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Offset, u, pixels Figure 2. Bias in estimating the offset between two sinusoids, as a function of the actual offset.

RMS Bias, pixels

g0

Bias, E(u )-u, pixels

A. Imaging Model In one dimension, (7) becomes

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Sinusoid frequency, relative to sample frequency Figure 3. Frequency dependence of bias. Aliasing results in significant bias.

Although this bias pattern would indicate that this method is likely to perform poorly with aliased signals, in practise the situation is a lot more complex. Images seldom consist of a single sinusoid, and even when there are high frequency components (above the Nyquist frequency) the low frequencies are still dominant. The bias mechanism is not linear, so it is not simply a case of adding the bias from each of the frequency components to get the overall bias. In an image with dominant low frequency components, the overall bias will be dominated by the low frequency component, although the higher frequencies would be expected to affect the bias. It is important to understand the bias mechanism, to appreciate what is going to happen in a more complex case. Fig. 4 shows a sinusoid sampled at half of Nyquist frequency, and the corresponding linear interpolation. While overall the interpolation errors cancel out, for a particular offset, the errors do not. Since the sinusoid is curved, the linear interpolation provides a poor approximation. The deviation from linear is greater near the peaks than closer to zero. This asymmetry results in small positive errors and larger negative errors for offsets close to 0, and small negative errors and larger positive errors for offsets close to 1. Therefore the bias is towards 0 for small offsets and towards 1 for larger offsets. Note that Fig. 4 shows only one phase relationship between the samples and the underlying sinusoid. The true bias will be obtained from averaging over all possible phase relationships.

Go Back

Next Page

Amplitude

1 Larger negative error 0.9 Larger positive error 0.8 0.7 Linear interpolation 0.6 0.5 Smaller negative error 0.4 0.3 Smaller positive error 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Sample position Figure 4. Bias mechanism in a sinusoid with frequency F=H/2. Positive and negative bias components do not cancel out. Samples offset by 0.2 and 0.7 pixels are shown.

g0

1 0.8 Amplitude

Menu

f0

0.6

Linear interpolation u

Offset, u

0.4

f1

Edge position, e

0.2

g-1

0 f -1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 Sample position Figure 5. A step edge blurred by area sampling. The points labelled f are reference image samples, and points labelled g are the other image offset by u. 0.5

C. Bias for step edges In the previous section the point was made that real images do not consist of pure sinusoids. A more realistic model to consider therefore is that of a step edge since many images can be approximated by piecewise constant regions (with step edges in between). In the image the step edge is blurred to a single pixel wide ramp by the area sampling within the pixels. It is equivalent to the edge pixels taking on an intermediate value depending on the exact position of the edge relative to the pixels. Without loss of generality, consider a single step edge of height 1, as illustrated in Fig. 5. There will always be one sample pixel within the blurred region, and one pixel on either side. Again the image being registered is offset from the reference by u. The intermediate pixel values are given by

f0 =

1 2

e

u < e + 12 & 0 g 1=% 1 1 $u e 2 u > e + 2 &u e 12 u < e + 12 g0 = % u > e + 12 $ 1 From (14) the prediction of the offset becomes uˆ =

(g

1

f 1 )( f 0 ( f0

f 1 ) + ( g0 f 1 ) + ( f1 2

f 0 )( f1 f0 )

2

f0 )

(18)

(19)

In general, the location of the edge will be random relative to the pixel grid. Therefore to obtain the expected value for the offset, we average over all possible edge positions relative to the sample positions:

# (g

E (uˆ ) =

f 1) + (g0

f 1 )( f 0

1

f 0 )( f1

f 0 )de (20)

0. 5

0.5

#( f

f 1 ) + ( f1 2

0

2

f 0 ) de

0. 5

Substituting (18) into (20) gives the expected offset: u 0. 5

# (u e

E (uˆ ) =

0.5 1 2

e) + ( 12 + e) 2 de +

)( 12

# u(

1 2

+ e)de

u 0.5

0.5

0. 5

#(

1 2

e) 2 + ( 12 + e) 2 de

(21)

0.5

= u+ u 3 4

3 4

2

1 2

u3

The bias is again the difference between the expected offset and the actual offset, and is plotted in Fig. 6. The bias pattern is very similar to that for sinusoids, with the bias towards the position of the reference samples, and results from asymmetry in the linear interpolation slopes on each side of the edge.

Bias, E(u )-u , pixels

Since the bias is dependent on the errors between the linear approximation and the actual signal, the amplitude of the errors will increase as the linear approximation deviates more from the actual curve. Therefore the errors, and bias, will increase as the frequency of the sinusoid increases.

0.025 0.02 0.015 0.01 0.005 0 -0.005 -0.01 -0.015 -0.02 -0.025 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Offset, u, pixels Figure 6. Bias characteristics for a blurred step edge, when the edge position is random relative to the sample grid.

In deriving the bias, the implicit assumption is that there are many edges within the image and that the position of the edges relative to the sample pixels is uniformly distributed. For images with a large number of edges at random orientations, this is likely to be a valid assumption. However, if the edge position distribution is not uniform, there is likely to be additional bias introduced.

Go Back

Next Page

D. Effects of noise The previous analysis assumed that there is no noise in the image. In practise, all imaging systems introduce noise from a range of sources: imaging (shot) noise, amplifier noise, quantisation noise, and so on. In the analysis here, we will consider additive Gaussian noise with zero mean and standard deviation #. Let rfi and rgi be such independent random variables added to fi and gi respectively, then (14) becomes

uˆ =

(f

1

f 0 + rf 1

(f

1

)(

rf 0 g 0

f 0 + rg 0

f 0 + rf 1

rf 0

)

rf 0

)

2

(22)

In considering the expected offset, many of the product terms expand to 0 because they involve the summation of a scaled random variable. Therefore

( f1

E (uˆ ) =

f 0 )(g 0

( f1

f0 ) + ) 2

f 0 ) + 2) 2 2

(23)

The effect of noise therefore is to add an additional term to both the numerator and denominator that depends on the noise level. For low or no noise, the bias is dominated by the signal, but when L is large, the noise dominates and the expected estimate of the offset is 0.5. This means that for images dominated by noise, the bias is away from the nearest pixel and towards the centre between the pixels. Performing the analysis for the step edge, (21) becomes E (uˆ ) =

3 4

u + 34 u 2 12 u 3 + 3) 2 1 + 6) 2

(24)

This is plotted in Fig. 7 for a range of noise levels. As the bias introduced by the noise is in the opposite direction to that for the deterministic signal, at intermediate levels of noise the bias is partially cancelled. This is clearly illustrated in Fig. 8.

0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0

0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 Noise standard deviation, ) Figure 8. RMS bias as a function of noise level. For # < 0.1, the bias is dominated by the signal, and for # > 0.1 the bias is dominated by the noise.

A. Experimental setup To assess the registration accuracy, it is necessary to work with images where the offset is known in advance. Capturing a sequence of images with precisely known offsets is difficult, if not impossible. Therefore, a high-resolution image was used as the image source, and a simple imaging model was used to simulate the capture of the sample images. The source image was filtered using a square box average filter to simulate area integration, shifted by known offsets, then downsampled. Finally, Gaussian noise was added to simulate the effects of various noise sources within the process. If we consider sinusoidal signals, for a given amplitude, the difference between adjacent pixel values will be smaller at lower frequencies, therefore the lower frequencies are more likely to be affected by noise. Two test images were chosen to evaluate the accuracy of the registration (see Fig. 9). The input images, which were approximately 1300x1300 pixels, were then downsampled by a factor of 10 to 128x128. This gave a set of images offset by multiples of 0.1 low resolution pixel. These low resolution images then had random noise added to each pixel.

Bias, E(u)-u, pixels

0.1 0.08 ) =0.2 0.06 Increasing ) 0.04 0.02 0 -0.02 -0.04 Increasing ) -0.06 -0.08 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Offset, u, pixels Figure 7. Bias as a function of offset and noise for a blurred step edge. Noise levels plotted are L = 0 to 0.2 in steps of 0.025. At low noise the bias is towards the nearest pixel and at higher noise levels the bias is away.

RMS bias, pixels

Menu

In practise, the noise level for minimum bias will depend on the number and amplitude of edges within the image. The signal contributions will be 0 away from edges (where f1–f0=0), while there will still be a contribution of #2 from such pixels. IV.

EXPERIMENTAL CONFIRMATION

This section tests the validity of the 1-D analysis in section III on real 2-D images.

Figure 9. Sample images used. Low resolution, 128x128 images are shown. “Beach” is a typical scene dominated by low frequencies with some sharp edges resulting in a limited degree of aliasing. “Text” contains a lot of high frequency information with significant aliasing.

The downsampling procedure resulted in 100 low resolution images with different offsets. By considering each possible pair of images, a total of 10,000 registrations were performed for each noise level. For each of the 100 relative offsets, the mean and standard deviations of the 100 2-D estimates of the offset enabled the systematic and random components of the errors to be estimated.

Menu

Go Back

Fig. 12 shows both the mean error (bias) and the standard deviation (random component). The bias components for both images show the characteristic dip where the bias resulting from the noise partially cancels the bias inherent in the signal. Note that the errors for both images are dominated by the bias rather than random components. This concurs with the finding of Bailey [10] that when applying the fit to a region greater than about 100x100 the bias dominates. The random component is expected to be more dominant when fitting to a smaller subimage.

1

Y offset

0.8 0.6 0.4 0.2 0 0

0.2

0.4 0.6 0.8 1 X offset Figure 10. Bias map of “Beach” with no noise. The cross is the estimated offset (averaged over 100 images) and the line links this with the true offset. The bias is away from the centre, towards the nearest pixel.

1

Y offset

0.8

The “Beach” image performs better at low noise, but deteriorates rapidly as noise is added. This is because this image is dominated by low frequencies, and has large relatively constant regions, which will be more affected by noise. The “Text” image has more high contrast edges, and is therefore less affected by noise. This also explains why the minimum bias occurs at a higher noise level than with “Beach”. At low noise, the significant aliasing results in a greater bias. As the bias becomes more dominated by the noise, the random component actually reduces slightly. This is because the noise induced bias is more consistent than the signal bias when fitting over a 128x128 image. V.

0.6 0.4 0.2 0

0 0.2 0.4 0.6 0.8 1 Figure 11. Bias map of “Beach” with noise added (L=0.02). Noise biases the estimated offset away from the nearest pixel and towards the centre.

It is demonstrated that the bilinear equation may be used as a predictor from which 2-dimensional translation may be estimated. Theoretical analysis shows that there is a bias towards the nearest pixel for sinusoids and step edges in the absence of noise. When noise is dominant, there is a bias away from the nearest pixel. This leads to an interesting effect where adding noise can actually reduce the bias and result in a more accurate estimate of the offset between two images. REFERENCES [1]

RMS error, pixels

0.25

"Beach"

0.2 0.15 0.1

"Text"

0.05 0 0

0.02

0.04 0.06 0.08 0.1 Noise standard deveiation, ) Figure 12. Effect of noise on the RMS bias for both “Beach” and “Text”. The solid line represents the bias (mean error), and the dashed line represents the random component (standard deviation) of the error.

B. Experimental results Fig. 10 shows the bias patterns for the “Beach” image, with no noise. There is no bias for pixels on the original sampling grid, or with an offset of (0.5,0.5). Elsewhere the bias is away from the centre, and towards the nearest pixel as indicated by (17) and (21). When noise is added, the bias is towards the centre of the image as shown in Fig. 11.

SUMMARY

Bailey, D. G. and Lill T. H., “Image registration methods for resolution improvement,” Proc. Image and Vision Computing New Zealand, pp. 91-96, Aug. 1999. [2] Dane, G. and Nguyen, T. Q., “The effect of global motion parameter accuracies on the efficiency of video coding”, Proc. IEEE International Conference on Image Processing, Vol. 5, pp 3359-3362, 2004. [3] Sharma, R., and Pavel, M., “Multisensor image registration”, Society for Information Display, vol. XXVIII, pp. 951–954, 1997. [4] Chen, C. Y., and Klette, R., “Image Stitching - Comparisons and New Techniques”, Proc. of the 8th International Conference on Computer Analysis of Images and Patterns, Lecture Notes in Computer Science, Vol. 1689, pp 615-622, 1999. [5] Barron, J. L., Fleet, D. J., Beauchemin, S. S., and Burkitt, T. A., “Performance of optical flow techniques,” Proc. 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR '92, pp. 236-242, 1992. [6] Brown, L. G., “A Survey of Image Registration Techniques,” ACM Computing Surveys, vol. 24, no. 4, pp. 325-376, Dec, 1992. [7] Zitova, B. and Flusser, J., “Image Registration Methods: a Survey,” Image and Vision Computing, vol. 21, pp. 977-1000, Oct 1, 2003. [8] Bailey, D. G., “Subpixel estimation of local extrema,” Proc. of Image and Vision Computing New Zealand, pp. 414-419, Nov. 2003. [9] Farid, H. and Simoncelli, E. P., “Differentiation of discrete multidimensional signals,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 496-508, 2004. [10] Bailey, D. G., “Predictive Interpolation for Registration,” Proc. Image and Vision Computing New Zealand, pp 240-245, Nov. 2000.