Maximum Entropy Matching: An Approach to Fast ... - DiVA portal

3 downloads 0 Views 415KB Size Report
Oct 25, 2000 - Maximum Entropy Matching (MEM) and the PAIRS method ... Statement 3 (the fast dissimilarity principle) is an important and a very general.
Maximum Entropy Matching: An Approach to Fast Template Matching Frans Lundberg October 25, 2000

Contents 1

Introduction

2

2

Maximum Entropy Matching 2.1 The cornerstones of Maximum Entropy Matching . . . . . . . . . . . 2.2 Bitset creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Bitset comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 4 5

3

PAIRS and the details of the bitset comparison algorithm 3.1 PAIRS . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Motivation for PAIRS . . . . . . . . . . . . . . . . . . 3.3 The bitset comparison algorithm . . . . . . . . . . . . 3.4 Implementation issues . . . . . . . . . . . . . . . . . 3.5 Speed . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

6 6 7 7 8 9

A comparison between PAIRS and normalized cross-correlation 4.1 Test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Generation of image distortions . . . . . . . . . . . . . . . . . 4.2.1 Gaussian noise, NOISE . . . . . . . . . . . . . . . . . 4.2.2 Rotation of the image, ROT . . . . . . . . . . . . . . 4.2.3 Scaling of the image, ZOOM . . . . . . . . . . . . . . 4.2.4 Perspective change, PERSP . . . . . . . . . . . . . . 4.2.5 Salt and pepper noise, SALT . . . . . . . . . . . . . . 4.2.6 A gamma correction of the intensity values, GAMMA 4.2.7 NODIST and STD . . . . . . . . . . . . . . . . . . . 4.3 Relevance of the distortions . . . . . . . . . . . . . . . . . . . 4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Other template sizes . . . . . . . . . . . . . . . . . . . . . . . 4.6 Performance using other images . . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

11 11 12 12 13 13 13 13 13 14 14 14 19 19

5

Statistics of PAIRS bitsets 5.1 Statistics of acquired bitsets . . . . . . . . . . . . . . . . . . . . . . .

22 22

6

Comments

24

4

1

. . . . .

. . . . .

. . . . .

1 Introduction One important problem in image analysis is the localization of a template in a larger image. Applications where the solution of this problem can be used include: tracking, optical flow, and stereo vision. The matching method studied here solve this problem by defining a new similarity measurement between a template and an image neighborhood. This similarity is computed for all possible integer positions of the template within the image. The position for which we get the highest similarity is considered to be the match. The similarity is not necessarily computed using the original pixel values directly, but can of course be derived from higher level image features. The similarity measurement can be computed in different ways and the simplest approach are correlation-type algorithms. Aschwanden and Guggenb¨uhl [2] have done a comparison between such algorithms. One of best and simplest algorithms they tested is normalized cross-correlation (NCC). Therefore this algorithm has been used to compare with the PAIRS algorithm that is developed by the author and described in this text. It uses a completely different similarity measurement based on sets of bits extracted from the template and the image. This work is done within WITAS which is a project dealing with UAV’s (unmanned aerial vehicles). Two specific applications of the developed template matching algorithm have been studied. 1. One application is tracking of cars in video sequences from a helicopter. 2. The other one is computing optical flow in such video sequences in order to detect moving objects, especially vehicles on roads. The video from the helicopter is in color (RGB) and this fact is used in the presented tracking algorithm. The PAIRS algorithm have been applied to these two applications and the results are reported. A part of this text will concern a general approach to template matching called Maximum Entropy Matching (MEM) that is developed here. The main idea of MEM is that the more data we compare on a computer the longer it takes and therefore the data that we compare should have maximum average information, that is, maximum entropy. We will see that this approach can be useful to create template matching algorithms which are in the order of 10 times faster then correlation (NCC) without decreasing the performance.

2 Maximum Entropy Matching 2.1 The cornerstones of Maximum Entropy Matching The purpose of template matching in image processing is to find the displacement r such that the image function I (x r) is as similar as possible to the template function T (x). This can be expressed as 

rmatch = argmax similarity T (x); I (x



r)

:

(1)

Maximum Entropy Matching (MEM) and the PAIRS method described later are valid for all types of discretely sampled signals of arbitrary dimension, but here we will discuss the specific case of template matching of RGB images.

2

The difficult part with template matching is to find a similarity measurement that will give a displacement of the template which corresponds to the real displacement of the signal in the world around us. For many applications it is difficult to even define this ideal displacement, since the difference between the image neighborhood and the template does not consist of a pure translation. This fact makes it difficult to compare different template matching algorithms. Furthermore, the similarity measurement that should be used is application dependent. For example, rotation invariance might be wanted for one application, but not for another. Maximum Entropy Matching does not necessarily lead to a similarity measurement that is better than others, but it aims to increase the speed of the template matching while keeping the performance. It works by comparing derived image features of the image and the template for each possible displacement of the template. The approach is based on the following statements. 1. The less data we compare for each possible template position, the faster this comparison will be. 2. The data we compare should have high entropy. 3. On the average, less data needs to be compared to conclude two objects are dissimilar then to conclude they are similar. This statement will be called the fast dissimilarity principle. 4. The data that we use for the comparison should be chosen so that the similarity measurement will be distortion persistent. Statement 1 is true in the sense that the time to compute a similarity measurement is usually proportional to the amount of data that is compared. For correlation-type template matching all of the pixel data in the template is used in the matching algorithm. We will see that the amount of data that is used for comparison can be decreased substantially using the MEM approach. The compare time also depends on the way the data is compared. Not counting normalizations, the similarity measurement for these algorithms is acquired by one multiplication and one addition for each byte of pixel data (assuming each intensity value is stored as one byte). The comparison of data for the MEM approach is done by an XOR-operation, and a look-up table, which is faster per byte then the correlation-type approaches1 and simple to implement in hardware. Statement 2 is intuitively appealing. To increase the speed of the matching algorithm we want to use as little data as possible in the comparisons, but we wish to use as much information as possible. Therefore the data used in the comparison should have high average information, that is, high entropy. Experiments show that it is possible to reduce the original amount of data used in the comparisons in the order of 10 to 100 times while keeping good performance. It is not very difficult to prove that the maximum entropy of digital data is only achieved when the following two criteria are fulfilled. One, the probability of each bit in the data being 1 is 0.50. Two, all bits should be statistically independent. When we talk about entropy we view the data to be compared as one random variable, and when we talk about independence of bits, the single bits are considered random variables. Since statistical independence of the bits is necessary to achieve maximum entropy it is natural and necessary to view the compare data as a set of bits. This view is used in MEM where a bitset is extracted from the template and from each neighborhood in 1 This

result is obtained using my C-implementations, see later sections for implementation issues.

3

the image. The similarity measurement used to compare the image neighborhood bitset and the template bitset is simply the number of equal bits. Lossy data compression of images is a large research area that I believe can be very useful in order to find high entropy image features that are good to use for template matching. However, the problems of finding these compare features and how to compress image data is fundamentally different, since there is no demand for image reconstruction from the compare features. Statement 3 (the fast dissimilarity principle) is an important and a very general statement that is valid for all types of objects that are built up by smaller parts. The statement comes from the fact that two objects are considered similar only if all their parts are similar. If a part from object A is dissimilar to the corresponding part of object B, we can conclude that A and B are dissimilar. If the part from A and the corresponding part of B are similar we cannot conclude anything about the similarity between the whole objects. Therefore it usually takes less data to conclude that two objects are dissimilar then to conclude they are similar. We will see how this statement can be used to speed up the matching algorithm. Statement 4. In template matching no image neighborhood is identical to the template. There is always some distortion (for example: noise, rotation or a shadow on the template) present and we must try to choose the data we compare so that the similarity measurement is affected as little as possible by these distortions. The following two sections will describe how to extract compare data, and then how to compare this data for fast, high performance template matching.

2.2 Bitset creation Maximum Entropy Matching consists of two separate parts: bitset creation and bitset comparison. In the bitset creation part a set of bits is produced for the template and for each neighborhood in the image directly or indirectly from the pixel data. How these bitsets are created is not determined by MEM. The optimal bitsets to extract is dependent on what image distortions that are expected for the intended application. One example of a bitset creation algorithm is PAIRS. We demand three things from the bitset creation algorithm. 1. The created bitsets should have high entropy. 2. The created bitsets should be resistant to the image distortions that appears for the intended application. 3. The bitset creation time should be short. In order to compare two different bitset creation algorithms we must have measurements of “high entropy”, “distortion resistance” and “bitset creation time”. We will suggest possible ways of measuring this quantities. It is difficult to estimate the entropy of a bitset consisting of more then a few bits. If the extracted bitset only has, say, 8 bits, we can estimate the full discrete probability distribution using a database of image neighborhoods. The entropy is then computed by its definition from the probability distribution. This is possible for a 1-byte bitset which has only 256 possible states. But, for a 4-byte bitset we have 232  4  109 possible states and an explicit estimation of the full probability distribution is not possible. There are other ways to estimate entropy. In [3] a method for estimating the entropy of one-dimensional information sequences is applied to gray-scale images. The method 4

uses pattern matching to estimate the entropy. More about pattern matching in information theory can be found in [4]. Since the entropy is difficult to estimate we can instead use a measurement of how close to maximum entropy the data is. Assuming we have a database of image neighborhoods we can find the probabilities for each bit being set to 1. These probabilities should be close to 0.50 to achieve high entropy. Also, we can measure how independent the bits are by estimating the correlation ρ. ρi j =

E (bi

E (bi ) (b j E (b j ))) V (bi )V (b j )

p

(2)

E denotes expectation value, V denotes variance, and bk denotes the k’th bit in the bitset. Since we are dealing with binary distributions we can fortunately conclude that if ρi j = 0 the i’th and the j’th bits are independent. We can construct a measurement of how much the bitsets deviate from having maximum entropy by studying how much they deviate from the assumption of 50 per cent probability of a bit set to one and from the desired independence of the bits. If the bits in a bitset have a probability of being 1 equal to 0.50 and they are independent the distribution of the number of ones in the bitset will follow a binomial distribution. Therefore we can define another measurement of how close to maximum entropy the bitsets are as the deviation from a binomial distribution of the number of ones in the bitsets. These two ways of measuring how close to maximum entropy the bitsets are will be exemplified. Distortion persistence of the bits can be measured by performing experiments on a number of templates subject to controlled distortions. A bitset is created from the template before and after the distortion. The number of bits that are equal of the two bitsets is a measurement of how persistent the bitset creation algorithm is to the applied distortion. The bitset creation time can be measured for a specific computer. However, if bitset creation method A is faster then method B on computer X, A is not necessarily faster on computer Y. Also, the implementations are often not trivial to optimize. So it is not always possible to determine which bitset creation method that is generally the fastest.

2.3 Bitset comparison The bitset comparison part of the MEM is not application dependent. For each image neighborhood and for the template a set of bits is generated somehow. The similarity measure in the template matching algorithm is simply the number of equal bits in the template bitset and the image neighborhood bitset. The bitsets consists of a whole number of bytes2 for practical reasons. It is possible to use the fast dissimilarity principle (MEM Statement 3 in section 2.1) to decrease the bitset compare time. This is done by comparing the first number of bytes in the neighborhood and the template bitsets. If the number of equal bits in these parts of the bitsets is below a certain threshold the bitsets are considered dissimilar, and the similarity value is set to zero. If the number of equal bits is not below the threshold the whole bitsets are compared and the similarity measure is the number of equal bits of the whole bitsets. This algorithm and its implementation is described in detail in the next section. 2A

byte is here assumed to be 8 bits.

5

3 PAIRS and the details of the bitset comparison algorithm 3.1 PAIRS PAIRS is an algorithm to create bitsets of arbitrary number of bytes from neighborhoods in RGB images. PAIRS can easily be modified to deal with other kinds of signals of arbitrary inner and outer dimension The PAIRS method is based on random pairs of pixels within a neighborhood of an image. Each bit in a bitset is created from a certain pair of pixels. A bit is set to 1 if the first pixel value is larger than the other in the pair. Otherwise the bit is set to 0. The pair of pixel values are chosen in the same color band. The random pairs are chosen according to Algorithm 1 which is presented in C-like pseudo-code. ---------- Algorithm 1 ---------// Computes a list of pairs to be used // for bitset creation. INPUT VARIABLES n Number of bytes in each bitset colors Number of colors (3 for RGB images) OUTPUT VARIABLES list List of pixel pair coordinates used to form the bitsets, size: 8 x n where each element contains the coordinates of the pixel pair FUNCTIONS CALLED rand rand(low,high) returns a random integer between low and high. ALGORITHM For i1=0 to n2*8-1 { index1x = rand(0,N-1); index1y = rand(0,N-1); index2x = rand(0,N-1); index2y = rand(0,N-1); index3 = rand(0,colors-1); Store all five index variables in list[i1]. } ---------------------------------

When the list of pixel pairs have been created according to Algorithm 1 or a precomputed list is loaded from a file, the actual bitsets are created according to Algorithm 2. ---------- Algorithm 2 ---------// Creates bitsets from image neighborhoods. INPUT VARIABLES im An RGB image, size: imSize1 x imSize2 x 3 list A list of pixel pairs created by Algorithm 1 OUTPUT VARIABLES bs Image bitset, size: (imSize1-N+1) x (imSize2-N+1) x 8*n

6

ALGORITHM For i1=0 to imSize1-N, i2=0 to imSize2-N /* For all image neighborhoods */ { For i3=0 to 8*n-1 /* For all bits in the bitset */ { Get index1x, index1y, index2x, index2y and index3 from list[i3]. If im[i1+index1x, i2+index1y, index3] > im[i1+index2x, i2+index2y, index3] { bs[i1,i2,i3] = 1; } Else { bs[i1,i2,i3] = 0; } } } ---------------------------------

Note that Algorithm 2 is used to create both the image bitset and the template bitset. The size of the resulting template bitset will be 1  1n or simply n if we neglect the singleton dimensions.

3.2 Motivation for PAIRS The PAIRS method for bitset creation that was described in the previous section has been developed since it is a good compromise between the desired properties of MEM bitsets as described in 2.2. The entropy of these bitsets are high, the similarity measurement is resistant to certain kinds of distortion, and the bitset creation time is low. I believe other ways to create bitsets can prove better then PAIRS for some applications, but PAIRS is fast and rather simple to implement, and it works on the original input intensity data. The method has proved useful in applications and is used to demonstrate the maximum entropy approach to template matching. Matching using bitsets created with PAIRS compares very well with correlation approaches according to experiments with controlled distortions, see Section 4. When high invariance against certain types of distortions, such as rotation, is needed I believe higher-level image features should be used when forming the bitsets.

3.3 The bitset comparison algorithm The previous section described the PAIRS way to create bitsets. The bitset comparison algorithm is used to match the template bitset with the image bitsets. The algorithm is not dependent on what bitset creation method that is used. There are two versions of the bitset comparison algorithm (Algorithm 3), with or without sort out. If sort out is not used the similarity measurement between two bitsets is simply the number of equal bits. Sort out can be used to increase the speed of the algorithm by setting the similarity to zero if the first n1 bytes is less then a certain limit. The principle behind this is the fast dissimilarity principle discussed in Section 2.1. The average number of bytes that have to be compared can be reduced substantially by using sort out. Notice that a drawback with using sort out is that the execution time will be dependent on the input data. ---------- Algorithm 3 ----------

7

// Computes a similarity measurement between // the template bitset and the image bitsets. INPUT VARIABLES im_bs Image bitset, size: s1 x s2 x 8*n temp_bs Template bitset, size: 8*n ADDITIONAL INPUT VARIABLES FOR SORT OUT VERSION n1 Number of bytes to use for initial sort out. thres Threshold OUTPUT VARIABLES s The similarity measurement, size: s1 x s2 FUNCTIONS CALLED simil simil(bs1, bs2) computes the number of equal bits in bitset bs1 and bs2. ALGORITHM (without sort out) For i1=0 to s1-1, i2=0 to s2-1 s[i1,i2] = simil(im_bs[i1,i2,0:n-1], temp_bs(0:n-1)); } ALGORITHM (with sort out) For i1=0 to s1-1, i2=0 to s2-1 { sortout_sim = simil(im_bs[i1,i2,0:n1-1], temp_bs(0:n1-1); if sortout_sim

Suggest Documents