Regular repetitive texture on an object produces regular repetitive texture in an image ...... textures in real images do not conform exactly to the theory. First, our ...
Computer Perception of Repetitive Textures Leonard G.C. Harney February 1988 CMU-CS·88-149
Computer Perception of Repetitive Textures
Leonard G. C. Harney February 1988
Submitted to Carnegie Mellon University in partial fulfillment of the requirements for the degree of Doctor of Philosophy.
Copyright
© 1988 by Leonard G.
C. Harney
This research was sponsored by DARPA monitored by the Air Force Avionics Laboratory under contract F33615-87-C-1499. Data for this research was partially provided by the Calibrated Imaging Lab at CMU, which is sponsored by NSF, DARPA, and NASA.
Abstract This dissertation presents research in computer analysis of two dimensional regular repetitive textures in real-world images. Previous efforts in this field have assumed simple grid-like repetitive structure. In contrast, we assume only locally simple repetitive structure. This local model of repetition leads to an algorithm that is able to analyse severely distorted repetitive textures, which occur in real-world scenes. We demonstrate the success of this algorithm on a variety of images. An essential part of describing repetitive textures is extracting the frequency of rep-
etition. However, regular repetitions admit many alternate frequency descriptions. We define the fundamental frequencies of a repetition as the two shortest independent vectors between elements of the repetition. We show that the fundamental frequency vectors are the most perpendicular basis vectors for the repetition and that they correspond to the relative neighbourhood graph of the repetitive pattern. Our algorithm exploits these properties to extract the fundamental frequencies of repetitive textures.
It is difficult to extract repetition frequency when the element of repetition is also unknown. We propose the dominant feature assumption as a solution to this problem. Rather than trying to find the unknown repetitive structure of an unknown texture element, we extract features from the image and rank them according to their importance (prominence). The repetitive structure of the most prominent (dominant) features is the desired structure of the entire pattern. The key concepts of the local model, the fundamental frequency vectors and the dominant feature assumption are used to develop a four-step algorithm for understanding repetitive texture. First, features are extracted. Second, basic structural relationships are established between features. Third, local repetitive structures are extracted around each feature. Finally, relaxation ensures consistency of the repetitive interpretation. We present experimental results demonstrating that this algorithm successfully extracts the structure of severely distorted repetitions in real-world images. i
For Vicky
· Acknowledgements This section is my opportunity to express my gratitude to those people who gave me support, encouragement and guidance during this research. First, I would like to thank my advisor, Prof. Takeo Kanade. Many of the concepts presented in this dissertation were first brought to light in his office. Takeo provided me with freedom to experiment and at the same time offered important evaluation of my progress. I will not forget his special encouragement when there seemed to be no solution in sight. Takeo also deserves special credit for the well-equipped Image Understanding Systems Laboratory in which this research was primarily conducted. I would also like to express my appreciation to the other members of my thesis committee: Prof. Steve Shafer, Prof Jon Webb and Prof. Robert Kass. These committee members invested time in discussion as I experimented with different approaches to the problem. They also studied drafts of this dissertation and provided valuable insights. Steve Shafer's Calibrated Imaging Laboratory provided data for this research. Jon Webb helped me on numerous occasions as I tried to employ Warp in my research. Jon also suggested the detailed study of a single texture which comprises the larger part of the experimental results of this dissertation. Robert Kass provided statistical insight and helped improve the rigour of the theoretical presentations in this dissertation. This dissertation presents the culmination of research that has explored many different approaches to the problem. Many of the early approaches that were investigated had horrendous computational complexity and would not have been feasible without the Warp systolic super computer. I would like to express my appreciation to the Warp group, and especially those who answered my frantic pleas for help: Jon Webb, Thomas Gross, David Yam, Monica Lam and Jeff Deutch.
v
VI
The IUS lab is a splendid research environment, not only because of the facilities provided but also because it is an informal meeting place for students and staff-a place where one becomes familiar with other research and ideas are bounced around. I would like to thank the IUS group for many stimulating discussions-s-especially Gudrun Klinker, Keith Gremban, Chuck Thorpe, Jill Crisman, Larry Matthies and Karl Kluge. I was fortunate enough to share an office with Gudrun and Keith for the last year and a half of my stay at Carnegie Mellon. I learned a lot about colour from Gudrun's research and Keith's work on camera calibration provided some test images for this dissertation. The IUS lab facilities are maintained by Ralph Hyre and Jim Moody;.I appreciated their dedication especially when there was the occasional facilities failure. Lastly, I thank my wife Vicky for her support and encouragement, especially during the last two months of intensive research and writing when she almost became a "thesis widow". Although our son Daniel is too young to understand what a PhD is, he has brought much joy into my life over the past three years-thanks Daniel.
Contents 1 Introduction
1
1.1 Background
3
.
1.2 Repetition in the Real World
5
1.3 The Repetition Analysis Problem
10
1.4 Desired Results
11
1.5 Impact . . . . .
12
1.6 Related Work
12
1.7
1.6.1
Fourier Transform
13
1.6.2
Co-occurrence Methods
13
1.6.3
Histogramming of Displacement Vectors
15
1.6.4
Limitations of Global Methods
15
Thesis Outline . . . ..
.
16
2 Modeling in Texture Analysis
31
2.1
Texture Models
.
.
32
2.2
Reverse Engineering of Texture Models .
33
2.3
Forward Engineering of Texture Models
34
2.4
Three Models of Repetition . . . . . . .
34
2.4.1
The Grid Model . . . . . . . . .
34
2.4.2
Consistent Relative Placement Model .
2.4.3
Local Repetition Model
36 38
2.4.4
Summary
39
.
Vll
CONTENTS
Vlll
3
Theorems on Regular Repetition
3.1
The Cyclic Problem
.
42
3.2
The Ambiguity of Repetitive Frequency.
43
3.2.1
Prior Approaches
.
44
3.2.2
Fundamental Frequency Vectors.
46
3.4
Implications of the Dominant Feature Assumption
59 62 62
3.5
Discussion of the Dominant Feature Assumption
64
3.5.1
75
3.3
3.6 4
5
41
The Dominant Feature Assumption
.
3.3.1
.
Related Work
Summary
.
An Ideal Algorithm
75
Overview of the Algorithm
77
4.1
Feature Extraction . . .
80
4.2
Basic Structural Relationships
81
4.3
Local Repetitions
.
83
4.4
Relaxation . . . . .
88
4.5
Handling Real Data
88
4.5.1
92
Smooth Thresholding
Basic Structural Relationships
5.1
5.2
5.3
95
Evaluation of Structural Relationships .
96 97 100 101 102 105 106
5.3.1
Link Evaluation Algorithm ..
108
5.3.2
Application of the Prejudice Principle.
108
5.3.3
Application of the Interference Principle
5.3.4
Locational Interference .. . . . . . . .
109 109
Feature Extraction . . . . . 5.1.1
Feature Detector Algorithm
5.1.2
Discussion
.
Establishing a Local Neighbourhood 5.2.1
The Six-connected Graph ..
5.2.2
Breadth-First Search . . . . .
CONTENTS
5.3.5 5.4
ix
Prominence Interference .
112
Discussion . . . . . . . .
113 115
6 Extracting Local Repetitions 6.1
Types of Repetitive Structures.
116
6.2
Algorithm . . . . . . . . . . .
117
6.3
Evaluation of a Local Repetition
117
6.3.1
Evaluation of Deviation .
119
6.3.2
Evaluation of Perpendicularity.
120
Discussion...............
121
6.4
123
7 Relaxation 7.1 7.2
8
Algorithm . . . . . . . . . . . . .
123
7.1.1
125
Comparison of Repetitions
Results
129
8.1 Results on a Variety of Images
130
8.2 A Detailed Study
9
126
Discussion.............
.
131
8.3
Discussion of Results . . . . .
132
8.4
Limitations of the Dominant Feature Assumption.
133
9.1
Contributions
9.2
The Local Model of Repetition: Discussion .
187 187 188
9.3
Fundamental Frequency Vectors: Discussion
189
9.4
The Algorithm: Discussion . . . . . . .
189
9.5
Future Work
.
190
9.5.1
Shape from Regular Repetitions.
190
9.5.2
One Dimensional Repetitions
190
9.5.3
Unsolved Problems . . . . .
190
Conclusion .
Chapter 1 Introduction Visual texture is a common feature in images of real-world scenes. In man-made environments we encounter visual textures like carpet, brick walls, ceiling tiles, textiles and wood grain. In natural environments there are visual textures of grass, bark, a clump of trees, fallen leaves, cirrus clouds and ripples on a lake. Whether in natural or man-made environments visual texture accounts for a considerable portion of a typical image. Interpreting and understanding real-world scenes is the ultimate goal of computer _______~Vl~·s.ion_r_esearch; analysing visual textures is an important part of this research. This
dissertation presents research in computer algorithms for understanding a particular class of visual textures-two dimensional regular repetitive textures. A two dimensional regular repetitive texture is a texture in which some basic texture element (called the texel) is repeated in a regular two-dimensional pattern. Common examples of regular repetitive textures are brick walls, tiled floors and snake skin. Regular repetitive textures in images usually arise as a result of regular repetitive structures in real-world scenes. In man-made environments we find regular repetitive structures such as brick walls, tiled floors, slate roofs and windows on the face of a . building. Many textiles also exhibit regular repetitive patterns. In natural environments, regular repetitive structure is found in such objects as pine cones, the centres of flowers and the scales of fish and reptiles. Regular repetitive texture on an object produces regular repetitive texture in an image of the object. However, the repetitive pattern in the image is not necessarily as simple as
1
CHAPTER 1. INTRODUCTION
2
the repetitive pattern which exists on the object. (a) If the object is curved or bent, this will affect the repetition in the image. (b) If the image plane is not parallel to the surface of the object then perspective projection will make the pattern appear smaller where the surface of the object is further from the camera. Each of these effects-which are called
distortions-makes the regular repetition in the image more complex and more difficult to analyse than a simple two dimensional regular repetition. Previous work in computer analysis of repetitive textures has assumed that the imaged texture is a simple geometric repetitive pattern. Because of this assumption, the algorithms developed in previous work cannot analyse regular repetitive textures which have been distorted in the ways that often occur in images of real-world scenes. This dissertation develops algorithms which can deal with the types of distortion of regular repetitions that are found in real images. We make no assumptions about the distortions which may be present in the image, so we do not make global predictions about the structure of the repetitive texture. Instead, we assume that the repetitive texture is locally a simple geometric grid. We develop local algorithms which identify the presence of locally regular repetition in images, segment the regions of locally regular repetition and also describe the repetitive structure in detail. Because the algorithms are local, they do not constrain the global properties of the repetitive texture. As a result, our techniques can analyse a variety of images of regular textures which could not previously be analysed. The algorithm which is presented in this dissertation consists of four main phases: feature extraction, extraction of basic structural relationships, extraction of local repetitive structures and relaxation. In the first phase of the algorithm, features are extracted from the image. The features
have position information associated with them as well as a measure of their importance or prominence. In the second phase, basic structural relationships are established by linking nearby features together. The links between features have an associated confidence value. Higher confidence values indicate links that are more likely to be neighbouring features in a repetitive structure. The third phase extracts local repetitive structures by finding patterns of collinear,
1.1. BACKGROUND
3
equally-spaced features. The repetitive structures are extracted from the basic structural relationships previously established. Finally, a relaxation algorithm is used to remove inconsistencies in the local repetitive structures and to decide upon regions of repetitive texture. The final result of this algorithm is a detailed structural description of the regular repetition. Our algorithm successfully extracts the repetitive structure of regular textures in realworld images. We present results of applying our algorithm to a variety of images of both natural and man-made
re~ar
textures. A detailed study of one particular regular texture,
a textile, is also presented. In this study, the texture is imaged under a variety of different types of distortion and our algorithm successfully analyzes the repetitive structure despite severe distortion.
1.1 Background Visual texture is a major component of typical images of real-world scenes. Understanding of real-world images is dependent upon an understanding of visual textures. We need to be able to identify entire regions containing a single texture and not be distracted by the details of the texture. We also need to describe the textures in a way which facilitates recognition of previously encountered textures and extraction of information about the underlying shape of objects from their surface texture. These important aspects of understanding visual texture are known as: texture segmentation, shape-from-texture and texture recognition. Texture segmentation is the process of understanding entire regions of texture-we
want to see a grassy field and not just the individual blades of grass; we want to see the entire brick wall rather than just the individual bricks. Humans are very adept at handling visual texture when segmenting images. We see the grassy field and the brick wall and we are not distracted by the individual blades of grass or the individual bricks. We would like to develop this capability in computer vision systems. Shape-jrom-texture is the paradigm of extracting surface orientation and shape in-
formation from visual texture. Humans are very adept at recovering shape information from surface texnires. For example, in figure 1-1, a spherical object can be perceived
4
CHAPTER
t. iNTRODUCTION
/ 0 c:::s- ~
° tJ 0 ° 0 t?0
/0
0
0
~,
0
a C)~"
0
0
0
\)\
0 0 0 o tJ\ 0 0~ O 10 0 0 0 0 0 0 0 01 0 ~ () ° 0 0 0 00 ~ o 0 0/ 10 0
'\\)
000°01
~ C)OoOol//
,,~~~
0
0
0/
~624
Figure 1-1: Artificial sphere image solely on the basis of its surface markings. In the center of the figure where the surface orientation is approximately perpendicular to the viewing direction, the surface markings are imaged without distortion. However, as the slant of the surface increases, the markings are increasingly foreshortened. Understanding the effect of sUrlace onentanon on imaged texture enables us to perceive the sphere. We would like to have this capability in computer vision systems also.
Texture recognition is the process of extracting a description of an observed texture and comparing that description with known textures. The description needs to contain all the important parameters that describe how the texture was generated so that distinct textures can be distinguished. Similarly, the description should not contain any parameters that would distinguish the particular texture sample from other samples of the same texture. Visual texture is usually classified as either statistical or structural [14]. Statistical textures exhibit fine structure which is best described by statistical properties of the image pixels (e.g. sand). Structural textures are textures which are composed of identifiable texture elements texels such as bricks or windows or scales. The texels are organized according to structural rules.
1.2. REPETITION IN THE REAL WORLD
5
This dissenation present research in computer algorithms for understanding a particular class of structural texture-two dimensional regular repetitive textures. Two dimensional regular repetitive textures are structural textures in which the texels are similar and are placed in a regular two dimensional pattern. For example, the windows in figure 1-6 and the reptilian scales in figure 1-11 form regular repetitive textures. The anificial texture in figure 1-2 (a) is also a regular repetition. There are other structural textures which will not be covered in this dissertation. Random repetitive textures are formed by randomly placing similar texels on a background
(e.g. part (b) of figure 1-2). This is called "bombing" in [34J. Regular structure textures are formed by placing dissimilar elements in a regular two dimensional pattern (e.g. part (c) of figure 1-2). Random mosaic textures are generated by tessellating the plane [28,34J. Random repetitive textures and regular structure textures are related to regular repetitive textures. Random repetitive textures are similar to regular repetitive textures in that the texture elements are similar to each other. However, in random repetitive textures the texels are placed randomly where as in regular repetitive textures they are placed in a regular grid pattern. Regular structure texture are similar to regular repetitive textures, but in regular structure textures the texture elements may differ from each other. -------~In_this_dissertation_we_wil:trestri-cCour
anennon to regular repentive textures. Regular
repetitive textures are a very important class of structural textures and occur especially frequently in man-made scenes. We will develop computer algorithms which are capable of segmenting and describing regular repetitive textures in real-world images. Since we are dealing constantly with regular repetitive textures we will often abbreviate the term and simply refer to repetitive textures or repetition. Unless otherwise stated, both of these terms should be understood as refering to two dimensional regular repetitive texture.
1.2 Repetition in the Real World Repetition occurs most frequently in man-made scenes such as buildings (figures 1-5, 1-6 and 1-7) but it also occurs in natural textures such as reptile skin (figure l-ll). Since we are developing an algorithm to handle. repetition in real-world scenes, we need an understanding of how repetition occurs in real-world scenes and what problems we may
CHAPTER 1. INTRODUCTION
6
0
0
0
0
0
0
0
0
0
0
0
0
0 00
00 0
0
0 0
0
0
0
0
0
0
0
0
0
0
0
·0
0
0
0
0
0
o
0 0
a 0 0
0 0
0
0
(a) Regular repetition B
0
(b) Random repetition K
M
B
G--N-P-G
R
A
W-8-
H
D
0
z
Q
D
L
W
G
X
Y
F
F
I
H
B
T
E
(c) Regular structure
Figure 1-2: Three types of repetitive textures
0
1.2. REPETITION IN THE REAL WORLD
7
encounter in trying to analyse it. Let's look at a number of real-world examples and try to identify the different characteristics that we are going to have to deal with in order to extract the repetition from the images. The scene of Pittsburgh in figure 1-5 is a typical city scene. Consider the building in the center of the image. This building exhibits a very obvious pattern of texture elements that are repeated with some frequency. Even this very obvious repetitive pattern has hidden complexities. The building is a parking garage and two distinct faces of the building are visible in the image. The repetitive pattern continues smoothly from the front of the garage to the side, only exhibiting a sudden change in the frequency of the repetition. We call such a sudden change in the frequency of a repetition a frequency jump. Since the frequency jump is what distinguishes the two faces of the building, it
is necessary to identify the frequency jump in order to segment the two faces of this building. At the very least, we need to be aware that frequency jumps can occur in repetitive patterns. Looking now at the building to the left of the parking garage in figure 1-5, we see a more complex form of repetition. This building's regular pattern consists of pairs of windows. This is an example of a hierarchical repetitive texture in which the element onlie repention is itself composed of more basic structural units. This same building is an example of a repetition containing varying texels. The lower part of the building has dark windows in a light wall while in the upper portion the windows appear as light features in a dark wall. In order to understand this repetitive pattern, then, we have to deal with two complexities: a hierarchical pattern and varying texels. The building on the right of the picture also contains examples of texture elements which vary considerably within the one repetition. Some of the windows are darker than others and some are even brighter than the wall. Many of the windows on this building cannot be seen because the blinds are drawn so they appear the same colour as the wall. They are invisible texels. In addition, this building has the windows grouped into vertical bands of two, three, or four individual windows with additional spacing between the bands. This additional spacing represents a sudden change in the phase of the repetitive pattern on the face of the building, which we call a phase jump. Thus, we need to
8
CHAPTER 1. INTRODUCTION
be aware of the possibilities that some of the texture elements may not be visible in a repetition, and that there may be phase jumps within the repetition. Figure 1-6 is an unusual view of a skyscraper. Taken from the air, it shows only one face of the building. However, the extreme perspective effects which result from the chosen viewing angle causes the frequency of the repetition to change smoothly and rapidly from the top of the image toward the bottom. The term we use for such smooth changes of frequency is frequency drift. In this image the frequency drift results from the imaging process. It is not indicative of a change in the repetition in the scene. In attempting to understand this repetition we must realize that there is only one repetitive pattern involved, despite the frequency drift within that pattern. An office building in Washington D.C. (figure 1-7) presents an extreme example of
variation between the elements of a repetitive texture. In this image, the windows are almost all different from each other because of the positioning and colour of the curtains visible inside the windows. Only the wall between the windows in consistent. Moving from architecture to textiles, figure 1-8 is a shadowgraph of loose burlap taken from Brodatz's photographic album of textures [7]. The repetitive structure in this image results from the weave of the burlap. The weave is uneven, however, so the image texture exhibits considerable del'arture from a simgle regetitive pattern. 'Ihls.departure is known as distortion. In order to understand repetitive patterns such as figure 1-8 it is necessary that we be able to deal with distorted repetitive textures. Figure 1-9 contains a number of repetitive structures. The most interesting one is the repetitive structure of the checked shirt which the author is wearing in this image. The shirt is wrapped around a body and as a result the repetitive pattern is distorted in the image. This is again an example of distortion, but in this case the distortion is caused by bending of the textured surface in three-dimensional space whereas in figure 1-8 the distortion occurred in the texture itself. In order to analyse the repetition in figure 1-9 it is important to extract the repetitive structure despite the presence of distortion. Since the distortion results from the shape of the underlying object, however, it may be possible to complete the analysis by recovering the shape of the underlying object from the textural distortion.
1.2. REPETITION IN THE REAL WORLD
9
Finally, let us consider figure 1-11 which is a naturally occurring repetitive texturethe skin of a reptile. At first glance, the pattern appears to be very regular. There is some distortion, but it is not as severe as in the loose burlap (figure 1-8) or the checked shirt (figure 1-9). However, a more careful examination reveals that this texture has
singularities where a column of the texture elements suddenly terminates. Two of these singularities are outlined in figure 1-12. Singularities may be interesting features of the repetition which we would like to identify. At the very least, when understanding repetitive textures such as this we should be able to understand the repetition despite the presence of singularities. To summarize, the following are some difficult properties of real-world repetitive textures. • Variations between texture elements. • Missing texture elements. • Hierarchical texture elements. • Frequency and phase jumps. • Frequency drifts. • Distortion. • Singularities. If a computer algorithm is to understand real-world repetitive textures, it must be
capable of dealing with these properties of real-world repetitive textures. In this dissertation we develop an algorithm to extract the structure of real-world repetitive textures. We will demonstrate the ability of our algorithm to handle some of these properties of real-world repetition by applying it to a variety of real-world images. One final observation: real-world images such as figure 1-5 also contain regions of the image that do not exhibit a repetitive texture. We expect a repetitive texture analysis algorithm not to extract repetitive structure from portions of the image where no repetition is present. In past work, each test image consists entirely of a small sample of a single
CHAPTER I. INTRODUCTION
10
repetmve texture. The results of testing an algorithm with such simple data provide no understanding of the performance of the algorithm on real-world images. In order to demonstrate the ability of our algorithm to distinguish repetitive and non-repetitive
regions of the image, we include non-repetitive textures in our test data.
1.3 The Repetition Analysis Problem Having examined the characteristics of repetition in the real world, we now define more carefully the problem to be solved. We are concerned with analyzing real-world scenes, so we seek computer algorithms that are able to handle properties of real-world repetition such as texel variation and distortion. The problem of repetition understanding is, for real-world scenes, to achieve the following.
• Identify presence of repetition; i.e. determine whether or not repetition is present in the image. If repetition is present, then identify its location in the image.
• Segment regions of repetition; i.e. determine the extent of each repetition in the ----------iirmmrge. • Describe repetition in detail; i.e. produce information which captures the characteristics of each repetition in the image. This problem is to be solved while, at the same time, handling the difficult properties of real-world images. The problem is more difficult because we do not assume any prior knowledge about the repetition; neither its frequency nor the texels which comprise
it. This lack of prior knowledge makes the problem more difficult because repetition is defined in a cyclic manner-the texels are whatever unit pattern is repeated at the repetition frequency and the repetition frequency is the spacing between the texels, This cyclic definition yields a cyclic problem. If we know the texel then we can find the frequency and if we know the frequency then we can find a texel to describe the repetition. Since we assume knowledge of neither the element nor the repetition, we need some way of breaking the inherent cycle in the problem. The usual method is to make some
1.4. DESIRED RESULTS
11
weak assumptions about the repetition. Choosing the assumptions that provide a robust method of analyzing repetition is a central part of the repetition analysis problem. In this dissertation, we make an assumption that the cycle can be broken by finding a prominent feature in each texture element. This is the dominant feature assumption which will be developed in chapter 3.
1.4 Desired Results The results of analysis of repetitive texture consists of two parts: a description of the repetitive structure and a description of the texture element. In this dissertation we concentrate on extracting the repetitive structure of repetitive textures. The element can be extracted and described once the repetitive structure is known. This has already been done elsewhere [23] and we will not be concerned with describing the texture elements in this dissertation: The description of the structure of a repetitive texture can take the form of a concise summary description or it can provide a more detailed description of the texture. A concise description of a repetitive texture usually consists of two vectors describing the frequency-of-the---repetition~Forexample~two-pai:rs-of-frequency-vectors-can-be---us-eu~o
describe the major repetitions 'on the parking garage in the Pittsburgh scene, as shown in figure 1-13. In contrast, a detailed description consists of the identified locations of each of the elements in the repetition and links between the elements which describe the detailed structure of the repetition. Such a description is shown in figure 1-14. A concise description is desirable for texture recognition, but a detailed description is more useful for shape-from-texture. In addition, repetitive textures which have been subjected to distortion cannot readily be described concisely. For these reasons, it is preferable to first generate a detailed description of the repetition. A concise description can then be extracted from the detailed description if it is required. In this dissertation, we aim to extract detailed descriptions of the structure of repetitive
textures. We will develop an algorithm that is capable of extracting detailed descriptions of repetitive textures which have been subjected to distortion, or which exhibit properties such as phase jumps,frequency jumps and singularities.
12
CHAPTER 1. INTRODUCTION
1.5
Impact
The problem of understanding repetition is a difficult and interesting problem in computer vision. A solution to the problem will impact several areas of research endeavour. First, the techniques developed in this dissertation provide a method of segmenting regular repetitive textures in real-world images.
This will assist in segmentation of
images, particularly images of man-made environments, in which regular repetition is common. Historically, texture has proved to be a difficult problem in image segmentation [30]. Second, regular repetitive patterns can provide information for object recognition. For example, the regular pattern of buttons on a telephone can be extracted using the algorithm developed in this dissertation. The pattern of the buttons can be identified for a wide range of view points. It could thus be used as a clue for recognition of telephones. The approach would be similar to Lowe's use of simple perceptual groupings of lines as clues for recognizing staplers [24]. Third, research in shape-from-texture can be advanced by applying known techniques such as Strat and Fischler's one-eyed stereo paradigm [36] to the detailed descriptions of repetition extracted by the algorithm presented in this dissertation. The detailed descriptions which our algorithm generates provide a one-to-one mapping between image locations of texture elements and grid co-ordinates on the surface of a regularly textured object. This provides a basis for solving the shape-from-texture equations.
1.6
Related Work
Past methods for analyzing repetitive textures have relied upon a global analysis of the texture to determine the repetitive frequency. An implicit assumption is made that the frequency is approximately constant throughout the texture. The following major approaches have been used: Fourier transform, co-occurrence matrices and histogramming of displacement vectors. In the following sections we will briefly discuss each of these approaches and the results obtained.
1.6. RELATED WORK
13
1.6.1 Fourier Transform The Fourier transform [33, pages 13-27] computes a frequency-based description of an entire image. Intuitively, such a frequency description should capture the repetitive frequency of a regular texture. The Fourier transform was first employed in texture analysis by Bajcsy [5,4]. In her work, textures are described by examining rings and wedges of the Fourier power spectrum. This provides the basis for a crude classification of textures as non-directional, monodirectional or bi-directional and as homogeneous or blob-like. In this classification, regular repetitive textures would be classified as bidirectional. Specific application of the Fourier transform to the analysis of regular repetitive textures is conducted by Matsuyama et at. [26]. They compute the Fourier power spectrum of a sample of repetitive texture and extract peaks in this spectrum as the spatial frequencies of the texture. To accommodate textures in which the basic structural frequencies have low energy, the higher frequency harmonics of the basic frequencies are also used in choosing the appropriate pair of basic frequencies. Matsuyama et at. successfully apply their technique to six samples of regular repetitive textures taken from Brodatz
[1J. The
technique_p1"Orides-good-results-f~r-small---
samples of simple regular repetitions.
1.6.2 Co-occurrence Methods Co-occurrence matrices capture second-order statistics of the pixels of an image. In their standard form, co-occurrence matrices have received a lot of attention, at least in part because ofJulesz's conjecture [8,9,19] that the human visual system discriminates textures on the basis of second-order statistics. This conjecture has been abandoned recently [18J and work on co-occurrence methods has also declined. However, no review of texture analysis would be complete without a discussion of work based on co-occurrence matrices. The grey-level co-occurrence matrices of an image form a four-dimensional space of descriptive statistics indexed by the two-dimensional vector
~
and the scalars II and
12 , Ct(ll.!z) represents the probability that the image contains a pair of pixels with
intensities II and 12 separated from each other by a displacement vector of 6. 6 is a
14
CHAPTER 1. INTRODUCTION
vector in two dimensions, hence the full set of C (l I' 12) values is a four dimensional space. Conventionally, however, the co-occurrence statistics have only been computed for selected displacement vectors
~.
For each to, the co-occurrence statistics form a
two-dimensional space indexed by II and lz and hence the term "co-occurrence matrix". Co-occurrence matrices have been studied extensively in relation to visual texture. Most researchers have perceived co-occurrence matrices as a basis for characterizing finegrained textures (micro-textures) [14]. Some, however, have observed that co-occurrence matrices computed at a variety of displacements are capable of capturing the repetitive structure of regular textures. Conners and Harlow [11] observe that co-occurrence matrices should be highly diagonal if /5 is a frequency of the repetitive texture. This follows from the definition of the co-occurrence matrices and the observation that if l5 is a frequency of the repetition then pixels separated by 6 should have similar intensities. They compute a measure of non-diagonality (called inertia) on the co-occurrence matrices for various /5 and locate local minima of this function as repetitive frequencies of the texture. They successfully apply this technique to several repetitive textures from Brodatz [7]. Zucker and Terzopoulos [45] similarly observe that the co-occurrence matrices of a _ _ _repetitiV'e-texture-should-exhibit-diagonal-structure-if-8 corresponas-to a frequency of the repetition. They use the standard statistical chi-square test [6, pages 425-428] to detect high levels of structure in co-occurrence matrices. They demonstrate this approach on five textures taken from Brodatz [7]. Chetverikov [10] employs the first moment function to measure the diagonality of the co-occurrence matrices. He derives the expected response of this function to a binary texture which is a mixture of regular sized and irregular sized elements. The parameters of this texture accommodate both regular repetitive textures and random repetitive textures. Chetverikov fits the theoretically derived curves to observed curves for three textures taken from Brodatz [7]. This technique successfully determines the frequency of regular repetitive textures. Davis et at. [13] introduced the concept of generalized co-occurrence matrices. They suggest computing co-occurrence relationships between image features instead of raw pixel intensities. Nevatia et at. [29] and Vilnrotter et at. [42] extract the frequencies of
1.6. RELATED WORK
15
repetitive textures by computing the co-occurrences of edge features. They successfully demonstrate this approach on several samples of regular repetitive textures.
1.6.3 Histogramming of Displacement Vectors Tomita et al. [40,38,39] introduce histogramming of displacement vectors between texture elements as a method of recovering the frequency of a regular repetition. Texture elements are extracted by a region analysis. The displacement vectors between the texture elements are then histogrammed. Under the assumption of a constant repetitive frequency, the histogram will exhibit clusters of displacement vectors corresponding to the repetitive frequencies of the texture. The clusters are identified and the frequency vectors are estimated as the cluster centres. Tomita et al. successfully apply this approach to several small samples of texture from Brodatz [7]. Further work in the same vein may be found in Matsuyama et al. [27] and Leu and Wee [23]. Again, successful results are demonstrated for small samples of texture from Brodatz's texture album.
1.6.4 Limitations of Global Methods Each of the above approaches-Fourier transformation, co-occurrence matrices and histogramming of displacement vectors-relies upon a global analysis of the repetitive texture. Such global analyses are not generally applicable to real-world images because they is based upon two implicit assumptions. First, an assumption is made that the image being analyzed consists entirely of a single repetitive texture. Real-world images consist of a mixture of repetitive and nonrepetitive textures. The data from these repetitive and non-repetitive regions of the image are combined by a global transformation, making it difficult to analyze the transform and extract descriptions of the individual regions of repetitive texture. Second, the above global analyses all assume that the frequency of the regular repetition remains constant throughout the image. When the repetitive frequency changes across the image, the global transformation combines the different frequencies into a single transform. This blurs the transform, making it difficult to identify the repetitive frequency. Thus, the methods based on global transformations are likely to be unable
CHAPTER 1. INTRODUCTION
16
to understand repetition in real-world images. Even if the repetitive frequency is successfully extracted, the results are only a description of the "average" pattern of the texture. To illustrate these problems, consider the Fourier transform. Figure 1-3 is the Fourier power spectrum of figure 1-10 which is a simple regular repetitive texture. The Fourier power spectrum of simple repetitive textures exhibits clear peaks which can be used to identify the frequency of the texture. These peaks are clearly visible as dark patches in figure 1-3. In contrast, however, the Fourier transform of a distorted regular repetition in a real-world scene does not contain identifiable peaks. Figure 1-4 is the Fourier power spectrum of the skyscraper in figure 1-6. Because of the frequency drift present in this image, the Fourier power spectrum does not contain clear peaks and extraction of the repetitive frequency from the spectrum would be extremely difficult. In addition, the image contains a mixture of repetitive and non-repetitive textures, which are combined in the Fourier transform, In general, global methods do not provide a suitable basis for analysis of real-world repetitive textures because real-world repetitive textures can exhibit distortions and other properties that make analysis of the global transform very difficult.
1.7 Thesis Outline This dissertation commences with a discussion of modeling in texture analysis. In some past work, explicit models of texture have been used to considerable advantage. In other work, no clear model of texture has been stated. Chapter 2 develops the concept that, whether or not a model made is explicit, a texture model always underlies texture analysis. It follows from this concept that understanding the texture model that underlies a texture analysis method can lead to greater understanding of the method itself. This principle is applied to past work in the area of repetitive texture analysis. We find that all past work has relied on one of two models of repetition. We show that both of these models are very restrictive and that the methods cannot be applied to distorted repetitive textures because they implicitly assume distortion-free repetition.
1.7. THESIS OUTLINE
17
.~:
.j.:';]•. .! .,.
Figure 1-3: Fourier power spectrum of a simple regular repetition l
. ------.1:;. ~
~
.,
.:....
,
.~:
Figure 1-4: Fourier power spectrum of a distorted regular repetition
CHAPTER 1. INTRODUCTION'
18
We then propose a new model of repetition-the local model. This model is explicitly designed to accommodate distorted repetitive textures and it forms the basis of our approach to understanding real-world repetitive textures. In chapter 3 we develop theoretical approaches to two problems in analysis of regular
repetition-the ambiguity of repetitive frequency and the cyclic nature of the definition of repetition. The frequency of a two dimensional repetition is inherently ambiguous--there are many alternate interpretations which correspond to the same repetitive pattern. This ambiguity is solved by defining the fundamental frequency vectors of a two dimensional repetition as the two shonest linearly independent frequency vectors of the repetition. We prove a number of results from lattice theory which show that the fundamental frequency vectors are aptly named. The definition of regular repetition is cyclic: The frequency of the repetition is the spacing between the elements of the repetition and the elements are the units which are repeated. This cyclic definition leads to a natural cycle in the analysis: if we know the texe1s then it is easy to find the frequency and if we know the frequency then it is easy to find the texels. However, if we know neither the texels nor the frequency then the problem is very difficult. To break this cycle, we introduce the dominant feature assumption. The dominant feature assumption states that we can find a dominant feature in each texel and extract the structure of the dominant features. It is not necessary to find the texels themselves--only the dominant features. We discuss some implications of the dominant feature assumption in chapter 3. In chapter 4 we present an overview of the algorithm which we have developed
from the theoretical basis of chapters 2 and 3. Our algorithm consists of four phases: feature extraction, extraction of basic structural relationships, extraction of local repetitive structures and relaxation. The feature extractor which we use in our experiments is discussed in detail at the start of chapter 5. It is based on a similar feature detector developed by Ahuja and Haken [1]. This feature detector is reasonably well suited to our purpose. The second phase of our algorithm is the extraction of basic structural relationships. In this processing, links are established between features which have a reasonable probability
1.7. THESIS OUTLINE
19
of being neighbouring dominant features in a regular repetition. The third phase of our algorithm is the extraction of local repetitive structures. Based on the structural relationships previously extracted, this processing builds local crossshaped, T-shaped and L-shaped repetitive structures. Each structure represents a hypothesized small piece of a repetitive pattern. This phase of the algorithm is described in detail in chapter 6. The fourth and final phase of our algorithm is relaxation. This phase ensures that adjacent repetitive structures are consistent with each other. It also decides which features are part of a repetitive pattern and which are not repetitive. This phase of the algorithm is described in detail in chapter 7. The final result of our algorithm is the repetitive neighbour relationships for a set of features which have been identified as repetitive features. In chapter 8 we present results of our algorithm applied to a variety of real-world scenes. These results demonstrate that our algorithm can successfully extract the structure of real-world repetitive textures. Chapter 8 also includes the results of a detailed study of a single repetitive texture. This study demonstrates the capabilities of the algorithm when confronted with a wide variety of types of distortion. The results are consistently good. In chapter 9 we discuss the major contributions of this dissertation and make some suggestions for future research.
20
CHAPTER 1. INTRODUCTION
Figure 1-5: Buildings in Pittsburgh
21
1.7. THESIS OUTLINE
Figure 1-6: Skyscraper
22
CHAPTER
t. iNTRODUCTION
Figure 1-7: Office building in Washington D.C.
1.7. THESIS OUTLINE
Figure 1-8: Loose burlap (from Brodatz, plate D103)
23
24
CHAPTER 1. INTRODUCTION
Figure 1-9: The author wearing a checked shirt
1.7. THESIS OUTliNE
Figure 1-10: A simple regular repetition
25
26
CHAPTER 1. INTRODUCTION
Figure 1-11: Reptile skin (from Brodatz, plate D22)
1.7. THESIS OUTUNE
Figure 1-12: Reptile skin with singularities identified
27
28
CHAPTER 1. INTRODUCTION
Figure 1-13: Coarse description of repetition
1.7. THESIS OUTLINE
Figure 1-14: Detailed description of repetition
29
30
CHAPTER i. iNTRODUCTiON
Chapter 2 Modeling in Texture Analysis An important part of research in computer texture analysis is developirig an understanding
of the capabilities and limitations of texture analysis algorithms. Such an understanding is necessary in order to know when and how to apply the algorithm to new data.
In past work, the capabilities of texture analysis algorithms have been understood on the basis of experimental data. In most cases the algorithm is used in a texture classification or segmentation experiment. In other instances, the results of analysis of a test data set are used to synthesize a texture sample as visible evidence of the degree to which the algorithm captures the important features of the texture. Such experiments are important, but do not provide a clear understanding of the capabilities and limitations of the algorithm. The reader is left uncertain of whether the algorithm which works so well for the researcher will be equally successful on his own data. The problem is that classification, segmentation and synthesis experiments do no more than demonstrate the algorithm on a strictly limited set of textures. In order to make predictions about the performance of an algorithm on new data, a deeper understanding is required.
In this chapter we develop a new approach to understanding texture analysis algorithms by understanding the texture models upon which they are based. We apply this paradigm to past work in repetitive texture analysis and identify the common failure of past work-its inability to deal with distorted repetitive textures. This inability to analyze distorted textures stems from the use of texture models that do not allow for distortion.
31
32
CHAPTER 2. MODELING IN TEXTURE ANALYSIS
We present a new model of repetition-the local model. This model allows for distortion such as occurs in real-world scenes. It forms the basis of our algorithm and helps explain why our algorithm can handle distorted textures.
2.1 Texture Models Visual texture in real-world images is the result of the combined effects of imaging processes and scene-space process. Together, these processes generate the texture in the image and are called texture generation processes. The most significant imaging process is usually perspective projection. For example, the texture of the face of the building in figure 1-6 exhibits perspective distortion which occurred as part of the picture-taking process. Scene-space processes, however, are many and varied. In the case of figure 1-6, the regular pattern of the windows on the surface of the building is actually the product of the efforts of architects, construction workers, etc. As another example, figure 1-8, the shadowgraph of loose burlap, was obtained without any significant imaging distortion. The distortion which is present is the result of scene-space processes. The primary scene-space process involved was the weaving of the burlap which generated the repetitive pattern. Some of the distortion in the pattern may have occurred during the weaving and some was probably the result of subsequent handling. The combined effect of these processes is an image which contains a distorted repetitive pattern. A texture model is a mathematical representation of the essential properties of the texture generation processes. In the case of the skyscraper in figure 1-6, the texture model would state that a regular pattern was constructed on a flat surface in the scene and perspectively projected into the image. For the loose burlap (figure 1-8), the texture model would state that a regular pattern was first constructed and then distorted. The distortion could be described statistically so that the model would be equally applicable to other samples of loose burlap. Our approach to understanding the capabilities and limitations of texture analysis algorithms is to develop a model of the textures which they can analyse. It is then possible to compare a set of textures with the algorithm's model and determine whether
2.2: REVERSE ENGINEERING OF TEXTURE MODELS
33
the algorithm will be capable of successfully analyzing the textures.
2.2 Reverse Engineering of Texture Models Every texture analysis algorithm is based on some model of texture. Usually the models are implicitly hidden in the algorithm and the researcher may not have even been aware that he was using a model of texture, but none-the-less, the model is there. These implicit models result from assumptions about the image that are built into the algorithm. We can gain a deeper understanding of the capabilities and limitations of the algorithm by discovering these assumptions. We can then develop a model of the textures which are appropriate for analysis by the algorithm. The following are two principles that can be employed to develop texture models from an algorithm.
1. The Description Principle: In order for an algorithm to be useful for analyzing a given texture, it must capture a description of all the important parameters of the texture generation processes. If important information is not captured then the texture descriptions which are extracted will be the same for distinct texture samples. By examining the descriptions generated by an algorithm we can gain an understanding of the textures for which it is appropriate. 2. The Applicability Principle: The algorithm must extract the texture parameters by computational methods that are applicable to the image data. By examining the algorithm we can sometimes discover implicit assumptions that are made about the image. These assumptions are then incorporated into our model of the textures for which the algorithm is suitable. This process of deriving a texture model from an existing algorithm is called reverse
engineering of texture models. Later in this chapter we will develop texture models for past work in repetitive texture analysis.
CHAPTER 2. MODELING IN TEXTURE ANALYSIS
34
2.3 Forward Engineering of Texture Models Texture models are useful for reverse engineering, as above. They are even more useful as a basis for designing texture analysis algorithms with predictable properties. 'This is called forward engineering. A texture model is a mathematical description of the texture. The model is transformed into constraints on the image, and the texture analysis algorithm is designed to exploit the constraints. Identification of the presence of the modeled texture can be performed by testing the image to see whether it satisfies the constraints. In the following sections of this chapter we will develop models of repetitive textures
and relate them to previous work. We will also develop a new model which forms the basis for the algorithm. proposed in this dissertation. As this algorithm is developed we will show the application of the model constraints to the algorithm, an example of forward engineering.
2.4
Three Models of Repetition
Let's now return to the problem of analyzing repetition. We will discuss three specific repetition models in detail: the grid model, the consistent relative placement model and the local model. For each model we will examine the restrictions that are placed upon algorithms which assume that model of repetition. We will also relate existing work in repetitive texture analysis to these models.
2.4.1 The Grid Model The grid model of repetition is the simplest and most obvious repetitive model. As shown in figure 2-1, it is defined by two vectors u and v and a placement error distribution E. The vectors u and v define a two-dimensional grid which is shown in the figure. The texture is constructed by placing texture elements, with error sampled from E, at the vertices of the grid. The resulting texture has no phase jumps or frequency drift. The only distortion that can occur results from the error in placing the texture elements. Apart from this placement error, the texture is perfectly regular.
2.4. THREE MODELS OF REPETITION
35
Figure 2-1: Grid model of repetition. The two vectors u and v define the repetitious grid. The texels are placed with error at the vertices of the grid. Now, since textures generated by the grid model of repetition do not have any significant distortion, it follows that any algorithm based on the grid model of repetition cannot analyse distorted repetitive textures. Such algorithms cannot handle phase jumps (as in figure 1-5), the frequency drift of perspective distortion (as in figure 1-6), arbitrary distortion (as in figures 1-8) or singularities (as in figure 1-11). Therefore algorithms based on the grid model are severely restricted in their applicability to real-world scenes. One such algorithm is that developed by Matsuyama et ai. [26]. This algorithm commences by taking the Fourier power spectrum of the image to be analyzed and looking for peaks in the transform as the frequency vectors of the repetition. The final results of the algorithm are a pair of vectors that describe the repetitive frequency. Since the resulting description is so simple, the Description Principle implies that the algorithm is truly appropriate only if the frequency of the repetition is constant throughout the image. At a more basic level, Matsuyama's algorithm relies upon the fundamental properties of the Fourier power spectrum. The Fourier power spectrum of an image measures the energy of spatial frequencies in the image. If the image contains samples of repetition
CHAPTER 2. MODELING IN TEXTURE ANALYSIS
36
withthe same frequency but different phases, these samples will tend to cancel each other out in the Fourier transform. Thus, this approach is not computationally appropriate to repetitive textures which contain phase changes and, by the Applicability Principle, we infer that the underlying texture model is the grid model of repetition. In summary, we have developed the grid model of repetition in which texture elements
are placed, with error, at the nodes of a regular grid. We have shown that the algorithm developed by Matsuyama et al. is only truly appropriate for textures generated by this grid model of repetition.
2.4.2
Consistent Relative Placement Model
Like the grid model, the consistent relative placement model is defined by two vectors u and v and a placement error distribution E. Whereas the grid model uses the vectors to define a grid, the consistent relative placement model uses the vectors to define the relationships between neighbouring texture elements: adjacent texels are displaced from each other by either the vector u or the vector v plus error sampled from E. Figure 2-2 illustrates this process. The vectors u and v are drawn at each texel to illustrate the error with which the neighbouring texel locations are predicted. The texture produced by this model has no frequency changes or drifts. Small phase jumps occur throughout the texture as a result of the errors added to the displacement vectors. The phase of the repetition can drift if the errors are correlated between adjacent features. This effect is shown in figure 2-2-as the texture moves from top to bottom in the figure, the phase drifts to the left. It is the tolerance of phase drift that distinguishes the consistent relative placement model from the grid model. The consistent relative placement model is not necessarily a good representation of processes which actually generate repetition in real-world scenes. We are examining it primarily because a large number of the existing algorithms for analysing repetitive textures are based upon the consistent relative placement model as we will now show. In section 1.6 we discussed algorithms for analysis of repetitive texture based on
co-occurrence matrices and histogramming of displacement vectors. We concluded that these global transformations are based upon an implicit assumption that the frequency of
2.4. THREE MODELS OF REPETITION
37
Figure 2-2: Consistent Relative Placement model of repetition. The two vectors u and v define the positional relationship between neighbouring texture elements. The relationship is not obeyed exactly due to error in placement of the texels. The defining vectors are shown at every texel, demonstrating the presence of placement error. repetition is constant. In fact, these particular globals transformations are appropriate for analysis of repetitive textures generated by the consistent relative placement model. Under the consistent relative placement model, the texture elements are always displaced from their neighbours by approximately one of the frequency vectors u or v. Therefore, the co-occurrence matrices of consistent relative placement textures will be highly diagonal for the frequency vectors u and v. Similarly, the histogram of displacement vectors will exhibit identifiable clusters around u and v. Of course, these statements are only true if the placement error is small compared to the frequency vectors. Texture analysis algorithms that are based on the consistent relative placement model can handle phase jumps and phase drift, but they cannot analyse textures which have frequency drift. The perspective distortion in figure 1-6 and the surface warping in figure 1-9 are too extreme to be treated as error in a consistent relative placement model so algorithms based on the consistent relative placement model cannot understand the repetition in these images. Since the consistent relative placement model allows phase drift which is not allowed
38
CHAPTER 2. MODELING IN TEXTURE ANALYSIS
by the grid model, algorithms based on the consistent relative placement model are more generally applicable than algorithms based on the grid model. Of course, if the repetition was generated by the grid model, its interpretation under the consistent relative placement model has larger errors than under the grid model, and the errors under the consistent relative placement model are correlated. However, in most cases this will not be a problem as the grid error will be small enough that the consistent relative placement interpretation will still be successful.
2.4.3
Local Repetition Model
Both the grid model and the consistent relative placement model are global models of repetitive textures-they describe an entire region of the texture by two vector parameters, u and v, and error. In the local repetition model, the repetitious frequency is defined at each feature by two vectors u and v. However, the values of the frequency vectors u and v may vary across the texture to accommodate frequency drifts and distortions. Figure 2-3 illustrates the simplest form of the local repetition model. For each texel, the opposite pairs of neighbours are assumed to be equally distant from the central texel, and the three texels are assumed to be' collinear. These repetitious relationships are shown by arrows in the figure, for selected texels. Under the local repetition model, the relationships between neighbouring texels are not necessarily the same throughout the repetition. This makes the local repetition model more flexible than the consistent relative placement model which requires the same relationship to be present throughout the entire texture. The local repetition model could include more details of the local repetitive structure. For example, two vectors
~u
and
~v
could be added to the model to represent the local
first-order frequency changes. Such additional parameters would enhance the understanding of severely distorted repetitive textures such as the skyscraper in figure 1-6. In such images, the frequency change is often locally consistent. In the current work, however, we have restricted our attention to the simple form of the local repetition model. In this dissertation we develop an algorithm for analysis of repetitive textures that is
based upon the simple form of the local model of repetition. What characteristics can we
2.4. THREE MODELS OF REPETITION
•
•
39
•
•
•
• • •
• •
•
•
Figure 2-3: Local model of repetition. The repetitious frequency is defined at each feature by two vectors u and v. However, the values of u and v may vary between features to accommodate frequency drifts and distortions. expect this algorithm to have? Phase drift and small phase jumps can be accommodated by the algorithm because these effects are treated as noise by the model. Similarly, frequency drift and small frequency jumps can be accommodated by the algorithm since the model does not require the repetition frequency to be constant, only locally consistent. In fact, all of the example images of repetitive texture in chapter 1 can be analysed under the local repetition model. The perspective distortion of the skyscraper in figure 1-6 is treated as noise, but does not prevent analysis. The random distortion of burlap in figure 1-8 is also treated as noise by the model and does not prevent analysis. Chapter 8 presents results which demonstrate the successful application of the local model of repetition to these and other images.
2.4.4
Summary
We have examined three models of repetition: the grid model, the consistent relative placement model and the local model. Of these, the grid model is the most restricted in
its applicability because it assumes that the repetitive elements lie on the vertices of a
40
CHAPTER 2. MODEliNG IN TEXTURE ANALYSIS
two-dimensional grid. The consistent relative placement model is less restricted because it only assumes that the elements of the repetition are displaced from each other by approximately equal vectors. The most flexible mode, however, is the local model which says only that the elements of the repetition obey a local repetitive constraint. We did not discuss many other possible models such as a perspective grid model. Such models are all less flexible than the local repetition model because they make more specific assumptions. The textures which they would generate could all be analysed by Ithe local repetition model unless the placement error involved was excessive. A good rule of thumb here is that the placement error should not exceed 25% of the magnitude lof the frequency vectors.
.
Chapter 3 Theorems on Regular Repetition A fundamental problem in analyzing regular repetitive textures is that the definition of regular repetition is cyclic: The frequency of the repetition and the texture element are defined in terms of each other. This cyclic definition makes regular repetition difficult to analyse-analysis of the repetition frequency and analysis of the texture element are interdependent. Breaking this cycle is an important part of an algorithm for analysis of repetitive textures. A second problem in analyzing regular repetitive textures is that the repetitive frequency is ambiguous-there are many possible frequency descriptions of the repetition. All of the descriptions are equally valid in a mathematical sense (they all represent the same repetitive pattern) but they are not equally useful in practice. We must choose, from the set of possible descriptions, a "canonical" or fundamental description of the repetitive frequency. In this chapter we present theory to address both of these problems. We break the cycle
in the definition of repetition by assuming that, without knowing the texture elements themselves, we can identify a single most important feature in each texture element. This is the dominant feature assumption. We deal with the ambiguity of repetitive frequency by defining the fundamental frequency vectors of a regular repetition as the shortest pair of independent frequency vectors in the repetitive pattern. We prove properties of the fundamental frequency vectors that are useful for extracting them from repetitive patterns in real-world images. 41
CHAPTER 3. THEOREMS ON REGULAR REPETITION
42
3.1
The Cyclic Problem
Understanding regular repetition is a cyclic problem: the two important aspects of regular repetition are defined in terms of each other. The texture element of the repetition is defined only by the fact that it is repeated-it may be any arbitrary pattern of intensities and its size is determined by the frequency of the repetition. 1 Similarly, the frequency of the repetition is defined only by the fact that it is the frequency with which the texture elements are repeated. This cycle makes it difficult to extract a description of regular repetitive textures: In order to extract repetitive frequency it is necessary to identify the texture elements, but to identify the texture elements it is necessary to know the repetitive frequency. A central problem in understanding visual repetition is breaking the cycle. In chapter 2 we examined previous work in computer analysis of repetitive textures. Past work has been based on either the grid model or the consistent relative placement model of repetition. Both of these models define a pair of vectors u and v representing the frequency of the repetition which is assumed to be constant. The existing methods are able to break the problem cycle by determining the frequency vectors u and v. This is achieved by Fourier analysis or co-occurrence methods. Such methods can determine the frequency vectors directly from the intensity image or some simple computed features. These methods do not require any prior knowledge of the characteristics of the texture element, so the cycle is broken. We reject, however, the assumption that regular repetition can be sufficiently characterized by a pair of constant frequency vectors. In previous chapters we showed examples of images which clearly exhibit variation in the repetition frequency, and we suggested that such variation will cause existing techniques to fail. We proposed a new model of repetition-the local model-s-which accommodates variation in the repetition by assuming only locally consistent repetitive structure. Under this model there are no constant IThere are two degenerate exceptions: the texture element must not be uniform intensity and it must not consist entirely of a repetition of some more fundamental texture element. If the texture element were uniform intensity then the texture field would be uniform intensity and not really a regular repetition. If the texture element consisted entirely of regular repetitions of some more fundamental texture element then the texture field itself would really be a regular repetition of the more fundamental unit
3.2. THE AMBIGUITY OF REPETITIVE FREQUENCY
43
vectors u and v to be found so the problem cycle cannot be broken by finding the repetition frequency directly from the image. What is needed is a method which simultaneously determines the texture elements and the repetition structure; i.e. a method which determines both what is being repeated and how it is being repeated. Under the grid model and the consistent relative placement model it was possible to determine the repetition frequency (the how) without knowing or finding the elements of the repetition (the what); under the local model, both must be obtained simultaneously. Our method of breaking the cycle in the repetition problem is to introduce the domi-
nant feature assumption-we assume that there is one feature in the texture element that is more important (prominent) then any other feature in the element. Rather than trying to find the unknown structure of unknown texture elements we find the most prominent (dominant) feature in each texture element and extract the repetitive structure of these dominant features. In section 3.3 we will develop the dominant feature assumption. First, however, we will address the problem of the ambiguity of repetitive frequency.
3.2
The Ambiguity of Repetitive Frequency
Even if the elements of a regular repetition are known, the repetitive frequency is not then defined unambiguously. Consider, for instance, figure 3-1 (a). The points in this pattern represent the known locations of the texture elements. The repetition is a simple grid model without error, yet the frequency vectors u and v of the repetition cannot be determined unambiguously. Figures 3-1 (b), (c) and (d) demonstrate the ambiguity. Each of these figures illustrates a repetitious grid which explains the pattern in (a). These three repetitive grids have very different frequency vectors u and v. They also have different neighbour relationships between the texels. However, they are equivalent in their ability to generate the pattern in (a). To address this issue, we define the fundamental frequency vectors of a repetitive pattern as the shortest pair of linearly independent frequency vectors for the repetitive pattern. We will show that, under this definition, the fundamental frequency vectors form a basis for 'the repetitive pattern and that they satisfy useful properties of shortness and approximate perpendicularity. These properties suggest that the fundamental frequency
44
CHAPTER 3. THEOREMS ON REGULAR REPETITION
vectors are, in some sense, truly fundamental and that it is appropriate to use them in descriptions of repetitive textures.
3.2.1 Prior Approaches In some prior work [27], regular repetitive patterns are viewed as instantiations of a grammar and an interpretation of a regular texture is obtained by parsing the texture under the grammar. The ambiguity of repetitive frequency is resolved by choosing the interpretation which provides the simplest parse of the grammar. This technique produces interpretations which depend strongly on the shape of the boundary of the region of repetition. For example, figures 3-2, 3-3 and 3-4 present three differently shaped regions of a single repetitive pattern. The repetitive pattern is the same in the three figures but the simplest parsed interpretations are very different from each other. Figure 3-2 (b) represents the parsed interpretation of 3-2 (a). The solid lines represent information explicitly present in the parse and the dashed lines represent information implied by the grammar. The parsed interpretation of the pattern is straightforward and agrees with the intuition of a square grid. The simplest parsed interpretation of figure 3-3 (a) is a skewed grid as shown in figure 3-3 (b). Whether this interpretation agrees with intuition or not, it is certainly different from the interpretation of figure 3-2 (a) even though both figures are actually samples of the exact same repetition. Part (c) of figure 3-3 shows the texture parse which corresponds to the interpretation of figure 3-2 (b). This parse is more complex because there are differences between the rows of the pattern that must be explicitly represented. From these examples, it is apparent that texture parses produce produce frequency descriptions which depend strongly 'on the shape of the texture region. A related problem is that the simplest texture parse is ambiguous for certain region shapes. For example, the repetitive pattern in figure 3-4 (a) has two equally simple parses as shown in 3-4 (b) and (c). A second approach which has been used in past research is to extract the shortest frequency vectors of the repetition [23,40,39]. This is the approach that we adopt-we define the fundamental frequency vectors of a repetition as the .shortest pair of frequency
45
3.2. THE AMBIGUITY OF REPETITIVE FREQUENCY
•
•
•
•
• •
• •
•
•
•
•
•
•
•
•
• •
•
• • •
(a) A repetitive pattern
(b) A grid for (a)
(c) Alternative grid for (a)
(d) Alternative grid for (a)
Figure 3-1: Ambiguity of repetitive frequency
46
CHAPTER 3. THEOREMS ON REGULAR REPETITION
• • • • • • • • • • • • • • •
(a) A repetitive pattern
(b) Grammar interpretation of (a)
Figure 3-2: Grammar interpretation of repetition (see text) vectors in the repetition. This definition of the fundamental frequency vectors is independent of the shape of the texture region, so the frequency descriptions it produces are more useful for texture recognition.
3.2.2 Fundamental Frequency Vectors In this section, we present the formal definition of the fundamental frequency vectors and
prove a number of properties which will be useful for identifying the fundamental frequency vectors of imaged textures. Specifically, we show that the fundamental frequency vectors form a basis for the repetitive pattern, that there is no basis consisting of vectors shorter than the longest fundamental frequency vector, that the fundamental frequency vectors describe the minimum-perimeter structural unit parallelogram of the repetitive pattern, that there is no more perpendicular basis than the fundamental frequency vector basis and that the angle between the two fundamental frequency vectors is between 60 and 120 degrees (or -60 and -120 degrees). Further, we will show that the fundamental frequency vectors correspond to the relative neighbourhood graph (which we will also define) of the repetitive pattern. 2 2The relative neighbourhood graphhas been shown by Rangarajan [32] to approximate human perceptual grouping of dot patterns.
47
3.2. THE AMBIGUITY OF REPETITNE FREQUENCY
• • • • • • • • • • • • • • •
.--...
(a) A repetitive pattern
-- --
(b) Grammar interpretation of (a)
o o
... --.
~-_4-_-'
--
... --.--..--
.-- .---.-- _ _
(c) Alternative grammar interpretation of (a)
Figure 3-3: Grammar interpretation of repetition (see text)
CHAPTER 3. THEOREMS ON REGULAR REPETITION
48
• • • • • • •
•
e
• • • •
e
e
e
o--+--~--.-- • -....-- ... -- -- ... --e
• •
(a) A repetitive pattern
(b) Grammar interpretation of (a)
.--+-- --+--e--+---e • - -+ - -.-- - +--e---e
(c) Alternative grammar interpretation of (a)
Figure 3-4: Grammar interpretation of repetition (see text)
49
3.2. THE AMBIGUITY OF REPETITIVE FREQUENCY
The results presented in this section are, with the exception of theorem 3.4, known results in lattice theory (see [15, chapters 3; 24] and [21]). We present proofs here in the interests of completeness and because this body of theory has not previously been related to computer texture analysis. We commence with formal definitions of a two dimensional regular repetition R and the fundamental frequency vectors u' and v', Definition 3.1 A regular repetition R is a set of points {Xo + iu + jv: i. j integers} in the plane where Xo is an arbitrary point in the plane and u and v are linearly independent vectors. R is a translate of a lattice. The vectors u and v form a basis for R. 0 It is clear that Xo may be replaced by any other element of R in the definition, and thus R is invariant under translations by ku + Iv for any integers k and I. Definition 3.2 Let R be a regular repetition with basis vectors u and v and a known point Xo. Let V
= {x,
-
Xz : Xl' Xz
E R; Xl
f:.
xz} be the set of all possible frequency
vectors of R. Define the vector u' to be anyone of the shortest vectors (measured with Euclidean length) in the set V. Let U be the set of frequency vectors which are multiples of the vector u'. Define a second vector v' to be anyone of the shortest vectors in the set V - U. The pair of vectors u' and v' are called the fundamental frequency vectors of the repetition.' 0 The above definition involves choosing anyone of the vectors in a set of shortest vectors. This operation allows some ambiguity in the definition of the fundamental frequency vectors. This ambiguity has three possible sources. The first source of ambiguity is the fact that, for every frequency vector a of the repetitive pattern, -a is also a frequency vector. This ambiguity is not really a problem and should not concern us further. The second source of ambiguity occurs if [u']
= [v'];
i.e. if u' and v' are the same
length. In that case, the labelling of the fundamental frequency vectors is not unique. Again, this is not a serious problem and should not concern us further. The final source of ambiguity occurs if there are two linearly independent c8?didate vectors for v'. In this case, there is more than one set of equally valid fundamental 3The vectors u' and v' are known as "successive minima" in lattice theory.
CHAPTER 3. THEOREMS ON REGULAR REPETITION
50
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Figure 3-5: An ambiguous repetitive texture frequency vectors which provide different alternative interpretations of the repetitive pattern. Figure 3-5 is an example of such a repetitive texture. In this case, the fundamental frequency vectors are not uniquely defined and the texture can be equally interpreted as skewed to the right or skewed to the left. This form of ambiguity is important and we will refer to regular repetitions which exhibit this form of ambiguity as ambiguous regular repetitions.
Theorem 3.1 Let R be a regular repetition with basis vectors u and v and known point Xo. Let u' and
VI
be the fundamental frequency vectors of R. Then u' and
VI
form a basis
for the repetitive pattern R; i.e. the set {Xo + k u' + 1v' : k. 1 integers} is identical to R.
Proof: We first show that points of the form Xo + k u' + 1VI for integers k and 1 are nodes of R. We then show that there are no other nodes in R. Thus, the expression Xo + k u' + 1VI generates exactly the nodes of R.
The vector u' is the difference of two elements Xl =Xo+ilu+hv and X2 of R. Therefore, u' is an integer combination of u and v. Similarly, combination of u and v. Therefore, {Xo + k u' + 1VI: k. 1integers}
~
VI
=Xo+i2U+hv is an integer
R.
We have thus shown that points of the form x+ku l +lvl are nodes of R for all integers
k and l. It remains only to show that there are no other nodes in R. This part of the proof proceeds by contradiction.
51
3.2. THE AMBIGUITY OF REPETlTNE FREQUENCY
Suppose that there exists some node y = Xc> + ·i U' + 0V' in R for non-integer ., or (\. We already know that nodes exist in R at all points Xo + iu' + jv' for integers i and j, so
l-; J u' + l(\J v' and another at Xo + h1u' + 1(\1 v' (where l-. J denotes the largest integer no greater than i and i i 1 denotes the smallest integer no
there exists a node of R at Xo +
smaller than i). The vectors between these nodes and the hypothesized node y are also frequencies of R and must therefore be in the set V. To simplify the algebra, let ( = i -
y - (x +
x+
l·i J and II = b - l(\ J.
li J u' + lb J Vi)
h1 U' + ib1 v' -
Then
= (u' + IIV'
y = (1 - Ou' +(1-
I/)V'
Thus (u' + I)v' and (1 - Ou' + (1 - l])v' are in the set V. Now, l i
J is
the largest integer which is no larger than i, so 0 ~ ( < 1. Similarly,
o ~ I) < 1. Now consider the two separate cases when I) = 0 and when 0 < 1] < 1. 1. If II = 0 then the vector (u' + IIV' reduces to (u'. Now, ( < 1 so I(u'l < lu'l. This contradicts the fact that u' is the shortest vector in V. Thus, there does not exist any node y for II = O. 2. If 0 < 11 < 1 then the vector Cu' + I)V' is not a multiple of u' and hence is not in the set U. Thus it must be in the set V - U. Similarly, (1- Ou' + (1- I))v' is also in the set V - U. Now, the squared lengths of these vectors are given as follows.
jCu' + llv'j2 = (21u'1 2 + 1)21v'1 2 + 2(1)(u' . v') 1(1 - Ou' + (1 - 1))v'1 2 = (1 - ()2Iu'12 + (1 - 1))2iv'1 2 + 2(1 - 0(1 - I))(U' . Vi)
These equations can be reduced to the following inequalities by using the facts that lu'l ~ Iv'l and u' . v' < 1v'1 2. (The latter is a strict inequality because u' and v' are not collinear.)
52
CHAPTER 3. THEOREMS ON REGULAR REPETITION
I(u' + 17v'I2
P (Fj ) . Then Fj is not a dominant feature. Proof: Under the assumption of a perfect regular repetition, all the texels are identical
so there exists a feature F; which is an image of F, but is in the same texel as Fj • It may be the case that
F; =Fi.
Certainly,
Fi is identical to F, in every respect except location.
Thus
and hence Fj is not the most prominent feature in its texel and is not a dominant feature. 0
Theorem 3.6 Let F, and Fj be two features in a regular repetition such that P (Fi )
::;
P (Fj ) . Funher assume that F, is a dominant feature. Then Fj must not be in the same texel as F; Proof: The proof proceeds by contradiction. Suppose that Fj is in the same texel as F i • Then, since F, is a dominant feature and the dominant feature has the greatest numerical prominence, and since P (Fj ) 2 P (F i ) , it must be the case that Fj is also a dominant feature. But the dominant feature assumption states that only one dominant feature exists in each instance of the texel. Hence, Fj cannot be in the same texel as Fi •
o Note that theorem 3.6 does not assume a perfect regular repetitive texture-it is applicable even when the texels differ within a regular repetition. Theorems 3.5 and 3.6 provide a basis for identifying the dominant features by comparing features. The following theorem provides a basis for identifying the fundamental frequency vectors.
CHAPTER 3. THEOREMS ON REGULAR REPETITION
64
Theorem 3.7 Let T, and 1j be two texture elements of a regular repetitive texture such that T, is connected to 1j in the structural grid of the repetition on its fundamental frequency vectors. Let F i and Fj be the dominant features of T, and 1j. Then there are no other dominant features within the lune of the link between F, and Fj .
Theorem 3.4 shows that the structural grid of a repetition on its fundamental frequency vectors corresponds exactly to the relative neighbourhood graph of the repetitive pattern (except when the repetition is ambiguous). The relative neighbourhood graph is defined to contain exactly those links between points for which there are no other points within the lune of the link. The application of the theorem to the set of dominant features is direct and proves theorem 3.7. 0 The implication of theorem 3.7 is that we can examine a link between a pair of features F, and Fj and determine that the link is not a fundamental frequency vector in the case that there exists F k which is in the lune of the link and which is a dominant feature. In the case of a perfect regular repetition, the link between F, and Fj is not a fundamental frequency vector if F k lies in the lune and is more prominent than or equally prominent as F, or Fj • We use the term dominated region to describe the region around a link which cannot contain any equally prominent or more prominent features. Theorem 3.7 shows that the dominated region of a fundamental frequency link is the lune. Clearly, any subset of the lune is also a dominated region for a fundamental frequency link. For example, figure 3-8 (a) is a sample of the woven cane texture taken from plate 0101 of Brodatz [7]. The dotted line in figure 3-8 (b) shows a fundamental frequency
vector relationship between two large (and hence dominant) features. There are no other large features within the lune of these two large features. In contrast, the pair of features identified in figure 3-8 (c) are not related by a fundamental frequency vector because there are other dominant features within the lune.
3.5
Discussion of the Dominant Feature Assumption
The dominant feature assumption states that there is one dominant feature in each texel of the repetitive pattern. Since our algorithm is based on this assumption, the algorithm
3.5. DISCUSSION OF THE DOMINANT FEATURE ASSUMPTION
(a) Woven cane texture from Brodatz, plate D101
(b) A fundamental frequency relationship
(c) Not a fundamental frequency
Figure 3-8: Application of Theorem 3.7
65
66
CHAPTER 3. THEOREMS ON REGULAR REPETITION-
should fail when the assumption is violated in the image. In this section we look at some artificial images and discuss the dominant feature assumption in relation to each image. First let us consider a regular repetitive texture overlaid with a pattern of random features. The random features fall within the texels of the regular repetition, modifying them. If the random features are less prominent than the most prominent repetitive features then the dominant features of the texels will be repetitive features and the dominant feature assumption implies that the random featutes can be ignored in extracting the repetitive structure. Figure 3-9 is an example of such a pattern. and it is not surprising that the repetitive pattern is immediately obvious to a human viewer. In contrast, however, consider figure 3-10. This is a similar figure but in this case
the random features are equally prominent with the repetitive features. In this figure the repetition cannot be extracted under the dominant feature assumption because there are dominant features in the pattern which do not belong to the repetition. In fact, there are non-repetitive dominant features in the dominated region of almost every repetitive link in the pattern. With so many cluttering random features, humans also find it difficult to locate the repetitive pattern in figure 3-10. A third example (figure 3-11) occurs when the random features are more prominent than the repetitive features. Again, under the dominant feature assumption the repetition cannot be extracted because the dominant features in each texel of the regular repetition are now random features.. It appears to be more difficult _for humans to observe the repetition in figure 3-11 than in figure 3-9. TIlls is surprising because the repetitive and random features are as much different from each other in the two figures. Evidently, the prominence of the features is important for perceiving repetitive structures. So far we have examined figures that combine random features with a repetitive grid. In figure 3-12 we see an example of a repetitive texture that contains more than one
equally prominent feature in each texel. For human viewers, this is an easy texture to understand but it violates the dominant feature assumption so our algorithm will fail. Such failure would not be too serious if violations of the dominant feature assumption were rare. However, man-made repetitive textures often have more than one dominant feature in the texel (for example, the pairs of windows on the building in figure 1-5). As a solution to this dilemma, we propose extending the dominant feature assumption to
3.5. DISCUSSION OF THE DOMINANT FEATURE ASSUMPTION
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
• •
• •
•
•
• • •
•
• •
•
•
•
• • •
•
•
• •
• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
• • •
•
•
•
•
•
•
••
••
•
•
Figure 3-9: Grid features dominant
• •
•
•
•
•
•
•
•
• •
•
•
•
•
•
•
•
•
•
••
•
•
• •
•
•
•
•
•
•
•
•
• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
•
•
•
• •
•
•
•
•
•
•
• •
•
•
••
• •
•
•
•
••
•
•
•
•
•
•
••
•
• •
•
• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
67
•
68
CHAPTER 3. THEOREMS ON REGULAR REPETITION
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
•
'.
Figure 3-10: Equally prominent grid and random features
3.5. DISCUSSION OF THE DOMINANT FEATURE ASSUMPTION
• • • • •• • • • • • •• • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • •• • • • • • • • • • • • • • • • • • • • • • •• • • • •• • • •• • •• • • • •• • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • ••• • • • • • • • Figure 3-11: Random features dominant
69
70
CHAPTER 3. THEOREMS ON REGULAR REPETITION
include dominant groups of features. Actually implementing group features is a subject for future research. Our algorithm is restricted to the use of ordinary features and will fail on figure 3-12. The proposed extension for group features would be based on a simple proximity grouping process. For example, in figure 3-12, a grouping process would be used to identify the pairs of features. Each pair would be identified as a single new feature-a group feature. The dominant feature assumption applied to the group features would then facilitate extraction of the repetitive structure. Not all repetitive patterns with more than one dominant feature in each texel can be easily grouped, however. It is possible to construct patterns in which there are several equally prominent feature groups in each texel. In that case, simple grouping would not yield a dominant feature and the repetition could not be extracted under the dominant feature assumption. To demonstrate this effect, we constructed the pattern in figure 3-13. It was carefully constructed so that the human visual system finds competing groupings for the features and no clearly prominent group features are observed. Preattentively, it is not at all obvious to the human viewer that he is looking at a repetitive texture. Even after careful scrutiny, it is still not possible to perceive the entire repetitive pattern of the texture. This is surprising because we are so used to being able to see repetition when it is present in an image. Of course, if we change the pattern of figure 3-13 so that one of the features in each texel is dominant then the repetitive structure of the whole pattern becomes immediately obvious, as in figure 3-14. Compare these two figures. Can you convince yourself that the structure of the dots is, in fact, the same? One final example demonstrates that the dominance of a feature is not necessarily the result of an area difference. In figure 3-15, the features have equal areas but one of them is a triangle. This difference appears to be sufficient for the human visual system to perceive the repetitive structure. The algorithms presented in this dissertation do not compute any shape parameters for the features, so this pattern cannot be analysed by our algorithm. 5The areas are equal within the limits imposed by printing resolution.
71
3.5. DISCUSSION OF THE DOMINANT FEATURE ASSUMPTION
••
••
••
~
••
••
•• ••
••
••
• ••
••
••
•• ••
••
••
••
••
••
••
•• ...
••
••
••
••
••
••
••
•• ••
•• ••
••
••
•• ••
•• ••
•
••
••
••
••
••
••
••
••
••
~
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
•
••
••
••
••
••
••
••
••
••
•• ...
•
Figure 3-12: Repetition of paired features. This pattern is not analyzab1e under the dominant feature assumption, but can be handled by extending the assumption to allow group features.
CHAPTER 3. THEOREMS ON REGULAR REPETITION
72
. • • -
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • •• • • • • • • • • • • • • • •• • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • ... • • • • • • • • • Figure 3-13: A repetitive pattern with no single dominant feature in each texel and no dominant group features. Human observers find it extremely difficult to understand this pattern as a regular repetitive texture.
73
3.5. DISCUSSION OF THE DOMINANT FEATURE ASSUMPTION
•
•
•
•
•
•
•
•
•
•
• •
•
•
•
•
• •
•
•
•
•
..
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Figure 3-14: Repetitive pattern of figure 3-13 with the features altered so that one is dominant in each texel. The extraction of the repetition is now trivially easy.
74
CHAPTER 3. THEOREMS ON REGULAR REPETITION
-
.
• • • .It. • .It. • • • • .It. • • • • • • • .It. • .It. • •• .It.• • .It. • .It. • • • • •• .It.• • .It. • .It. • • • • • .It. .It. • • •• • • .It. • .It. • • • • .It. • • • .It.• • .It. • .It. • • • .It.• • • • • • • .It. • • •• • • .It. • .It. • • • • • • .It. • .It. .It. • • • .It.• • .It. • • • • • • • .It. .It. • • • .It. • • • • • • • .It. • • • •• .It.• • .It. • .It. • • • • •• .It.• • .It. • .It. • • • • .It. • .It. • • • • .It. • • • • • • • • • • .It. • .It. • .It. • • • • .It. • • • .It. • .It. • • • .It.• • .It. • .It. • • • • • .It. • • • • .It. • .It. • • • .It. • .It. • • • • • • • • .It. • • .It. • .It. • • • • .It. • • • • .It. • .It. • • • .It. • .It. • • • .It. • .It. • • • • • • • .It. • .It. • • .It. • .It. • • • • • • • • .It. .It. • • • .It. • • • • .It.• • .It. • . • • •• • • .It. • .It. • • • • .It. • • .It. • .It. • • • .It. • • • • • • .It. • • • • •• .It.• • .It. • .It. • • • • •• .It.• • .It. • .It. • • .It. • .It. • • • • • • • • • • .It.
.It.
Figure 3-15: Repetitive pattern of figure 3-13 with a triangle substituted for one feature in each texel. The triangle acts as a dominant feature and facilitates extraction of the repetition.
3.6. AN IDEAL ALGORITHM
75
3.5.1 Summary It appears from the above demonstrations that the dominant feature assumption is reasonable. We have explained how simple grouping of features can be used to resolve many textures where there is more than one dominant feature in each texel. We have also demonstrated an example where there is no dominant simple grouping. In that case, the texture is not easily (preattentively ) analysed by the human visual system. There appears to be a similarity between the abilities of the dominant feature assumption and the abilities of the human visual system.
3.6 An Ideal Algorithm From the theoretical basis that we have established we can now propose an ideal algorithm for understanding repetitive textures. The ideal algorithm is as follows. 1. Compute features with prominence values. 2. Compute the dominant feature relative neighbourhood graph-the graph in which two features are connected exactly if they are equally prominent and there are no more prominent features in their lune. 3. Extract local repetitions (collinear equally-spaced triples) from the above. This algorithm is not practically applicable to real images because the repetitive textures in real images do not conform exactly to the theory. First, our theory assumes a perfectly regular grid but the repetition in real textures is often distorted and the features . are imprecisely located. The results of theory must therefore be applied in an approximate way. Second, the prominence values of the dominant feature may vary from one texel to another, so the test for equal prominence in step 2 must be replaced by an approximate test. Finally, the local repetitions cannot be expected to be exact-an approximate method is needed to evaluate the repetitions. In the following chapters we will develop and explain an algorithm which adapts the
above ideal method to handle the distortions and fluctuations which occur in real images.
76
CHAPTER 3. THEOREMS ON REGULAR REPETITION
We will demonstrate this algorithm on real data and show that it is indeed capable of extracting the repetitive structure.
Chapter 4 Overview of the Algorithm This chapter presents an overview of the algorithm which we have developed for extracting detailed structural descriptions of repetitive textures. The algorithm consists of four phases which are combined in a purely bottom-up manner. The four phases are as follows.
1. Extract features 2. Extract basic structural relationships between features 3. Extract local repetitions 4. Relaxation The first step is to extract image features. The features may, in principle, be any desired features of the image. In our work, the feature detection was not the thrust of the research, so we used simple blob features. The extracted features include parameters such as area and intensity contrast and, most importantly, prominence values. The second step is to extract basic structural relationships between the features. This processing is entirely symbolic and is based solely on the feature information extracted in the first step. The prominence values are used together with the dominant feature assumption to compute weighted structural links between features. These links reflect coarse structural information, primarily based on proximity and the dominant feature
77
CHAPTER 4. OVERVIEW OF THE ALGORITHM
78
assumption. There is a rough correspondence between this processing and the task of connecting the dots in a random dot pattern. The third step is to extract local repetitive structures. We assume that the local repetitive structures should normally be prominent in the basic structural links between features. Therefore, we examine the structural links around each feature and identify candidate local repetitive patterns for further analysis. The local repetitive patterns may be + shaped (representing repetition internal to a repetitive grid), T shaped (representing a side of the repetitive grid) or L shaped (representing a comer of the grid). The local repetitions are evaluated on the basis of the placement errors in the repetition they represent and on the basis of the goodness of the fundamental frequency vectors which they propose locally for the repetition. The fourth step is relaxation which is performed to resolve conflicts between the candidate repetition extracted in the third step. Each candidate repetition examines its neighbouring repetitions and becomes stronger or weaker depending upon the support it receives. This enforces consistency between repetitively linked features and reflects the constraint that the repetitive grid is locally consistent. Each phase of this algorithm involves highly parallel computation. It would therefore be well suited for implementation on a highly parallel machine. The remainder of this chapter consists of a more detailed description of each of these phases of the algorithm. The most important phases-the extraction of basic structural relationships, the extraction of local repetitions and the relaxation phase-are described in detail in the following chapters. In this chapter, our intent is to give an overview of the algorithm and to show how it works. We will show each step in the processing of two samples of repetitive textures-a portion of the skyscraper and a portion of the woven cane texture. The skyscraper (figure 4-1) exhibits perspective distortion which makes it difficult to analyse. Our results demonstrate that the local repetition model is very capable of handling such perspective distortion. The woven cane texture (figure 4-2) is an example of a texture which has more than one feature in each texture element. Extracting the repetition from this texture demonstrates the application of the dominant feature assumption.
79
Figure 4-1: A portion of the face of a skyscraper. This texture exhibits severe perspective distortion which can be accommodated by the local repetition model but not by the other models discussed in chapter 2.
Figure 4-2: A sample of woven cane texture from Brodatz, plate DI0l. This texture contains several features in each texel and demonstrates the dominant feature assumption in action.
CHAPTER 4. OVERVIEW OF THE ALGORITHM
80
4.1 Feature Extraction In chapter 3 we discussed the important characteristics of the low-level features. The
major characteristic, which forms the. basis of the dominant feature assumption, is that every extracted feature must have an associated prominence value and a location in the image. It is also important not to miss any real features when extracting the features from images. This is because the processing is strictly bottom-up and a feature which is omitted cannot be recovered. However, it is less of a problem if features are extracted which are not really present in the image, especially if their prominence values are low. Among the theoretical approaches to low-level feature detection in images, the texton theory of Julesz [18] and Marr's tokens [25] are the closest conceptual allies to the features that are desired by our theory. In practice, however, good methods to extract textons and tokens have yet to be developed. In particular, rather than ensuring that no features are missed, existing techniques are designed to extract only the most obvious features and therefore tend to omit important data. The features that we used in the experiments for this dissertation were simple blob features extracted from monochromatic images. The prominence of each feature was taken as its area multiplied by its intensity contrast against the background. We used a feature detector originally proposed by Ahuja and Haken [1]. This detector identifies blob features based on Laplacian of Gaussian filtering and directly measures both the size of the feature and its intensity contrast. We modified the detector to remove responses to intensity ridges in the image. Details of the algorithm are presented in section 5.1. The feature detector produces two separate sets of features: a set of bright features and a set of dark features. For some textures it is more appropriate to analyse one set of features than the other set. For example, the skyscraper (figure 4-1) is best analysed with dark features while the woven cane (figure 4-2) is best analysed with bright features. In principle, the results of completely processing both dark and bright features could be
combined in a final phase of the algorithm. This was not done in the present researchresults are presented for the type of feature most suited to the images as judged by the author. Figure 4-3 shows the dark features which were extracted from the portion of the
42. BASIC STRUCTURAL RELATIONSHIPS
81
~
-:'?:....
"."
,,,,-
·w
Figure 4-3: Features extracted from figure 4-1. original image in figure 4-1. Each disk represents a separate feature. The area of the disk is the measured area of the blob and the darkness of the disk is the measured intensity contrast. The features vary in size and spacing as a result of the perspective distortion present in the image. A considerable number of small features have been extracted which are not meaningful. Our algorithm will not be affected by these features. The bright features extracted from the woven cane texture are shown in figure 4-4. The small features in this pattern are meaningful but they have a low prominence value. In subsequent processing the algorithms will find the repetition of the dominant features,
by using the dominant feature assumption.
4.2 Basic Structural Relationships Once features have been extracted it is necessary to start organizing them in some manner. Witkin and Tenenbaum [44] have suggested that low-level organization of visual information is an important first step of hypothesizing meaningful relationships. The proposed relationships are not always useful in the final analysis of the scene, but they are usually useful and, in the absence of other information, they probably reflect causal relationships in the scene.
82
CHAPTER 4. OVERVIEW OF THE ALGORITHM
:e:.:.:.: .............. ................ .•. . . :.:e:.: : ". ... ... . tit
G
..
•
•
•• e
•• ..o$ G
•
•
.
••
.
•
~
•• e
'*
*
;:,~$#
Figure 4-4: Features extracted from figure 4-2. Since our aim is to analyze repetitive textures. we desire to start organizing the features in a manner which will provide a basis for understanding repetitive structures in the scene. At the same time, we wish to maintain highly local computation. Therefore, we do not start by looking for repetitive structures-we start by grouping things together which are probably related. The basis for this initial grouping is the dominant feature assumption and the dominated region assumption. Together. these assumptions lead to the following two basic concepts. 1. Prejudice. Features prefer to link themselves to other features which are equally
or more prominent. 2. Interference. The link: between two features is weakened if there is another equally or more prominent feature which is in the dominated region of the first two features. The Prejudice Principle is derived from theorems 3.5 and 3.6. If the second feature is equally prominent or more prominent than the first feature. then it is possible that the two features are dominant features of different texels and then we are interested in their relationship. If the second feature is considerably weaker than the first feature then
4.3. iocsi. REPErITIONS
83
it is unlikely that it is a dominant feature so the relationship is not interesting. While theorems 3.5 and 3.6 are not directional, the links under the Prejudice Principle are directional-a feature F, may have a link to another feature F, but Fj may not have a link to F j • This directionality is incorporated. so that a non-dominant feature will be linked to its neighbouring dominant features. The Interference Principle is a restatement of theorem 3.7. That theorem shows that a link between two dominant features is a fundamental frequency link only if there are no other dominant features in the lune (or, more generally, dominated region) of the first two features. As presented here, the Prejudice Principle and the Interference Principle are too rigid to be applied to real data. Real data exhibits distortions and noise effects which cause these two principles to be .only approximately true. Chapter 5 presents details of the application of these principles in the phase of our algorithm that extracts basic structural relationships. The algorithm extracts basic structural relationships for all features. In figures 4-5 and 4-6 are displayed. the basic structural relationships for all features in the two sample images (figures 4-3 and 4-4). The links are directional-the link from a less prominent feature to a more prominent feature is often strong while the reverse link may be very weak. In the figures, this is illustrated by a thick line emanating from a weak feature and reaching half-way to the strong feature.
4.3
Local Repetitions
When the basic structural relationships have been extracted, we have the information needed. to extract local repetitive relationships for each feature. The purpose of this processing is to move from basic structural grouping which reflects the fundamental frequency constraints to a first understanding of two-dimensional repetitive structure. The basic structural relationships reflect the likelihood of each link being a fundamental frequency link, so it only remains to discover whether the strong links participate in repetitive 'relationships.
84
CHAPTER 4. OVERVIEW OF THE ALGORITHM
Figure 4-5: Basic structural relationships for the skyscraper sample image, thresholded to reduce clutter in the figure.
Figure 4-6: Basic structural relationships extracted from the woven cane sample image, thresholded to reduce clutter in the figure.
4.3. LOCAL REPErITIONS
85
We consider three types of repetitive relationships. The most complete repetitive relationship is a cross (+-shaped) relationship, such as in figure 4-7 (a). Cross relationships occur in the interior of repetitive patterns. In a cross relationship, each of the repetitive links has an opposite-another repetitive link which emanates from the same central feature but in the opposite direction. Each repetitive link: and opposite pair should satisfy the basic repetitive relationship of collinearity and equal spacing which is the local repetition model. Along the edges of repetitive regions, T-shaped relationships (such as in figure 4-7 (b)) occur. In T-shaped relationships, one of the repetitive links has no opposite link. At the comers of repetitive regions, L-shaped relationships occur (see figure 4-7 (c)). In L-shaped relationships, neither repetitive link has an opposite. This means that it is
not possible to directly determine whether an L-shaped relationship is part of an actual repetition-the decisions are made during the final relaxation phase of the algorithm. In the repetition extraction phase of the algorithm, we extract potential local repetitive
relationships of each variety-cross, T-shaped and L-shaped. As a computational consideration, we limit the number of each type of relationship retained for further processing. We evaluate each local repetition on the basis of a number of considerations and retain the strongest repetitions. Two important considerations in this evaluation are: (a) how well the predicted repetitive structure in cross and T-shaped relationships is actually obeyed in the data and (b) how well the fundamental frequency vectors of the local repetition obey the perpendicularity constraint in theorem 3.3. Figures 4-8 and 4-9 show the local repetitions extracted from the two sample images. Only the strongest repetition at each feature is displayed as the figure would otherwise be impossibly complex. However, the algorithm actually retains many other local repetitions which are given further consideration in the final relaxation phase. Some of the features in these figures have no repetitions displayed because all the extracted repetitions for those features had unacceptably low evaluations.
CHAPTER 4. OVERVIEW OF THE ALGORITHM
86
(a) Cross relationship
(b) T-shaped relationship
(c) L-shaped relationship
Figure 4-7: Types of repetitive relationships
4.3. LOCAL REPETITIONS
87
Figure 4-8: The best local repetitions found in the sample skyscraper image (not thresholded).
Figure 4-9: The best local repetitions found in the sample woven cane image (not thresholded).
88
CHAPTER 4. OVERVIEW OF THE ALGORITHM
4.4
Relaxation
In the extraction of basic structural relationships we exploited the dominant feature as-
sumption and the theorems on the fundamental frequency vectors. In the extraction of local repetitions we exploited the local repetition model. One remaining assumption is unexploited--the assumption of consistency in the repetitive structure. In textures where the second fundamental frequency vector is close to being ambiguous, the repetitions extracted at adjacent features may not be compatible with each other even though both features are part of the same repetitive grid. This effect may be observed in figures 8-15 (c) and 8-17 (c). Incompatibility can also occur as a result of texture variations and noise effects that cause an alternative interpretation to be preferred at a particular feature. In addition, we have yet to distinguish between L-shaped relationships that are not part of any repetition and those which OCcur at the comer of a repetitive grid. These issues are addressed by a relaxation algorithm. The relaxation phase of the algorithm is designed to resolve incompatibilities and to identify T-shaped L-shaped relationships which are a part of larger regions of repetition. The algorithm is simple, fast and effective. Each repetitive relationships at each feature which is still under consideration is compared with the best repetitive relationship at neighbouring features and if there is a match then the repetitive relationship being considered is made stronger. Repetitive relationships which are not matched are made weaker and are eventually removed. Figures 4-10 and 4-11 show the final results obtained by iterated relaxation. These final results demonstrate the success of our algorithm on these sample data. In following chapters we will present results obtained on larger images which demonstrate the segmentation ability of this algorithm and its ability to handle diverse textures.
4.5
Handling Real Data
Our algorithm is intended to handle real data, but real data is rarely well-behaved. Real repetitive textures exhibit distortions which cause the repetitive pattern to change across the image. Adjacent texels in real repetitive textures are not identical. The prominence
4.5. HANDliNG REAL DATA
Figure 4-10: Final repetitive structure extracted from figure 4-1
Figure 4-11: Final repetitive structure extracted from figure 4-2 .
89
90
CHAPTER 4. OVERVIEW OF THE ALGORITHM
values for the dominant features in real repetitive patterns vary as a result of distortion and noise. In short-real data exhibits a lot of variation. Processing real data requires allowing for the variation while simultaneously extracting the order which underlies the data. In this chapter we introduced a number of concepts which are easily applied to perfect data but are difficult to apply to real data: • The Prejudice Principle and the Interference Principle are used to extract basic structural relationships from the data-on a perfectly regular repetition, these principles are easily applied by a simple decision process. • The evaluation of local repetitive structures is based on how well they obey the local repetition model and how well they obey the perpendicularity constraint of theorem 3.3. For perfect data, both of these constraints can be applied by simply discarding local repetitive structures which do not satisfy the constraints. • The relaxation phase of the algorithm depends upon matching local repetitive structures to determine whether or not they are compatible. For perfect data, this is a simple decision based on exact equality of the repetitive vectors. Each of the above constraints can be applied directly to perfect data by a simple decision process--a test or a threshold. However, when dealing with real data we must allow for deviation from the exact constraints. One way to make such allowance is to loosen the tests. For example, when matching two repetitive structures, to allow some error in the match. In this approach, the match error is measured and then thresholded yielding a binary decision-either the two structures match (well enough) or they don't match at all. The approach which we have used throughout our algorithm is to replace binary thresholding with a smoothed evaluation function. Rather than making a binary decision which results in either a yes or no answer, we evaluate the adherence to a constraint as a number in the range [0,1]. An evaluation of one represents absolute adherence to the constraint and a value of zero represents absolute violation of the constraint. Values in between represent varying degrees of departure from the ideal. These values are similar
4.5. HANDliNG REAL DATA
91
to probabilities and we sometimes combine them as though they were probabilities, but they are not really probability values. The smooth threshold approach has the following advantages over the more traditional binary threshold approach: • Data is not unduly discarded. In the binary threshold approach, a sample of data which marginally fails to satisfy the set threshold is considered totally unacceptable and is usually discarded from further processing. Since variation is a characteristic of acceptable real data, binary thresholds must be set very wide to allow all possibly useful data to be retained, or the algorithm must go back later and recover data which has been discarded. Setting wide thresholds admits a large amount of useless data while backtracking requires time-consuming re-evaluation. • Useful ranking information is retained. In the binary threshold approach, a sample of data which is marginally acceptable is not distinguished from a sample of data which is very close to ideaL The algorithm may have to re-evaluate the data later to determine which of the acceptable alternatives is the best. In the smooth threshold approach, the threshold evaluation is higher for a sample of data which is closer to the ideal. In addition to using smooth thresholds, we consistently retain multiple alternative hy-
potheses throughout the algorithm. For example, 50 basic structural links are maintained for each feature even though only a few of them are actually useful, and 50 competing local repetitive structures are maintained even though only one is actually applicable. This prevents backtracking to recover lost data. It is similar to smooth thresholding in that it postpones making a binary decision (which piece of data to keep) until more information can be gathered. We have found that, when dealing with real data, it is important not to make decisions too early. We avoid early decision making by using smooth thresholds and by retaining multiple competing hypotheses.
In the following chapters we present details of the
implementation of each phase of our algorithm and discuss the use of smooth thresholds and multiple competing hypotheses.
CHAPTER 4. OVERVIEW OF THE ALGORITHM
92
4.5.1 Smooth Thresholding Since smooth thresholding is used in many places throughout our algorithm, we will now present details of the smooth thresholding function that we use. Although many possible smooth thresholding functions are available, throughout this dissertation we will concentrate on a smooth threshold function T defined as follows.
x- T
1
T (x, T, 0") =
1
2 tanh 0"/0.55 + 2
This function was chosen because of three desirable properties which can be seen in figure 4-12: it is monotonic, it converts an infinite domain to the range (0, 1) which is the desired range of evaluations, and it never actually reaches the asymptotes 0 (false) and 1 (truth). This means that ranking information is available at all times, subject to round-off error. These properties are shared by many probability density functions-our . function T is a particular parameterization of the Sigmoid distribution [37, page 14].
T has the following additional properties . • T
(T, T, 0") is always 0.5.
• The factor of 0.55 was chosen so that T (T + 0", T, 0") is approximately 0.75 and T (T - 0", T, 0") is approximately 0.25.
These properties provides a simple semantic understanding of the parameters 0":
T
is the threshold and 0" defines the 25-percentile points relative to
T
and
T.
In addition to T, we define a second smooth threshold function which converts the domain (0,00) to the range (0,1). This is more appropriate when the input value is restricted to be strictly positive. Figure 4-13 shows an instance of the smooth threshold function
7L which
is defined as follows.
7L (x, T, 0") = T
(log;', 0, log 0")
This function is especially appropriate for input values x which are ratios of two positive numbers. For example, it is used to compare the prominence values of features.
45. HANDLING REAL DATA
93
1.0
r-30"
r-20"
r-O"
r
r+O"
r+20"
r+30"
Figure 4-12: Smooth threshold function T
Figure 4-13: Smooth log threshold function 7L
94
CHAPTER 4. OVERVIEW OF THE ALGORITHM
Chapter 5 Basic Structural Relationships The four major phases of our algorithm are: extraction of features. extraction of basic structural relationships. extraction of local repetitive relationships and relaxation. In chapter 4 we presented an overview of this algorithm. In this chapter we will describe the feature extractor which we used and then discuss the extraction of basic structural relationships in detail. Feature extraction is the first phase of our algorithm. The feature extractor processes the raw image data and produces symbolic features which represent significant features of the image. In our algorithm. the features are blobs-small regions which are darker or lighter than their surround. Each feature is characterized by its size and its intensity contrast. The prominence value of each feature is then computed as the area multipled by the magnitude of the intensity contrast. Once features have been extracted it is necessary to start organizing them in some manner. Since our aim is to analyze repetitive textures. we start organizing the features in a way which forms a basis for understanding repetitive structures in the scene. The initial structural relationships which we extract are based on the dominant feature assumption and the dominated region assumption. both of which were introduced in chapter 3. Together. these assumptions lead to the Prejudice Principle and the Interference Principle which are stated as follows. 1. Prejudice. Features prefer to link themselves to other features which are equally or more prominent. 95
96
CHAPTER 5. BASIC STRUCTURAL RELATIONSHIPS
2. Interference. The link between two features is weakened if there is another equally or more prominent feature which is in the dominated region of the first two features. The extraction of basic structural relationships is a practical application of these two principles to real data. Because real data contains distortions and noise effects which are not included in the theories of chapter 3, the smooth thresholding technique is used to evaluate potential structural links to determine how well they adhere to the principles. The structural links, together with their evaluations, are the result of this phase of our algorithm. The algorithm for extracting basic structural relationships consists of two sub-phases. First, a local neighbourhood of features is established for each feature. Then, the links between the central feature and its local neighbours are evaluated to obtain the basic structural relationship links. Section 5.2 describes the technique which we use to establish a local neighbourhood and section 5.3 describes the "evaluation of the structural links.
5.1 Feature Extraction This section describes in detail the feature detector which was used throughout the experiments reported in this dissertation. Feature detection is not the major thrust of our work, but the nature of the work imposes certain requirements on the feature detector, so we developed a feature detector to meet those requirements. The major requirement for our feature detector is that every extracted feature must have an associated prominence value and a location in the image. The location in the image forms the basis for identifying repetitive structure and the prominence value is required by the dominant feature assumption. The prominence value of a feature is required to be invariant under translations of the feature in the image. Because we are looking for two-dimensional repetitive textures, we need features which have point locations. Edges, which have unknown extent in one direction, are unacceptable. Corners, bounded regions of constant intensity and blobs are typical features that could be used. In this dissertation, we chose simple blob features.
A second requirement for our feature detector is that it should not miss any of the dominant features of the repetitive textures. This is because our processing is strictly
5.1. FEATURE EXIRACI'ION
97
bottom-up and a feature which has not been extracted cannot be recovered later. However, our algorithm is designed to ignore features which have a low prominence value compared to the dominant features, so it does not matter if the feature extractor finds features which are not really present in the image.
5.1.1 Feature Detector Algorithm The feature detector that we used for the experiments in this dissertation is based on the work of Ahuja and Haken [1]. It is not intended as an all-round feature detector but simply as a method of supplying adequate data for our algorithm. It has some deficiencies which we will elaborate later. In their paper on three-dimensional texture, Ahuja and Haken propose performing
feature detection in scale space by convolving the image with two filters at each scale (J'. The two filters are \J2G Il and
r= Jx +y2.
;(1 \J2G
Il
which are given by the following equations where
2
Ahuja and Haken solve for the response of each of these filters to an image consisting of a disk of one intensity on a background of a different intensity. At the center of the disk, the solution has a closed form which allows the diameter of the disk D and its intensity contrast A to be recovered as follows, where \J2G Il
*I
convolving \J2G Il with the image I.
D= A
=
2(J' (J' 2(J'2 7rD2
(:u
\J2G Il *
flPl8u 2 (\J 2 G Il
I) (\J 2G * f) + 2
* I)
Il
denotes the result of
98
CHAPTER 5. BASIC STRUCTURAL RELATIONSHIPS
The optimal situation for estimating the disk parameters occurs when the disk diameter
D = 2V2t7. For this reason, Ahuja and Haken use the filter width w alternative description of the filter size.
= 2v'it7
as an
Our feature detector employs the above equations directly to estimate the diameter and intensity contrast of blob features in the image. The feature extraction algorithm proceeds as follows. First, the input image is convolved with V 2GIl. and ;CT V 2GIl filters for the following values of w: 4, 5.6, 8, 11.2, 16, 22.4 and 32. These filter sizes provide a geometric progression which covers a wide range of feature sizes. The convolution is performed in the Fourier domain. Second, each pair of filtered images is processed to extract disks. Disks are extracted at local extrema of the image V 2GIl
* I.
Disks are discarded if their intensity contrast
is below an absolute threshold Also, disks are discarded if their diameter varies too much from the filter size w-such disks can be extracted more optimally at another scale. The disks are extracted as two separate sets-a set of bright disks corresponding to bright features and a set of dark disks corresponding to dark features. In all subsequent processing, these two sets of features remain separate. In our experiments, disks which had an intensity contrast of less than 20 were discarded. For 8-bit images, this corresponds to less than one tenth of the data range. Disks with a diameter smaller than w/ v'i or larger than 2w were also discarded. We made the acceptable range of disk sizes larger than the factor of
v'i which separates the filter sizes
so that each disk will be found in at least one filter size. If there was no overlap between the acceptance ranges of the different filters then some features would be missed-the diameter estimated at one filter size can be too big while the diameter estimated at the next filter size is too small. We avoid this potential problem by widening the thresholds so that they overlap. The first two steps, above, are a very effective way of extracting disks representing compact blobs in the image. However, disks are also extracted in response to ridges in the image and in response to strong edges. In its third step, our feature extractor removes the more extreme cases of response to ridges. This is done by scanning the V 2GIl
*I
image around each extracted disk. If the disk is actually a response to a ridge, there will
99
5.1. FEATURE EXIRACTION
be a ridge in the V 2G" * I image around the extracted disk. The feature extractor tries to detect these ridges as follows. Let
(V2 G" *IL denote the values of the V 2G,,*I image at the disk centre (the local
extrema). The V2G Il
*I
image is thresholded at a (V 2GIl * IL. An eight-connected
region is extracted around the disk center in this thresholded image. If any point of the connected region is further than three times the radius of the disk from its centre, then the disk is considered to be a response to a ridge. In actual implementation, the thresholding is performed on-the-fly and the algorithm walks the boundary of the connected region to determine whether any point is too far from the centre-the connected region is never actually extracted. The value of a is set depending on the filter size because smaller filters are more sensitive to noise than larger filters. In our experiments, ao, the value for the smallest filter, was set to 0.8; for an arbitrary filter width w the value of a was given by the following equation.
a
=1 -
4
(1 - ao)w
The fourth and final step performed by our feature detector is to remove redundant and overlapping features. The feature detector discards a disk dl if there exists another disk dz which satisfies all of the following conditions.
1. The centre of d l is inside da or the centre of d2 is inside d l • This determines whether or not there is substantial overlap between the features. 2. The prominence of dz is greater than the prominence of d I. The prominence of a feature is defined as its area multiplied by the magnitude of its intensity contrast. 3. The magnitude of the intensity contrast of d, is less than four times the magnitude of the intensity contrast of d2 • This condition prevents large dim features from deleting small bright features. This algorithm is performed by exhaustive search.
100
CHAPTER 5. BASIC STRUCTURAL RELATIONSHIPS
5.1.2 Discussion The feature detector which we have described above performs satisfactorily on most of our images and provides raw data suitable for use by our algorithm. The strengths of this particular feature detector are that it is reasonably simple to implement and that it is very good about not missing the features present in the images. The latter is especially important in our work as we do not return to the original image to recover missed features. One of the weaknesses of our feature detector is that it responds to large-scale changes in the average image intensity. This produces large features that have a high prominence value. These prominent features suppress any small features which occupy the same portion of the image. This sometimes results in the loss of useful features corresponding to the texture present in the image. For example, in extracting features from the texture composite image in figure 8-7 our feature extractor responds to variations in the average intensity of the burlap texture in the lower left comer of the image. Figures 8-8 and 8-9 show the extracted features-the large-scale features are easily seen scattered amongst the texture features. In the final results (figures 8-10 and 8-11) there are gaps in the repetitive pattern caused by the large-scale features. Another weakness is that our feature detector cannot extract highly elongated features such as occur in figure 8-25. In this figure, extreme perspective distortion of the repetitive texture on the top of the cylinder results in highly elongated features which are not extracted. As a result, our algorithm cannot find the repetitive structure in this portion of the image. In general, however, the data obtained from the feature detector which we have described has been sufficiently complete and accurate so that our algorithm is able to analyse repetitive textures in a variety of images. Chapter 8 presents detailed results of the entire algorithm on a variety of images, including displays of the features which were extracted from a number of the images. The feature detection algorithm was implemented on a Sun 3/60 and the Warp systolic array computer [3]. The filter convolutions were performed in the Fourier domain on the Warp requiring 3.5 seconds for each 512 x 512 Fourier transform and 620 ms for the complex multiplication. Each filter convolution requires three Fourier transforms and one complex multiplication. Eight filter sizes were used with two filters at each size for
52. ESTABliSHING A LOCAL NEIGHBOURHOOD
101
a total of 179 seconds of Warp processing per image. Analysis of the filtered images was performed by a C program running on a Sun 3/60 and took 272 seconds for the data in
figure 8-34.
5.2 Establishing a Local Neighbourhood The first task in computing the basic structural groupings is to establish a local neighbourhood of features. This is pure computational expediency-the link evaluation function is defined between all pairs of features in the image. However, it is generally true that links between widely separated features will be weak so it is computationally reasonable to establish a local neighbourhood. It is required of this processing that the local neighbourhood of each dominant feature of a repetitive texture should include the neighbouring dominant features in the repetitive structure. If this requirement is not met then the relationships between repetitive neighbouring dominant features will be lost and they will not be subsequently recovered. For this reason, the local neighbourhood is made
reasonably large-:---fifty features are included in each local neighbourhood. It is advantageous for the locality to be balanced around the central texel. In particular, it is not appropriate to use the nearest N neighbours based on Euclidean distance. Near
the border between a dense texture and a sparse texture, the Euclidean locality includes many features from the dense texture and few from the sparse texture. In figure 5-1, for instance, the central feature belongs to the repetitive texture on the right, but only two features from that repetition are included in a Euclidean neighbourhood of ten features. In this example, the Euclidean neighbourhood isolates the feature from one of its repetitive neighbours, which is not a good start when the aim is to find repetitive neighbours! To provide a basis for a balanced locality, we introduce the six-connected graph which is defined below. The extraction of a local neighbourhood involves the following two steps. 1. Compute the modified six-connected graph as defined below. 2. For each feature, perform a breadth-first search on the modified six-connected
graph starting at the feature under consideration. Features that are encountered in
CHAPTER 5. BASIC STRUCTURAL RELATIONSHIPS
102
• • • • • • • • • • • • , , ----, • • • , • • • • • •
• ....
""
~
I
..
\
• • • • • • • • " " ---- " • • • • • •
•
•
•
•
•
I I
\
. . ...
•
I
I
\
•
,
\
I
\
•
\
I I I
, ,,
I
I
,
Figure 5-1: Euclidean neighbourhood of a texel feature. the search and have sufficient prominence are entered into the neighbourhood of the central feature until a fixed limit of 50 neighbouring features has been accumulated. This algorithm obtains a balanced neighbourhood of 50 features surrounding each feature. The prominence test is employed to screen the neighbours so that a clutter of irrelevant features around a dominant feature will not prevent that dominant feature from having neighbouring dominant features in its neighbourhood. Figure 5-2 shows the 50 neighbours of a single feature in the woven cane sample image.
5.2.1 The Six-connected Graph The six-connected graph is defined as follows. Definition 5.1 Let P = {Pt,P2, ... ,PIl} be a set of points in the plane. Let
e (Pi,Pj)
be a
function which returns the angle of orientation of the vector from Pi to Pj in degrees. Let
Q (Pi,Pj) be defined by Q (Pi,Pj) = Le (Pi,Pj) /60J where L· ..J denotes the integer floor operation. It can be seen that Q (Pi,Pj) returns an integer index that identifies which of the six sixty-degree sectors of angle is occupied by the vector from Pi to Pj. The six-connected graph is a directional graph which contains an edge from Pi to Pj if there does not exist another point Pk for which 1J (Pi,Pk)
Q (Pi,Pj).
0
< 1J (Pi,Pj) and Q (Pi,Pk) =
52. ESTABliSHING A LOCAL NEIGHBOURHOOD
103
Figure 5-2: Neighbours of a single feature The six-connected graph is so called because it generally has six connections for each node. In the above definition, however, more than six connections may be present at a given node if that node has more than o~e equally close neighbour in a single sixty-degree sector. In our algorithm, we modify the six-connected graph so that it has only one neighbour
in each sixty-degree sector. TIes are broken arbitrarily. In the rare cases where two features have identical locations (which could not happen in the set of point P but can happen in real data), the tie is broken in such a manner as to -guarantee that the two features do not each see the other at the same orientation. Figure 5-3 shows the six-connected neighbourhood for the same configuration of points as in figure 5-1. In contrast to the bad characteristics of the Euclidean neighbourhood, the six-connected neighbours are evenly balanced around the central feature. Figures 5-4 and 5-5 illustrate the six-connected graphs of the features extracted from the skyscraper and woven cane sample images (figures 4-1 and 4-2). One interesting property of the six-connected graph is that it is a super-set of the relative neighbourhood graph [41]. The relative neighbourhood graph has been proposed as an approximation to human perceptual grouping of dot patterns [32,41] and has wellknown properties. The six-connected graph includes all the connections of the relative
CHAPTER 5. BASIC STRUCTURAL RELATIONSHIPS
104
... . . ',. . . • . .... . . . . • • • • • • .. " • • . ..' . • • . '
,
'\ .
,
' .\
,
--------------~~~~ ,
,,
,
. .' . .,'.
• •
,,
.
\ ,,
\, \
',
,
.
• •
• •
Figure 5-3: Six-connected neighbourhood of a texel feature.
Figure 5-4: Six-connected graph of features
52. ESTABliSHING A LOCAL NEIGHBOURHOOD
105
Figure 5-5: Six-connected graph of features neighbourhood graph, but may include more. We compute the modified six-connected graph by the obvious brute-force method, with computational complexity the square of the number of image features involved.'
5.2.2 Breadth-First Search The modified six-connected graph links the features together in a tight web of links and cross-links. This graph is used as the basis for establishing a balanced local neighbourhood consisting of a fixed-size set of features. The neighbourhood is extracted from the modified six-connected graph by a breadth-first search process. Each feature is taken in turn as the central feature for which a neighbourhood is to extracted. A threshold is set based on the feature's prominence value-other features which are less prominent than the required threshold are not admitted to the neighbourhood when they are encountered in the breadth-first search. The search proceeds until fifty features have been admitted to the neighbourhood. It is not necessarily the case that the original IThis is the same computational complexity as the best-known algorithm for the relative neighbourhood graph [41] but the algorithm given there is much more difficult to implement. The relative neighbourhood graph could be computed in O(n2) time by first computing the six-connected graph and then extracting the relative neighbourhood graph from it.
106
CHAPTER 5. BASIC STRUCTURAL RELATIONSHIPS
six-connected neighbours of the central feature will be in the neighbourhood-they also must pass the admission test. In our experiments we set the admission threshold to 1/16 = 0.0625. This threshold is sufficiently small that features which are not admitted to the neighbourhood would have negligible effect on the results if they were added to the neighbourhood. However, it is practically necessary to have such a threshold as many cluttering features are sometimes extracted by the feature detector. An example of cluttering features may be seen in figure 8-13 (b). In our experiments we found that a neighbourhood of fifty features combined with an admission threshold of 1/16 was sufficient to ensure that each dominant feature of a repetitive texture would have access to the neighbouring dominant features of the repetition. Limiting the size of the neighbourhood is purely a computational convenience, however, and different parameters could be used if necessary for particular data.
5.3 Evaluation of Structural Relationships After a local neighbourhood has been established, the links between the central feature and each of its neighbours are evaluated to determine how well they adhere to the Prejudice Principle and the Interference Principle. The difficulty in applying these principles is that the data we have available to us (i.e. the features) is not ideal. In particular, we cannot assume that the dominant features in two nearby texels will be equally prominent. This
.assumption was important to the proof of theorem 3.5 upon which the Prejudice Principle is based. So we have to adjust the Prejudice Principle to allow that a second feature which is somewhat less prominent than the first feature may still be a good feature to which to link. Similarly, in the Interference Principle, we have to consider that a marginally less prominent feature within the dominated region is probably indicative that the link. is not a Fundamental Frequency link. These considerations modify the way we evaluate the applicability of the Prejudice Principle and the Interference Principle to real data.
An additional difficulty arises in applying the Interference Principle to real data. While the relative neighbourhood graph has been shown to be equivalent to the fundamental repetitive frequencies of a simple grid, the relative neighbourhood graph is not robust
5.3. EVALUATION OF STRUcrURAL RELATIONSHIPS
107
!-!~!
!-{..--!
(a) A perfect regular repetition
(b) A slightly disturbed regular repetition
Figure 5-6: Sensitivity of the relative neighbourhood graph to small disturbances in the location of a point to small changes in the positions of the grid points. This is especially true when the repetitive pattern is elongated as, for example, in figure 5-6. Part (a) of this figure shows a perfect regular repetition and its relative neighbourhood graph which, as predicted by theory, captures the fundmental frequency vectors of the repetition. Figure 5-6 (b) shows what happens when the central point of the pattern is moved by a very small amount (less than 7% of the longer fundamental frequency vector's length). Variations of this magnitude often occur in real data as a result of distortion and noise effects. Such small changes introduce large changes into the relative neighbourhood graph because the moved point now falls inside the lune of other pairs of points, preventing those points from being linked. The sensitivity of the relative neighbourhood graph to small variations in the point positions results from its use of the lune. To avoid this sensitivity, we use a function other than the lune to represent the dominated region of a pair of features. The actual dominated region function which we use lies inside the lune and is designed to be more robust to small changes in the positions of the grid points. However, because our dominated region is a subset of the lune, the graph extracted is a superset of the relative neighbourhood graph and contains non-fundamental repetitive links which are removed in subsequent processing.
108,
CHAPTER 5. BASIC STRUCTURAL RELATIONSHIPS
One final robustness consideration concerns the use of thresholding functions. In these early stages of processing, the data is not well enough refined to make confident binary decisions. Rather than using step thresholding functions, we use the smooth functions T and
7L to evaluate whether a point is approximately equally prominent with another point
and whether it falls in the dominated region of a pair of points. The smooth function yields a numerical confidence evaluation on the range (0,1).
5.3.1 Link Evaluation Algorithm The link evaluation algorithm combines the Prejudice Principle and the Interference Principle. For each link to be evaluated, the relative contrast suppression function SR (which implements the Prejudice Principle) and the interference suppression function S/ (which implements the Interference Principle) are evaluated. The results of these two functions are combined to yield the link strength. The strength of the directional link from feature
F, to feature Fj is represented by r, (Fj , Fj ) and is defined as follows.
Note that as either the relative contrast suppression value SR or the interference suppression value S/ approaches one, the link evaluation approaches zero. A link evaluation of zero represents a basic structural relationship which is deleted.
5.3.2 Application of the Prejudice Principle The Prejudice Principle states: "Features prefer to link themselves to other features which are equally or more prominent". This principle is applied to the evaluation of the directional link between F, and Fj: The evaluation uses a smooth threshold function to determine the extent to which the second feature can be said to be equally or more prominent than the first feature. The relative contrast suppression function SR is defined as follows. Note that this is a suppression function-a value close to one indicates that the link will be suppressed.
5.3. EVALUATION OF STRUCTURAL RELATIONSHIPS
109
The constants used in our experiments were as follows. These constants allow a considerable variation in the prominence values of features-a ratio of eight to one is penalized to a level of 50%.
5.3.3 Application of the Interference Principle Application of the Interference Principle in our algorithm consists of two parts. The position of the interfering feature in relation to the link is evaluated by the locational interference function SL (Fi , Fj; FA:) and the relative prominence of the interfering feature is evaluated by the prominence interference function Sp (Fi , Fj,FA:). The details of these two functions are presented in the following sections. For each link to be evaluated, both SL and Spare evaluated for all features FA: that are neighbours of both Fi and Fj in the local neighbourhoods defined previously. The values of SL and Sp are combined in the following equation. The resulting value of SI (Fi , Fj ) represents the combined interference effects of all features FA: on the link (Fi , Fj ) . Note that this is a suppression function-a value close to one indicates that the link will be suppressed. SI (Fi , Fj )
= 1 - II 1 -
SL (Fi , Fj,FA:) Sp ir; Fj,FA:)
A:
5.3.4 Locational Interference The dominated region of a pair of features defines the set of locations where other features will interfere with a link between the pair of features under the Interference Principle.
Previously, we have shown that the dominated region of the relative neighbourhood graph is the lune. The lune of the two features Fi and Fj is shown by a dashed line in figure 5-7. The lune is sensitive to small position changes in the position of nearby features of the repetitive pattern as was shown in figure 5-6. In figure 5-7, either of the points F 1 or F 2
CHAPTER S. BASIC STRUCTURAL RELATIONSHIPS
110
,,
,,
,"
,," ,,
,,,
,,-
--,
---
"
'---"'"
,,
,,
,,
,,
, ,,
,,
I I
,, ,, \
I
\
.
r;,V"\, ,. - . .:. . '.-- /
,.>,
' . ..> /',
. •;:i'~\.J
'> --:
• .. ,~ ~1:5 --v,y·
,/.....
'.
)
.. xe ' 'I
IX ,?',I'f~~
) ~~.0,~,k"C' Ie" ~, ),),
,~B
'jc..~;.
t!i;.\ ~~" A -X",' ~ ~=~~5 ~Y~ -~ 50N-.0.""j,6-, A "',,,;). A, ~ -< ,,,\' X >< ~ ~v~ ~~,,~ .~iQ, '-. \" \ y\,....).f';( ."",,,'f"'! ' ·1 Y i' './ "\ ' ,\\--/ v ' . \ .
jf.[3; i--'. )~~
~~
;>