Document not found! Please try again

Extending Attribute Filters to Color Processing and

0 downloads 0 Views 4MB Size Report
Dr. M. H. F. Wilkinson. Beoordelingscommissie : Prof. dr. S.T. Acton. Prof. dr. R. Klette. Prof. dr ... This research project was funded by the NPT project ”Building a Sustainable ICT Train- ...... Add the product of δh and A(Pk ...... Lp + Kp. (6.18) with K a tunable balance point, at which ws = 0.5, and p a parameter determining the.
Extending Attribute Filters to Color Processing and Multi-Media Applications

Florence Busingye Tushabe

RIJKSUNIVERSITEIT GRONINGEN

Extending Attribute Filters to Color Processing and Multi-Media Applications

Proefschrift

ter verkrijging van het doctoraat in de Wiskunde en Natuurwetenschappen aan de Rijksuniversiteit Groningen op gezag van de Rector Magnificus, dr. F. Zwarts, in het openbaar te verdedigen op maandag 22 november 2010 om 14.45 uur

door

Florence Busingye Tushabe

geboren op 5 juli 1978 te Mulago, Oeganda

Promotor Copromotor

: Prof. dr. N. Petkov : Dr. M. H. F. Wilkinson

Beoordelingscommissie

:

ISBN: 978-90-367-4630-4

Prof. dr. S.T. Acton Prof. dr. R. Klette Prof. dr. J.B.T.M. Roerdink

This research project was funded by the NPT project ”Building a Sustainable ICT Training Capacity in the Public Universities in Uganda” The work described in this thesis was performed at: The University of Groningen, P. O. Box 800, 9700 AV Groningen, NL Faculty of Computing and IT, Makerere University, P. O. Box 7062 Kampala, Uganda

The front cover shows the filtering results for improvement of image compression for different attribute filters. These are the same four images as in Figure 5.5, in full color. Original image courtesy of Bruce Justin Lindbloom, and the Image Compression Benchmark (http://www.imagecompression.info/).

Printed by Ipskamp Drukkers B. V., Enschede, The Netherlands.

To God, My protector and provider; Deliverer and defender.

vi

Contents

Acknowledgements 1

2

ix

Introduction 1.1 Summary of the work . . . . . . . . . . 1.1.1 Watermarking . . . . . . . . . . 1.1.2 Image Compression . . . . . . 1.1.3 Content-based Image Retrieval 1.1.4 Color Image Processing . . . . 1.2 Thesis Organization . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

The Effect of Selected Attribute Filters on Watermarks 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 2.2 Watermarking . . . . . . . . . . . . . . . . . . . . . . 2.3 Attribute Filtering . . . . . . . . . . . . . . . . . . . . 2.3.1 The Max-tree . . . . . . . . . . . . . . . . . . 2.3.2 Tested Attributes . . . . . . . . . . . . . . . . 2.4 Experimental Results . . . . . . . . . . . . . . . . . . 2.4.1 Filtering by Gray-level . . . . . . . . . . . . . 2.4.2 Filtering by Area . . . . . . . . . . . . . . . . 2.4.3 Filtering by Power . . . . . . . . . . . . . . . 2.4.4 Filtering by Volume . . . . . . . . . . . . . . 2.4.5 Filtering by Vision . . . . . . . . . . . . . . . 2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . vii

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

1 4 5 5 6 7 8

. . . . . . . . . . . . .

9 9 10 11 12 12 14 15 15 16 16 16 16 17

Contents 3

Preprocessing for Compression: Attribute Filtering 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . 3.2 Pre-processing for Compression . . . . . . . . . . 3.3 The Proposed Method . . . . . . . . . . . . . . . 3.3.1 Binary Attribute Filtering . . . . . . . . . 3.3.2 The Max-Tree Approach . . . . . . . . . . 3.3.3 The Volume Attribute . . . . . . . . . . . 3.3.4 The Vision Attribute . . . . . . . . . . . . 3.4 Experimental Results . . . . . . . . . . . . . . . . 3.4.1 At the Same Threshold . . . . . . . . . . . 3.4.2 At Different Thresholds . . . . . . . . . . 3.4.3 At the Same Quality . . . . . . . . . . . . 3.4.4 At the Same Size . . . . . . . . . . . . . . 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . 3.6 Future Work . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

19 19 20 21 21 22 23 24 24 24 25 26 27 28 30

4

Content-based Image Retrieval Using Combined 2D Attribute Pattern Spectra 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 2-D Pattern Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Computing the Pattern Spectra . . . . . . . . . . . . . . . . . . . . . 4.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31 31 32 33 33 35 37 38

5

Color Processing using Max-trees: A Comparison on Image Compression 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Binary and Gray-scale Operators . . . . . . . . . . . . . . . . . . 5.2.2 The Max-tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 Color Connected Filters . . . . . . . . . . . . . . . . . . . . . . . 5.2.4 New Extensions to Color Max-trees . . . . . . . . . . . . . . . . 5.2.5 The Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 The Building Phase . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Attribute Management . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Filtering and Restitution . . . . . . . . . . . . . . . . . . . . . . . 5.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Comparison of Filters . . . . . . . . . . . . . . . . . . . . . . . .

41 41 44 44 45 47 49 53 53 54 56 58 59 59

viii

. . . . . . . . . . . . .

. . . . . . . . . . . . .

Contents

5.5 6

7

5.4.2 Comparison of Decisions 5.4.3 Comparing the Preorders 5.4.4 Comparing the Attributes Discussion and Conclusions . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Connected Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Vector Attribute Filters . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3 The Max-tree Approach . . . . . . . . . . . . . . . . . . . . . . . . 6.2.4 Color Max-trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 The Color-Vector Attribute Filter . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 The Preorders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Using Color-Vector Attributes . . . . . . . . . . . . . . . . . . . . . 6.3.3 The Shape Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Combining Color and Shape . . . . . . . . . . . . . . . . . . . . . 6.5 Conclusions and future work . . . . . . . . . . . . . . . . . . . . . . . . . Summary

. . . .

61 63 64 66

. . . . . . . . . . . . . .

69 69 71 71 73 75 77 78 78 79 80 84 86 89 90 95

Samenvatting

97

Publications

101

Appendix: Attribute Computation

103

Bibliography

107

ix

Acknowledgments

I would like to thank dr. Michael H. F. Wilkinson, my supervisor, for who he is. He is the first genius that I have seen, face-to-face. Thank you Michael, for your many ideas, constant support, positive attitude, humility and patience. Without you, this work would have taken me an extra five years to finish. May God bless you indeed. Many thanks go to Prof. Venansius Baryamureeba (Barya). Barya wrote the proposal that gave me the opportunity to pursue this PhD. He repeatedly encouraged me to take it on, despite my initial reluctance. He showed me that I had the potential to take on the task. Thank you Prof. Barya for teaching me how to write and for making my life as a PhD student very comfortable and fulfilling. You provided me with all the moral support that I needed and was always available whenever I needed you, even at a short notice. My dear friend, dr. Erik Urbach. You made my social life in the Netherlands wonderful. You helped me with the difficult C codes especially those initial modifications to the Max-tree implementations for the vision attribute and the color filter. Thank you for having such a sweet heart. I would like to thank Makerere University, Sida/Sarec and Nuffic for all the financial support that was given to me. Indeed, without the financial assistance I would not have been writing this sentence now. John Kizito, Fred Rwakijuma and Agaba Gabriel; what would I have done without you? Janat, you are my shining star. Adi, you are an angel without wings and you light up my world. My children, Arnold and Amani, your presence in my life gives me tremendous joy and pride. You motivate me in ways you can not even imagine. I really appreciate the love and support that all my colleagues at the Faculty of Computing and IT, Makerere University extend to me. Special thanxs go to Agnes Namulindwa, Richard Mayanja, Agnes Rwashana, Nabukenya Josephine, Julianne Sansa, John Ngubiri, Fred Kiwanuka, Johnson Mwebaza, Emily Bagarukayo, Stuart Katungi and Wycliff. Thanxs to all my colleagues at the University of Groningen, all the members of the Intelligent Systems group especially Prof. N. Petkov, Dr. M. Biehl, George xi

Ouzounis, Aree Witoelar, Easwar Subramanian, G. Papari and Desiree. Lastly but not least, special thanks go to all my friends and family for all the support you have given to me. My mother always wanted me to become a doctor. When I refused to take biology in high school, I saw the disappointment in her eyes. But now, I become a doctor, albeit not the one she had in mind. Thank you mother for all your love and sacrifice. You are my backbone. Florence Tushabe Groningen October 19, 2010

Chapter 1

Introduction Every artist dips his brush in his own soul, and paints his own nature into his pictures. Henry Ward Beecher

We view the world through a lens of images. In our minds, all kinds of concepts are visualized, given a shape, a color and a style. Emotions are described by color, voices by shapes and complicated theories are simplified by imagery. Through image processing and analysis techniques, faster solutions to problems have been obtained in many fields, including security, medicine, engineering, law enforcement, astronomy, entertainment, etc. They have been used to meet user needs ranging from understanding what happens inside our bodies, in our environment, even in foreign galaxies and beyond. Numerous studies show how image analysis and filtering techniques have helped save lives by aiding early detection of diseases including cancer (I. F. Kallel and Garcia 2009). Image processing is all about manipulating an image in a bid to fulfil a given user need. It involves proper identification of the relevant features of the image, processing them using appropriate methods and at the right level of detail so that meaningful information can be extracted and used to solve the problem. Morphological processing (Serra 1982) is one of the many techniques of image processing. Mathematical concepts are modeled around set and lattice theory (Ronse 1990) and are used to describe image components and their relationships (Gonzalez and Woods 2002). Morphological processing involves modifying an image by adding to or removing content from a given pixel or group of pixels. The result is an image that contains structures with more meaningful content or information to the image user or owner. Image processing results in some desirable tasks like denoising, shape or edge detection, segmentation, contrast enhancement e.t.c. Filtering as used in this thesis means removing part of an image’s contents using morphological operations. Morphological operations deal with images either at local neighborhood or component (a set of pixels) level (Salembier and Wilkinson 2009). Local neighborhood morphological methods are often based on erosions and dilation operations, which modify the shapes of the image components by either shrinking or enlarging them. This

2

1. Introduction

kind of approach has a limitation of changing the component edges and hence can lead to deformations and distortions. Connected filtering (Heijmans 1999, Salembier and Serra 1995, Salembier and Wilkinson 2009) is a kind of morphological filtering that deals with connected components of the image, and not the individual pixels. Connected filters, do not change the shapes of components, and hence result in better quality images and do not cause blurring. Within connected filtering, attribute operators (Breen and Jones 1996, Salembier et al. 1998) are used to either preserve or remove connected components of the image, based on some property, or attribute. For each component, an attribute value is calculated. This value can either be a scalar (Breen and Jones 1996) or a vectorial value (Urbach et al. 2005, Naegel and Passat 2009). An attribute filter removes elements of an image that do not satisfy pre-determined criterion of a given attribute like area, perimeter, volume, circularity, e.t.c (Breen and Jones 1996). Usually, filtering takes place by removing all components whose attribute value is less than or equal to a pre-determined threshold, although other criteria can also be used. The remaining components are then left untouched and this guarantees shape preservation for the remaining parts, including edge preservation. Attribute filtering is an attractive option for users because it provides a robust avenue for describing a given user need. It also enables more flexibility in choosing the parts of the image to remove, in defining the attributes that suitably represent those parts and in applying the desired filtering operations during implementation. In addition to being strictly edge preserving, they can also be used to create strictly causal scale-spaces (Bangham, Ling and Harvey 1996, Bangham, Chardaire, Pye and Ling 1996), perform both low, intermediary and high level processing tasks (Salembier and Wilkinson 2009), and can be given many useful invariance properties like scale, translation and rotation invariance (Urbach et al. 2007). Attribute filters are appealing because an intuitive link between a user need and an attribute, or a group of attributes can often be made. For example, size analysis is efficiently performed using the area attribute (Cheng and Venetsanopoulos 1992, Vincent 1993, Meijster and Wilkinson 2002); texture analysis by the entropy attribute (Salembier et al. 1998), e.t.c. Quite uniquely, attribute filters allow filtering on shape, rather than size. For example, Wilkinson and Westenberg (2001) use non-compactness as an elongation measure to detect blood vessels in 3D angiography. Other shape parameters have been suggested in (Westenberg et al. 2007, Kiwanuka et al. 2009, Kiwanuka and Wilkinson 2010). One of the challenges that is encountered within attribute filtering is in the identification of the appropriate filter that can be used to solve a given user need. First and foremost, the attributes that accurately differentiate the target from the non-target components have to be identified. The factors that should be put into consideration when choosing the most appropriate attribute can include: the properties of the attribute be-

3

Figure 1.1: General approach used to identify appropriate attributes

ing sought and whether they accurately lead to the objective, the choice of the optimal number of attributes that sufficiently represent the need, the definition of their relationships and the mathematical translations of the above. Secondly, if significant meaningful information is to be harnessed, the filtering method has also to be carefully chosen. The type and composition of the attribute values have to be identified, and weighting issues considered if more than one attribute is to be used. Implementation dynamics including the feature extraction method, dimensionality reduction, the kind of image simplification as well as the type of software and hardware platforms are also influential. Attribute identification is hard because each application seeks unique features and hence the selected attributes have to be customized for that particular purpose. For example, a shape filter developed to be used of a domain specific application in a controlled environment, e.g., MRI images, will not work as well as when used on a general purpose non-specialized image database. Wrong attribute selection causes unreliable results. The general contribution of this thesis is twofold: Firstly, to identify suitable attributes for improving image compression and content-based image retrieval and to test their robustness within selected watermarking algorithms. Secondly, we extend grayscale and binary attribute theory to color or multivariate images, and test these theories for traffic sign detection and within compression. The problem of attribute selection has been tackled by following the approach shown in Figure 1.1. A user need is first identified and translated into an applicable concept. Suppose a user would like to know if an image contains sickle cells, then the user need can be represented as: ”Identify the components in this image that resemble a sickle cell ”. This is then translated into a corresponding concept which involves analysis of the image components so as to come up with a distinguishing mechanism to target sickle cell components. In the sickle cell case, the properties of sickle cells that make them different from normal cells are identified. If we assume that the only distinguishing criteria of a sickle cell from a normal one is shape, then corresponding concept is shape. The next step is the selection of the appropriate attribute(s) that best describe the

1. Introduction

4

(a)

(b)

Figure 1.2: Sickle and normal red blood cells: (a) original micrograph; (b) sickle cell detected with elongation attribute filter.

concept (shape, in this context). Figure 1.2 shows an image containing both sickle and normal red blood cells in (a) and the results after it has been filtered with the elongation attribute at a threshold of 2 in (b). After this, the composition and type of attribute value is decided upon. This includes formulating the mathematical equation that defines the chosen attribute. In this case, elongation has been defined as the ratio of the length of component to the width of component. Finally, implementation and testing of the filter is conducted. Attribute filters are implemented using either the Max-tree approach (Salembier et al. 1998), pixel-queue algorithm (Vincent 1993) or the Union-find method (Meijster and Wilkinson 2002, Tarjan 1975). This work implements attribute filters exclusively using the Max-tree approach, because unlike the other two, the Max-tree approach is suitable for shape attributes (Meijster and Wilkinson 2002, Urbach et al. 2007). It also proposes a new adaptation of the Max-tree to cater for color images. The Max-tree approach to image representation is one of the fastest and most flexible ways of implementing attribute filters, and the only one allowing parallel execution to date (Wilkinson et al. 2008). The connected components of all threshold sets of the image are identified, assigned a single attribute value and filtered if the attribute value is less than a pre-determined threshold (Evans and Gimenez 2008, Salembier et al. 1998, Salembier and Serra 1995). After the image has been filtered, it is reconstructed to remain with only the components that satisfy the user requirements.

1.1

Summary of the work

This work proposes filters for applications within the domains of watermarking, compression, image retrieval, and extension to color processing, including an application of traffic-sign recognition. These are briefly explained in the following.

1.1. Summary of the work

1.1.1

5

Watermarking

Watermarking is embedding of information into a physical document or multi-media file so that, under specified circumstances, it can be retrieved for various reasons such as for security, monitoring and control, intellectual property verification, etc. Bank notes, art pieces, photographs, music, text documents and any intellectual property are routinely watermarked. Digital watermarking is the process of embedding information into a digital signal like audio, video, text or image files. It was first introduced in 1954 and has been widely used in forensic tracking for anti-piracy, audience measurement, TV ratings and advertisement monitoring. Watermarks verify the integrity of images and are also used for copyright control, broadcast monitoring, device control etc (Cox and Miller 2002). Image processing is an inevitable part of the analysis process. This thesis also shows how attribute filtering has been beneficial for compression, content-based image retrieval and automatic traffic sign identification. The question that arises is: what impact do these filtration procedures have on underlying watermarks of an image? More specifically, what effect do the proposed filters have on existing watermarks of an image? Seven watermarking algorithms and five attributes have been investigated. The study found that out of the 35 filters that were tested, only three turned out to have impact on the watermarks. The rest do not affect the underlying watermarks.

1.1.2

Image Compression

Image compression is the representation of image data using fewer bits (Gonzalez and Woods 2002). The image compression process can be broken down into two sub-processes: encoding and decoding. Encoding, as shown in Figure 1.3, takes as input the original image data and table specifications: a set of parameters that are used to map the new data from the old one and vice versa. By means of a specified set of procedures, the encoder generates as output the compressed image data. Decoding uses the compressed image together with the table specifications and reconstructs the original image (Gonzalez and Woods 2002) Image compression is important because it minimizes storage requirements, allows faster file processing, retrieval and transmission as well as improving the overall system performance. However, some compression algorithms lead to a loss of image quality and cause blurring at high compression ratios. Lossless compression algorithms limit compression ratios and and moreover users are not usually in control of selecting regions of interest that should be changed. A tradeoff between image size and quality is therefore inevitable. Pre-processing operations for compression can be employed to leverage the trade-

1. Introduction

6

Figure 1.3: Image encoding

off between image quality and file size. Some methods like the Morphological Image Cleaning (Peters 1995) which use structuring elements result in a slight change in shape of the surviving elements. This may impact image quality negatively. This work proposes a pre-processing method for compression that is based on attribute filters. The proposed attribute filters identify psycho-visually redundant information from an image. This allows huge parts of an image to be removed without the image looking visually different. The proposed filters were tested and found to improve compression results in terms of both file size and quality. The proposed method maintains the shapes of surviving components even after filtering at high levels and offers the users flexibility in selecting regions of interest. Moreover, the method is computationally cheap.

1.1.3

Content-based Image Retrieval

Content-based image retrieval (CBIR) is the retrieval of images from a database based upon the contents of the image. These can include features like color, shape, texture and spatial location (Bober 2001). Recent results show that content-based image retrieval systems perform worse than the text-based ones or a combination of both (Nardi and Peters 2007, Arni et al. 2008). This shows that there is a need for content based image retrieval to be further improved if less emphasis on user dependency (vocabulary, interpretation and image annotations) is to be achieved. This work attempts to address this issue by proposing a filter that enhances the performance of CBIR systems. It addresses a user need of: given a sample image, find as many relevant images as possible from a large photographic image collection (Nardi

1.1. Summary of the work

7

and Peters 2007). In particular, we investigate the similarity of two images based upon the shapes and colors of their individual components. The proposed filter improves the work of Urbach et al (Urbach et al. 2007) and customizes it to suit content-based image retrieval for general, everyday, vacation pictures. The filter in (Urbach et al. 2007) performs excellently for diatom identification. Unlike diatoms, the requirements for everyday vacation pictures are not domain specific, they are unpredictable and not easily described. Besides, the diatom images were grey scale, whereas the retrieval tasks concerns color images. In fact a comparison of this method and the one proposed herein found that the proposed filter gives an improvement of over 30% with respect to the one in (Urbach et al. 2007). We propose a filter that works for any kind of image, be it one taken indoors or outdoors, at night or in the day, and containing anything, anywhere. It is general enough to represent any structure and yet also specific to distinguish the major structures contained, such as buildings, people, animals, birds and common objects like bicycles, cars, flags etc. This filter discriminates objects using a multiscale analysis of size and shape. It consists of a size-shape spectrum composed of three rotation and scale invariant spectra: the area–non-compactness, area–compactness and area–entropy pattern spectra. We extract pattern spectra from the red, green and blue color bands of an image, concatenate them and train a classifier for application within large-scale image data analysis.

1.1.4

Color Image Processing

Although color undoubtedly plays a major role in object recognition, attribute filtering in color remains largely unexplored. Our research found only one piece of literature that implements attribute filtering for color images using the Max-tree approach (Naegel and Passat 2009). This stems from the fact that the Max-tree requires the pixel values to be totally ordered. The total order used in gray scale morphology is just the natural order on R or Z. In general, so long as the pixel values have a total order, whether range images, intensities, or saturations, Max-trees work. By contrast, vector images, or images of scalars lacking total order such as hue or orientation, are not easily given a total order. This poses problems in component-tree-based processing, because the hierarchy in the tree is driven by the total order of the pixel values. To build such trees in color morphology, the ordering has to be decided upon (Lezoray et al. 2005, Angulo 2007, Naegel and Passat 2009). This work proposes a Max-tree adaptation to color image processing. We propose a vectorial image processing method that uses a total preorder on color based on several combinations of luminance and saturation. New image restitution methods proposed here do not result in undesirable color artifacts or visible quantization effects.

1. Introduction

8

Apart from ordering issues, color can be used as a vector attribute, along the lines of (Urbach et al. 2005). This can be done by computing the mean color of a component in a suitable color space and using the resulting values as attributes, either on their own, or combined with shape information. The Max-tree filtering is carried out based on vectorial similarity to a reference color. These methods are tested on a road sign detection application and color image compression.

1.2

Thesis Organization

This thesis is organized as follows: Chapter 2 tests the effect of five attribute filters on embedded watermarks existing in an image. The filters are tested for robustness against watermarks and their performance is compared on seven watermarking algorithms. Chapter 3 discusses the details of the filter that is proposed as a suitable simplification for image compression. It also shows the results after the images are compressed using jpeg (JPEG 1993), jpeg 2000 (ISO 2000) and LZW (Welch 1984) compression algorithms. Chapter 4 discusses the filter that has been identified for content-based image retrieval for a general purpose image databases. This filter applies a multiscale analysis of the image based on the size and shape of its components. It also uses attributes that together form a rotation, translation and scale invariant filter. The theoretical underpinnings are discussed as well as the experimental results and their implications. Chapter 5 introduces the Max-tree adaptation for color images. It discusses the Maxtree construction that orders color according to its luminance and saturation and new methods for image reconstruction. Suitable attributes for color compression are also suggested. Chapter 6 extends the methods proposed in Chapter 5 by focussing on color filtering using vectorial similarity. These methods are applied to automatic traffic sign recognition. The summary and concluding remarks are then provided in Chapter 7.

Published as: F. Tushabe and M. H. F. Wilkinson – “The Effect of Selected Attribute Filters on Watermarks”, in 2nd International Conference on Digital Image Processing. Proceedings of SPIE Vol. 7546, 75463E, SPIE CCC code 0277-786X/10/$18. doi: 10.1117/12.855420.

Chapter 2

The Effect of Selected Attribute Filters on Watermarks Pictures help to form the mental mind Robert Collier

Abstract This paper shows the effect that selected attribute filters have on existing watermarks of an image. Seven transform domain watermarking algorithms and five attributes have been investigated. The attributes are volume, gray-level, power, area and vision. Apart from only one, all of the filters have been found not to affect the underlying watermarks. Keywords: Attribute Filtering, mathematical morphology, robust watermarking, attacks to watermarks.

2.1

Introduction

Attribute filtering is emerging as a powerful tool for enhancing various applications within the image processing domain. In (Tushabe and Wilkinson 2007b) attribute filters are shown to improve compression results in terms of both size and quality. They also greatly improve retrieval results for large scale content-based image retrieval (Tushabe and Wilkinson 2008), for diatom analysis and classification (Urbach et al. 2007) and early skin cancer detection (Naegel et al. 2007). An attribute filter removes elements of an image that do not satisfy a given criterion Λ, of a given attribute like area, perimeter, volume, circularity, etc (Breen and Jones 1996). The remaining components are then left untouched and this ensures edge preservation. Attribute filtering is an attractive option for users because it offers more flexibility in terms of what parts of an image to remove, the choice of attributes and filtering rules to use, as well as the guarantee of shape preservation for the remaining parts. With the increasing adoption of attribute filters as a pre-processing step for various applications, concerns emerge about the impact this has on existing watermarks. A watermark is data that is embedded into a media or multimedia object to enhance or protect its value (Mintzer et al. 1998). When extracted from an image, a watermark can

2. The Effect of Selected Attribute Filters on Watermarks

10

prove ownership, verify integrity and convey object specific information like the buyer or seller, date taken, number of copies distributed and so forth. This work investigates the effect that attribute filtering has on watermarked images. The study sought to find out how the attributes of gray-level, area, power, volume and vision affect watermark signatures if they are applied to an image that remains visually lossless. These attributes were selected because they have successfully been used in previous applications. For example, area was used for shape analysis (Tushabe and Wilkinson 2008), the improvement of image retrieval (Urbach et al. 2007) and the identification of suspicious moles (Naegel et al. 2007); volume, vision and power for the improvement of compression (Tushabe and Wilkinson 2007b) and gray-level within watermark insertion. In this paper, we show that attribute filtering and in particular, filtering by the selected attributes does not compromise existence of watermarks in images. In Section 2.2 and 2.3, we briefly discuss the theory behind watermarking and attribute filtering respectively. Section 2.4 presents the results of experiments conducted and some concluding remarks are given in Section 2.6.

2.2

Watermarking

Image watermarking is performed either in the spatial or transform domain (Wolfgang et al. 1999). Spatial domain techniques directly manipulate pixel contents through either masking approaches or least significant bit modification. While transform domain techniques convert the image into frequencies of various coefficients. Discrete cosine transforms (DCT) and Discrete wavelet transforms (DWT) are the most common representations of image watermarking schemes in the transform domain. Generally, for an N xN image, an N xN transformation is implemented to produce N xN coefficients. The watermark W is converted into an n length vector following a preferred distribution like a gaussian or normal distribution and then added to chosen n coefficients n ≤ N to produce the watermarked image. Watermarks can either be robust or fragile. A robust watermark is one that is not removed after a slight modification on the image is performed (Mintzer et al. 1998). This could be common image processing techniques like re-scanning, addition or multiplication of uncorrelated noise, multiple watermarking, inverse deduction and conspiracy attacks. Others can be estimation of the original image and blind modification through operations including quantization, re-sampling, analog-to digital conversion, rotation, scaling and cropping, e.t.c (Lin and Delp 1999). On the other hand, fragile watermarks are those designed to alter or be destroyed after slight modifications to the image (Lin and Delp 1999). Fragile watermarks are pre-

2.3. Attribute Filtering

11

ferred in circumstances of image authentication or integrity checking like for fingerprint database forensics. While robust watermarks are preferred for copyright protection especially proving ownership. This work tested one DCT and six DWT embedding techniques because of their robustness to attacks. Cox et al (Cox et al. 1997) is the DCT embedding algorithm and it inserts the signature onto the largest n DCT coefficients. The rest are DWT algorithms and they are: (Corvi and Nicchiotti 1997) which embeds the watermark into a multiresolution approximation image using an additive embedding formula while (Wang et al. 1998) adopts design principles of successive subband quantization (SSQ) and bitplane coding. Others add the watermark to a a few selected significant coefficients after a threelevel decomposition. They include (Dugad et al. 1998) that uses Daubechies-8 filters and (Kim and Moon 1999) that apply bi-orthogonal filters and a level-adaptive thresholding scheme. The two-level decompositions are represented by (Xia et al. 1998) in which the Haar wavelet filter produces the watermark that is added to the large coefficients of the high and middle frequency bands of the DWT. Lastly, (Zhu et al. 1998) performs a fourlevel wavelet decomposition in which all high-pass subband coefficients are selected and modified using the additive embedding formula.

2.3

Attribute Filtering

The attribute-based approach to morphological filtering is a method used to filter out desired parts of an image without changing the shapes of the remaining ones (Breen and Jones 1996). This is of particular interest to the process of image authentication / validation because shape preservation and hence visual quality is maintained or even improved. The image is decomposed into sets of connected components, each of which adopts a single attribute value, r, and is considered for further processing only when r satisfies a given criterion. The attributes are chosen as deemed fit for the application and examples include area, perimeter, elongation, volume, circularity, etc. Attribute filtering consists of binary attribute openings, binary attribute thinnings, gray-scale attribute openings and gray- scale attribute thinnings. A binary attribute opening is the union of the trivial openings of the connected components of a set X. Breen and Jones (Breen and Jones 1996) define a binary attribute opening, ΓΛ of a set X with increasing criterion Λ as: ΓΛ (X) =

[

ΓΛ (Γx (X))

(2.1)

x∈X

Where Γx (X) is a connected opening of a set X at point x. This returns the connected

2. The Effect of Selected Attribute Filters on Watermarks

12

component of X that contains x if x ∈ X and ∅ otherwise. ΓΛ represents the trivial opening of a set C if C ⊆ E and Λ is an increasing criterion. ΓΛ returns C if C satisfies criterion Λ and ∅ otherwise. Attribute openings are characterized by being increasing (C ⊆ D ⇒ ΓΛ (C) ⊆ ΓΛ (D)), idempotent (ΓΛ ΓΛ (C) = ΓΛ (C)) and anti-extensive (ΓΛ (C) ⊆ C). Example attributes include area, perimeter and moment-of-inertia. On the other hand, a binary attribute thinning is the union of trivial thinnings of the connected components of a set X. It is defined as (Breen and Jones 1996): ΦΛ (X) =

[

ΦΛ (Γx (X))

(2.2)

x∈X

where ΦΛ represents a trivial thinning of a set X ⊆ E with Λ being any criterion. ΦΛ returns C if C satisfies T and ∅ otherwise. Attribute thinnings are characterized by being idempotent, anti-extensive and non-increasing. Example attributes are perimeter length, compactness, non-compactness, circularity, and entropy.

2.3.1

The Max-tree

The Max-tree approach (Salembier et al. 1998, Meijster and Wilkinson 2002) which was used to implement the attribute thinnings can be briefly described as follows. Let the peak components, Phk of an image represent the connected components of the threshold set at gray level h with k from some arbitrary index set. These peak components are arranged into a tree structure and filtered by removing nodes whose attribute values are less than a pre-defined attribute threshold, λ. Thus, the Max-tree is a rooted tree in which each of its nodes, Chk corresponds to a peak component, Phk (Meijster and Wilkinson 2002). An example is shown in Figure 2.1 which illustrates the peak components, Phk , of a 1-D signal, the corresponding Chk at levels h = 0, 1, 2, 3 and the resultant Max-tree. Further details can be found from (Salembier et al. 1998), (Meijster and Wilkinson 2002)

2.3.2

Tested Attributes

In order to check robustness to watermarks, five attribute filters have been analyzed: The Area Attribute: The area attribute is just the area of a connected component (Cheng and Venetsanopoulos 1992, Vincent 1993). It is given by:

2.3. Attribute Filtering

13 C31

C30 ?

P30 P20

4

P31

P10

3

1 P11

  @ @ R 

2

P00

5 1

(a)

(b)



C20

C10

C11

@ @ R

C00

(c)

Figure 2.1: Peak components(Phk )(a), possible attribute values (b) and corresponding Max-tree (Chk ) (c).

A(X) =

X

1X (x),

(2.3)

x∈E

where X is the set of pixels in the region, and 1X is the charactersitic or indicator function of X. The Grayscale Attribute: We define the grayscale attribute as the average value of the pixels in a given component of a multivariate or color image. It represents how dark or light the color of that component or region is. There are 255 possible grayscale values. The grayscale attribute is given by: P f (x) (2.4) G(X) = x∈X A(X)

Where X is the set of pixels in the region and f (x) the gray-level value of pixel x, and A(X) is defined as above. The Power Attribute:

The power attribute (Young and Evans 2003) measures the power in the components to be removed. The components are selected by considering both the area where the changes will occur and the change in intensity values that will result. It is based on the theory that a small change over a large area will cause the image to seem visually lossless. The formula for calculating the power removed is: X P (X, f, α) = (f (x) − α)2 , (2.5) x∈X

where X is the set of pixels in the region, f the original gray-level image and α the intensity value of the parent in the Max-tree of the region under study.

2. The Effect of Selected Attribute Filters on Watermarks

14 The Volume Attribute:

The volume attribute (Vachier 1998) behaves in a very similar manner to how the power attribute works. Previous experiments have shown that the volume attribute removes more pychovisually redundant information at a lesser computational time than the power attribute (Tushabe and Wilkinson 2007b). It is calculated based upon change in intensity over area for Phk components. The volume attribute is given by X V (X, f, α) = (f (x) − α), (2.6) x∈X

where X is the set of pixels in the region, f is the original image and α the intensity value of the parent in the Max-tree of the region under study. The Vision Attribute: The vision attribute (Tushabe and Wilkinson 2007b) works in a similar manner to the volume attribute, but calculates the change in intensity over area for Chk components instead of Phk and hence removes all Chk nodes that do not comply with the conditions.

2.4

Experimental Results

We tested the robustness of watermarks against the effects of binary attribute openings and thinnings. By using the direct criteria of the Max-tree approach (removes nodes if and only if r < λ, with r the attribute and λ the attribute threshold), the attributes gray-level, area, power, volume and vision were assessed (Salembier et al. 1998, Urbach and Wilkinson 2002). Twenty two grayscale test images obtained from (Kominek 2006) containing 12 natural small greyscale images and 10 natural medium greyscale images were embedded with watermarks of lengths 100 each. Seven publicly available watermarking algorithms (Cox et al. 1997, Corvi and Nicchiotti 1997, Dugad et al. 1998, Kim and Moon 1999, Wang et al. 1998, Xia et al. 1998, Zhu et al. 1998) were then investigated using the method below: • For each watermarking algorithm, create one signature of length 100 and embed within the 22 images; • Filter each image using any of the five attributes of gray-level, area, power, vision and volume. Visual losslessness was ensured by having P SN R > 26 and crosschecked with physical examination. For each image, both attribute openings and thinnings were applied (at the same thresholds) in order to improve pyschovisual losslessness;

2.4. Experimental Results

15

• Extract the watermark from all the images and compare it with the original signature. A correlation coefficient of 0.2 between the extracted and embedded signature means that the watermark has been obtained (Meerwald 2001).

2.4.1

Filtering by Gray-level

Filtering by gray-level removes all components whose gray-level is less than or equal to a given threshold. By using a low parameter of λ = 50, visual losslessness is ensured for all images. Five of the algorithms reported high detection rates of more than 80% while (Dugad et al. 1998) showed non-robustness with only 36% accuracy. When a high parameter value T = 150 was applied, the original image was significantly changed and only one (Xia et al. 1998) of the seven algorithms detected the watermarks. Table 1 summarizes the percentage of images (out of 22) that returned positive detection after filtering by gray-level.

Table 1: Detection of watermarks after gray-level filtering (%)

T /Algorithm Xia Cox Dugad T = 50 100 100 36 T = 100 100 86 14 T = 150 95 41 14

2.4.2

Zhu 86 64 59

Corvi 73 36 18

Wang 95 73 41

Kim 82 14 0

Filtering by Area

Filtering by area removes components of the image whose size is less than a given threshold. By using personalized thresholds for all the images, their visual quality was preserved. Four of the algorithms showed perfect accuracy while (Kim and Moon 1999) performed just averagely (55%). Table 2 summarizes the overall performance.

Table 2: Detection of the watermarks after filtering (%)

Xia Cox Dugad Area 100 100 73 Power 100 100 68 Volume 100 91 45 Vision 100 100 68 Gray-level 100 100 36 Average 100 98.2 58

Zhu 86 86 86 95 86 87.8

Corvi 100 100 73 100 73 89.2

Wang 100 100 82 100 95 95.4

Kim 55 86 9 68 82 60

2. The Effect of Selected Attribute Filters on Watermarks

16

2.4.3

Filtering by Power

Filtering using power was performed by selecting low parameter of T = 1000 that maintains visual losslessness for all the images. Four of the algorithms reported perfect detection rates (100%) and the others also performed quite well, as shown in Table 2.

2.4.4

Filtering by Volume

Filtering by volume removes components of the image whose volume is less than a given threshold. By using a general threshold of T = 5000 for all the images, their visual quality was heavily degraded and clearly different from the originals. Inspite of the visual lossyness, (Xia et al. 1998) still performed perfectly while (Kim and Moon 1999) detected the watermarks in only 2 of the images. Table 2 shows further details of the overall performance. Please note that at visual losslessness, volume performs similarly to area.

2.4.5

Filtering by Vision

Personalized thresholds were obtained for all the images in order to ensure preservation of the visual quality. This is largely because, for vision, a small change in threshold causes a big visual difference. Four of the algorithms showed perfect accuracy and the rest also performed quite well too as illustrated in Table 2.

2.5

Discussion

The results of the experiments conducted above show that the watermarking algorithms tested are quite robust against the attribute filters of gray-level, area, power, volume and vision. It is especially true for the methods in (Xia et al. 1998, Cox et al. 1997, Wang et al. 1998, Corvi and Nicchiotti 1997, Zhu et al. 1998). This is evidenced when the volume filter (and most likely area and vision) detected the signatures even after severe degradation. The average performance for (Kim and Moon 1999) (60%) and (Dugad et al. 1998) (58%) should be received with a little less excitement though. The robustness of attribute filtering to these schemes can be attributed to the preservation of parts with strong signals like the edges. It is worth noting that (Dugad et al. 1998) is not robust against gray-level filtering since only 8 of the 22 images returned a positive detection at visual losslessness.

2.6. Conclusion

2.6

17

Conclusion

This work has summarized the effect of selected attribute filters on watermarks. A marked image was modified by applying a dual attribute opening and thinning filter in order to improve and maintain visual losslessness. It can be concluded that of the seven watermarking algorithms tested, only one ( (Dugad et al. 1998)) is not robust against only one filter, the gray-level one. In otherwords, area, volume, vision, power and gray-level filters are robust to transform domain watermarking schemes. As expected, signature detection becomes harder when large threshold limits are used because more image detail has being discarded. It is interesting to note that even at high threshold levels, most of the algorithms detected the signatures, albeit with lower correlation coefficients. This means that these filters can safely be incorporated as a pre-processing step within applications like compression (Tushabe and Wilkinson 2007b), content-based image retrieval (Tushabe and Wilkinson 2008) etc without fear of compromising existing signatures. In the future, the robustness of other attributes can be investigated for conclusive remarks about their effect on watermarked images. Spatial domain watermarking algorithms can also be examined.

Published as: F. Tushabe, Michael H. F. Wilkinson – “Preprocessing for Compression: Attribute Filtering,” Voted as the Best student paper for the International Conference on Signal Processing and Imaging Engineering (ICSPIE’07), San Francisco, USA, 24-26 October, 2007.

Chapter 3

Preprocessing for Compression: Attribute Filtering Everything has beauty, but not everyone sees it Confucius

Abstract This work proposes a preprocessing method for image compression based on attribute filtering. This method is completely shape preserving and computationally cheap. Three filters were investigated, including one derived from the power filter of Evans and Young that removes even more perceptually unimportant information. The results from 22 images that were processed in various ways and compressed using the popular compression algorithms of Jpeg, Jpeg2000 and LZW are presented. Our experiments have shown that all the filters cause an improvement of as much as 11, 10 and 20% for jpeg, jpeg2000 and LZW algorithms respectively. Keywords: Attribute filtering, Mathematical morphology, Image compression, Pre-processing for compression, universal quality index

3.1

Introduction

The amount of compression provided by any process is dependant on the characteristics of the particular image being compressed, the desired image quality and speed of the compression. A reduction in file size will improve systems performance, reduce file processing / transfer time and minimize data storage requirements. All these advantages render data compression a necessary, if not critical part of file processing. Data compression in images takes place through methods like quantization, alternative coding and filtering. In images, ratios as high as 50:1 can be manifested but a tradeoff between size and quality will largely depend on how much compression is desired. Large compression ratios result in poorer quality images as compared to those compressed at smaller ratios. Compression schemes are either lossy or lossless. Lossy schemes like jpeg (JPEG 1993) remove information that the human visual system tends to ignore. These schemes provide higher compression ratios with relatively good quality images. The disadvantage however is that they are irreversible and therefore information once

20

3. Preprocessing for Compression: Attribute Filtering

lost can not be recovered. Additionally image quality reduces with increase in compression ratios. Lossless compression schemes like jpeg2000 (ISO 2000) and the Lempel-Ziv Welch (LZW)(Welch 1984) re-package information so that less space is utilized. Therefore, although lossless schemes provide better quality and a reversible process, the maximum compression ratios achieved are much lower than those registered by lossy ones. Users desire good quality images even after being highly compressed and preprocessing methods prior to compression are needed to enhance the tradeoff between quality and size. Pre-processing methods allow the owners of the images to participate in choosing aspects and sections of the image that can be ignored, over-processed or filtered out. If the right features of an image are chosen and processed at the right levels then irrelevant data can be discarded to reduce the size of the image while improving its quality. In this paper, we discuss pre-processing methods that can be applied to an image to enhance compression results in terms of size and/or quality. We suggest a preprocessing method for compression that uses either the volume attribute or a modified version that we have called the ”vision attribute” to improve compression results in terms of quality and size. The rest of the paper is organized as follows. In Section 3.2 we briefly survey the current preprocessing methods. Section 3.3 discusses the theory behind the proposed attribute and method of implementation. Section 3.4 explores the experimental results obtained prior and after the proposed filtering method, including comparisons with power filtering and results after jpeg, jpeg2000 and LZW algorithms. We provide concluding remarks in Section 3.5.

3.2

Pre-processing for Compression

Mathematical morphology is a popular tool for gray scale image analysis. It does not cause blurring even after high level filtering, it allows user flexibility in terms of selection of region of interest and is computationally cheap. Peters (Peters 1995) proposed the Morphological Image Cleaning (MIC) algorithm that removes noise from an image by use of Alternating Sequential Filters (ASF) that consist of a series of morphological openings and closings with structuring elements of increasing sizes. The MIC algorithm first smoothes the image, then calculates the difference between the smoothed image and the original one. That difference is thresholded at a value greater than the amplitude of the noise, further manipulated and then added to the original image to produce it’s noise-less version. The noise removal that the MIC algorithm performs causes an improvement in compression sizes and image quality. However, because it is based on structural morphological openings and closings that are not strictly shape preserving, the edges of the final image will have been slightly modified. This is due to the erosion operation that removes the structures that can not contain the structuring

3.3. The Proposed Method

21

element while shrinking the remaining ones. The proceeding dilation may not recover those parts of the remaining components that were lost by the erosion. Connected morphological filtering becomes advantageous because it is shape preserving, idempotent (can not be degraded any further once it has been processed) and can be made to affect desired parts of the image other than the entire image. Young and Evans (Young and Evans 2003) proposed a connected morphological filtering method based upon attribute filtering using the power attribute in particular. This method is based upon ASF filters consisting of attribute openings and closings and a region can not grow or shrink if its measured power exceeds some defined threshold. Power filtering provides even better compression ratios than the MIC algorithm or filtering by area attribute because this filter removes both the noise and psychovisually redundant information contained in the image.

3.3

The Proposed Method

We propose an attribute-based preprocessing method to enhance image compression. Attribute filtering views the image as sets of pixels (connected components) rather than single pixels or rigidly defined neighborhoods. For each connected component, an attribute, r, is calculated and compared to a pre-defined threshold T . If r > T , the whole connected component is preserved, else, it is removed. Unlike morphological openings and closings which grow/shrink components, attribute filtering totally preserves remaining structures by leaving them untouched and hence resulting in better visually appealing images. In addition to being strictly edge preserving, attribute filtering can be used to create strictly causal scale-spaces, perform both low, intermediary to high level processing tasks and can be given many useful invariance properties like scale and rotation invariance. In this paper, the experiments were performed with binary attribute filters, but the work can be extended further to gray-scale (Breen and Jones 1996).

3.3.1

Binary Attribute Filtering

Binary attribute filtering has been defined by (Breen and Jones 1996) as a concept in mathematical morphology that removes connected components from a binary image on the basis of a given criterion of an attribute. Binary attribute filtering is manifested either through binary attribute openings or binary attribute thinnings. Let C, D be connected components of set X and Ψ a binary image operator. Attribute openings remove the small, bright parts of the image and are characterized by being increasing (C ⊆ D ⇒ Ψ(C) ⊆ Ψ(D)), idempotent (ΨΨ(C) = Ψ(C)) and antiextensive (Ψ(C) ⊆ C). Examples include attributes like area, perimeter and momentof-inertia. On the other hand, attribute thinnings remove the bright parts of an image

22

3. Preprocessing for Compression: Attribute Filtering

and are characterized by being idempotent, anti-extensive and non-increasing. Nonincreasing attribute thinnings filter on pure shape criteria or mixed size/shape criteria. Examples include attributes like perimeter length, elongation, circularity and concavity. The binary attribute opening, ΓT , of a set X, as a trivial opening, ΓT , of the connected opening, Γx (X) is defined by (Breen and Jones 1996) as [ ΓT (X) = ΓT (Γx (X)) (3.1) x∈X

where the connected opening Γx (X) at point x is ( Ci that contains x Γx (X) = ∅

if x ∈ X, if x 6∈ X,

(3.2)

in which Ci is one of the connected components or grains (Serra 1988, Heijmans 1999) and the trivial opening, ΓT of a set C if C ⊆ E and T is an increasing criterion is given by: ( C if C satisfies criterion T , ΓT (C) = (3.3) ∅ otherwise On the other hand, it is a trivial thinning, ΦT , if the T in 3.3 is a non-increasing criterion. Therefore a binary attribute thinning, ΦT , of a set X, is the union of the trivial thinnings ΦT of the connected openings Γx (X) at all points x ∈ X, or [ ΦT (X) = ΦT (Tx (X)) (3.4) x∈X

Dark features can be removed equivalently by the dual filters, the attribute closings and thickenings, respectively. These can be obtained by first inverting an image, applying the appropriate opening or thinning, and inverting the result.

3.3.2

The Max-Tree Approach

There are three major approaches to implementing attribute filtering. The Pixel queue algorithm (Breen and Jones 1996, Vincent 1993), the Max-tree approach (Salembier et al. 1998) and the Union-find method (Meijster and Wilkinson 2002, Tarjan 1975). We chose to use the Max-tree approach because it implements both attribute openings and thinnings at relatively fast computing time and with more flexibility (variety of filtering rules) (Meijster and Wilkinson 2002). The max-tree approach consists of arranging the subsets of an image into a tree starting from the root node that acts as a parent to all subsequent nodes. Each node represents a flat zone Lh where a set of pixels adopt a single gray-level value of the highest (for Max-tree) or lowest (for Min-tree) node within that

3.3. The Proposed Method

23 C31

C30 ?

P30 P20

C30

P31

P10

C31

C20 P11

C10

P00 peak components

  @ @ R 



C20 C11 C00

node members

C10

@ @ R 0

C11

C0

Max-tree

Figure 3.1: Peak components (Phk ), corresponding (Chk ) and the resultant Max-tree (right)

subset. The image is thresholded at level h to obtain the thresholded set consisting of peak components, Phk , whose gray-level ≥ h (k indicates indices identifying the individual components). Chk are the components in Phk with gray-level h. Therefore, a Max-tree is defined by (Meijster and Wilkinson 2002) as rooted tree in which each of the nodes, Chk , at gray-level h corresponds to a peak component, Phk . An example is shown in Figure 3.1 which illustrates the peak components, Phk , of a 1-D signal, the corresponding Chk at levels h = 0, 1, 2, 3 and the resultant Max-tree. Filtering is implemented by checking whether a node, Chk , satisfies a given criteria of an attribute. If it does not, then the entire node (Phk ) is removed. If it does satisfy the criteria, Phk is preserved.

3.3.3

The Volume Attribute

The volume attribute (Vachier 1998) in this case behaves in a very similar manner to how the human visual system (HVS) operates. The HVS is not sensitive to small changes in intensity over a large area. Therefore, the volume attribute will be calculated based upon (change in intensity) and the area for Phk components. The volume attribute is given by: X V (Phk , f, α) = (f (x) − α), (3.5) x∈Phk

where Phk is the peak component corresponding to node Chk , f (x) ≥ h is the original gray value of pixel x, and α is the gray value of the parent of Chk . Our experiments have shown that the volume attribute removes more pychovisually redundant information at a lesser computational time than the power attribute (Young and Evans 2003) which calculates: X P (Phk , f, α) = (f (x) − α)2 (3.6) x∈Phk

3. Preprocessing for Compression: Attribute Filtering

24

3.3.4

The Vision Attribute

We experimented with an attribute which we have called the vision attribute that works in a similar manner with the volume attribute, but calculates (change in intensity) and area for Chk components instead of Phk and hence removes all Chk nodes that do not comply with the conditions.

3.4

Experimental Results

Twenty two (22) test images obtained from (Kominek 2006) containing 12 natural small grey scale images and 10 natural medium grey scale images were used. They were filtered using the direct criteria which removes nodes if and only if r < T (Salembier et al. 1998). The experiments were implemented in C programming language and Matlab 6.5. Quality was measured using the Universal Quality Index (Wang and Bovik 2002) metric. The objective of the study was to investigate whether attribute filtering can improve compression of visually lossless images. Comparison of power, volume and vision filtered images versus unprocessed ones at same thresholds, same quality levels and same sizes was conducted. We propose that the preprocessing method goes as follows: Perform an attribute opening and thinning based upon the power, vision or volume attribute of a desired threshold. Determine whether the resultant image is attractive (acceptable) then apply the desired compression algorithm.

3.4.1

At the Same Threshold

When processed at a threshold of T = 100, six of the images were not affected by either type of filter because the difference in intensities between neighboring connected components is huge and hence 100 was a very low threshold to effect any removal. Our experiments showed that the vision attribute degrades an image very fast, much faster than the volume or power. For example, at T = 3, image boat was totally degraded and visually lossy while at T = 100, volume and power filtered ones still looked visually lossless. Vision registered an average quality (for all images at T = 50) of 0.5859 in comparison with volume (0.9229) and power (0.8376). Table 3.1 shows the overall average compression ratios after processing at thresholds of T = 50 and T = 100. Comparison of the visually lossless filters (ie volume and power) showed that both filters improve the compression ratios for all compression schemes tested with volume out-performing the power attribute. The percentage improvement exhibited by volume is four times that of power for jpeg and twice for jpeg2000 and LZW. This implies that the volume attribute removes more psychovisually redundant information compared to the power attribute at similar threshold parameters

3.4. Experimental Results

25

Table 3.1: Comparison of the Compression results (bits per pixel) at same thresholds

T = 50 T = 100

None 1.46 1.46

T = 50 T = 100

None 4.52 4.52

T = 50 T = 100

None 6.38 6.38

Jpeg Power Volume 1.41 1.27 1.39 1.20 Jpeg2000 Power Volume 4.28 3.99 4.22 3.85 LZW Power Volume 5.83 5.25 5.68 4.99

Vision 0.96 0.85 Vision 3.12 2.74 Vision 3.04 2.37

even when the images remain visually lossless. This is because power is a slow filtering attribute that removes relatively smaller particles per unit increase in T .

3.4.2

At Different Thresholds

To investigate the behavior of an image over a wide range of threshold values, image Bridge was forced to attain size 0 (KB) by processing it at different thresholds of increasing order. It is noted that all the filters reduced file size and behaved in a monotonically decreasing manner, where at thresholds p, q ∈ R; p < q, ⇒ s(p) > s(q), s(p) and s(q) being sizes of the images at p and q respectively. Figure 3.2 illustrates the findings that show how an increase in bit rate (lower compression) reduces the distortion (1 - quality) in a monotonically decreasing manner for all filters. This reflects how the quality is reduced with an increase in compression. When the three attributes are compared together at similar distortion levels, volume registered the highest compression ratios (lowest bit rates) closely followed by power and then vision (for jpeg/jpeg2000). It is observed that when the images are over filtered upto beyond recognition (98% of nodes deleted), volume and power cause a slow gradual decrease in size as T increases. On the otherhand, vision decreases size upto a certain point, s(p)∗ , beyond which increase in T causes s(p) to increase. This is because the vision attribute is edge enhancing. Figure 3.3 shows the overall behaviour of the barbara after vision filters it more than 98%. It shows how vision attains s(p)∗ of 6100 at T = 100, unlike volume and power that continue reducing s(p) as T increases.

3. Preprocessing for Compression: Attribute Filtering

26

Figure 3.2: Barbara at selected thresholds after jpeg processing

3.4.3

At the Same Quality

Fifteen (15) images were filtered at various T values to obtain similar quality of U QI = 0.90. At U QI = 0.90, the images remain visually lossless as shown in Figure 3.4.3. Table 3.2 shows the various thresholds needed to achieve the target quality for the three attributes. It is observed that the power attribute needs much higher thresholds to attain the given quality in comparison with volume and vision. For example image 6 (france) requires a threshold of 250,000 for power as opposed to 2,900 (volume) and 662 (vision). This means that it is quicker to arrive at a desired quality level by using the vision, volume and power attributes respectively. It also emerged that much as a big percentage of the images were being filtered, they remained visually lossless. For example, to achieve U QI = 0.90, the power and volume attributes deleted 73,208 and 80,832 nodes respectively. This changed 35% (92,154) and 37 %(96,954) of the pixels respectively. De-

3.4. Experimental Results

27

Figure 3.3: Barbara after huge filtering beyond recognition

tailed results are presented in Table 3.3 which shows that bitrates decreases for all the filters. The highest improvement in terms of compression ratios for jpeg and jpeg 2000 was caused by volume, power and vision respectively. While that for LZW was vision followed by power and vision. Furthermore, all filters narrowed the data distribution especially volume with the lowest standard deviation for jpeg / jpeg2000 and vision for LZW.

3.4.4

At the Same Size

Table 3.4 shows the quality of 10 randomly selected images forced to attain the same size. For all the images (except bird), the volume filtered ones registered the best quality. This generally means that if filtered to the same size, volume exhibits the highest quality, followed by power or vision depending on the image.

3. Preprocessing for Compression: Attribute Filtering

28

(a) No filter

(b) Power

(c) Volume

(d) Vision

Figure 3.4: Barbara at UQI = 0.90 after lzw processing

3.5

Conclusion

In this paper, we have discussed an image pre-processing method based on attribute filtering and implemented by the Max-tree approach so that visual quality is enhanced through shape preservation. This method also offers more flexibility with attributes and more choice with the different filtering rules. We have applied the power, volume

3.5. Conclusion

29 Table 3.2: r needed to attain UQI = 0.90

Image P ower V olume V ision Image P ower V olume V ision

1 295 38 7 9 900 110 8

2 18 8 5 10 480 27 14

3 24 9 4 11 900 64 7

4 1700 100 8 12 13100 1240 17

5 6 7 20 250,000 340 8 2900 21 4 662 16 13 14 15 144 50 430 20 11 85 11 5 7

8 730 62 7

Table 3.3: Compression results at UQI = 0.90

Jpeg Jpeg2000 LZW

Average bpp % bpp % bpp %

None SD 1.54 0.60 4.75 1.29 6.80 2.29 -

Power SD 1.39 0.55 9.96 4.29 1.27 9.50 5.70 1.81 16.12 -

Volume SD 1.37 0.54 11.21 4.28 1.27 9.84 5.72 1.83 15.95 -

Vision SD 1.44 0.57 7.00 4.43 1.29 6.65 5.03 1.78 20.27 -

and vision filters on high quality visually lossless images and have found that all consistently increase compression ratios linearly. Even then, each exhibited strengths and

Table 3.4: UQI indices at selected sizes

Image Barbara

Bird Boat Bridge Camera F rance Lena

Size 41.0 38.7 36.1 10.8 5.6 31.8 15.5 9.8 48.2 9.5

Power r 35 360 1,600 14,000 90 450 250 100 315,000 3,000

UQI 0.9285 0.7880 0.8011 0.6728 0.7496 0.7819 0.9388 0.7981 0.8372 0.8463

Volume r 9 31 83 580 30 47 20 17 2,277 150

UQI 0.9331 0.8024 0.8147 0.6964 0.7516 0.7980 0.9424 0.8231 0.8639 0.8642

Vision r 4 9 14 24 7 7 7 6 1500 13

UQI 0.9205 0.7415 0.7415 0.6232 0.7812 0.7682 0.8683 0.8070 0.5172 0.7803

3. Preprocessing for Compression: Attribute Filtering

30

weaknesses depending on the environment that was applied. The power attribute needs high parameter values to attain specific sizes / quality. The vision attribute performs best with the LZW scheme and requires relatively low threshold values to achieve a particular size or quality. The volume attribute is best suited for jpeg and jpeg2000 compression. Our experiments have shown that when the three attributes are generally compared, volume consistently produces the best improvements in terms of quality and size after compression. Our preferred choice of filtering rule is the direct rule since the others (like minimum) can cause unpredictable behavior especially with the vision attribute. In conclusion, we are convinced that attribute filtering using power, volume and vision attributes is a viable preprocessing method for compression.

3.6

Future Work

This work can be extended to gray-scale attribute filtering and a super filter that consists of a combination of multiple attributes can be explored.

Published as: F. Tushabe, M. H. F. Wilkinson – “Content-based Image Retrieval Using Combined 2D Attribute Pattern Spectra,” In: Advances in Multilingual and Multimodal Information Retrieval, Vol. 5152/2008, pp 554-561, LNCS, Springer - Verlag Berlin Heidelberg, 2008

Chapter 4

Content-based Image Retrieval Using Combined 2D Attribute Pattern Spectra The question is not what you look at, but what you see. Henry David Thoreau

Abstract This work proposes a region-based shape signature that uses a combination of three different types of pattern spectra. The proposed method is inspired by the connected shape filter proposed by Urbach et al. We extract pattern spectra from the red, green and blue color bands of an image then incorporate machine learning techniques for application in photographic image retrieval. Our experiments show that the combined pattern spectrum gives an improvement of approximately 30% in terms of mean average precision and precision at 20 % with respect to Urbach et al’s method.

4.1

Introduction

The most popular content-based image retrieval descriptors follow the standard MPEG7 visual tool-set (Bober 2001). They include descriptors based on color, texture, shape, motion and localization. We test an alternative method of obtaining the image descriptor by the application of granulometric operations and machine learning techniques. Granulometric operations are applied to the image at different scales and levels of complexity to derive information about the distribution of its contents (Matheron 1975). Attribute filtering is a relatively new and efficient way of implementing granulometry. Desired descriptors like size, spatial location or shape can be well represented with appropriate attributes like area (Maragos 1989), moments (Wilkinson 2002), (Hu 1962) or shape (Urbach et al. 2007). A size granulometry for example uses sieves of increasing sizes to obtain the size distribution of the image. Previous works like (Bagdanov and Worring 2002), (Garcia et al. n.d.) use a structuring element approach for the granulometric operations. However, recent studies have found connected filtering to be faster and equal or sometimes better in performance than the SE approach (Urbach

4. Content-based Image Retrieval Using Combined 2D Attribute Pattern Spectra

32

et al. 2007). In (Urbach et al. 2007), a shape filter of a 2-D pattern spectrum consisting of an area and non-compactness spectrum is proposed. We extend the shape spectrum proposed in (Urbach et al. 2007) and apply it to a photographic data set containing everyday vacation pictures (Grubinger et al. 2006). This is because most shape-based image retrieval studies concentrate on artificial images or highly specialized domainspecific image data sets. The proposed shape spectrum consists of three rotation and scale invariant spectra: the area–non-compactness (Urbach and Wilkinson 2002), area– compactness and area–entropy pattern spectra. They are weighted, combined and used for image retrieval within large-scale databases. The rest of the paper is organized as follows: Section 4.2 briefly describes the theory of the method employed, Section 4.3 contains the experimental set-up and Sections 4.4 and 4.5 give the results, discussions and concluding remarks.

4.2

Theory

Connected attribute filtering decomposes an image into sets of connected components. Each component adopts a single attribute value, r, and is considered for further processing only when r satisfies a given criterion. Attribute filtering is manifested through attribute openings or thinnings and is extensively discussed in (Breen and Jones 1996). Let C, D be connected components of set X and Ψ a binary image operator. Attribute openings are characterized by being increasing (C ⊆ D ⇒ Ψ(C) ⊆ Ψ(D)), idempotent (ΨΨ(C) = Ψ(C)) and anti-extensive (Ψ(C) ⊆ C). Example attributes include area, perimeter and moment-of-inertia. On the other hand, attribute thinnings are characterised by being idempotent, anti-extensive and non-increasing (C ⊆ D 6⇒ Ψ(C) ⊆ Ψ(D). Example attributes are length, compactness, non-compactness, circularity and entropy. If X, Y represent an image, then the size granulometry (Γr ) is a set of filters {Γr } with r from some totally ordered set Λ (usually Λ ⊂ R or Z ) satisfying the properties: Γr (X) ⊆ X

(4.1)

X ⊆ Y ⇒ Γr (X) ⊆ Γr (Y )

(4.2)

Γs (Γr (X)) = Γmax(r,s) (X)

(4.3)

∀r, s ∈ Λ In (Breen and Jones 1996), it is shown that attribute openings indeed provide size granulometries since equations (1),(2) and (3) define Γr as being anti-extensive, increasing and idempotent respectively. Similarly, (Urbach and Wilkinson 2002) show that a shape granulometry can be obtained from attribute thinnings. The shape granulometry,

4.2. Theory

33

of X, is a family of filters, {Φr }, with shape parameter, r, from some totally ordered set Λ (usually Λ ⊂ R or Z) with the following properties: Φr (X) ⊆ X

(4.4)

Φr (tX) = t(Φr (X))

(4.5)

Φs (Φr (X)) = Φmax(r,s) (X)

(4.6)

∀r, s ∈ Λ and t > 0 Equations (5),(6) and (7) define Φr as anti-extensive, scale invariant and idempotent respectively.

4.2.1

2-D Pattern Spectra

The results of the application of granulometry to an image can be stored in a pattern spectrum (Maragos 1989). A 2D pattern spectrum represents the results of two granulometric operations in a single 2-dimensional histogram. The shape filter proposed in this work consists of a size-shape pattern spectrum. The size pattern spectrum, sΓ (X), obtained by applying the size granulometry, {Γτ }, to a binary image X is defined by (Maragos 1989) as: dA(Γr (X)) (sΓ (X))(u) = − dr r=u

(4.7)

where A(X) is the area of X. While the shape pattern spectrum, sΦ (X), is obtained by applying the shape granulometry, {Φτ }, to a binary image X and defined by (Urbach et al. 2007) as: dA(Φr (X)) (sΦ (X))(u) = − dr r=u

(4.8)

where the difference with (4.7) is in the use of the shape granulometry.

4.2.2

Computing the Pattern Spectra

The Max-tree approach (Meijster and Wilkinson 2002, Salembier et al. 1998) was used to implement the attribute thinnings and openings. Let the peak components, Phk of an image represent the connected components of the threshold set at gray level h with k from some arbitrary index set. These peak components are arranged into a tree structure and filtered by removing nodes whose attribute values are less than a pre-defined threshold,

4. Content-based Image Retrieval Using Combined 2D Attribute Pattern Spectra

34

C30

C31

?

P30 P20

P31

P10

P11 P00

peak components

2, 4 3, 1 8, 2

  @ @ R 



2 2 4

C20

2, 3 2, 5 13, 1

attributes of the nodes

C10

3

C11

@ @ R

8

C00

max-Tree

2-D spectrum

Figure 4.1: Peak components(Phk ), their attributes, corresponding (Chk ) (the Max-tree) and the resulting pattern spectrum (right).

T . Thus, the Max-tree is a rooted tree in which each of its nodes, Chk , at gray-level h corresponds to a peak component, Phk (Meijster and Wilkinson 2002). An example is shown in Figure 4.1 which illustrates the peak components, Phk , of a 1-D signal, the corresponding Chk at levels h = 0, 1, 2, 3, the resultant Max-tree and corresponding spectrum. Note that two attributes are shown per node, the first of which is the size attribute which increases as the tree is descended. The second attribute, which is the shape attribute is not increasing. The method of generating the 2D spectrum has been adopted from (Urbach et al. 2007). Let {Γr } be a size distribution with r from some finite set Λr and {Φs } a shape distribution with s from some index set Λs . If S is the 2-D array that stores the final 2-D spectrum, then each cell, S(r, s), contains the sum of gray levels of Chk that falls within size class r− and r and shape class s− and s. The 2-D pattern spectrum is then computed from the Max-tree as follows: • Set all elements of the array S to zero. • Compute the Max-tree according to the algorithm in (Salembier et al. 1998). • As the Max-tree is built, compute the area A(Phk ), perimeter P (Phk ), histogram of the gray levels and moment of inertia I(Phk ) of each node. • For each node Chk : – Compute the size class r from the area of Phk . – Compute the shape class s from the shape attribute of Phk . – Compute the gray level difference δh , between the current node and its parent; – Add the product of δh and A(Phk ) to S(r, s).

4.3. Experiments

35

The shape attributes chosen are non-compactness, N , defined as I(Phk ) N= 2 k , A (Ph )

(4.9)

P 2 (Phk ) , A(Phk )

(4.10)

compactness, C, defined as C=

and finally, Shannon entropy (Shannon 1948) X H=− p(i) log2 p(i),

(4.11)

with p(i) the probability with which gray level i occurs in Phk .

4.3

Experiments

The objective of our experiments was: given a sample image, find as many relevant images as possible from the IAPR TC-12 photographic collection (Nardi and Peters 2007). Our method uses the three query images that were provided per topic. The detailed methodology is as follows: 1. Separate the jpeg image into three different images, each representing its red, green and blue color bands. This is after initial analysis shows that RGB representation improves results unlike YUV and XYZ which performed worse than not separating the images. 2. Extract the desired pattern spectra from all the images including the query images. A 20 by 15 bin histogram that eventually translates into a 1 × 600 array representation was chosen. When concatenated, the spectra retrieved from the three color bands forms a 1 × 1800 vector per spectrum type. The three spectra that were tested are: (a) Area and Non-Compactness (A-N) spectrum • Area: Is a size filter that represents the number of pixels in the component. Initial experiments in (Tushabe and Wilkinson 2007a) (Tushabe and Wilkinson 2008) showed that the discriminative power lies more in the larger particles rather than the smaller ones. Therefore all particles less than 30% of the total image size were ignored. • Non-Compactness: Thresholds of 1 - 53 were used for the non-compactness attribute since it gave the best MAP when compared with other thresholds ranging between T = 1 : 100.

36

4. Content-based Image Retrieval Using Combined 2D Attribute Pattern Spectra (b) Area and Compactness (A-C) spectrum • Area: Same thresholds as above.

• Compactness: The thresholds chosen for compactness are T = 600 since it registered the highest MAP when compared with other thresholds ranging between T = 1 : 1000. (c) Area-Entropy (A-E) spectrum • Area: Same thresholds as above.

• Entropy: A threshold of T = 8 was chosen because it is the maximum entropy that any component can achieve. 3. The spectra were separated into two equal parts, A and B, referring to larger and smaller features in the images. 4. The baseline distance, dx,j , of any two images x and j is given by: dx,j = wa dA(x,j) + wb dB(x,j)

(4.12)

where wa,b are the weights of parts A and B of the spectrum and dα (x, j) the L1 norm distance of image x and j as computed from attribute α of the spectrum. The weights chosen for area - non-compactness is wa = 0.7 and wb = 0.3; area compactness is wa = 0.7 and wb = 0.3; and area - entropy attributes wa = 0.5 and wb = 0.5. These weights were found by trial and error. 5. The 250 most significant features from the 1×1800 spectra are selected and used to train the query images using the naive Bayesian classifier from (Kira and Rendell 1992, Demsar et al. 2004). The images are then classified by each of the spectra into classes consisting of the 60 topics. The distance, dx , of an image from a particular topic is reduced by a given percentage, p if it has been classified within that topic. This is done because we wish to obtain a single distance, and the bayesian classifier from (Kira and Rendell 1992) works with a different distance measure than dx . Parameter p is the classification weight and is 20% for for A-N and A-E and 70% for A-C feature sets respectively. These percentages were also determined empirically. 6. The distance, Dx of X from topic T is the minimum of its distances from the three topic images. The final distance of image x from topic image y is the weighted addition of its distances from the three spectra. C H Dx = min{0.75dN x,j + 0.20dx,j + 0.05dx,j } j∈T

(4.13)

4.4. Results

37 Table 4.1: Performance of the spectra.

Run A-N A-C A-E A-N and A-C A-N and A-E A-N, A-C and A-E

MAP 0.0444 0.0338 0.0265 0.0539 0.0479 0.0571

P20 0.1258 0.1100 0.0767 0.1508 0.1358 0.1608

Relevant 830 819 622 932 846 926

% improvement 21.4 7.9 28.6

C H where dN x,j , dx,j , dx,j are the distances between x and j depending on their A-N, A-C and A-E spectra, respectively.

7. The similarity measure between images X and Y is then calculated using: Dx (4.14) Dmax is the maximum of Dx over the data set, which helps in normalizing Sim(X, Y ) = 1 −

where Dmax Dx .

4.4

Results

The experiments were implemented in C and Matlab and run on an AMD Opteronbased machine. Feature extraction took approximately 3 seconds per image. Performance has been measured using precision, recall and the Mean Average Precision. Precision is a measure of the ability of a system to present only relevant items. It is the ratio of the number of relevant items retrieved to the total number of items retrieved (Zhu 2004) while recall is a measure of the ability of a system to present all relevant items. It is the ratio of the number of relevant items retrieved to the number of relevant items in collection. The Mean Average Precision (MAP), combines precision and recall results over the entire interval from r = 0 to r = 1 (Zhu 2004). The overall performance of this method has shown that combining the three spectra improves the MAP of the best performing single spectrum by over 28%. Table 4.1 gives the detailed results of the different combinations that were performed. They show that the A-N spectrum has the highest discriminative power, followed by A-C and A-E respectively. Figure 4.2 illustrates the interpolated precision-recall average for the three separate and the combined spectra. As expected, at any given point, the precision of the combined spectrum is much higher than any of the individual ones. Initial results

4. Content-based Image Retrieval Using Combined 2D Attribute Pattern Spectra

38 0.7

non−compactness compactness entropy All three

0.6

0.5

Precision

0.4

0.3

0.2

0.1

0

0

0.1

0.2

0.3

0.4

0.5 Recall

0.6

0.7

0.8

0.9

1

Figure 4.2: Interpolated Precision - Recall Averages.

showed that Bayesian classification out-performed k-nearest neighbor and decision tree. Bayesian classification improves the MAP of the combined filter by 28% from 0.0444 to 0.0571 and precision at 20 from 0.1258 to 0.1333.

4.5

Discussion

Our experiments have shown that using only one technique, i.e., the 2D pattern spectra, produces very promising results for CBIR. There is no doubt that combining it with other visual descriptors like color or texture will further enhance performance for image retrieval. This work proposes a feature vector that combines three 2D pattern spectra: the area–non-compactness, area–compactness and area–entropy spectra. The combined spectrum translates in an improved performance in terms of both the mean average precision and precision at 20. Given the small training set used and simple retrieval scheme,

4.5. Discussion

39

the registered performance indicates that this feature set shows promise and should be developed further. In (Urbach et al. 2007) the area–non-compactness spectrum is shown to be very robust against noise in the application of diatom identification. The difference in performance between the different pattern spectra may be attributed to differences in robustness to noise. Compactness is probably less robust through the use of the perimeter parameter. The fact that the A-C spectrum required a classification weight of 70% compared to 20% for A-N and A-E respectively could indicate that the decision boundary with the simple nearest neighbor classifier is less reliable in the case of compactness. The relatively poor performance of entropy may mean that shape is relatively more important than variation in gray level. We believe that choosing features using more advanced relevance learning techniques (Hammer and Villmann 2002, Hammer et al. 2005) as well as using a larger training set will enhance the MAP scores registered here. Secondly, obtaining the spectra from specific objects (cartoon) as opposed to the whole image can also be tried out (Maragos and Evangelopoulos 2007, Sofou et al. 2005). Further advancements should include relevance feedback by users.

Submitted as: F. Tushabe and M. H. F. Wilkinson – “Color Image Processing using Component Trees: A Comparison on Image Compression,”, to Pattern Recognition.

Chapter 5

Color Processing using Max-trees: A Comparison on Image Compression I know the color of truth Lyric in Song by Boyz II Men

Abstract This paper proposes a new method of implementing color connected filters, using component trees, through total preorders. It adapts the Max-tree image representation to accommodate color and other vectorial images. The proposed algorithm allows any total order or preorder to be used. The work extends earlier work by Naegel and Passat with three new color restitution decisions to improve color fidelity. Tests are performed on six different color preordering schemes on luminance, chromaticity or saturation or combinations. Finally, an algorithm to compute the entropy attribute for all nodes in O(N ) memory complexity is proposed, as opposed to O(2B N ), with B the number of bits per pixel. The latter is essential when using preorders which require more than 8 bit per pixel to order the pixels. Comparison with an earlier method show that the proposed methods result in quality improvement by as much as 15%. Keywords: Mathematical morphology, color connected filters, attribute filters, component tree, color image compression.

5.1

Introduction

Connected filtering is a branch of mathematical morphology that filters connected components (connected sets of pixels of maximal extent) instead of individual pixels (Serra 1988), (Serra 1998), (Salembier et al. 1998), (Salembier and Serra 1995), (Heijmans 1999). A component is either retained as it is, or is removed if it does not satisfy given conditions, or attribute criterion (Breen and Jones 1996). Connected filters are therefore shape preserving and do not cause blurring even at high filtering levels. They allow users to chose properties of sections of the image that can be ignored, over-processed or filtered out. Connected filters have been used in various application, including image filtering and noise reduction (Angulo 2007), (Gimenez and Evans 2008), (Naegel and Passat

42

5. Color Processing using Max-trees: A Comparison on Image Compression

2009), image simplification for compression (Tushabe and Wilkinson 2007b), (Young and Evans 2003), video processing (Young and Evans 2003), (Salembier et al. 1998), vessel enhancement filtering (Ouzounis and Wilkinson 2006), (Purnama et al. 2010), (Wilkinson and Westenberg 2001), and image analyzed microscopy (Urbach et al. 2007). Recent reviews can be found in (Salembier and Wilkinson 2009), (Wilkinson and Ouzounis 2010). An important class of connected filters is based on the Max-tree and its dual the Mintree (Salembier et al. 1998). Both are also referred to as component trees (Jones 1999), (Najman and Couprie 2006), (Naegel and Passat 2009). Auto-dual variants such as the level-line tree have also been developed (Monasse and Guichard 2000a), (Monasse and Guichard 2000b). The aim of these trees is to encode a hierarchy of connected components at different levels to allow fast filtering or for analysis as a multi-scale representation of the image or volume under study. In the framework of mathematical morphology, the basic working structure is a complete lattice (Ronse 1990), (Serra 1998). A complete lattice is a set of ordered elements (either partial or total order) for which each family of elements possesses a supremum (sup) or an infimum (inf) (Ronse 1990), (Serra 1998). Examples in image analysis are the lattice of subsets of the image domain in the case of binary images, and the lattice of scalar functions on the image domain in the case of gray scale images. The choice of suprema and infima in gray scale morphology is straight forward because gray level intensity values are completely ordered from black to white. In general, so long as the pixel values have a total order, whether range images, intensities, or saturations, choosing the supremum (infimum) over the lattice of images consists of choosing uniform images filled with the supremum (or infimum) pixel value. By contrast, vector images, or images of scalars lacking total order such as hue or orientation, are not easily given a partial order. This poses problems in component-tree-based processing, because the hierarchy in the tree is driven by the total order of the pixel values. To build such trees in color morphology, the ordering has to be decided upon (Lezoray et al. 2005, Angulo 2007, Naegel and Passat 2009). There are several types of multidimensional vector orderings (Plataniotis and Venetsanopoulos 2000, Lezoray et al. 2005, Angulo 2007). A marginal ordering deals with each component independently and then later concatenates the scalars back together. This method has been shown to introduce new colors (Naegel and Passat 2009). Reduced ordering obtains a scalar value from the vectorial components. Most researchers who adopt this approach calculate distances from a reference vector. In (Witte et al. 2005) colors are ordered with respect to their distances from white or black. In (Angulo 2007) color is ordered based on its distance from other reference colors which are not necessarily white or black. In (Lezoray et al. 2005) the minimum spanning tree of a region adjacency graph (RAG) is used. Partial ordering groups subsets of the data that form a convex hull (Plataniotis and

5.1. Introduction

43

Venetsanopoulos 2000, Gibson et al. 2004) while lexicographical (or conditional or hierarchical) ordering first prioritizes components and then executes comparisons beginning with the first priority component. Other orderings are based on combining any of the above types for example in (Angulo 2007) reduced and lexicographic orders are combined. Multivariate processing in color is generally approached in two major ways (Wheeler and Zmuda 2000). In marginal processing, each channel is processed independently, filtered using regular gray-scale morphology and then merged back into a single color image again. This approach has been found to be very efficient for denoising applications but poor at object detection (Naegel and Passat 2009). The second approach is the vectorial one which transforms the multichannel data into a single channel based on one or more channels, processes it and then performs the color reconstruction. Image filtering implemented using the Max-tree approach (Salembier et al. 1998) is one of the fastest and most flexible ways of implementing connected filters (Meijster and Wilkinson 2002). Unfortunately, very little literature is available about how connected color processing is implemented by using the Max-tree approach. In (Naegel and Passat 2009), several orderings are investigated including marginal, lexicographic and reduced orderings. Four of the five tested approaches were found to produce undesirable colored artifacts and the one that did not suffered from very visible quantization effects. This paper proposes a new Max-tree adaptation to color image processing that does not result in undesirable color artifacts or visible quantization effects. We propose a vectorial image processing method in which the color vector is transformed into a scalar channel through a reduced ordering. Image reconstruction is carried out in such a way that it does not result in undesirable visible quantization effects or color artifacts. Like (Naegel and Passat 2009), we focus on component trees for two important reasons: (i) fast algorithms are available (Salembier et al. 1998), (Najman and Couprie 2006) including parallel ones (Wilkinson et al. 2008), and (ii) the methods can readily be extended to forms of connectivity based on clustering or partitioning (Serra 1998), (Ouzounis and Wilkinson 2007). The rest of the paper is organized as follows. In Section 5.2, we discuss the theoretical aspects of the proposed method. Section 5.3 discusses the algorithms that were used in detail while Section 5.4 presents the results obtained after the proposed method is tested on an image compression benchmarking dataset. Conclusions are then provided in Section 5.5.

44

5. Color Processing using Max-trees: A Comparison on Image Compression

5.2

Theory

5.2.1

Binary and Gray-scale Operators

Connected operators are image transformations that result in removal or retention of the connected components of an image. Salembier et al. (Salembier et al. 1998) define a binary connected operator for any binary image X as connected when the set difference X \ ψ(X) is exclusively composed of connected components of X or its complement, X c . All of the following connected operators are based on connectivity openings or connected openings (Serra 1988). For a given image domain E connectivity openings Γx are defined for all x ∈ E. In the binary case, Γx (X) returns the foreground component to which x belongs if x ∈ X, and ∅ otherwise. After extracting the connected components using these connectivity openings, a trivial filter ΨΛ , based on attribute criterion Λ is applied to each. These are defined as ( C if Λ(C) is true ΨΛ (C) = (5.1) ∅ otherwise, with C the connected component. The attribute criterion (Breen and Jones 1996) usually has the form Λ(C) = (Attr(C) ≥ λ) (5.2) with Attr(C) some real-valued attribute of C and λ the attribute threshold. Finally, the attribute filter ΨΛ based on criterion Λ is defined as [ ΨΛ (X) = ΨΛ (Γx (X)). (5.3) x∈E

Thus, the result of the attribute filter is the union of all connected components which meet the criterion Λ. If the criterion Λ is increasing, i.e. if C ⊆ D then Λ(C) implies Λ(D), the above operator is an attribute opening, because it is idempotent, increasing and anti-extensive (Breen and Jones 1996) otherwise it is an attribute thinning, which is idempotent and anti-extensive, but not increasing. Both remove connected foreground components which fail the criterion. Their dual operators are the attribute closings and thickenings respectively, which are defined as ΦΛ (X) = (ΨΛ (X c ))c

(5.4)

with X c = X \ E denoting set complement. Connected operators are also defined for gray-scale images in (Cheng and Venetsanopoulos 1992), (Vincent 1993), (Breen and Jones 1996), (Salembier et al. 1998). In gray scale, connected filters work on connected components of level set images Lh Lh (f ) = {x ∈ E | f (x) = h}.

(5.5)

5.2. Theory

45

The connected components Lkh of these level sets for all levels h or flat zones partition the image domain E, i.e. the union is E and the intersection of any two different flat zones is empty. A gray level operator ψ acting on gray level functions is connected if for any function f , the partition of E of flat zones of f is finer than the partition of flat zones of its transformation ψ(f ) (Salembier et al. 1998). This means that any flat zone of f is contained entirely in a single flat zone of ψ(f ). Conceptually, and in the earliest implementation of the area opening (Cheng and Venetsanopoulos 1992), attribute filters in grey scale work by thresholding the image at all possible grey levels, applying the filter to each level and stacking the results. Let the threshold set Th at level h be defined as Th (f ) = {x ∈ E | f (x) ≥ h}.

(5.6)

The grey scale variant ψΛ of any binary attribute filter ΨΛ can be defined as ψΛ (f )(x) = sup{h | x ∈ ΦΛ (Th (f ))},

(5.7)

which means that the highest gray value h is chosen such that x is still member of a connected component in the threshold set which meets the criterion Λ (Breen and Jones 1996). Connected components Phk of threshold sets Th are referred to as peak components (Meijster and Wilkinson 2002). Other definitions of gray-scale attribute filters can be found in (Breen and Jones 1996), (Salembier et al. 1998), (Urbach et al. 2007), and are often based on the Max-tree or component tree structure discussed in the next section. Practical implementations of gray-scale connected filtering has been performed by using three major approaches. The pixel-queue algorithm (Breen and Jones 1996), (Jones 1999, Vincent 1993), the Max-tree approach (Salembier et al. 1998), (Najman and Couprie 2006) and the union-find method (Meijster and Wilkinson 2002), (Tarjan 1975). This work deals exclusively with the Max-tree implementations.

5.2.2

The Max-tree

The Max-tree (Salembier et al. 1998) data structure is an efficient multi-scale representation of a grey scale image. The nodes Chk , with k the node index and h the gray level of the Max-Tree represent peak components Phk for all threshold levels in a data set. The root node represents the set of pixels belonging to the background, and each node has a pointer to its parent. An example of a Max-Tree of a 1-D signal is given in Fig. 5.1. Each node contains a reference to its parent, its original and filtered grey level and its attribute value, or values, in the case of vector-attribute filtering (Urbach et al. 2005). The filtering process is separated into three stages: construction, filtering and restitution. During the construction phase, the Max-tree is built from the flat zones of the image, collecting auxiliary data used for computing the node attributes at a later stage.

46

5. Color Processing using Max-trees: A Comparison on Image Compression C30

P30 P20

P21 P10 P00

? C21 C20 @ @ R C10 ? C00

Figure 5.1: A 1-D signal f (left), the corresponding peak components (middle) and the Max-Tree (right). Figure after (Wilkinson et al. 2008).

Once the attributes have been stored in the Max-Tree nodes, we can apply the attribute criterion of choice to each node to decide whether or not they should be retained. Various strategies of filtering are discussed in (Salembier et al. 1998), (Urbach et al. 2007), (Ouzounis and Wilkinson 2010). In all cases, filtering is performed by identifying and removing the nodes that do not fulfil the attribute criterion Λ. The final phase is restitution, which consists of transforming the output Max-tree into an output image. If a node has been removed, a new gray level has to be assigned to it. Generally this is the gray level of the nearest preserved ancestor in the tree (Salembier et al. 1998). As a result, the gray level values of the original image are assigned to the pixels of the preserved nodes, and no new gray levels appear in the image. If the criterion Λ is increasing, restitution is simple, because the tree is always pruned: i.e. if a node is rejected all its descendants are also rejected. However, if the criterion is not increasing, as in the case of scale-invariant filters (Urbach and Wilkinson 2002), (Urbach et al. 2007), there is a problem. Some rejected nodes have preserved descendants. There are several possible restitution decisions that can be made (Breen and Jones 1996), (Urbach and Wilkinson 2002), (Urbach et al. 2007), (Salembier et al. 1998), three of which prune the tree, by occasionally overruling the attribute criterion: Min: removes a node if any of its ancestors is removed (Breen and Jones 1996) Max: preserves a node if any ancestor is preserved (Breen and Jones 1996) Viterbi: treats selecting a correct pruning point in a branch of the tree as an optimization problem (Salembier et al. 1998). Two others do not prune the tree, but preserve all nodes that meet the criterion, and remove the others: Direct: simply implements (5.7), leaving all preserved nodes at their original gray value (Breen and Jones 1996)

5.2. Theory

47

Subtractive: if a node is removed, all its descendants are lowered by the same amount, effectively changing (5.7) to ψΛ (f )(x) =

h max X

χ(ΦΛ (Th (f ))),

(5.8)

h=1

i.e., using the summing the results per threshold rather than taking the maximum (Urbach et al. 2007), (Urbach and Wilkinson 2002). Several variants of these rules exist for grey scale, to extend to so called hyperconnected filters (Ouzounis and Wilkinson 2010). In the following, we will generally use the Direct decision, or its color equivalent, developed below. Until now we have only discussed the Max-tree, which is used for anti-extensive filtering (i.e., for removing bright features). Removing dark features is done using a Min-tree, which is just the Max-tree of the inverted image. Where the Max-tree is used for attribute thinnings and openings, the Min-tree is used for attribute thickenings and closings.

5.2.3

Color Connected Filters

In linear filtering and many other form of image processing, color image processing differs little from gray scale processing. The image is first split into its component R, G, and B channels, then processed using the conventional means before the results are recombined into a color image again. This so-called marginal processing is simple, and often effective, but can result in the appearance of new colors not previously present in the image. These “false colors” can present really nasty artefacts in images, and can be avoided by applying vectorial processing. Despite these objections, in noise removal using connected filters, marginal processing yielded the best results (Naegel and Passat 2009). This is not surprising for two reasons: (i) because the filters are connected, no false edges can appear, as in normal morphological or linear filters (Naegel and Passat 2009), and (ii) noise is typically generated by independent processes in each R, G, and B channel. No correlations should exist. However, in HLS or L*a*b* spaces correlations in the noise do exist, and in that case vectorial processing should be better. In more general filtering tasks, even RGB representations should probably be treated in a vectorial way. Several approaches have been proposed to solve these problem. The implementation issues involved in color connected operators can be divided into three parts: identification of the extremal points, the merging criteria of the removed region(s) and the color assignment to both the flat zones as well as the new merged region (Evans and Gimenez 2008). There are a few color connected filters that have been implemented. In the vector area morphology sieves approach (VAMS) (Evans 2003) , the supremum region is

48

5. Color Processing using Max-trees: A Comparison on Image Compression

obtained by calculating aggregate distances between each flat zone and its connected neighbors. The extremal node is chosen as the one with the greatest aggregate distance. Merging is to the nearest neighbor node and the merged node takes on the color of the nearest neighbor while the flat zones adapt the mean color value in a given node. Another connected color filter is the convex color sieves (CCS) approach (Gibson et al. 2004). CCS is similar to VAMS except in the way that the extremal points are determined. In CCS, ordering is by first constructing a convex hull of each region and its connected neighbors. The extremal region is then defined as the one that lies on the edge of the hull. It is interesting to note that VAMS (Evans 2003) and CCS (Gibson et al. 2004) process extremas without necessarily classifying them as either maxima or minima. This is because it is possible for that approach to obtain several connected extrema. This weakness has been dealt with by the introduction of the VAMOCS (Gimenez and Evans 2008) which combines strengths of the VAMS and CCS methods, and provides area openings and closings. Neither of these three methods explicitly builds a tree, because they use local order. This means that it is more difficult to perform fast multi-scale analysis as in (Meijster and Wilkinson 2002), (Urbach et al. 2007). Another approach is that of the binary partition tree (BPT) (Salembier and Garrido April, 2000, Vilaplana et al. 2008). In this case the tree does not contain regional maxima (which by their nature require a total order or preorder) as their leaves, but the flat zones of the image, these are hierarchically merged using some measure of color difference to determine the merging order. Though highly effective in color image processing, their computation is not as fast as that of Max-trees, which explains the continuing interest in the latter. One implementation of color connected filters that uses the Max-tree has been conducted by Naegel and Passat (Naegel and Passat 2009). The Max-tree method requires that the data are ordered, which is easy in grey scale, but non-trivial in color space. Therefore, (Naegel and Passat 2009) impose either a total order or a total preorder on the color data. Let T be our color space. A total order on T is any binary relation ≤ which is 1. reflexive: a ≤ a is true 2. transitive: a ≤ b ∧ b ≤ c ⇒ a ≤ c 3. total: (a ≤ b) ∨ (b ≤ a) is true 4. antisymmetric: (a ≤ b) ∧ (b ≤ a) ⇒ a = b In the case of a total preorder the last property (antisymmetry) does not hold. This means that if we use a total preorder on the color space to sort the pixels into different levels

5.2. Theory

49

of the Max-tree, pixels with different colors can end up in the same node. A simple example would be sorting by the luminance of each pixel. Obviously, the first three relationships hold, due to the total ordering of luminance, but the last does not, because different color stimuli may have the same luminance. By contrast, the hue component from HSV or HLS color spaces cannot be used because it is not totally ordered. In (Naegel and Passat 2009), the performance on noise removal of five different color connected area filters was tested, using four (pre)ordering schemes: (i) through marginal processing, (ii) using lexicographic ordering giving priority to R, G and B bands (in order of priority), (iii) a total order built by combining a total preorder based on the distance to the color white, and complementing it with lexicographic order, (iv) total preordering that calculates the distance of a node from color white (which is more or less equivalent to luminance). Multiple color assignments within the restitution decisions were also tested. These apply to the preorder only, because in the case of total order we can simply use the existing rules for gray scale. The Pmean restitution rule assigns each node of the tree the mean value of its constituent pixels as its representative color, and then uses this representative value to restitute the nodes. This means that the rejected nodes obtain the representative color of the nearest preserved ancestor, and that the preserved nodes are assigned their own representative value. The Pmedian decision is similar but uses the median color of the pixels in the region after sorting using lexicographic order. The results from (Naegel and Passat 2009) show that of all the methods that were tested, only Pmean reconstruction did not introduce undesired colored artifacts. It however, altered the image quantization so much that it was very visible even at low thresholds. Indeed, even if the area threshold is set to zero, and all nodes are preserved, colors of pixels change. This is highly undesirable. This work proposes different restitution decisions for color filtering using the Maxtree and compares the results with those after using the Pmean (Naegel and Passat 2009).

5.2.4

New Extensions to Color Max-trees

In this work we explore several different extensions to color Max-trees. Because marginal processing proved best in noise filtering, we focus on the application of image simplification for compression, extending the gray-scale work in (Tushabe and Wilkinson 2007b). First we explore different preorders in Section 5.2.4, trying to find simple schemes with psycho-visually sensible orderings. We discuss extensions to the restitution decisions for color processing based on preorders, improving the results of (Naegel and Passat 2009). Next, we explore different attributes suitable for simplification in Section 5.2.4. Some of these are not new, but new algorithms for computing them needed to be developed.

50

5. Color Processing using Max-trees: A Comparison on Image Compression

Orders and Preorders The kind of ordering that is proposed in VAMS (Evans 2003) and CCS (Gibson et al. 2004) is not a total ordering because it is local. The same holds for the ordering proposed in VAMOCS (Gimenez and Evans 2008). The ordering in (Naegel and Passat 2009) is according to how far a color is from color white. This gives a higher priority to color white, thereby implying that white is more important than other colors. This poses the question of which color to use as reference. We suggest that color is ordered according to the meaning behind it. All channels of a given color space represent a more generalized concept. For example saturation or chromaticity is represented in the S channel of the HLS and HSV color spaces and the second and third channels of the L*a*b* color space. Luminance is represented in the L channel of the HLS and L*a*b* color spaces, e.t.c. Preordering color based on these channels gives the user better intuition of which channel would achieve the best result in this case. We propose a preordering based on: • Chromaticity C; defined as the length of the vector formed by the two chromaticity (color) components in the CIE L*a*b* color space (HunterLab 2008): √ C = a2 + b2 . (5.9) with a and b the second and third component of the L*a*b* color space. • Luminance (LLab ); defined as the first component of the CIE L*a*b* color scheme (HunterLab 2008). • HLS Luminance (LHLS ); defined as the second component of the HLS color scheme (Plataniotis and Venetsanopoulos 2000) LHLS = 0.299R + 0.587G + 0.114B.

(5.10)

• Saturation (S); the third component in the HLS color space (Plataniotis and Venetsanopoulos 2000)  0 if V = 0 S= (5.11) (V − X)/V otherwise with if V = max(R, G, B) and X = min(R, G, B).

• Weighted Luminance (wL); Luminance and chromaticity have been combined by giving luminance a higher weight according to Equation 5.12. In these experiments, w1 = 1 and w2 = 256. wL = w1 C + w2 LLab

(5.12)

5.2. Theory

51

• Weighted Chromaticity (wC); Luminance and chromaticity are combined by giving chromaticity a higher weight according to Equation 5.13. In these experiments, w1 = 256 and w2 = 1. wC = w1 C + w2 LLab .

(5.13)

In all cases, we need to address the consequences for restitution. One simple improvement on the color restitution of (Naegel and Passat 2009) is changing the way preserved nodes are dealt with. In our case we copy the Direct decisions effect in gray scale, namely that a preserved node retains its original color. Removed nodes must however be assigned a new color. For this we can copy the strategy of the Pmean decision for removed nodes, and assign the mean color of the closets surviving ancestor. This rule is referred to as the Mean of Parent decision (MP), and only differs from Pmean in the treatment of preserved nodes. A similar approach could be used with the Pmedian decision, but this is computationally more expensive, and its performance in (Naegel and Passat 2009) was inferior. Alternatively, we propose the Nearest Color (NC) approach. In this case a removed node selects the color closest to its own mean color from those adjacent pixels that belong to the nearest preserved ancestor. This guarantees that no new or false colors appear, because the color selected was always present in the image. We simultaneously minimize the color change while guaranteeing that the output image contains no unwanted structures. The final rule is Nearest Neighbor (NN). This assigns each pixel in the node to be filtered the color of the spatially nearest pixel in the first preserved ancestor. Unlike the other restitution rules, this splits up the removed nodes into different zones. This means it is no longer a connected filter in the classical sense, but it does prevent false colors and minimizes the edge strength along the boundary of the removed region. In a way, it could be seen as a quick-and-dirty version of the image inpainting as proposed in (Dimiccoli and Salembier 2007), which also aims at reducing the boundary between removed and preserved regions. Unlike (Dimiccoli and Salembier 2007), our method guarantees idempotence, because in the according to the preorder the new region is completely flat. This is because all colors used come from a single node in the Max-tree. A simple result on image Lenna contaminated with salt and pepper noise is shown in Fig. 5.2. Note how all three new methods are competitive in quality with the marginal approach, whereas the Pmean creates color artefacts. Attributes Filtering is achieved by determining whether an attribute value of each node satisfies a given attribute criterion. These experiments tested the following attributes:

5. Color Processing using Max-trees: A Comparison on Image Compression

52

(a)

(b)

(c)

(d)

(e)

(f)

Figure 5.2: The denoising using different restitution rules, using area open/close with area threshold 5: (a) Lenna with color pepper and salt noise; (b) marginal processing; (c) NC; (d) NN; (e) MP; (f) Pmean .

• Area: The Area attribute calculates the size of a component and has been defined as (Cheng and Venetsanopoulos 1992), (Vincent 1993), (Meijster and Wilkinson 2002): X A(X) = χ(X)(x) (5.14) x∈X

where X is the set of pixels in the region, and χ(X) is the characteristic function of x.

• Volume: The Volume attribute (Vachier 1998) is the change in intensity over the area of a node and is given as: V (X, f, hparent ) =

X

x∈X

(f (x) − hparent )

(5.15)

where f is the intensity value within the original region and hparent is the grey level of the parent.

5.3. Algorithms

53

• Power: The power attribute (Young and Evans 2003) calculates the square of the change in intensity over the area of a node. It is defined as: X P (X, f, hparent ) = (f (x) − hparent )2 (5.16) x∈X

• Entropy: The entropy attribute (Shannon 1948) measures the information content in the grey level distribution of a node and is defined as: X E(X) = − p(f (x)) log2 (p(f (x))), (5.17) with p(f (x)) the probability that f (x) occurs within X.

• Vision: The Vision attribute (Tushabe and Wilkinson 2007b) calculates the volume of all nodes but only components whose volume is equal to the threshold are considered. • VisionP: We define the VisionP attribute to calculate the power of all nodes but only components whose power is equal to the threshold are considered.

5.2.5

The Proposed Method

In general, the proposed method first converts the color image into a gray scale one based on a general concept like its luminance, chromaticity or saturation. This image, called the order image, will be the foundation on which the Max-tree is built. Once the tree has been constructed, the nodes that do not fulfil the user requirements are removed using the conventional gray-scale filtering rules as discussed in Section 5.2.2 and 5.2.1. Image restitution is then performed by using one of the four restitution methods from Section 5.2.4.

5.3

Algorithms

To implement the above methods, we adapted the algorithm from (Salembier et al. 1998) to color filtering. Two essential steps must be made. First we need to provide the algorithm with two images: (i) the original color image (ORI), and (ii) the image indicating the (pre)order of the pixels (order). In the current implementation we tacitly assume that ORI uses RGB color space. The rationale is that this represents the physical stimulus, and that the averages of the R, G, and B values over each node in the Max-Tree best represent the observed stimulus if the image were rescaled in such a way that the entire node consisted of just one pixel. However, the algorithm does allow use of other color spaces for ORI. The order image is computed from ORI by any of the equations

5. Color Processing using Max-trees: A Comparison on Image Compression

54

in Section 5.2.4. Min-Trees are not built explicitly. Instead, we simply invert the order image and compute and filter the Max-Tree using the inverted order. The second adaptation is to the Max-Tree nodes themselves, which need to maintain color information of the node, both of the node itself, and of those neighboring pixels which belong to its parent in the tree. In all, a max-tree node contains the following fields • Parent: reference to parent of Chk • Attribute: pointer to attribute data • Preserve: preserve/reject status of node • Level: level according to initial ordering (grey level in image order). • NewLevel: ordering level after filtering • Color: original color • NewColor: color after filtering • FirstRejected: reference to first rejected ancestor • ParColor: parent color after filtering • Neighbors: list of neighboring pixels which are member of the parent node. The first six fields are identical to those in the original algorithm, the remainder are needed for color processing. In our implementation, the nodes are stored in an array Tree and all references to nodes are stored as array indices.

5.3.1

The Building Phase

In the building phase of the Max-Tree, using the flood-filling algorithm from (Salembier et al. 1998), only the one adaptation is needed is that we have to compute the mean color of each node. To do this, we initialize a Color vector and the nodes own area stored in OwnArea to zero, and update it as each pixel is added to the node, as shown in Alg. 1. Finally we divide each element of Color by the node’s area, and assign this to the Color field of the final node. After building the tree, the Neighbors list for each node is built. This is done by a single pass through the image. At each point, we obtain the node curnode belonging to the current pixel p. We then visit all the neighbors q of p, and test if order[q] is is larger than order[p]. If the node node corresponding to pixel q is a descendant

5.3. Algorithms

55

Algorithm 1 The flooding function of the Color Max-Tree algorithm adopted for attribute openings and thinnings. The parameters h and m are the current and child node gray levels while attr is a attribute count at level h within the same connected component. The parameter thisAttribute is used to pass child attributes to parent nodes. flood(h, thisAttribute) { Color = {0,0,0} OwnArea = 0; while (not HQueue-empty(h)){ p = HQueue-first(h) STATUS[p] = NumberOfNodes[order[p]] x = x_coord_of_p y = y_coord_of_p if attr not initialized { attr = NewAuxData(x,y) if(thisAttribute) MergeAuxData(attr,thisAttribute) } else { AddToAuxData(attr,x,y) } Color = Color + ORI[p] OwnArea++ for (every neighbor q of p){ if (STATUS[q] == "NotAnalyzed"){ HQueue-add(order[q],q) STATUS[q] = "InTheQueue" NodeAtLevel[P_order[q]] = TRUE if (order[q] > order[p]){ m = P_order[q] child_attribute = NULL do{ m = flood(m,child_attribute) } while (m != h) MergeAuxData(attr,child_attribute) } } }

/* Flooding function at level h /* Initialize color vector

*/ */

/* /* /* /*

*/ */ */ */

First step: propagation Retrieve priority pixel STATUS = the node index Retrieve x, y, z coordinates of p

/* Initialize attr /* Accounts for child attributes

*/ */

/* add pixel to current

*/

/* Add original color of p to color data /* Increment own area

*/ */

/* Process the neighbors

*/

/* Add in the queue

*/

/* Confirm node existence /* Check for child nodes

*/ */

/* Recursive child flood

*/

/* Merge auxiliary data of children

*/

} NumberOfNodes = NumberOfNodes[h] + 1 /* Update the node index */ m = h-1 /* 2nd step: defines father */ while ((m >= 0) and (NodeAtLevel[m] = FALSE)) m = m-1; if (m >= 0){ /* Node parent is not the background */ idx = NodeOffsetAtLevel[h] - 1 node->Parent = NodeOffsetAtLevel[m] /* Compute the parent node */ PostAuxData(attr, m) /* Signal parent level to attr */ } else { /* Node parent is the background */ idx = NodeOffsetAtLevel[h] /* Check if node exists and create if not */ Tree[idx]->Parent = idx /* Compute the parent node */ PostAuxData(attr, h) /* Signal parent level to attr */ } MergeAuxData(Tree[idx]->Attribute, attr) /* Merge node attributes */ Tree[idx]->Color = Color/OwnArea /* Compute mean color */ Tree[idx]->Status = Finalized /* Finalize node */ Tree[idx]->Level = h NodeAtLevel[h] = FALSE thisAttribute = Tree[idx]->Attribute /* Set ’thisAttribute’ for recursion */ return (m) }

5. Color Processing using Max-trees: A Comparison on Image Compression

56

of curnode. We then follow the root path from node until we find its ancestor which has curnode as parent. Pixel p is appended to the neighbor list of this node using the AddToPixList function. This function first checks if the last pixel in the neighbor list is equal to p, and if not, appends p to the list. In this way, the final pixel list never contains ties. The code is shown in Alg. 2. Algorithm 2 Setting the neighbors lists. SetNeighbors(mt, order) { for (all pixels p) { curnode = mt->getNode(p) for (every neighbor q of p){ if (order[q]>order[p]){ node = mt->getNode(q) while ( mt->Tree[node].Parent!= curnode){ node = mt->Tree[node].Parent } AddToPixList(mt->Tree[node].Neighbors, p) } } } }

5.3.2

/* Build neighbor lists

*/

/* Descendant of p

*/

/* Find ancestor of q which */ /* is direct descendant of p */ /* Add p to Neighbor list

*/

Attribute Management

To allow code reuse for different attributes with the same Max-Tree, we use the approach of (Breen and Jones 1996), in which pointers to functions for attribute administration are passed to the Max-Tree building and filtering routines. Auxiliary data sets are created and updated during the building phase for each node, and the final attribute is computed from these data in the filtering phase. Pointers to the following function are required: • NewAuxData, which initializes the auxiliary data, and inserts first point • DisposeAuxData, which discards them, • AddToAuxData, which adds a pixel to the auxiliary data • MergeAuxData, which merges two sets of auxiliary data, • PostAuxData, which stores the parent’s level in order in the auxiliary data (if necessary). • Attribute, which computes the attribute based on the auxiliary data. In the case of the area attribute, the auxiliary data only contains the area of the node, which is initialized to zero. In this case AddToAuxdata simply increments the area

5.3. Algorithms

57

value, MergeAuxData adds the area values and Attribute simply returns the area value. PostAuxData is a dummy routine in this case, because the area attribute does not depend on the parent grey level. Other attributes require more advanced handling. Previously, entropy was computed by maintaining a histogram of the grey levels (or order levels) within each peak component Phk in the auxiliary data, and computing the entropy using (5.17). In the case of eight bits-per-pixel grey level images this is feasible, even though the worst-case memory usage is some 1024 times the image size, just for the histograms. In the case of 16 or more bits-per-pixel grey-level or order images, such as in the case of wL or wC, the cost becomes prohibitive. We therefore propose a new algorithm to compute this attribute. Instead of allocating histograms for each node, we store the necessary information in a linked list. This in itself does not reduce memory cost. However, the linked lists can be nested within each other using the nesting relationships between the Max-Tree nodes. Thus, all the attribute data is stored in a single linked list, and the auxiliary data of each Max-Tree node indicates which section of the list is relevant to its entropy. In our implementation, the entropy auxiliary data structure EntropyData consists only of four fields: an integer field Area denoting the area of the current node Chk , a grey level field Hcurrent containing the grey level h of Chk , and two pointers to EntropyData structures named next and last. The first points to the next item in the list, the last at the last item pertaining to the current node’s entropy. This means the worst-case storage requirement is O(N ) (i.e., independent of the number of grey levels) as opposed to O(GN ) with G the number of grey levels and N the number of pixels. At initialization of a node, an EntropyData structure is created, using function NewEntropyData, as shown in Alg. 3. Its Hcurrent grey level is assigned the order level of the current node, and its area set to one. The next pointer to NULL and the last pointer to the memory location of the current EntropyData structure to indicate a singleton list. As pixels at the current grey level are added, using AddToEntropyData, all we have to do is increment the Area field. If a child node exists, what needs to be done is to append the EntropyData linked list from the sub-tree to the that of the current node. This is done in MergeEntropyData, by updating the next pointer of the last item of the list, followed by an update of the last pointer of the current list itself. Once all the data have been gathered, the final attribute can be computed node by node, by traversing the appropriate EntropyData linked list, build the histogram from the area and grey level information, and compute the entropy as usual. This does mean the worst case computational complexity is O(GN ), because we visit each node O(G) times. The algorithms for the other attributes used can be found in the Appendix.

5. Color Processing using Max-trees: A Comparison on Image Compression

58

Algorithm 3 The key functions for the entropy attribute. void *NewEntropyData(x, y) { allocate *entropydata; entropydata->Hcurrent = order[x,y]; entropydata->Area = 1; entropydata->next = NULL; entropydata->last = entropydata; return(entropydata); } /* NewEntropyData */ void AddToEntropyData(*entropydata, x, y ) { entropydata->Area++; } /* AddToEntropyData */ void MergeEntropyData(*entropydata, *childdata) { entropydata->last->next = childdata; entropydata->last = childdata->last;

/* append to end of list /* update last

*/ */

} /* MergeEntropyData */ double EntropyAttribute(*entropydata){ initialize array p to zero totalpixels=0 current = entropydata; while (current!=entropydata->last){ p[current->Hcurrent] += current->Area; totalpixels += current->Area; current=current->next; } p[current->Hcurrent]+=current->Area; num+=current->Area; entropy = 0; for all grey levels i >=entropydata->Hcurrent { if (p[i]>0){ entropy += (p[i]/num)*log(p[i]/num); } } return(-entropy); } /* EntropyAttribute */

5.3.3

/* add to histogram */ /* increment total pixels */ /* move to next */ /* process last

*/

/* compute entropy

*/

Filtering and Restitution

The biggest differences between grey-scale and color filtering are found in the filtering and restitution phase. As discussed before, there are many choices we can make concerning the final colors assigned to each pixel in the case of total preorders. This is because a single level in the order image may correspond to multiple colors in ORI. If

5.4. Experimental Results

59

order represents a total order, this is not the case. Instead we have order[p] = order[q]



ORI[p] = ORI[q],

and all restitution strategies found in (Salembier et al. 1998, Urbach and Wilkinson 2002) can be used without modification. In the current implementation, we only consider the Direct filtering rule. This means that in all cases we preserve the current color of all preserved nodes in the tree. Naegel and Passat (Naegel and Passat 2009) use the vector mean color or median color based on lexicographic ordering to represent preserved nodes, which leads to severe color artefacts even in preserved regions, as noted in (Naegel and Passat 2009). Here we propose a different approach: simply retain the original color of each pixel in each preserved node. This avoids all color artefacts. In the direct rule, any node which is not preserved needs to be assigned the level of the first ancestor FirstPreserved along the root path which has been preserved. However, we do not compute a pointer to FirstPreserved, but to its immediate descendent along the root path, and store this in the node in the FirstRejectedfield. We do this because in some restitution methods we need the those pixels of the FirstPreserved node which are adjacent to the FirstRejected node (which denotes the root of the subtree we are changing).

5.4

Experimental Results

The experiments used the fourteen (14) test images obtained from the image compression benchmark database in (Garg 2008). Quality is measured using the mean Structural SIMilarity (SSIM) quality index (Wang et al. 2005). The overall SSIM value of a given image is determined as the average of the SSIM values from the three color bands, i.e. red, green and blue. The objective of the study was twofold: Firstly to see how the proposed methods work and to identify the circumstances under which the different operations, orders and decisions work best. Secondly to compare the behavior of selected attributes when applied to a jpeg compression (JPEG 1993) at a quality of 75. The attributes that are compared are Area, Power, Volume, Vision, VisionP and entropy. Implementation was in C programming language in the Cygwin platform and Matlab 6.5 in Windows.

5.4.1

Comparison of Filters

Four different filter combinations have been tested: an area opening (denoted Max), an area closing (Min) as well as an opening followed by a closing (MaxMin) and a closing

60

5. Color Processing using Max-trees: A Comparison on Image Compression

(a)

(b) Figure 5.3: Some of the original images used: a) Artificial and b) Leaves

followed by an opening (MinMax). In all cases the attribute threshold T = 150. The results show that there is no difference in quality between the MaxMin and MinMax operations. This is because both filter out the same content. Under a similar

5.4. Experimental Results

61

1

Max MaxMin Min MinMax

0.8

0.6

0.4

0.2

0

S

C

wC

L HLS

L Lab

wL

Figure 5.4: The various filters tested on image Leaves

environment, the quality of either a Max or a Min operation is predictably better than using the joint MaxMin or MinMax operation. This is because less content has been filtered out, for a given attribute threshold. The overall observations reveal that Min operations work best in conjunction with luminance based orders while saturation and chromaticity are better associated with Max operations. An example of this is illustrated in Figure 5.4 that shows the quality bar chart of image Leaves. Image Leaves contains few colors, mainly green as shown in Figure 5.3(b). It can be seen that for this particular image, the luminance based operations performed better than the rest, especially those boosted by the Min operation. This is because the image contains is more-or-less monochrome, thus causing luminance to play a more dominant role.

5.4.2

Comparison of Decisions

Four restitution decisions discussed in Section 5.2.4 were tested: the new NC, NN, and MP decisions, and the Pmean which is the best from those tested in (Naegel and Passat 2009). The images were filtered using the MaxMin operation and at an area threshold of

62

5. Color Processing using Max-trees: A Comparison on Image Compression Table 5.1: Quality obtained after filtering using the different decisions

Order LLab C wL wC LHLS S AvQuality

NN 0.80 0.81 0.79 0.75 0.79 0.71 0.775

NC 0.80 0.82 0.79 0.75 0.79 0.71 0.777

MP Pmean 0.79 0.76 0.72 0.48 0.79 0.79 0.75 0.74 0.78 0.76 0.62 0.52 0.742 0.674

AvQuality 0.786 0.705 0.789 0.749 0.781 0.642

Table 5.2: Compression ratios obtained after filtering using the different decisions

Image LLab C wL wC LHLS S AvCR

NN 1.63 1.12 1.64 1.19 1.64 1.65 1.478

NC 1.64 1.19 1.64 1.19 1.65 1.65 1.491

MP Pmean 1.64 1.70 1.05 1.75 1.64 1.65 1.18 1.17 1.65 1.70 1.40 1.72 1.423 1.616

AvCR 1.652 1.276 1.643 1.181 1.661 1.601

T = 150. The results of filtering image Artificial using all four methods is demonstrated in Figure 5.5. It is clear that the visibly different colors have been introduced in the image filtered using Pmean , which is not the case with the new methods. A similar effect is shown in Figure 5.2. This corresponds to the results in (Naegel and Passat 2009) that show visible quantization effects. Further scrutiny also shows that Pmean performs worse when the order used is based upon saturation or chromaticity. The average results obtained from all the images are given in Table 5.1 and Table 5.2. Table 5.1 shows that when filtered at the same threshold, the decision that produces the best quality is NC, closely followed by NN, MP and then Pmean . The average quality registered by using the nearest color is 0.777 which is equivalent to a 15% improvement when compared to using Pmean . Although the quality difference between NC and NN filters is not statistically significant, Table 5.2 shows that NC gives slightly better compression ratios. On the other hand, Pmean registers the highest compression ratios. The overall results show that NC and NN orders produce high quality images devoid of colored artifacts.

5.4. Experimental Results

63

(a)

(b)

(c)

(d)

Figure 5.5: Artificial after chromaticity filtering using (a) NC, (b) NN, (c) MP, and (d) Pmean , using area filter and area threshold 100. Full-color versions are on the front cover.

5.4.3

Comparing the Preorders

In order to test the different types of preordering, all the 14 images were filtered using the MaxMin operation, NN decision and at an area threshold of T = 150. It emerged that there is not a statistically significant difference between the performance of LHLS and Llab . This can be observed in Figure 5.4, 5.7, 5.6 and Table 5.1. These results show that LHLS and LLab orders result in the best quality images and at the lowest sizes or best compression ratios. This therefore makes luminance based orders, on the average, better off than saturation or chromaticity ones. However, scrutinizing individual images show that some are better off being filtered with chromaticity and saturation related orders. Figure 5.7 and 5.6 show a comparison of two images filtered at different thresholds ranging from 100 to 2000. Image Artificial shown in Figure 5.3(a) is an image rich in colors and its analysis in Figure 5.6 shows that order chromaticity returns the best quality. On the other hand, image Leaves shown in Figure 5.3(b) is mainly monochromatic and its analysis in Figure 5.7 shows that the

5. Color Processing using Max-trees: A Comparison on Image Compression

64 0.9

C 0.85 wC

0.8 SSIM

LHLS wL

S

0.75

LLab

0.7

0.65

1

2

3

4

5 6 Threshold

7

8

9

10

Figure 5.6: Testing the Orders on image Artificial

luminance based orders register the best quality while chromaticity takes a back seat. On average, the images that registered the worst quality are those by saturation, while chromaticity related filters register the lowest compression ratios as shown in Table 5.2. Incidentally, the combination of luminance and chromaticity orders was always almost as good as using either of the two orders separately. This means that the proposed combination strategy needs further improvement. We therefore recommend that users choose either luminance, saturation or chromaticity orders until a more effective combination method is discovered.

5.4.4

Comparing the Attributes

The six attributes that were tested are: Area, Volume, Power, Vision, VisionP and Entropy. All the images were filtered at 10 different thresholds using the NC decision, LLab preorder and a MaxMin operation and then compressed. In order to obtain similar bit-rates (size in bytes * 8/ number of pixels), different threshold ranges for each of

5.4. Experimental Results

65

0.95

0.9

0.85

L

HLS

L

Lab

wL

SSIM

0.8

S

0.75

0.7

C

0.65

wC

0.6

0.55

1

2

3

4

5

6

7

8

9

10

Threshold

Figure 5.7: Testing the Orders on image Leaves

the attributes had to be obtained. The threshold range varied from image to image and attribute to attribute. Our experiments show that all these attributes are suitable as a preprocessing filter for color compression although some attributes are better than others. For all images, quality reduces with an increase in compression or a reduction in bit rates. These results are shown in Figure 5.8 which illustrates how quality changes with a decrease in bit rate. It can be observed that the quality of Area, Volume and Power filters reduces gradually and predictably unlike Entropy, Vision and VisionP which at one point make drastic reductions which a small increase in filtering thresholds. The quality of Entropy filters sharply declines after an average bit rate of approximately 0.31 has been attained. Vision and VisionP attributes give interesting results. An increase in filtering thresholds causes a reduction in quality until a turning point when it becomes fairly unpredictable. In some images, the quality begins to improve despite an increase in compression ratio. This happens at very low bit-rates and after the image has been severely degraded. This was also shown to happen in (Tushabe and Wilkinson 2007b) and at-

5. Color Processing using Max-trees: A Comparison on Image Compression

66 0.85

0.8 Area 0.75 Volume 0.7 Power

SSIM

0.65 Vision

0.6

VisionP 0.55

0.5

Entropy

0.45

0.4

0.35 0.1

0.15

0.2

0.25 bitrate(bits−per−pixel)

0.3

0.35

0.4

Figure 5.8: Average quality expressed as SSIM over the 14 images in the data set as a function of bit-rate

tributed to their edge enhancing properties. When Vision and VisionP attributes are compared, the quality of Vision filtered images is consistently better than VisionP filtered ones. This can be observed in Figure 5.8. This could mean that Volume a more robust attribute than Power. In general, the attribute filter that results in the highest quality images is Area, followed by Volume, Power, Vision, and VisionP, while Entropy performs well before bit rates of 0.3.

5.5

Discussion and Conclusions

This work proposes a method for Max-tree representation to color image processing. The methods that have been discussed result in high quality images especially since no new colors or artifacts are introduced. The color image is first transformed into a grayscale, (pre)order image, using luminance, saturation or chromaticity, or combinations. The results have shown that generally, luminance based images give better quality and

5.5. Discussion and Conclusions

67

lower sized images in image compression. However, depending on the nature of the input image, chromaticity can play a better role too. Though the main focus of this work was on quasi-auto-dual filters (MinMax and MaxMin), filtering the image using the Min operation has been found to suite the luminance order while saturation and chromaticity are better suited with the max operation, on several images. This work shows that color image reconstruction using the nearest color or nearest neighbor decisions is superior to the other decisions that were tested, though the MP decision came in a good third. As a preprocessing method for compression, the performance of the attributes is quite similar to the ones found in (Tushabe and Wilkinson 2007b). Area and Volume have been found to be the most suitable attributes for compression. However, their in the previous work area was outperformed by volume, whereas in this case area was the better performer. This may be the result of the different quality criterion, because the Universal Quality Index (UQI) (Wang and Bovik 2002) was used in (Tushabe and Wilkinson 2007b), whereas we used SSIM here (Wang et al. 2005). It should be noted that the other attributes tested could well be of great importance to other image processing applications. The availability of a memory-efficient algorithm for the entropy attribute may be of importance in such cases. The results of this approach can be further improved by investigating better ways of combining the luminance, chromaticity and saturation orders. Furthermore, the approach could be extended to other connectivity, by including these adaptation in the dual-input Max-tree algorithm (Ouzounis and Wilkinson 2007). In the future we will extend this work to vector-attribute filtering (Urbach et al. 2005) for object recognition.

Submitted as: F. Tushabe and M. H. F. Wilkinson – “Color Image Processing using Component Trees: A Comparison on Image Compression,”, to Pattern Recognition.

Chapter 6

Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition Abstract This paper introduces a connected vector-attribute operator that filters image components based upon their color. Instead of using vectors of shape descriptors to detect object, color information is used. The proposed filter is useful for situations when it is desirable to recognize objects based on how similar their color is to a reference color. We tested the performance of the proposed filter using six color spaces, and applied it to automatic traffic-sign recognition. Our experiments show that object recognition using the color filter was better than using a similar shape filter. The best color filter gives an AUC of 85% compared to an AUC of 71% by the best shape filter that was tested. Keywords: color filter, color connected operator, shape filter, max tree, binary partition tree, vector attributes, Attribute filtering, traffic sign recognition, traffic sign detection, road sign detection.

6.1

Introduction

Color plays an important role within object recognition. It is one of the simplest and most intuitive ingredients of the object detection process. In this paper we focus on how to include color in connected attribute filters based on Max-trees (Salembier et al. 1998) also called component trees (Jones 1999, Najman and Couprie 2006). This chapter is part of a larger study of traffic-sign recognition based on connected filters. This full study will also include other filters, such as those based on the binary partition tree (Salembier and Garrido April, 2000, Vilaplana et al. 2008), on level-line trees (Monasse and Guichard 2000a, Monasse and Guichard 2000b), and those that go beyond autoduality (Soille 2005, Soille 2008). It will also include more advanced object recognition such as Learning Vector Quantization (Kohonen 1990), and its generalizations (Hammer and Villmann 2002). In this paper, we will limit ourselves to optimal strategies to include color information into attribute filtering based on Max-trees. In connected attribute filtering the usual way to describe objects is through size (Cheng and Venetsanopoulos 1992, Vincent 1993, Breen and Jones 1996, Salembier et al. 1998) or shape (Urbach and Wilkinson 2002, Urbach et al. 2007, Breen and Jones 1996,

70

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

Salembier et al. 1998), or a combination (Urbach et al. 2007, Urbach et al. 2005, Naegel et al. 2007). Attribute filters on grey scale images compute some property or attribute on of each connected component of each threshold set of the image, and remove those which do not meet some criterion. Initially a single attribute was used (Breen and Jones 1996, Salembier et al. 1998), recently filtering based on attribute vectors has been introduced (Urbach et al. 2005, Naegel et al. 2007). Though vector-attribute filters using shape descriptors can be quite powerful, color information could add to this, in particular because it is information which is independent of, and therefore complementary to shape information. One area in which it has long been recognized that the combination of color and shape is important is in the application of image analysis to traffic sign recognition (Piccioli et al. 1996, Bahlmann et al. 2005, Goedem´e 2008, de la Escalera et al. 2003). Traffic-sign recognition is useful both for the design of autonomous vehicles, but also for driver assistance in intelligent vehicles (Klette et al. 2009). Important issues are speed of computation, due to the real-time requirements involved, and invariance to scale, translation, rotation and projection effects (Piccioli et al. 1996, Bahlmann et al. 2005, de la Escalera et al. 2003). For this reason (Goedem´e 2008) used SURF based features in this context, because they are fast, and rotation and scale invariant. This makes filtering based on component trees interesting for this application, because they provide an efficient, compact, multi-scale representation of the image, and allow any invariance to be included easily (Urbach et al. 2007). In images, color is represented by a 3-D vector whose components describe color, either in terms of the three basic stimuli red, green, and blue, or using descriptors such as like chromaticity, saturation, luminance, hue, etc. Thus, the RGB color space defines color using red, green and blue components. The HSL color space defines color as a combination of hue, saturation and lightness, while in L*a*b* (HunterLab 2008) color is defined as a combination of luminance and two chromaticity components. This vector representation has consequences for the morphological tools we want to use. Although color filters are well defined within many image processing fields, this is not the case in connected filtering based on component trees. This is because a total ordering of pixel values is an essential ingredient in this implementation of component trees. Therefore, this work builds on earlier work to build component trees on color images (Naegel and Passat 2009, Tushabe and Wilkinson 2010). Both papers focus on how to build optimal component trees by defining total orders or total preorders on color image data. This is necessary to define threshold sets properly. It is shown that total preorders suffice to build component trees, provided appropriate image restitution rules are set in place (Naegel and Passat 2009, Tushabe and Wilkinson 2010). For filtering purposes, both previous papers used classical shape and size attributes from their grey-scale counterparts (Naegel and Passat 2009, Tushabe and Wilkinson

6.2. Previous Work

71

2010). In this paper we explore the use of color as attribute vectors. Color has been used as an attribute vector before, but in a different kind of filter, based on the binary partition tree (Salembier and Garrido April, 2000), which is considerably more expensive computationally to build than the Max-tree or component tree on which the current paper focuses. This report is divided into five sections. Section 6.2 briefly describes the theory behind the proposed method, specifically a discussion on connected attribute filtering and their implementation using the Max-tree. Section 6.3 describes the proposed color filter and the method of execution. Section 6.4 reports on the experimental set-up and the results obtained after they are applied within traffic signs recognition. Conclusions and future work are discussed in Section 6.5.

6.2

Previous Work

In this section we first discuss the concept of connected attribute filters in the binary case, and then extend them to vector attributes in Section 6.2.2. We then extend these notions to grey scale using the Max-tree approach in Section 6.2.3, followed by a discussion of color connected filtering in Section 6.2.4. As is usual in mathematical morphology, binary images are considered subsets of the image domain E (Serra 1982). Grey scale images are mappings from E to some completely ordered set T , color images are mappings from E to some vector space V (usually V is a subset of Z3 or R3 ).

6.2.1

Connected Operators

Connected operators are image transformations that result from the identification and manipulation of the connected components of an image. Salembier and Serra (Salembier and Serra 1995) define a binary connected operator for any binary image X as connected when the set difference X \ ψ(X) is exclusively composed of connected components of X or its complement, X c = E \ X. The set of all subsets of E is denoted as P(E). Attribute filters are based on connectivity openings or connected openings (Serra 1988). These are related to connectivity classes or connections. Informally, in the binary case these are the collections of all the connected subsets of E. More formally a connection is defined as follows. 1. D EFINITION . A connectivity class C ⊆ P(E) is a set of sets with the following two properties: 1. ∅ ∈ C and {x} ∈ C for all x ∈ E 2. for each family {Ci } ⊂ C, ∩Ci 6= ∅ implies ∪Ci ∈ C.

72

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

This means that singletons and the empty set are considered to be connected, and that any collection of connected sets, which have a non-empty intersection has a connected union. Connectivity openings Γx for any x ∈ E can be defined as 2. D EFINITION . The binary connectivity opening Γx of X at a point x ∈ E is given by (S {Ci ∈ C | x ∈ Ci ∧ Ci ⊆ X} if x ∈ X Γx (X) = ∅ otherwise.

(6.1)

Thus connectivity openings return the largest possible connected set containing x which is still a subset of X, if x ∈ X, and ∅ otherwise. In this way, the family of operators {Γx , x ∈ E} extracts the connected components of X. These connected components are either retained or removed using a trivial filter ΨΛ , based on attribute criterion Λ. These are defined as ( C if Λ(C) is true ΨΛ (C) = (6.2) ∅ otherwise, with C the connected component. In most cases, the attribute criterion (Breen and Jones 1996) has the form Λ(C) = (Attr(C) ≥ λ) (6.3) with Attr(C) some real-valued attribute of C, and λ the attribute threshold. The complete attribute filter ΨΛ based on criterion Λ can then be defined as [ ΨΛ (X) = ΨΛ (Γx (X)). (6.4) x∈E

We can see that the output of the attribute filter is the union of all connected components which meet Λ. The above operator is an attribute opening only if the criterion Λ is increasing, i.e. if C ⊆ D then Λ(C) implies Λ(D) (Breen and Jones 1996). In all other cases it is an attribute thinning. The dual operators of attribute openings and thinnings are the attribute closings and thickenings, respectively, which are defined as ΦΛ (X) = (ΨΛ (X c ))c .

(6.5)

These remove connected background components which do not meet Λ. Connected operators can readily be extended to gray-scale images (Cheng and Venetsanopoulos 1992, Vincent 1993, Breen and Jones 1996, Salembier et al. 1998). In gray scale, connected filters work at the level of connected components of level-set images Lh , defined as Lh (f ) = {x ∈ E | f (x) = h}. (6.6)

6.2. Previous Work

73

Because Lh is a binary image, we can define connected components and connectivity openings as before. The connected components Lkh at level h are called flat zones. The flat zones partition the image domain E, because the the union of the flat zones is E, and the intersection of any two different flat zones is empty. According to the definition in (Salembier and Serra 1995), a gray level operator ψ is connected if for any image f , the partition of E of flat zones of f is finer than the partition of flat zones of its transformation ψ(f ). This means that any flat zone of f is contained entirely in a single flat zone of ψ(f ). The simplest way to extend increasing morphological filters to gray scale is through threshold decomposition (Maragos and Ziff 1990), and this principle has been applied to attribute filters as well (Breen and Jones 1996, Cheng and Venetsanopoulos 1992, Vincent 1993). The principle works by thresholding the image at all possible grey levels, then applying the filter to each level and finally stacking the results. The threshold set Th at level h is defined as Th (f ) = {x ∈ E | f (x) ≥ h}. (6.7) The grey scale variant ψΛ of any increasing, binary attribute filter ΨΛ can be defined as ψΛ (f )(x) = sup{h | x ∈ ΦΛ (Th (f ))}.

(6.8)

This means that for each pixel the highest gray value h is chosen such that x is still member of a connected component in the threshold set which meets the criterion Λ (Breen and Jones 1996). Connected components Phk of threshold sets Th are referred to as peak components (Meijster and Wilkinson 2002). In the non-increasing case, (6.8) can be used directly, or it can be replaced by other filtering rules (Breen and Jones 1996, Salembier et al. 1998, Urbach et al. 2007). Many of these are based on the Max-tree or component tree structure discussed in Section 6.2.3.

6.2.2

Vector Attribute Filters

Using a single value to characterize each component is useful if the desired structures can be separated easily from the undesired structures. In the case of noise filtering, small, low contrast structures can readily be removed using area (Cheng and Venetsanopoulos 1992, Vincent 1993), or volume (Vachier 1998). Blood vessel segment in 3D angiography has been achieved using a 3-D counterpart of Hu’s first moment invariant (Wilkinson and Westenberg 2001, Westenberg et al. 2007). In general, however, a single value has comparatively little discriminating power. For this reason, vector-attribute filters have been proposed by Urbach et al. (Urbach et al. 2005). Urbach et al. (Urbach et al. 2005), start out by considering multivariate attribute thinnings. A multivariate attribute thinning Φ{Λi } (X) with a set of scalar attributes {τi } and

74

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

their corresponding criteria {Λi }, with 1 ≤ i ≤ N , is defined such that connected components are preserved if they satisfy at least one of the criteria Λi (C) is met and are removed otherwise: N [ {Λi } Φ1 (X) = (6.9) ΦΛi (X). i=1

or alternatively, components are preserved if they meet all of the criteria, yielding {Λi }

Φ2

(X) =

N \

ΦΛi (X).

(6.10)

i=1

This does not really add anything new, in the sense that each property is considered in isolation, and that if the criteria have the form Λi (C) = (τi (C) ≥ λi )

(6.11)

with λi the attribute thresholds, we have N parameters to set, which is cumbersome. In the feature space spanned by the attribute values, we only can select or reject objects in some oblong region. Alternatively, the set of scalar attributes {τi } can also be considered as a single vector-attribute ~τ = {τ1 , τ2 , . . . , τN }, in which case a multivariate attribute thinning in (6.9) can be defined using a vector criterion Λ~~τλ (C) = ∃i : τi (C) ≥ λi

for 1 ≤ i ≤ N ,

(6.12)

with ~λ the attribute threshold vector. Given the reasoning above, a more useful criterion given by (Urbach et al. 2005) is Λ~~τr,ǫ (C) = d(~τ (C), ~r) ≥ ǫ,

(6.13)

in which ~r is a reference vector, ǫ is a dissimilarity threshold, and d : RN × RN → R is a dissimilarity measure, i.e. d(~a, ~b) ≥ 0, and d(~a, ~b) = 0 implies ~a = ~b. Inserting (6.13) into (6.4) yields the binary vector-attribute thinning Φ~~τr,ǫ (X). This removes the connected components of a binary image X whose vector-attributes differ less than a given quantity from a reference vector ~r ∈ RN . Alternatively, as suggested by (Naegel and Passat 2009), we change the inequality in (6.13) to Λ~~τr,ǫ (C) = d(~τ (C), ~r) ≤ ǫ,

(6.14)

in which case we retain the objects which are more similar to a given reference vector than a given threshold. This is useful if we want to detect structures or objects in images rather than remove them. In the following we will use (6.14) for detection of traffic

6.2. Previous Work

75

signs. The simplest way to obtain useful reference vectors is by computing the vector attributes for some reference object S, i.e. Λ~τS,ǫ (C) = d(~τ (C), ~τ (S)) ≤ ǫ.

(6.15)

Finally, as remarked in (Urbach et al. 2005) we can use multiple reference objects {Si }, and use the criterion ^  ~ τ Λ{Si },ǫ (C) = d(~τ (C), ~τ (Si )) ≤ ǫ, (6.16) i

i.e, the lowest dissimilarity yields determines whether or not a connected component is preserved. Any dissimilarity measure d can be used to quantify the difference between the two vectors. The most obvious choices suggested in (Urbach et al. 2005) are distances in RN . In our software, we implemented L1 , L2 and L∞ norms of ~τ (C) − ~τ (Si ). When multiple references are available (at least N + 1) the Mahalanobis distance to the centroid of the distribution of references can also be used (Naegel and Passat 2009). Key issues in developing vector-attribute filters are selecting the most appropriate feature space by choosing the right ~τ , and determining the dissimilarity threshold ǫ. Vector attribute filters can be extended to gray scale in the same way as regular attribute filters. The main algorithmic approaches are the pixel-queue algorithm (Breen and Jones 1996, Jones 1999, Vincent 1993), the Max-tree or component-tree approach (Salembier et al. 1998, Najman and Couprie 2006) and the union-find method (Meijster and Wilkinson 2002, Tarjan 1975). Because most vector-attributes used here are based on shape, and are therefore scale-invariant, we must use an approach which deals well with non-increasing attributes (Urbach et al. 2007). This means the Max-tree approach is best (Meijster and Wilkinson 2002, Salembier and Wilkinson 2009).

6.2.3

The Max-tree Approach

The Max-tree image representation arranges all peak components of the image into a tree with the root node representing the image domain E. Each leaf corresponds to a regional maximum. Once the tree is constructed, it can be processed further by modifying the tree (Urbach and Wilkinson 2002, Salembier et al. 1998). The filtering process is separated into three stages: construction, filtering and restitution. During the construction phase, the Max-tree is built from the flat zones of the image, collecting auxiliary data used for computing the node attributes at a later stage. Several algorithms to compute Max-trees have been published, for details see (Salembier et al. 1998, Najman and Couprie 2006, Wilkinson et al. 2008). Once the attribute values have been stored in the Max-Tree nodes, comparisons are applied to decide whether or not they should be retained. Filtering is performed by

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

76

C31

C30 D2 E3

C1

F3

?

C20

  @ @ R 

F

?   @ @ R 

 

D C11

C10 B1

E

 

@ @ R

B

C @ @ R

C00

A0 (a)

(b)

A (c)

Figure 6.1: Original image (a), corresponding peak components (b) and the resultant max-tree (c).

identifying and removing the nodes that do not fulfil the attribute criterion Λ (Salembier et al. 1998, Urbach et al. 2007, Ouzounis and Wilkinson 2010). The final phase is restitution, which consists of transforming the modified Max-tree into an output image. The simplest or Direct rule to restitute an image is by direct implementation of (6.8). In this case, if a node has been preserved,it retains its original gray value. If a node has been removed it is assigned the gray level of the nearest preserved ancestor in the tree (Salembier et al. 1998). As a result, the gray level values of the original image are assigned to the pixels of the preserved nodes, and no new gray levels appear in the image. Within restitution, a decision also has to be made concerning the descendants of a removed node. The Min decision removes a node if any of its ancestors is removed (Breen and Jones 1996), Max preserves a node if any ancestor is preserved (Breen and Jones 1996) while Viterbi uses the Viterbi optimization algorithm to select a correct pruning point in a branch of the tree (Salembier et al. 1998). The Subtractive rule lowers a node’s descendants by the same amount as its own. Figure 6.1 illustrates an artificial image having six connected components with each one labeled with a letter and a flat zone value. The resultant Max-tree at levels h = 0, 1, 2, 3 is shown in b) and the components that correspond to the Max-tree leaves are given in c). Filters based on the Max-tree remove or detect bright features in the image. To remove or detect dark features a Min-tree needs to be constructed (Salembier et al. 1998), in which the order is reversed, i.e., the leaves are the regional minima. The simplest way to construct these is by inverting an image, performing Max-tree-based processing, and inverting the result.

6.2. Previous Work

6.2.4

77

Color Max-trees

Color Max-trees are more difficult to construct, because the entire notion of a Max-tree is based on thresholding, and hence on a total order (Naegel and Passat 2009, Angulo 2007, Tushabe and Wilkinson 2010). Defining a meaningful total order for color connected operators is not trivial. This is because multi-variate color information is not inherently ordered. Color processing using Max-trees can be achieved through a marginal processing, in which the individual components are processed separately and the results combined later. This kind of ordering has been found to ignore the intricate intercomponent dependencies and to cause unexpected color changes in the resultant images (Naegel and Passat 2009). Therefore, other vectorial orderings have been proposed. Lexicographical (or conditional or hierarchical) ordering prioritize some components over others and execute comparisons beginning with the first priority component. In (Angulo and Serra 2003), priority is given to the red component, followed by green and the red one in order of priority. In (Yu et al. 2004), it is the lightness, saturation and hue, in order of priority. In lexicographical ordering, one component is more important than others, which can lead to counterintuitive results. Alternatively we can use a total preorder, e.g. by deriving a scalar value from the 3-D vectorial components (Naegel and Passat 2009, Angulo 2007), and performing the processing based on this scalar image, as was used in component-tree-based image processing in (Naegel and Passat 2009, Tushabe and Wilkinson 2010). A total preorder on our color space T is any binary relation ≤ which is 1. reflexive: a ≤ a is true, 2. transitive: a ≤ b ∧ b ≤ c ⇒ a ≤ c, 3. total: (a ≤ b) ∨ (b ≤ a) is true. In order for a total preorder to become a total order, and extra property is needed: 4. antisymmetric: (a ≤ b) ∧ (b ≤ a) ⇒ a = b. In practice this means that in a Max-tree based on a total preorder, rather than a total order, pixels with different colors may end up in the same node. Care must be taken when restituting the image to ensure color fidelity, as discussed in (Tushabe and Wilkinson 2010). In the following we use software developed in (Tushabe and Wilkinson 2010), and adapt it to vector-attribute filtering.

78

6.3

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

The Color-Vector Attribute Filter

This work proposes a filter that enables processing of a multi-variate image according to its color information. This extends the color Max-tree described in (Tushabe and Wilkinson 2010) to vector attributes, and includes filtering based on color and shape similarities. The special areas that have been tackled are within construction and filtering of the Max-tree based on color information.

6.3.1

The Preorders

As in (Tushabe and Wilkinson 2010), we build a Max-tree by computing a scalar image from the color image, by using luminance, saturation, or combinations, combined with the usual total order on gray scale pixels. This scalar image is used to drive the flooding algorithm from (Salembier et al. 1998) adapted in such a way that color information is stored in each Max-tree node (Tushabe and Wilkinson 2010). In this way, the structure of the Max-tree is dictated by the order image, whereas the original color image is used to store the mean R, G and B values of the pixels within each node Chk . Luminance is the brightness or intensity of the color while saturation is how concentrated a color is, i.e. how strongly it deviates from grey, for a given luminance. This means that if we use luminance to define the preorder on the color data, we build the same topology Max-tree as in the case of an ordinary gray-scale image of the same scene. Bright features end up in the leaves of the Max-tree, and conversely, dark features end up in the leaves of the Min-tree. By contrast, if we use saturation as a the basis for a preorder, a primary red feature may end up in a Max-tree leaf, but a white feature would end up in the root of the tree, or in the leaves of the Min-tree. If we want to use shape to characterize features, we must try to ensure these features are near the leaves of the Max-tree, so that their shape is unchanged by merger with other image features. If we know the color of the feature as well, we can facilitate detection by choosing the appropriate order. An example can be seen in Fig. 6.2, which shows a reference image of a traffic sign. The most distinctive white component shown in Fig. 6.2(b) is best detected using luminance preorder, and is particularly effective for shape-based filtering. By contrast, the highly saturated blue component shown in Fig. 6.2(c) is best detected using a saturation-based order, and is best used for color-based filtering, though its shape can also be used for filtering. In (Tushabe and Wilkinson 2010), lexicographic ordering of luminance and saturation was tested to determine whether filtering could be improved, by combining the two in a single preorder. The results showed that the most significant component completely dominated the result, and the the additional information given by the least significant component was negligible. Therefore we explore a different combination of the two in

6.3. The Color-Vector Attribute Filter

79

this work. One problem is that saturation or chromaticity information becomes unreliable at low luminance values, due to poor signal-to-noise. We can therefore create a new scalar image βc from the luminance and saturation images, by using luminance at low luminance levels, and switching to saturation at high luminance levels, as follows βc = ws (L)S + (1 − ws (L)) ∗ L

(6.17)

with L the luminance, S the saturation, and ws (L) a luminance dependant, sigmoidal weight function given by Lp ws (L) = p (6.18) L + Kp with K a tunable balance point, at which ws = 0.5, and p a parameter determining the steepness of the sigmoidal function. In our case we selected p = 6 quite arbitrarily. This definition of βc allows us to combine the two components in a flexible way. Some color spaces have inbuilt luminance values like the first component of the L*a*b*, YIQ, or YCbCr color space, and the third component in the HSL space. For the color spaces that do not, such as the XYZ and RGB space, we define luminance as the weighted average of the 3 components. For example, the luminance of an RGB image is defined as: L = 0.299R + 0.587G + 0.114B, (6.19) which is the same as in the HSL color space. Some color spaces have inbuilt saturation values like the second component of the HSV or HSL color space. For classical tristimulus spaces such as RGB, we define saturation as: ( 0 if max(R, G, B) = 0 S = max(R,G,B)−min(R,G,B) (6.20) otherwise. max(R,G,B) and similarly for XYZ. For luminance/chromaticity spaces L*a*b* and YCbCr we use the Euclidean norms of the chromaticity vectors (Cr,Cb)T and (a*,b*)T .

6.3.2

Using Color-Vector Attributes

As stated before, each Max-tree node contains the mean R, G, and B values of the pixels in node Chk , i.e. the pixels in the peak component Phk for which the preorder scalar value equals h. Thus, any pixels within this peak component with higher preorder values are not included in this average. In (Tushabe and Wilkinson 2010), this color information is only used for restitution purposes. Here we use it to compute color attribute vectors for each node. This is done by a single pass through the Max-tree, and computing the desired color attribute vector. These color attribute vectors are computed

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

80

(a)

(b)

(c)

Figure 6.2: An example reference traffic-sign image: (a) original image; (b) reference shape for most distinctive white component; (c) reference shape for most distinctive blue component.

by converting the RGB data to one of 6 color spaces: RGB, CIE XYZ, HSV, CIE L*a*b*, YCrCb and YIQ. To reduce the effect of variability in exposure and illumination, the luminance component in the latter three color spaces can be given a lower weight in the dissimilarity measure. For details on color spaces see (HunterLab 2008, Tushabe and Wilkinson 2010). Once the Max-tree has been constructed, node pruning is based on how similar the color vector of the component is to a reference color vector. This approach has been inspired by (Urbach et al. 2005) and the specific method of Max-tree filtering goes as follows: 1. Read the reference image and derive its corresponding reference vector, ~r. In these experiments, we offer the program two image: a color traffic-sign reference image as in Figure 6.2 (a), and a binary image containing only one connected component, as in Figure 6.2 (b) and (c). The program then computes ~r from the color values in pixels corresponding to the white pixels in the binary image. 2. Derive the associated color vectors, ~τc , of all the connected components represented in the Max-tree of the the input image. 3. Prune the tree based on the similarity of the vectors ~r and ~τc . Similarity of the two vectors is measured using the Euclidean distance. Nodes are preserved when a component has a dissimilarity value that is less than a desired similarity threshold, ǫ. In all cases we use the direct filtering rule, which implements (6.8).

6.3.3

The Shape Attributes

For comparison purposes, traffic-sign recognition based on four shape filters has been tested. For speed of computation, all are based on moment invariants, which are amongst

6.3. The Color-Vector Attribute Filter

81

the most classical shape descriptors (Hu 1962). The moments used are described in the following.

Normalized Central Moments The normalized central moments of order p + q (Gonzalez and Woods 2002) are translation and scale invariant and derived from the central moments which are defined as: µpq =

X−1 −1 X YX x=0 y=0

(x − x¯)p (y − y¯)q f (x, y)

(6.21)

m01 10 and y¯ = m are the with X and Y the width and height of the image, and x¯ = m m00 00 coordinates of the centroid and mpq represents the geometric moments of order(p,q) and defined as:

mpq =

X−1 Y XX

xp y q f (x, y)

(6.22)

x=0 y=0

Central moments are translation invariant, but not scale invariant. For reasons of efficiency, we compute central moment directly from geometric moments. The reason for this is that geometric moments can be computed very efficiently within the Maxtree attribute management scheme, because any geometric moment of the union of two disjoint regions is simply the sum of the moment of each region. This is not the case for central moments, due to the shift in center of mass. Once the geometric moments have been computed for each node of the Max-tree, simple post-processing can be used to compute the central moments from geometric moments (Gonzalez and Woods 2002). The normalized central moments can be computed from the central moments as: ηpq = where γ =

p+q 2

µpq µγ00

(6.23)

+ 1.

Hu’s moment invariants Hu’s moment invariants (Hu 1962) are invariant to rotation, scaling and translation. They consist of seven moment invariants that are derived from the normalized central

82

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

moments and defined as (Hu 1962): φ1 = η20 + η02 , 2 φ2 = (η20 − η02 )2 + 4η11 ,

φ3 = (η30 − 3η12 )2 + (3η21 − η03 )2 ,

φ4 = (η30 + η12 )2 + (η21 + η03 )2 ,

φ5 = (η30 − 3η12 )(η30 + η12 )[(η30 + η12 )2 − 3(η21 + η03 )2 ]+ (3η21 − η03 )(η21 + η03 )[3(η30 + η12 )2 − (η21 + η03 )2 ],

φ6 = (η20 − η02 )[(η30 + η12 )2 − (η21 + η03 )2 + 4η11 (η30 + η12 )(η21 + η03 )],

φ7 = (3η21 − η03 )(η30 + η12 )[(η30 + η12 )2 − 3(η21 + η03 )2 ]+ (η30 − 3η12 )(η21 + η03 )[3(η30 + η12 )2 − (η21 + η03 )2 ].

(6.24)

Note that Hu’s moment invariants are limited to orders 2 and 3. Complex moment invariants Complex moments (Flusser and Suk 2003) can be defined for any order, and are defined as ∞ X ∞ X cpq = (x + iy)p (x − iy)q f (x, y) (6.25) −∞ −∞

where i denotes imaginary unit. For our purposes, a more efficient representation is one using normalized central moments ηpq from (6.23), i.e., cpq =

p q    X X p q k=0 j=0

k

j

(−1)q−j · ip+q−k−j · ηk+j,p+q−k−j .

(6.26)

Note that this differs from the definition given by Flusser and Suk (Flusser and Suk 2003). In (Flusser and Suk 2003) the notation suggests ordinary geometric moment from (6.21) should be used, which does not yield either scale or translation invariance. Flusser and Suk show that the cpq given by (6.26) are rotation invariant by expressing (6.26) in polar coordinates. Equation (6.26) yields an infinite number of moment invariants of any order, but not all are independent. Flusser and Suk go on to construct a basis of independent complex moment invariants. Let us consider complex moments up to some order r ≥ 2, an invariant basis set B can be constructed as follows: Φ(p, q) ≡ cpq cp−q q0 p 0 ∈ B

∀p, q | p ≥ q ∧ p + q ≤ r,

(6.27)

6.3. The Color-Vector Attribute Filter

83

where p0 and q0 are arbitrary indices such that p0 + q0 ≤ r, p0 − q0 = 1, and cp0 q0 6= 0. When using, for instance, p0 = 2 and q0 = 1, the basis of invariants of the second and third order are: Φ(1, 1) = c11 , Φ(2, 1) = c21 c12 , Φ(2, 0) = c20 c212 , Φ(3, 0) = c30 c312 .

(6.28)

As noted in (Flusser and Suk 2003), Hu’s invariants can be expressed in terms of complex moments as follows, φ1 = c11 , φ2 = c20 c02 , φ3 = c30 c03 , φ4 = c21 c12 , φ5 = Re(c30 c312 ), φ6 = Re(c20 c312 ), φ7 = Im(c30 c312 ),

(6.29)

where Re() and Im() denote the real and imaginary parts, respectively. This observation means that complex moment invariants should have the same expressive power as those of Hu, when limited to orders 2 and 3. Complex moments can be extended to higher orders, however.

Affine moment invariants In (Flusser and Suk 1993), Flusser and Suk derive moment invariants which are invariant under general affine transformations, given by u = a0 + a1 x + a2 y, v = b0 + b1 x + b2 y,

(6.30)

in which (x, y) and (u, v) the coordinates before and after the transformation, respectively. Any affine transformation can be composed of zero or more linear transformations, such as scaling, rotation, shearing, and translation. The four invariants up to

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

84 order three are

1 (µ20 µ02 − µ211 ), µ400 1 I2 = 10 (µ230 µ203 − 6µ30 µ21 µ12 µ03 + 4µ30 µ312 + 4µ03 µ321 − 3µ221 µ212 ), µ00 1 I3 = 7 (µ20 (µ21 µ03 − µ212 ) − µ11 (µ30 µ03 − µ21 µ12 ) + µ02 (µ30 µ12 − µ221 )), µ00 1 I4 = 11 (µ320 µ203 − 6µ220 µ11 µ12 µ03 − µ220 µ02 µ21 µ03 + 9µ220 µ02 µ212 + 12µ20 µ211 µ21 µ03 µ00 I1 =

+ 6µ20 µ11 µ02 µ30 µ03 − 18µ20 µ11 µ02 µ21 µ12 − 8µ311 µ30 µ03 − 6µ20 µ202 µ30 µ12 + 9µ20 µ202 µ221 + 12µ211 µ02 µ30 µ12 − 6µ11 µ202 µ30 µ21 + µ302 µ230 ).

(6.31)

More invariants can be derived, however, higher order moments are dependent. Experiments on the recognition of satellite images show that affine moments can also be used for projective deformed objects (Flusser and Suk 1993), which may be relevant to the problem of traffic-sign detection, whenever the traffic signs are not seen face-on.

6.4

Experiments

We implemented the shape and color vector-attribute filters in single C program with a graphical user interface was developed in fluid / fltk. The GUI was customized to enable a user to easily select the dissimilarity threshold ǫ, luminance weight, the tipping point K, and color space used (RGB, XYZ, HSV, YCrCb, YIQ, and L*a*b*), and the moment invariants (normalized central, Hu, Complex and Affine), and number of orders, if applicable. Initial tests showed that many false positives occurred if all nodes of a Max-tree were used. This was mainly due to very many small nodes, usually caused by noise, or small texture features in foliage, etc. This affected the shape attribute filters more than the color attributes, though when looking for red structures, rear lights of cars could cause similar false positives. Therefore, we introduced an area threshold, set by default to 10 pixels, on the nodes to be considered by the vector-attribute filters. This has two advantages: first we remove many spurious responses to noise and texture features, and second, we reduce the number of nodes on which we need to compute the vector attributes and dissimilarity measure considerably, leading to a modest speed increase. The image database consisted of 48 PPM / PGM images obtained from (Grigorescu and Petkov 2003), see Figure 6.3. The images are divided into three equal classes, each containing the traffic signs mandatory bike lane, crossing with right of way, and pedestrian crossing, some of which may have multiple variants, as shown in Figure 6.3.

6.4. Experiments

85

(a)

(b)

(c)

(d)

(e)

Figure 6.3: The reference images of traffic signs, and examples from the database: (a) mandatory bike lane, (b) crossing with right of way (three variants), (c) pedestrian crossing, and (d) and (e) example images from the database from (Grigorescu and Petkov 2003), showing multiple traffic signs.

A ground truth was manually identified for all the images, by marking a region of interest (ROI) containing the expected traffic sign. A true positive identification (T P R) is when a node in the ROI is correctly identified by the proposed filter and is defined as TPR =

I J

(6.32)

where I is the total area of the true positive nodes identified in the ROI and J is the total area of true positive nodes possible. Similarly, the false positive rate F P R is defined as FPR =

I′ . J′

(6.33)

where I ′ is the total area of false positive nodes identified in ROI and J ′ is the total area of false positive nodes possible in whole image. The experiments tested the performance of the proposed color filter and compared it with a shape filter implemented using the same method.

86

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

(a)

(b)

(c)

Figure 6.4: Image filtered using RGB reference color (0, 102, 255): (a) input image (b), T = 0.10, and at T = 0.38

6.4.1

Results

Color Vector-Attribute Filters As the dissimilarity threshold ǫ increases, the color filter first quickly recognizes the components that are most similar to the reference color. Figure 6.4 shows an input image containing a blue traffic sign, objects of lighter shades of blue (the balconies) and those of other colors. It can be observed that among the first components detected at ǫ = 0.10 is the blue traffic sign, followed by the other blue objects at ǫ = 0.38. At ǫ = 0.10 recognition of the traffic sign is not perfect, due to specular reflections in the traffic sign. As the dissimilarity threshold increases, the other components are also identified. Comparisons of using the luminance and saturation preorders was conducted. The results show that for this data set, the color filter using saturation preorder detects the desired structures at lower ǫ = 0.10 than when using luminance preorder. This can be explained by the fact that in the saturation preorder the structures of interest are in the

6.4. Experiments

87

leaves of the tree, because these traffic signs structures were deliberately designed with highly saturated colors. When the structures are not in the leaves of the tree, they are more likely to be contaminated with other colors, because we use the mean R, G, and B values of each node to compute the final color attribute vector. Figure 6.5 shows the difference between the preorders, using red and blue reference colors to identify image components based on pure luminance and pure saturation. Figure 6.5(b) shows how the luminance filter recognized the red parts of a traffic sign at ǫ = 0.57 while in Figure 6.5(c) saturation identified it ǫ = 0.3. The same holds for blue traffic signs in Figure 6.5(d) and (e) respectively. In Figure 6.5(c), red parts of the cabin of the truck are also detected, unsurprisingly indicating that color alone is unable to detect traffic signs unambiguously. A curious feature of the filter is that when using luminance preorder, the filter appears to detect structures nearer the leaves which are in fact white. This is because the color vector attribute is computed only from the pixels belonging to the node itself, without mixing any color information from peak components nested within it. Thus, when light blue pixels at the edge of the white region together form a peak component, only the color of the lowest-luminance pixels is included in the color vector attribute. Therefore, when filtering based on color they are among the first to be picked up. The saturation attribute calculated from six types of color images was also tested. These are from RGB, HSV, XYZ, YIQ, YCbCr and Lab color spaces. The 48 images were filtered at the various similarity levels ranging from 0 ≤ ǫ ≤ 1 and their performance quantified by plotting ROC curves. The results are summarized in Figure 6.6, which shows the F P R vs. T P R of all the color spaces. The best performance was registered by the HSV color space that has an AUC of 0.84, followed by RGB at AUC= 0.82 while the least performance was by XYZ at an AUC= 0.615. Comparison with Shape Filters Figure 6.8 shows an image which is filtered using shape and color at a dissimilarity threshold ǫ = 0.46. It can be observed that the shape filter accurately recognizes the traffic signs based on their shapes. A comparison of the four shape filters of Hu, Complex and Affine moment invariants as well as the normalized central moments was conducted. The best performance was registered by the normalized central moments with an AU C = 0.71, followed by Hu and Complex moments at AU C = 0.64 and the affine moments at AU C = 0.62. Details are illustrated in Figure 6.7. These results mean that with the current implementation, the color filter performs better than the shape filter since the best color filter gives an AUC = 0.84 in comparison with the best shape filter which has AUC = 0.71. Additionally, the results show that the shape filter is more suited using luminance preorder for the Max-tree, unlike the color

88

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

(a)

(b)

(c)

(d)

(e)

Figure 6.5: Color filtering using different preorders: (a) input image; (b) using luminance preorder, and reference color (R, G, B) = (255, 0, 0); (c) using saturation preorder and same reference color; (d) and (e) same as (b), and (c) but for reference color (R, G, B) = (0, 102, 255).

filter which thrives better with saturation preorder.

6.4. Experiments

89

1

0.9 HSV 0.8

RGB YCbCr

0.7

LAB

0.6 TPR

YIQ XYZ

0.5

0.4

0.3

0.2

0.1

0

0

0.1

0.2

0.3

0.4

0.5 FPR

0.6

0.7

0.8

0.9

1

Figure 6.6: Comparison of the different color spaces

6.4.2

Combining Color and Shape

We tried a simple combination of color and shape filtering by concatenating the attribute vectors and assigning equal weight to shape and color information. We used a compound scalar according to (6.17) to provide the preorder, tested the filters using the best shape and color attributes from the above comparison. Th tipping point constant K was set to 256, which means the order is largely based on luminance, and that at the highest intensities, saturation start influencing the ordering. The results can be seen in Figure 6.9. As can be seen, this simple scheme to combine both forms of information performs worse than color alone. A more complex method of combining shape and size information is needed.

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

90 1

0.9

0.8

0.7

TPR

0.6

0.5

0.4

0.3

Normalized FSAffine FSComplex Hu

0.2

0.1

0

0

0.1

0.2

0.3

0.4

0.5 FPR

0.6

0.7

0.8

0.9

1

Figure 6.7: Results from the four shape filter

6.5

Conclusions and future work

This work proposes a color vector-attribute filter that identifies image components based on their colors. To deal with the lack of order in color spaces, the color vector-attribute filter uses a preorder based on any scalar value that is calculated from the luminance and saturation components of the image, as in (Naegel and Passat 2009, Tushabe and Wilkinson 2010). Filtering of the image is performed by comparing the original color vector with a pre-determined color reference vector. The level of dissimilarity of two vectors is what determines whether a node will or will not be preserved. Any color space can be used as vector attribute, where the optimal performance will depend on the application. Shape information from moment invariants is also useful in the traffic-sign detection method shown. Somewhat surprisingly, normalized central moments performed better than the more advanced moment invariants. This may be due to the fact that the orientation of a traffic sign is important, so rotation invariance may be unnecessary, or

6.5. Conclusions and future work

91

(a)

(b)

(c)

Figure 6.8: Examples of shape and color filtering: (a) input image, (b) filtered using shape, and (c) filtered using color.

even detrimental to performance. It is difficult to find an optimal dissimilarity threshold ǫ which suits all traffic sign components equally, as reflected in the ROC curves and AUC values obtained. Setting individual thresholds would be better, but does mean more parameters need to be set. The same holds for color, though the effect appears to be smaller. These problem could be approached by using some distance-based machine learning scheme, such as LVQ and its generalizations (Kohonen 1990, Hammer et al. 2005), which could learn from examples. Though considerable training would be needed, the filtering framework need not be changed at all, because the trained classifier could simply be inserted in the filtering algorithm. This extension is currently under development. The proposed color filter is fast, reaching up to 50 frames per second on our 360×270 images, and computationally cheap on 8 bit-per-pixel preorder images (O(GN ), with N the number of pixel, and G the number of preorder levels). It provides a flexible way of choosing preorders, color spaces, and moment invariants, depending on the nature

6. Color Vector-Attribute Filtering: an Application to Traffic-Sign Recognition

92 1

Color

0.9

0.8

Both

0.7 Shape

TPR

0.6

0.5

0.4

0.3

0.2

0.1

0

Shape Both Color 0

0.1

0.2

0.3

0.4

0.5 FPR

0.6

0.7

0.8

0.9

1

Figure 6.9: ROC curves for compound preorder according to (6.17), color (HSV), shape (normalized central) and a combination with equal weight assigned to color and shape in the combined attribute vector.

of the image data set in use. When tested for road traffic sign recognition, the color attribute proved to be superior to shape. Evidently, the saturated colors used in road signs stand out in natural images. However, the most striking shape features in traffic signs seem to be best detected in the luminance preorders, whereas the colored components stand out in the saturation preorders. This difference complicates the obvious extension of the filtering scheme, i.e. combining color and shape. Two solutions are possible: (i) computing two different Max-trees, one for each preorder, or (ii) computing different types of trees such as the binary partition tree (Salembier and Garrido April, 2000), the auto-dual level-line tree (Monasse and Guichard 2000a, Monasse and Guichard 2000b), or based on quasi-flat-zone hierarchies as in (Soille 2005, Soille 2008). The advantage of the former is that it is fast, and we can make use of an available parallel algorithm to speed up computation if necessary. The alternative may be better in detecting both kinds of features. For example, the level line tree is auto-dual and puts both minima and maxima in the leaves. Like the Min-tree and Max-tree from which it is computed, it needs a preorder. We could compute a single scalar to obtain our preorder, based

6.5. Conclusions and future work

93

on distances to reference colors (Angulo 2007). These distances are then combined in such a way that the saturated components end up in the maxima, and the distinctive white components in the minima. This would not only allow detection of both types of structures in a single tree, but would also encode the inclusion relations between the components of a traffic sign. This could facilitate detection. A comparison of these methods is ongoing. Since the proposed method is in principle able to distinguish all types of traffic signs from pedestrians crossing, bicycles, stop, railway, etc., from a single component tree, the method could be an improvement from the current state-of-the-art in production cars, that only automatically distinguishes speed limit and stop signs (Klette et al. 2009). A head to head comparison with other methods (Goedem´e 2008, Piccioli et al. 1996, Bahlmann et al. 2005) is needed once the key improvements discussed above have been made. The current code base has a number of features that we have yet to test thoroughly. In particular, combining luminance and saturation in the preorder was not tested fully. Furthermore, we have started combining shape and color information through a system of weights assigned to each set of vector attributes. The source code and data are available upon request.

Chapter 7

Summary Even the best dancer leaves the stage A Ugandan proverb.

This thesis uses an experimental approach to the practical applicability of connected attribute filtering. The effect of five attribute filters on existing watermarks of an image has been investigated. The image is first embedded with a watermark, filtered to visual losslessness and then the presence of the watermark detected. Seven watermarking algorithms and five attribute filters have been used. The attributes are Volume, Graylevel, Power, Area and Vision. The results showed that on average, 92% of the watermarks were detected. This means that attribute filtering is a largely safe procedure as a preprocessing method for any given application. This work also proposes attribute filters that improve image compression results in terms of file size and image quality. These filters remove psycho-visually redundant data from an image by mimicking how the human visual system works. The image still retains its visual losslessness despite modification of as much as 35 - 40% of its content. The Volume attribute provided the best compression results for jpeg and jpeg2000 compression algorithms while the Vision attribute produced the best quality for LZW compression algorithm and improved compression ratios by as much as 20%. A shape filter that improves content-based image retrieval results has also been suggested. This filter has a high object discriminating power and is scale, rotation and translation invariant. It has been developed for applications that identify objects in general purpose vacation pictures. A multi-scale analysis of the image based on shape, size and color information is conducted. This granulometric operation extracts information about object size, entropy, compactness and non-compactness. Color information is included by conducting a marginal processing of data from the red, green and blue channels of the image. The theoretical contributions of this work revolve around the implementation of connected attribute filters by using the Max-tree approach. Two attribute filters that represent Max-tree node pruning (and not the peak nodes) have been defined. These are the Vision and VisionP filters that represent Volume and Power attributes respectively. These filters have produced outstanding performance when applied within both grayscale and color compression.

96

7. Summary

Secondly, a new method of implementing color connected filters, using component trees, through total preorders has been proposed. The Max-tree image representation is customized to accommodate color and other vectorial images. The proposed algorithms allow any total order or preorder to be used. An algorithm to compute the entropy attribute for all nodes in O(N ) memory complexity is proposed, as opposed to O(N 2B ), with B the number of bits per pixel. The latter is essential when using preorders which require more than 8 bit per pixel to order the pixels. Two color filtering methods were explored. The first one uses vector similarity as measured by the Euclidean distance. When tested on automatic traffic signs recognition, the proposed theories were found to result in node recognition of 85%. The second filtering method used a scalar value computed using six different color preordering schemes based on luminance, chromaticity or saturation or combinations. The second filtering method improves color fidelity by the application of alternative restitution methods. Three image restitution rules have been introduced. The first is the Mean of the parent decision that assigns a removed node the mean color of its closest surviving ancestor. The second is the nearest color decision that assigns a removed node the closest color from the pixel of its nearest preserved ancestor. The third restitution rule is the nearest neighbor decision which assigns a removed node the color of the spatially nearest pixel in the first preserved ancestor. These resistitution rules have resulted in an improvement in color fidelity since the proposed decisions do not introduce new colors or artifacts. Though the main focus of this work was on quasi-auto-dual filters (MinMax and MaxMin), filtering the image using the Min operation has been found to suite the luminance order while saturation and chromaticity are better suited with the Max operation, on several images. The results also show that in general, luminance orders result in higher quality images as well as lower file sizes. Comparison with an earlier method show that the proposed methods result in quality improvement by as much as 15%. Algorithms for the computation of all the proposed attribute filters have been made, including one to compute the entropy for all nodes in O(N ) memory complexity, as opposed to O(N 2B ), with B the number of bits per pixel. This results in a faster method and is essential when using preorders which require more than 8 bit per pixel to order the pixels.

Samenvatting

Dit proefschrift gebruikt een experimentele aanpak om de toepasbaarheid van z.g. attribuut filters te onderzoeken. Attribuutfilters zijn morfologische filters die werken op het niveau van samenhangende componenten van het beeld, en niet op pixel basis. Eerst is onderzocht of attribuutfilters in staat zijn om digitale watermerken te verwijderen. Digitale watermerken worden in beelden aangebracht, om herkomst en auteursrecht te waarborgen. Het doel van dit experiment was om na te gaan of de attribuutfilters de watermerken kunnen verwijderen zonder de beeldkwaliteit aan te tasten. Zeven watermerk methoden zijn getest tegen vijf attribuut filters. Deze filters gebruiken respectievelijk volume, grijswaarde, “power”, oppervlak and “vision” criteria tijdens het filteren. In 92% van de gevallen bleken de watermerken robuust tegen deze “aanval.” Dit betekend dat attribuut filters in de meeste gevallen veilig kunnen worden toegepast zonder dat auteursrechtelijke bescherming door watermerken verwijderd wordt. Daarnaast is aangetoond dat attribuut filters als voorbewerkingsmethode compressie verbeteren. Attribuut filters zijn in staat om psycho-visueel onbelangrijke structuren in het beeld te verwijderen, zodat de informatie-inhoud kleiner wordt, zonder dat de visuele indruk verandert. Zelfs als 35 – 40% van de (meest kleine) beeld structuren wordt verwijderd, is er schijnbaar nog niets aan de beelden veranderd. Met het volumeattribuut filter werden de beste resultaten geboekt op jpeg en jpeg2000 compressie, terwijl het “vision” attribuut het beste scoorde op Lempel-Ziv en Welch (LZW) compressie. De compressie werd bij dezelfde visuele kwaliteit, tot 20% beter. Attribuut filters gebaseerd op vorm zijn ook gebruikt voor z.g. content-based image retrieval (CBIR). Dit filter is in staat om beeldanalyse te verrichten die invariant is voor veranderingen in schaal, rotatie, en translatie. De methode is aangepast voor het automatisch terugvinden van foto’s in een database, puur gebaseerd op de beeldinhoud van een voorbeeld foto. Om dit te bereiken is een multi-schaal, en multi-vorm analyse verricht op kleurenbeelden, en de resultaten worden samengevat in een schaal/vorm-

98

Samenvatting

spectrum dat ook kleurinformatie bevat. Deze granulometrie¨en verkrijgen informatie over grootte, entropie, compactheid, en langgerektheid in het beeld, op ieder van de drie kleurbanden (rood, groen en blauw). De theoretische bijdrage in dit proefschrift betreft vooral de ontwikkeling van attribuut filters gebaseerd op component-bomen voor kleurenbeelden. Twee nieuwe attribuut filters gebaseerd op the “vision” en “visionP” attributen zijn ontwikkeld. Deze zijn gebruikt om in kleurenbeelden compressie te verbeteren, door details “midden in” de boom aan te passen, in plaats van de boom uitsluitend te “snoeien”. Daarnaast is de component-boom aangepast voor kleurenbeelden. Dit is problematisch, omdat component-bomen uitgaan van pixelwaarden die een volledige ordening hebben (zoals grijswaarden). Het probleem kan worden opgelost door, z.g. preordeningen, door b.v. kleuren to ordenen op luminantie of verzadiging. Deze aanpassingen vergen veranderingen aan de component-boom om vector informatie op te slaan. Ieder totale (pre-)ordening kan vervolgens gebruikt worden om de beelden te filteren. Ook is een algoritme ontwikkeld om de het entropie attribuut voor alle knopen in de boom uit te rekenen met een geheugen complexiteit van O(N ), tegenover O(N 2B ), met B het aantal bits per pixel. Dit laatste is vereist als het beeld dat de preordening opslaat meer dan 8 bits per pixel vergt. Daarnaast zijn nieuwe filtermethodes ontwikkeld die kleur-artefacten in de gefilterde beelden voorkomen. Naast het filteren van kleurbeelden op basis van vorm-informatie, is het ook mogelijk om objecten te detecteren met kleur-informatie. Twee methoden zijn onderzocht. Deze methode gebruikt de Euclidische afstand in de kleur-ruimte. In de toepassing op verkeersbordherkenning leverde deze methode een trefzekerheid van 85%, aanzienlijk beter dan met alleen vorm informatie. De methode is getest op zes verschillende kleurpreordeningen, gebaseerd op luminantie, chrominantie of verzadiging, of combinaties hiervan. Tot slot zijn drie beeld-reconstructie regels opgesteld. De “gemiddelde van de voorouder” regel kent aan een verwijderde knoop van de boom de gemiddelde kleur van de meest nabije, niet gefilterde voorouder in de boom. De “meeste gelijkende kleur” regel kent die kleur in de meest nabije, niet gefilterde voorouder in de boom toe, die het meest lijkt op de gemiddelde kleur van de weggefilterde knoop. De “meest nabije pixel” regel kent aan iedere pixel in een weggefilterde knoop de kleur toe van de dichtstbijzijnde pixel in de meest nabije, niet gefilterde voorouder in de boom. Deze methoden resulteren in een verbeterde kwaliteit van gefilterde beelden, met name als het gaat om natuurgetrouwe kleurweergave. De nadruk in deze filters ligt op “quasi-auto-duale” filters (MinMax of MaxMin). Wat opvalt is dat in veel gevallen de Min operatie beter omgaat met een preordening op basis van luminantie, terwijl de Max operator het best werkt bij chrominantie of verzadiging als preordening. In het geval van beeldcompressie werkt luminantie meestal beter, en levert compactere files op met

99 betere beeldkwaliteit. In vergelijking met eerdere resultaten levert dit een verbetering in kwaliteit tot 15%.

Publications

Conference Proceedings • F. Tushabe and V. Baryamureeba (2010): Implications of Cybercrime on SocialEconomic Development”, At the launch of the African Center for Cyberlaw and Cybercrime prevention, the United Nations African Institute for the Prevention of Crime and the Treatment of Offenders, 26 - 27th August 2010, Kampala, Uganda. • F. Tushabe, V. Baryamureeba and F. Katushemererwe (2010): “The Translation of the Google Interface into Runyakitara”, proc. International Conference for Computing and ICT Research (ICCIR), Aug 1-4 2010, Kampala, Uganda. • F. Tushabe, M. H. F. Wilkinson (2010): “Effect of selected attribute filters on watermarks”, Proc. SPIE Vol. 7546, No. 1, 75463E (2010), DOI:10.1117/12.855420 • F. Tushabe, P. Jehopio, V. Baryamureeba, P. Bagyenda and C. Ogwang (2008): “The Status of Software Usability in Uganda”, Proceedings of the International Conference of Computing and ICT Research, August 3, 5th 2008, Kampala, Uganda. • Florence Tushabe, M. H. F. Wilkinson (2008): “Content-based Image Retrieval Using Combined 2D Attribute Pattern Spectra”, in book: Advances in Multilingual and Multimodal Information Retrieval by LNCS, Springer; Vol. 5152, ISBN: 978-3540-85759-4. • F. Tushabe and M. H. F. Wilkinson (2007): “Content-based Image Retrieval Using Shape-Size Pattern Spectra”, Proceeding for the Cross Language Evaluation Forum 2007 Workshop, ImageClef Track. Budapest, Hungary, 19 - 21 September 2007.

102

Samenvatting

• F. Tushabe and M. H. F. Wilkinson (2007): “Image Preprocessing for Compression: Attribute Filtering”, Proc. Signal Processing and Imaging Engineering (ICSPIE’07), San Francisco, USA, 24-26 October, 2007. • V. Baryamureeba and F. Tushabe (2004): “The Enhanced Digital Investigation Process Model”, Proceedings of the Digital Forensic Research Workshop, Maryland, USA, August 11-13.

Journal Publications 1. V. Baryamureeba, F. Tushabe (2006): “The Enhanced Digital Investigation Process Model”, Asian Journal of Information Technology, Volume 5 issue 7, Pages 790 794, 2006. 2. V. Baryamureeba, F. Tushabe (2005): “Cyber Crime in Uganda: Myth or Reality”, The World Academy of Science, Engineering and Technology, Volume 8, pages 66 - 70, ISSN: 1307-6884.

Unpublished 1. F. Tushabe and M. H. F. Wilkinson (2010): “Color Image Processing using Component Trees: A Comparison on Image Compression”. Submitted to Pattern Recognition. 2. F. Tushabe, E. R. Urbach, and M. H. F. Wilkinson. “The Color Attribute Filter”, In Preparation for IEEE Transactions on Image Processing.

Appendix: Attribute Computation

The volume and power attribute filtering functions are given in Alg. 4 and Alg. 5 respectively. To compute the volume attribute, we need a data set which contains just four fields: Hcurrent and Hparent storing the order values of the current node and its parent, respectively, Area, storing the area, and Volume storing the sum of the volumes of its children (with respect to its own order level Hcurrent). Thus, X Volume = (order[x] − Hcurrent) (7.1) x∈C

with C the set of pixels in the children of the current node. The algorithm is based on the folowing observation X  X ′ ′ V (X, f, Hparent) = (f [x] + ∆h) = f [x] + ∆hA(X) (7.2) x∈X

x∈X

with f ′ (x) = order[x] − Hcurrent, ∆h = Hcurrent − Hparent, and A(X) the area of X. Because within X f ′ is only non-zero in C, this equates to V (X, f, Hparent) = Volume + Area(Hcurrent − Hparent).

(7.3)

This equation is implemented in function VolumeAttribute in Alg. 4. Initializing the auxiliary data with the first pixel is simply a matter of allocating the structure, setting Hcurrent to the current pixel’s order value, setting Area to one, and the volume to zero. The AddToVolumeData just increments the area, because it always adds pixels at the current level. The MergeVolumeData function is more complicated, as it needs to compute the volume of the child node and add it to its Volume field. The areas are merged as usual. The PostVolumeData function only updates the parent grey level field.

104

Appendix: Attribute Computation

Algorithm 4 The auxiliary functions for the volume attribute void *NewVolumeData(x, y ){ allocate *volumedata; volumedata->HCurrent = order[x,y]; volumedata->Area = 1; volumedata->Volume = 0; return (volumedata); } void AddToVolumeData(*volumedata, x, y ){ volumedata->Area ++; }/* AddToVolumeData */

void MergeVolumeData(*volumedata, *childdata){ dh = childdata->HCurrent - childdata->HParent; volumedata->Area += childdata->Area; volumedata->Volume += childdata->Volume + dh*childdata->Area; } /* MergeVolumeData */ void PostVolumeData(*volumedata, hparent){ volumedata->HParent = hparent; } /* PostVolumeData */ double VolumeAttribute(*volumedata) { dh = volumedata->HCurrent - volumedata->HParent; return( volumedata->Volume + dh*volumedata->Area); } /* VolumeAttribute */

In the case of the power attribute, we add one field to the auxiliary data: Power, which contains the sum of the powers of its children, with respect to its own order level, i.e. X Power = (order[x] − Hcurrent)2 (7.4) x∈C

The computation of the power attribute is based on a similar decomposition as in the case of volume: X X X P (X, f, Hparent) = (f ′ (x) + ∆h)2 = (f ′ (x))2 + 2∆h f ′ (x) + ∆h2 A(X). (7.5) x∈X

x∈X

x∈X

Therefore, the power attribute can be computed as P (X, f, Hparent) = Power + 2∆hVolume + Area∆h2 .

(7.6)

This expression is used in PowerAttribute to compute the final attribute value, as can be seen in Alg. 5. Initialization of the auxiliary data is identical to the case of the volume attribute, except that the Power field must be set to zero as well. Adding pixels to the current

105 Algorithm 5 The auxiliary functions for the power attribute void *NewPowerData(x, y){ allocate *powerdata; powerdata->HCurrent = order[x,y]; powerdata->Area = 1; powerdata->Volume = 0; powerdata->Power = 0; return (powerdata); } /* NewPowerData */ void AddToPowerData(*powerdata, x, y ){ powerdata->Area++; } /* AddToPowerData */ void MergePowerData(*powerdata, *childdata) { dh = childdata->HCurrent - childdata->HParent; childvolume = childdata->Volume + dh*childdata->Area; powerdata->Area += childdata->Area; powerdata->Volume += childvolume; powerdata->Power += childdata->Power + 2*dh*childvolume + dh*dh*childdata->Area; } /* MergePowerData */ void PostPowerData(*powerdata, hparent) { powerdata->HParent = hparent; } /* PostPowerData */ double PowerAttribute(*powerdata) { dh = powerdata->HCurrent - powerdata->HParent; volume = powerdata->Volume + dh*powerdata->Area; return ( powerdata->Power + 2*dh*volume + dh*dh*powerdata->Area); } /* PowerAttribute */

level is identical to the volume case, as is the post-processing through PostPowerData. Merging two data sets is a bit more complicated, because the Power fields must also be merged using (7.6). Computing the Vision and VisionP attributes can be done with an auxiliary data set which only contains Hcurrent, Hparent, and Area, fields. The latter only stores the area of the current node excluding the area of the children. Initialization is similar to the volume case, except that no Volume field need be initialized. AddToVisionData and AddToVisionPData are identical to the volume counterpart, as are the postprocessing functions PostVisionData and PostVisionPData. The merging functions do nothing, because both attributes are independent of any children. The output attributes are computed as Vision(X, f, Hparent) = Area × (Hcurrent − Hparent),

(7.7)

106

Appendix: Attribute Computation

and VisionP(X, f, Hparent) = Area × (Hcurrent − Hparent)2 .

(7.8)

Storage-wise each of these algorithms has complexity O(N ), with N the number of pixels, because the number of Max-tree nodes cannot be larger than the number of pixels, and the auxiliary storage per node is fixed. Likewise, the computational complexity is O(N ), all of the attribute handling functions have a O(1) complexity.

Bibliography

Angulo, J.: 2007, Morphological colour operators in totally ordered lattices based on distances: Application to image filtering, enhancement and analysis, Computer Vision and Image Understanding 107, 56–73. Angulo, J. and Serra, J.: 2003, Morphological coding of color images by vector conected filters, 7 th International Symposium on Signal Processing and its Applications (ISSPA03), IEEE, pp. 69– 72. Arni, T., Clough, P., Sanderson, M. and Grubinger, M.: 2008, Overview of the imageclefphoto 2008 photographic retrieval task, Working notes for the CLEF 2008 Workshop, Aarhus, Denmark. Bagdanov, A. and Worring, M.: 2002, Granulometric analysis of document images, Proceeding of the 16th International Conference on Pattern Recognition, Vol. 1, pp. 468 – 471. Bahlmann, C., Zhu, Y. and Ramesh, V.: 2005, A system for traffic sign detection, tracking, and recognition using color, shape and motion information, Proceedings of the IEEE Symposium on Intelligent Ve- hicles, pp. 255–260. Bangham, J. A., Chardaire, P., Pye, C. J. and Ling, P. D.: 1996, Multiscale nonlinear decomposition: the sieve decomposition theorem, IEEE Trans. Pattern Anal. Mach. Intell. 18, 529–538. Bangham, J. A., Ling, P. D. and Harvey, R.: 1996, Scale-space from nonlinear filters, IEEE Trans. Pattern Anal. Mach. Intell. 18, 520–528. Bober, M.: 2001, Mpeg 7 visual descriptors, IEEE Transactions in Circuits and Systems for Video Technology 11(7), 703–715. Breen, E. J. and Jones, R.: 1996, Attribute openings, thinnings and granulometries, Computer Vision Image Understanding 64(3), 377–389. Cheng, F. and Venetsanopoulos, A. N.: 1992, An adaptive morphological filter for image processing, IEEE Trans. Image Proc. 1(4), 533–539. Corvi, M. and Nicchiotti, G.: 1997, Wavelet-based image watermarking and copyright protection, Proc. Scandinavian Conference on Image Analysis (SCIA ’97), Finland.

108

BIBLIOGRAPHY

Cox, I. J. and Miller, M. L.: 2002, The first 50 years of electronic watermarking, EURASIP journal of Applied Signal processing 2, 126 – 132. Cox, J., Kilian, J., Leighton, T. and Shamoon, T. G.: 1997, Secure spread spectrum watermarking for multimedia, Proc. IEEE International Conference on Image Processing (ICIP 1997), Vol. 6, Washington, USA, pp. 1673–1687. de la Escalera, A., Armingol, J. M. and Mata, M.: 2003, Traffic sign recognition and analysis for intelligent vehicles, Image and Vision Computing 21(3), 247 – 258. Demsar, J., B, B. Z. and Leban, G.: 2004, Orange: From experimental machine learning to interactive data mining. White paper, Faculty of Computer and Information Science, University of Ljubljana, www.ailab.si/orange. Dimiccoli, M. and Salembier, P.: 2007, Perceptual filtering with connected operators and image inpainting, in G. J. F. Banon, J. Barrera, U. d. M. Braga-Neto and N. S. T. Hirata (eds), Proc. Int. Symp. Math. Morphology (ISMM) 2007, Universidade de S˜ao Paulo (USP), pp. 227–238. Dugad, R., Ratakonda, K. and Ahuja, N.: 1998, A new wavelet-based scheme for watermarking images, Proc. 5th IEEE International Conference on Image Processing, (ICIP 98), Chicago, USA. Evans, A. N.: 2003, Vector area morphology for motion field smoothing and interpretation, IEE Proc.-Vis. Image Signal Process. 150, 219–226. Evans, A. N. and Gimenez, D.: 2008, Extending connected operators to colour images, Proceedings of the IEEE International Conference on Image Processing, 2008, San Diego, California, U.S.A, pp. 2184–2187. Flusser, J. and Suk, T.: 1993, Pattern recognition by affine moment invariants, Pattern Recognition 26, 167–174. Flusser, J. and Suk, T.: 2003, Construction of complete and independent systems of rotation moment invariants, CAIP, pp. 41–48. Garcia, J. M. F., Lopez, M. L., Gomez, J. I., de la Blanca, N. P. and Fdez-Valdivia, J.: n.d., Content based image retrieval using a 2d shape characterization. Garg, S.: 2008, Image compression benchmark. http://www.imagecompression.info/ test_images/. Gibson, S., Harvey, R. and Finlayson, G. D.: 2004, Convex colour sieves, Lecture Notes in Computer Science 2695/2003, 1079. Gimenez, D. and Evans, A. N.: 2008, An evaluation of area morphology scale-spaces for colour images, Comp. Vis. Image Understand. 110, 32–42. Goedem´e, T.: 2008, Traffic sign recognition with constellations of visual words, Proc. ICINCORA, pp. 222–227. Gonzalez, R. C. and Woods, R. E.: 2002, Digital Image Processing, Prentice Hall. Grigorescu, C. and Petkov, N.: 2003, Distance sets for shape filters and shape recognition, IEEE Trans. Image Proc. 12(10), 1274–1286.

BIBLIOGRAPHY

109

Grubinger, M., Clough, P. and Clement, L.: 2006, The iapr tc-12 benchmark for visual information search, IAPR Newsletter 28(2), 10–12. Hammer, B., Strickert, M. and Villmann, T.: 2005, On the generalization capability of GRLVQ networks, Neural Processing Letters 21, 109–120. Hammer, B. and Villmann, T.: 2002, Generalized relevance learning vector quantization, Neural Networks 15, 1059–1068. Heijmans, H. J. A. M.: 1999, Connected morphological operators for binary images, Comp. Vis. Image Understand. 73, 99–120. Hu, M. K.: 1962, Visual pattern recognition by moment invariants, IRE Transactions on Information Theory IT-8, 179–187. HunterLab: 2008, Cie l*a*b* color scale, 8(7). I. F. Kallel, M. S. Bouhlel, J.-C. L. and Garcia, E.: 2009, Control of dermatology image integrity using reversible watermarking, International Journal of Imaging Systems and Technology 19, 5 – 9. ISO: 2000, Information technology - jpeg 2000 image coding system. ISO/IEC International Standard 15444-1,ITU-T Rec. T.800. Jones, R.: 1999, Connected filtering and segmentation using component trees, Comp. Vis. Image Understand. 75, 215–228. JPEG: 1993, Information technology–coded representation of picture and audio information– digital compression and coding of continuous-tone still images (jpeg standard), Computing Science Technical Report ITU-T Rec. T.81—ISO/IEC 10918-1, ISO and ITU-T. Kim, J. R. and Moon, Y. S.: 1999, A robust wavelet-based digital watermark using level- adaptive thresholding, Proc. 6th IEEE International Conference on Image Processing (ICIP 99), Japan. Kira, K. and Rendell, L.: 1992, A practical approach to feature selection, Proceedings of the 9th International Conference on Machine Learning, Aberdeen, Scotland, pp. 249 – 256. Kiwanuka, F. N., Ouzounis, G. K. and Wilkinson, M. H. F.: 2009, Surface-area-based attribute filtering in 3D, in M. H. F. Wilkinson and J. B. T. M. Roerdink (eds), Proc. ISMM 2009, Vol. 5720 of LNCS, pp. 70–81. Kiwanuka, F. N. and Wilkinson, M. H. F.: 2010, Radial moment invariants for attribute filtering in 3D, Poster at Workshop on Applications of Discrete Geometry and Mathematical Morphology. Klette, R., Jiang, R., Morales, S. and Vaudrey: 2009, Discrete driver assistance, in M. H. F. Wilkinson and J. B. T. M. Roerdink (eds), Proc. ISMM 2009, Vol. 5720 of LNCS, pp. 1–12. Kohonen, T.: 1990, Improved versions of learning vector quantization, IJCNN International Joint Conference on Neural Networks, Vol. 1, IEEE Computer Computer Society Press, San Diego, pp. 545–550. Kominek, J.: 2006, Waterloo bragzone and factals repository.

110

BIBLIOGRAPHY

Lezoray, O., Meurie, C. and Elmoataz, A.: 2005, A graph approach to color mathematical morphology, Proceedings of the 5th IEEE International Symposium on Signal Processing and Information Technology, pp. 856–861. Lin, E. T. and Delp, E. J.: 1999, Review of fragile image watermarks, Proc. Multimedia and Security Workshop (ACM Multimedia ’99), Florida, USA, pp. 25–29. Maragos, P.: 1989, Pattern spectrum and multiscale shape representation, IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 701–715. Maragos, P. and Evangelopoulos, G.: 2007, Levelling cartoons, texture energy markers and image decomposition, Proceeding of the 8th International Symposium on Mathematical Morphology, pp. 125– 138. Maragos, P. and Ziff, R. D.: 1990, Threshold decomposition in morphological image analysis, IEEE Trans. Pattern Anal. Mach. Intell. 12(5), 498–504. Matheron, G.: 1975, Random sets and integral geometry, John Wiley and Sons . Meerwald, P.: 2001, Digital image watermarking in the wavelet transform domain, Masters Thesis, Department of Scientific Computing, University of Salzburg. Meijster, A. and Wilkinson, M. H. F.: 2002, A comparison of algorithms for connected set openings and closings, tpami 34(4), 484–494. Mintzer, F., Braudaway, G. W. and Bell, A. E.: 1998, Opportunities for watermarking standards, Communications of the ACM 41(7), 57 – 64. Monasse, P. and Guichard, F.: 2000a, Fast computation of a contrast invariant image representation, IEEE Trans. Image Proc. 9, 860–872. Monasse, P. and Guichard, F.: 2000b, Scale-space from a level lines tree, J. Vis. Commun. Image Repres. 11, 224–236. Naegel, B. and Passat, N.: 2009, Component-trees and multi-value images: A comparative study, LNCS 5720. Naegel, B., Passat, N., Boch, N. and Kocher, M.: 2007, Segmentation using vector-attribute flters: methodology and application to dermatological imaging, Proceedings of 8th International Symposium on Mathematical Morphology, Vol. 1, Rio de Janeiro, Brazil, pp. 239 – 250. Najman, L. and Couprie, M.: 2006, Building the component tree in quasi-linear time, IEEE Trans. Image Proc. 15, 3531–3539. Nardi, A. and Peters, C. (eds): 2007, Working Notes of the 2007 CLEF Workshop, Budapest, Hungary. Ouzounis, G. K. and Wilkinson, M. H. F.: 2006, Filament enhancement by non-linear volumetric filtering using clustering-based connectivity, in N. Zheng, X. Jiang and X. Lan (eds), Int. Workshop Intell. Comput. Pattern Anal. Synth. (IWICPAS) 2006,, Vol. 4153 of Lecture Notes in Computer Science, Xi’an, China, pp. 317–327.

BIBLIOGRAPHY

111

Ouzounis, G. K. and Wilkinson, M. H. F.: 2007, Mask-based second generation connectivity and attribute filters, IEEE Trans. Pattern Anal. Mach. Intell. 29, 990–1004. Ouzounis, G. K. and Wilkinson, M. H. F.: 2010, Hyperconnected attribute filters based on k-flat zones, IEEE Trans. Pattern Anal. Mach. Intell. . In press. Peters, R. A.: 1995, A new algorithm for image noise reduction using mathematical morphology, IEEE Trans. Image Proc. 4(5), 554–568. Piccioli, G., Micheli, E. D., Parodi, P. and Campani, M.: 1996, Robust method for road sign detection and recognition, Image and Vision Computing 14(3), 209 – 223. Plataniotis, K. N. and Venetsanopoulos, A. N.: 2000, Color image processing and applications, Springer-Verlag New York, Inc., New York, NY, USA. Purnama, I. K. E., Aryanto, K. Y. E. and Wilkinson, M. H. F.: 2010, Non-compactness attribute filtering to extract retinal blood vessels in fundus images, International Journal of E-Health and Medical Communications . In press. Ronse, C.: 1990, Why mathematical morphology needs complete lattices, Signal Processing 21(2), 129 – 154. Salembier, P. and Garrido, L.: April, 2000, Binary partition tree as an efficient representation for image processing, segmentation and information retrieval, IEEE Trans. Image Proc. 9(4), 561– 576. Salembier, P., Oliveras, A. and Garrido, L.: 1998, Anti-extensive connected operators for image and sequence processing, IEEE Trans. Image Proc. 7(4), 555–570. Salembier, P. and Serra, J.: 1995, Flat zones filtering, connected operators and filters by reconstruction, IEEE Trans. Image Proc. 4(8), 1153–1160. Salembier, P. and Wilkinson, M. H. F.: 2009, Connected operators: A review of region-based morphological image processing techniques, IEEE Signal Processing Magazine 26(6), 136– 157. Serra, J.: 1982, Image Analysis and Mathematical Morphology, Vol. 1, Academic Press, New York. Serra, J.: 1988, Mathematical morphology for Boolean lattices, in J. Serra (ed.), Image Analysis and Mathematical Morphology, II: Theoretical Advances, Academic Press, London, chapter 2, pp. 37–58. Serra, J.: 1998, Connectivity on complete lattices, J. Math. Imag. Vis. 9(3), 231–251. Shannon, C. E.: 1948, A mathematical theory of communication, The Bell System Technical Journal 27, 379 – 423. Sofou, A., Evangelopoulos, G. and Maragos, P.: 2005, Coupled geometric and texture PDE based segmentation, Proceeding of the International Conference on Image Processing, Vol. II, pp. 650 – 653. Soille, P.: 2005, Beyond self-duality in morphological image analysis, Image and Vision Computing 23, 249–257.

112

BIBLIOGRAPHY

Soille, P.: 2008, Constrained connectivity and connected filters, IEEE Trans. Pattern Anal. Mach. Intell. 30(7), 1132–1145. Tarjan, R. E.: 1975, Efficiency of a good but not linear set union algorithm, ACM 22, 215–225. Tushabe, F. and Wilkinson, M. H. F.: 2007a, Content-based image retrieval using shape-size pattern spectra, CLEF, pp. 554–561. Tushabe, F. and Wilkinson, M. H. F.: 2007b, Image preprocessing for compression: Attribute filtering, Proceedings of International Conference on Signal Processing and Imaging Engineering (ICSPIE’07), San Francisco, USA, pp. 1411–1418. Tushabe, F. and Wilkinson, M. H. F.: 2008, Content-based image retrieval using combined 2d attribute pattern spectra, 5152/2008, 554–561. Tushabe, F. and Wilkinson, M. H. F.: 2010, Color image processing using component trees: A comparison on image compression, Pattern Recognition . Submitted. Urbach, E. R., Boersma, N. J. and Wilkinson, M. H. F.: 2005, Vector-attribute filters, Mathematical Morphology: 40 Years On, Paris, France, pp. 95–104. Urbach, E. R., Roerdink, J. B. T. M. and Wilkinson, M. H. F.: 2007, Connected shape-size pattern spectra for rotation and scale-invariant classification of gray-scale images, tpami 29(2), 272– 285. Urbach, E. R. and Wilkinson, M. H. F.: 2002, Shape-only granulometries and grey-scale shape filters, Proc. Int. Symp. Math. Morphology (ISMM) 2002, Sydney, Australia, pp. 305 – 314. Vachier, C.: 1998, Utilisation d’un crit`ere volumique pour le filtrage des images, Proc. Reconnaissance des Formes et Intelligence Artificielle, Vol. 1, Clermont-Ferrand, pp. 307–315. Vilaplana, V., Marques, F. and Salembier, P.: 2008, Binary partition trees for object detection, IEEE Trans. Image Proc. 17(11), 1–16. Vincent, L.: 1993, Greyscale area openings and closings, their efficient implementation and applications, Proceedings EURASIP Workshop on Mathematical Morphology and its Application to Signal Processing, pp. 22–27. Wang, H.-Y., Su, P.-C. and Kuo, C.-C. J.: 1998, Wavelet based digital image watermarking, Optics Express 3, 491 – 497. Wang, Z. and Bovik, A. C.: 2002, A universal image quality index, IEEE Signal Processing Letters 9, 81–84. Wang, Z., Bovik, A. C. and Sheikh, H. R.: 2005, Structural similarity based image quality assessment, in H. R. Wu and K. R. Rao (eds), Digital Video Image Quality and Perceptual Coding, Series in Signal Processing and Communications, CRC, pp. 225–241. Welch, T. A.: 1984, A technique for high-performance data compression, Computer 17, 8–19. Westenberg, M. A., Roerdink, J. B. T. M. and Wilkinson, M. H. F.: 2007, Volumetric attribute filtering and interactive visualization using the max-tree representation, IEEE Trans. Image Proc. 16, 2943–2952.

BIBLIOGRAPHY

113

Wheeler, M. and Zmuda, M. A.: 2000, Processing color and complex data using mathematical morphology, Proc. IEEE National Aerospace and Electronics Conference (NAECON) 2000, pp. 618 –624. Wilkinson, M. H. F.: 2002, Generalized pattern spectra sensitive to spatial information, Proceeding of the 16th International Conference on Pattern Recognition, Vol. 1, Quebec City, Canada, pp. 701–715. Wilkinson, M. H. F., Gao, H., Hesselink, W. H., Jonker, J. E. and Meijster, A.: 2008, Concurrent computation of attribute filters using shared memory parallel machines, IEEE Trans. Pattern Anal. Mach. Intell. 30(10), 1800–1813. Wilkinson, M. H. F. and Ouzounis, G. K.: 2010, Advances in connectivity and connected attribute filters, in P. W. Hawkes (ed.), Advances in Imaging and Electron Physics, Vol. 161, Elsevier, pp. 211 – 275. Wilkinson, M. H. F. and Westenberg, M. A.: 2001, Shape preserving filament enhancement filtering, in W. J. Niessen and M. A. Viergever (eds), Proc. MICCAI’2001, Vol. 2208 of Lecture Notes in Computer Science, pp. 770–777. Witte, V., Schulte, S., Nachtegael, M., Weken, D. and Kerre, E. E.: 2005, Vector morphological operators for colour images, LNCS 3656/2005, 667–675. Wolfgang, R., Podilchuk, C. and Delp, E.: 1999, Perceptual watermarks for digital images and video, Proceedings of IEEE Special Issue Identification and Protection of Multimedia Information 87(7), 1108–1126. Xia, X. G., Boncelet, C. G. and Arce, G. R.: 1998, Wavelet transform based watermark for digital images, Optics Express 3, 477 – 523. Young, N. and Evans, A. N.: 2003, Psychovisually tuned attribute operators for pre-processing digital video, IEE Proc. Vision, Image and Signal Processing 150, 277–286. Yu, M., Wang, R., Jiang, G., Liu, X. and Choi, T.-Y.: 2004, New morphological operators for colour image processing, Proceedings of the TENCON 2004. IEEE Region 10 Conference, pp. 443 – 446. Zhu, M.: 2004, Recall, precision and average precision, Computing Science Technical Report no.2004-09, University of Waterloo, Murray Hill, New Jersey. Available at http://www.stats.uwaterloo.ca/stats_navigation/techreports/ 04WorkingPapers/2004-09.pdf. Zhu, W., Xiong, Z. and Zhang, Y. Q.: 1998, Multiresolution watermarking for images and video: A unified approach, Proc. 5th IEEE International Conference on image processing, (ICIP 98), Chicago, USA.

Suggest Documents