robust feature detection and tracking in thermal ...

2 downloads 0 Views 10MB Size Report
Similar to Robert cross operator, Sobel filter includes three steps. Step 1. Convolution ...... [2] Agostinelli, F., Anderson, M. R., and Lee, H. (2013). Adaptive multi- ...
ROBUST FEATURE DETECTION AND TRACKING IN THERMAL-INFRARED VIDEO

VU HOANG MINH

SCHOOL OF ELECTRICAL AND ELECTRONIC ENGINEERING 2015

ROBUST FEATURE DETECTION AND TRACKING IN THERMAL-INFRARED VIDEO

VU HOANG MINH

SCHOOL OF ELECTRICAL AND ELECTRONIC ENGINEERING

A DISSERTATION SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER CONTROL AND AUTOMATION

2015

Abstract

In this thesis, popular techniques within the area of machine vision: noise reduction, feature detection, edge detection and feature tracking, have been studied. This project is concerned with the use of thermal-infrared cameras which are much less affected by changes in lighting, shadows and out-of-view motion compared to visible cameras. The main research focus of this thesis is how to deal with the low signal-to-noise ratio of thermal-infrared video in developing a novel real-time methodology for robust feature detection and tracking. The thesis first reviews the background of thermal-infrared imagery. It then covers the necessity of a noise reduction filter in thermal-infrared video. Next, it presents a number of existing approaches in edge and feature detection followed by four proposed techniques. Finally, results reveal that the proposed techniques perform well in thermal-infrared video.

i

Acknowledgments

Firstly, the author would like to express sincere thanks and appreciation to Professor Cheah Chien Chern for his knowledge and guidance. Without him, this thesis would not have been completed or written. Secondly, special gratitude goes to Doctor Stephen Vidas. Thanks to his patience and knowledge, the author found inspiration in his words of encouragement. One simply could not wish for a better or friendlier supervisor. Most of all, the author thanks his parents for their constant love, support and encouragement.

ii

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

i

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ii

Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

x

List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii 1 Introduction

1

1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.4 Structure of the Document . . . . . . . . . . . . . . . . . . . . . . . . . .

4

2 Background

5

2.1 Thermal-infrared Radiation . . . . . . . . . . . . . . . . . . . . . . . . . .

5

2.1.1 Categorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

2.1.2 Electromagnetic Radiation . . . . . . . . . . . . . . . . . . . . . .

6

2.2 Thermal-infrared Cameras . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

2.2.1 Optris PI450 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

2.2.2 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.3 Thermal-infrared Imagery . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.3.1 Variation in SNR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.2 Degrees per graylevel . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 Noise Reduction Filters

13

3.1 Least Square Error Wiener-Kolmogorov Filter . . . . . . . . . . . . . . . 14 iii

Contents 3.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.2 Wiener Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Median Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.2 Median Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3 Non-local Means Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.2 Non-local Means Filter . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4 Sparse 3D Transform-domain Collaborative Filter . . . . . . . . . . . . 22 3.4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.4.2 BM3D Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4 Edge Detectors

25

4.1 Roberts cross Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1.2 Roberts Cross Detector . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 Sobel Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2.2 Sobel Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3 Laplacian of Gaussian Detector . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3.2 Laplacian of Gaussian Detector . . . . . . . . . . . . . . . . . . . 29 4.4 Canny Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4.2 Canny Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5 Enhanced Canny Detector - A Smoothness-based Detector . . . . . . . 33 4.5.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.5.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5 Feature Detectors

44

5.1 Harris Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.1.2 Harris Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.2 Susan Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 iv

Contents

5.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.2.2 Susan Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.3 Features from Accelerated Segment Test Detector . . . . . . . . . . . . 51 5.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.3.2 FAST Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.4 Global and Local Curvature Properties Detector . . . . . . . . . . . . . 53 5.4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.4.2 Global and Local Curvature Properties Detector . . . . . . . . . 54 5.5 Enhanced Curvature Detector . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.5.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.5.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6 Feature Tracking

64

6.1 Feature Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6.1.1 Proposed Gradient-based Feature Detector . . . . . . . . . . . . 65 6.1.2 Proposed Gradient/Edge-based Feature Detector . . . . . . . . 67 6.2 Feature Descriptor - Oriented FAST and Rotated BRIEF . . . . . . . . . 69 6.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 6.2.2 Oriented FAST and Rotated BRIEF Feature Detector . . . . . . 70 6.3 Feature Tracker - Lucas-Kanade Tracker . . . . . . . . . . . . . . . . . . 72 6.3.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 6.3.2 Lucas-Kanade Tracker . . . . . . . . . . . . . . . . . . . . . . . . . 74 7 Evaluation

76

7.1 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 7.2 Noise Reduction Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 7.3 Edge Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 7.4 Feature Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 7.5 Feature Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 7.5.1 Restructure main data directory . . . . . . . . . . . . . . . . . . . 84 7.5.2 Prepare image processing scripts . . . . . . . . . . . . . . . . . . 85 7.5.3 Prepare image processing validation scripts . . . . . . . . . . . 87 7.5.4 Prepare feature detection scripts . . . . . . . . . . . . . . . . . . 87 7.5.5 Prepare feature detection validation scripts . . . . . . . . . . . . 89 v

Contents

7.5.6 Prepare feature tracking scripts . . . . . . . . . . . . . . . . . . . 90 7.5.7 Prepare feature tracking validation scripts . . . . . . . . . . . . 91 7.5.8 Denoising methods analysis script . . . . . . . . . . . . . . . . . 91 7.5.9 Feature detection analysis script . . . . . . . . . . . . . . . . . . 94 7.5.10 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 8 Conclusions and Future Work

108

8.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 8.1.1 Noise Reduction Filters . . . . . . . . . . . . . . . . . . . . . . . . 108 8.1.2 Feature Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 8.1.3 Edge Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 8.1.4 Feature Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 8.2.1 Implementation of Denoising Methods in C++ Code . . . . . . 109 8.2.2 Improve Performance of Custom Feature Detectors . . . . . . . 110 8.2.3 Performance Evaluation for Monocular SLAM . . . . . . . . . . 110 References

111

vi

List of Figures

2.1 Background: Visible and Thermal-infrared EM Spectrum . . . . . . . .

6

2.2 Background: Wien’s Law of Radiation . . . . . . . . . . . . . . . . . . .

7

2.3 Background: Thermal-infrared Camera . . . . . . . . . . . . . . . . . . .

8

2.4 Background: Comparison of Visible and Thermal-infrared images . .

9

2.5 Background: Example of High and Low-SNR Images . . . . . . . . . . 10 3.1 Noise Reduction: Noise Classification . . . . . . . . . . . . . . . . . . . . 14 3.2 Noise Reduction: Block Diagram of Adaptive Wiener Filter . . . . . . 15 3.3 Noise Reduction: Median Filter’s Kernel . . . . . . . . . . . . . . . . . . 17 3.4 Noise Reduction: Non-local Means Filter’s Application 1 . . . . . . . . 18 3.5 Noise Reduction: Non-local Means Filter’s Application 2 . . . . . . . . 19 3.6 Noise Reduction: Non-local Means Filter’s Weight . . . . . . . . . . . . 20 3.7 Noise Reduction: Scheme of BM3D . . . . . . . . . . . . . . . . . . . . . 22 3.8 Noise Reduction: Patches and Search Window in BM3D . . . . . . . . 23 4.1 Edge Detection: Discrete Approximation of LoG Function . . . . . . . 30 4.2 Edge Detection: Four Directions in 3 × 3 Mask . . . . . . . . . . . . . . 33 4.3 Edge Detection: Enhanced Edge Detector: Step 1 for High-SNR . . . 35 4.4 Edge Detection: Enhanced Edge Detector: Step 1 for Low-SNR . . . . 36 4.5 Edge Detection: Enhanced Edge Detector: Step 2 for High-SNR . . . 37 4.6 Edge Detection: Enhanced Edge Detector: Step 2 for Low-SNR . . . . 37 4.7 Edge Detection: Enhanced Edge Detector: Step 3 for High-SNR . . . 38 4.8 Edge Detection: Enhanced Edge Detector: Step 3 for Low-SNR . . . . 38 vii

List of Figures

4.9 Edge Detection: Flowchart of Enhanced Edge Detector . . . . . . . . . 43 5.1 Feature Detection: Requirements of Good Corner Detectors . . . . . . 45 5.2 Feature Detection: Example of Feature Points . . . . . . . . . . . . . . . 45 5.3 Feature Detection: Harris Corner Response . . . . . . . . . . . . . . . . 48 5.4 Feature Detection: Example of USAN . . . . . . . . . . . . . . . . . . . . 49 5.5 Feature Detection: FAST Algorithm . . . . . . . . . . . . . . . . . . . . . 52 5.6 Feature Detection: Flowchart of Enhanced Feature Detector . . . . . . 63 6.1 Feature Tracking: Flowchart of Gradient-based Feature Detector . . . 67 6.2 Feature Tracking: Flowchart of Gradientedge-based Feature Detector 70 6.3 Feature Tracking: Movement of Feature Points . . . . . . . . . . . . . . 73 7.1 Evaluation: Datasets for Evaluation . . . . . . . . . . . . . . . . . . . . . 77 7.2 Evaluation: Comparison of Denoising Methods . . . . . . . . . . . . . . 78 7.3 Evaluation: Comparison of Denoising Methods . . . . . . . . . . . . . . 79 7.4 Evaluation: Comparison of Edge Detectors at High-SNR . . . . . . . . 80 7.5 Evaluation: Comparison of Edge Detectors at Low-SNR . . . . . . . . . 81 7.6 Evaluation: Comparison of Corner Detectors at High-SNR . . . . . . . 82 7.7 Evaluation: Comparison of Corner Detectors at Low-SNR . . . . . . . 83 7.8 Evaluation: Comparison of Different DPGs . . . . . . . . . . . . . . . . 88 7.9 Evaluation: Video Validation . . . . . . . . . . . . . . . . . . . . . . . . . 89 7.10 Evaluation: Corner Detection Validation . . . . . . . . . . . . . . . . . . 92 7.11 Evaluation: Corner Tracking Validation . . . . . . . . . . . . . . . . . . . 95 7.12 Evaluation: Drift of Low-SNR at DPG=0.01 . . . . . . . . . . . . . . . . 97 7.13 Evaluation: Survival Rate of Low-SNR at DPG=0.01 . . . . . . . . . . 98 7.14 Evaluation: Drift of Low-SNR at DPG=0.005 . . . . . . . . . . . . . . . 99 7.15 Evaluation: Survival Rate of Low-SNR at DPG=0.005 . . . . . . . . . . 100 7.16 Evaluation: Drift of Original Low-SNR at DPG=0.01 . . . . . . . . . . 101 7.17 Evaluation: Drift of NL-means Low-SNR at DPG=0.01 . . . . . . . . . 101 7.18 Evaluation: Drift of BM3D Low-SNR at DPG=0.01 . . . . . . . . . . . . 104 7.19 Evaluation: Survival Rate of Original Low-SNR at DPG=0.01 . . . . . 104 7.20 Evaluation: Survival Rate of NL-Means Low-SNR at DPG=0.01 . . . . 105 7.21 Evaluation: Survival Rate of BM3D Low-SNR at DPG=0.01 . . . . . . 105 viii

List of Figures

7.22 Evaluation: Survival Rate in N-frame Sequence . . . . . . . . . . . . . . 106 7.23 Evaluation: Flowchart of Feature Tracking Process . . . . . . . . . . . . 107

ix

List of Algorithms

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Evaluation: Scale DPG . . . . . . . . . . . . . . . . . . . . . . . . . . . Noise Reduction: Median Filter . . . . . . . . . . . . . . . . . . . . . Noise Reduction: Non-local Means Filter . . . . . . . . . . . . . . . . Noise Reduction: BM3D Filter . . . . . . . . . . . . . . . . . . . . . . Edge Detection: Robert-cross Operator . . . . . . . . . . . . . . . . . Edge Detection: Sobel Operator . . . . . . . . . . . . . . . . . . . . . Edge Detection: Laplacian of Gaussian Edge Detector . . . . . . . . Edge Detection: Canny Edge Detector . . . . . . . . . . . . . . . . . Edge Detection: Link Edges . . . . . . . . . . . . . . . . . . . . . . . . Edge Detection: Proposed Smoothness-based Detector . . . . . . . Feature Detection: Harris Feature Detector . . . . . . . . . . . . . . Feature Detection: Susan Feature Detector . . . . . . . . . . . . . . Feature Detection: FAST Detector . . . . . . . . . . . . . . . . . . . . Feature Detection: CSS Algorithm . . . . . . . . . . . . . . . . . . . . Feature Detection: GLCP Detector . . . . . . . . . . . . . . . . . . . . Feature Detection: Remove Weaker Corners . . . . . . . . . . . . . . Feature Detection: Proposed Corner Ranking . . . . . . . . . . . . . Feature Tracking: Proposed Gradient-based Feature Detector . . . Feature Tracking: Proposed Gradientedge-based Feature Detector Feature Tracking: Oriented FAST and Rotated BRIEF . . . . . . . . Feature Tracking: Lucas-Kanade Tracker . . . . . . . . . . . . . . . . Evaluation: Frame Re-indexing Procedure . . . . . . . . . . . . . . . Evaluation: Generate Different DPG Sequences . . . . . . . . . . . . Evaluation: Video Validation . . . . . . . . . . . . . . . . . . . . . . . Evaluation: Feature Detection . . . . . . . . . . . . . . . . . . . . . . Evaluation: Feature Detection Validation . . . . . . . . . . . . . . . . Evaluation: Feature Tracking . . . . . . . . . . . . . . . . . . . . . . . Evaluation: Feature Tracking Validation . . . . . . . . . . . . . . . . Evaluation: Denoising Methods Analysis . . . . . . . . . . . . . . . . Evaluation: Feature Detection Analysis . . . . . . . . . . . . . . . . .

x

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11 17 21 24 27 29 31 34 39 42 48 51 53 54 56 61 62 66 69 73 75 86 87 90 91 93 93 96 102 103

Acronyms

BLS-GSM Bayes Least Squares with a Gaussian Scale-Mixture. BM3D Sparse 3D Transform-domain Collaborative Filter. BRIEF Binary Robust Independent Elementary Features. CSS Curvature Scale Space. DPG Degrees Per Graylevel. DSNU Dark Signal Non-Uniformity. EM Electro-Magnetic. FAST Features from Accelerated Segment Test. FPN Fixed Pattern Noise. GLCPD Global and Local Curvature Properties Detector. LKT Lucas-Kanade Tracker. LWIF Long-Wave Infrared. MATLAB Matrix Laboratory. MMSE minimum mean-square error. MWIF Mid-Wave Infrared. NIR Near Infrared. NL-means Non-local means. NUC Non-Uniformity Correction. xi

Acronyms

ORB Oriented FAST and Rotated BRIEF. PRNU Photo Response Non-Uniformity. RCRS rank-conditioned rank-selection. ROS Region of Support. RS rank-selection. SLAM Simultaneous Localization and Mapping. SNR Signal-Noise Ratio. SSD sum of squared differences. SURF Speeded Up Robust Features. SUSAN Smallest Univalue Segment Assimilating Nucleus. SWIF Short-Wave Infrared. USAN Univalue Segment Assimilating Nucleus. VLWIF Very Long-Wave Infrared.

xii

Chapter 1 Introduction This thesis aims to investigate critical problems relating to thermal-infrared image processing. The problems are identified in four key areas: noise reduction in low Signal-Noise Ratio (SNR) thermal-infrared images, feature detection, edge detection, and feature tracking. The applications of this work include search and rescue, medical imaging, exploration and the like. The structure of this chapter is as follows. Firstly, an overview of the aims of the thesis is provided in Section 1.1. The motivation of the project is then explored in Section 1.2. Next, the objectives of the thesis are described in depth in Section 1.3. Finally, the structure of the document is discussed in Section 1.4.

1.1

Overview

The importance of computer science has grown rapidly in recent decades. This comes from the fact that computer science is a crucial part shaping industry and society with the intent of improving people’s quality of life. Machine vision is one field of computer science. It aims at extracting and analyzing useful information from images and videos for industrial and medical applications, for example, medical imaging [53], signature identification [77], optical character recognition [79], handwriting recognition [89], object recognition [43], pattern recognition [26], Simultaneous Localization and Mapping (SLAM) [23] and so on. Feature detection is a fundamental process in image processing which is one part of machine vision. There are four common types of image features including edges, interest points, interest regions and ridges. Feature detection is a technique with the purpose of detecting interest points inferring the contents of an image. To be 1

Chapter 1. Introduction

more specific, these interest points are distinctive, in the shape of isolated points, and they show significant differences in all directions in the neighborhood. A large number of feature detection approaches have been proposed in the literature. For example, an intensity-based method based on the auto-correlation of the signal was proposed by Moravec in 1980 [56]; a corner-score-based technique regarding to direction directly was developed by Harris and Stephens in 1988 [29]; and a real-time machine learning approach was developed by Rosten and Drummond in 2006 [66]. Feature points of an image are used for image registration [31], object detection [83] and classification [27], tracking [70], and motion estimation [92]. Similarly, feature tracking is an elementary technique in image processing which targets identifying and tracking feature points from frame to frame. In fact, no feature-based tracking system performs successfully unless interest points are welldetected. Therefore, selecting features is a key step in a tracking framework. Feature tracking approaches can be categorized into two categories: correspondence based methods and texture correlation based methods [37]. Correspondence based methods, such as [73], represent an approach which detects a set of interest points in every frame and aims to match two sets of points in two consecutive frames. On the contrary, texture correlation based methods, for example, [70] and [78], demonstrate a technique where feature points are only identified in the first frame and tracked over succeeding ones. Feature tracking is applicable in 3D pose estimation [45] and SLAM [6]. In normal circumstances in machine vision, visible-optical cameras are used due to their high resolution and SNR, rich spatial distribution of texture, lack of data interruptions and cheap price. However, in failure conditions where the vision is blocked by dust, smoke or fog, color cameras do not work well. Therefore, there is a certain need to use a "see-through" camera, such as a thermal-infrared camera. To a certain extent, the operation of thermal-infrared cameras is similar to visual light cameras. However, in terms of capturing thermal-infrared radiation, thermalinfrared cameras are complex in design. The use of a thermal-infrared camera in feature detection and tracking is very new and challenging, because they produce very low resolutions and poor SNR outputs. As a result, it is essential to process raw images before further use. The principle objectives of this Master’s thesis are categorized according to four active areas of research that have been identified relating to image and video processing: – Noise Reduction Filter 2

1.2. Motivation

– Feature Detection – Edge Detection – Feature Tracking The development of this project is concerned with the robustness and tractability issues of thermal-infrared image and video. The principal contributions of this thesis focus on: the enhancement of existing approaches; the proposal of a number of novel techniques in low SNR image processing; and the evaluation of existing and proposed methods.

1.2

Motivation

The generation of effective feature tracking results using visible-optical cameras is still an active field of research, though quite mature. However, the use of thermalinfrared camera in similar issue is relatively young and challenging. Recently the usage of thermal-infrared cameras in research has increased due to their ability to detect normally unseen objects by obtaining radiation information in failure conditions, for example, fog, dust or dark area. From this identified problem, this thesis has been motivated by two questions: – How can a conventional camera be replaced by a thermal-infrared camera in normal conditions? – How to handle the disadvantages of a thermal-infrared camera, namely lowSNR output and poor resolution? By investigating the capability of thermal-infrared video in a feature detection and tracking system, this project is contributing to the performance of SLAM and robot navigation systems in unknown environments.

1.3

Objectives

In this thesis, a few popular approaches in the field of image processing are studied. In particular, this project is focused on low-SNR thermal-infrared imagery. One of the core objectives of the project is to investigate the fundamental differences between visual and thermal-infrared modalities with the purpose of proposing an effective methodology for feature detection and tracking in thermal-infrared video. It can be seen that most current techniques aim at the visual modality, so they do 3

Chapter 1. Introduction

not perform consistently in thermal-infrared images. Therefore, the objectives of this project are: – Explore the effects of noise reduction filters in low-SNR videos, and propose one to two techniques to improve edge and feature detection and tracking. – Propose an effective, specialized feature detection algorithm for low-SNR thermal-infrared images. – Propose an edge-strength ranking process in a standard framework. – Propose a corner-strength ranking technique in a traditional framework. – Implement the most common edge and feature detectors and feature tracking techniques to gain insightful knowledge of their performance in thermalinfrared. – Evaluate the positive effects of noise reduction filters and proposed techniques on a feature detection and tracking framework.

1.4

Structure of the Document

Chapter 2 gives a detailed background of the nature of the thermal-infrared modality as an alternative imaging domain, and the relationship between thermal-infrared imagery and temperature estimation. Chapter 3 presents the importance of a noise reduction filter in common low SNR thermal-infrared scenarios and reviews a number of well-known denoising methods that are practically applicable. Chapter 4 addresses popular edge detectors and presents an enhanced Canny algorithm which aims to remove bad edges which are made by noise. Chapter 5 examines existing feature detectors. This reviews the algorithms of mentioned techniques. A feature ranking representation is developed based on a curvature corner detection. Chapter 6 includes research into the problem of feature detection and tracking in the thermal-infrared modality. Chapter 7 includes the evaluation of existing image processing algorithms in the thermal-infrared modality, based on a unique and extensive dataset collected as part of the thesis. Chapter 8 gives conclusions and future directions for the completion and extension of this work. 4

Chapter 2 Background This chapter provides background of the nature of the thermal-infrared modality as an alternative imaging domain and the properties of thermal-infrared camera. This chapter is recommended reading in order to fully appreciate the problems in thermal-infrared imagery. The background of thermal-infrared radiation is firstly introduced in Section 2.1. Secondly, the properties of thermal-infrared cameras are discussed in Section 2.2. Finally, the properties of thermal-infrared imagery and the discussion of degrees per graylevel in thermal-infrared image are explored in Section 2.3.

2.1

Thermal-infrared Radiation

2.1.1

Categorization

Objects at temperature greater than absolute zero (0 degrees Kelvin) produce thermal radiation. The temperature affects directly to the wavelengths of this form of Electro-Magnetic (EM) radiation. This leads to a scheme classified by wavelength [54]: – – – – –

Near Infrared (NIR): from 0.7 to 1.0 µm Short-Wave Infrared (SWIF): 1.0 to 3.0 µm Mid-Wave Infrared (MWIF): 3.0 to 5.0 µm Long-Wave Infrared (LWIF): 8.0 to 12.0 µm Very Long-Wave Infrared (VLWIF): 12.0 to 30.0 µm

In the context of digital output, LWIF is normally refereed to as thermal infrared; while NIR is used interchangeably with the term visibility to human eye. 5

Chapter 2. Background Visible and infrared subsections of the EM spectrum is shown in Figure 2.11 .

Figure 2.1: Visible and infrared subsections of the EM spectrum Wavelength of the radiation influences the absorptivity, reflectivity, and emissivity of all objects. In fact, the properties of thermal radiation is expressed by Kirchoff’s Law [11]. In his research, Beiser et al. indicated that under normal circumstances, most of materials do not reflect much LWIF. Therefore, only emitted radiation originating from the material itself is detected. This phenomenon principally explains why thermal-infrared images are less affected by changes in the environments, such as to lighting, shadows and out-of-view motion.

2.1.2

Electromagnetic Radiation

All objects eject radiation as a function of their materials and temperature. Objects ideally absorbing all radiation is referred as blackbodies. Planck’s Law discusses clearly the EM radiation of a blackbody object [48].

Bλ (λ, T ) =

2hc 2 λ5

1 e

hc λkB T

(2.1) −1

Here B is the spectral radiance of the blackbody surface, T is absolute temperature, λ is the wavelength, kB is the Boltzmann constant, h is the Planck constant and c is the speed of light. 1

Figure 2.1 is a modification of one contributed by user Magnus Manske under the CC BY-SA 3.0 license at http://en.wikipedia.org/wiki/File:Infrared_spectrum.gif

6

2.2. Thermal-infrared Cameras Wien’s Displacement Law (see Figure 2.22 ) also states that:

λmax =

b T

(2.2)

Here T is the absolute temperature in Kelvin, and b is Wien’s displacement constant.

Figure 2.2: Wien’s Law of radiation This law relates to a number of everyday experiences. – Objects with a surface temperature similar to that of the sun - in dark red color - have peak radiant emittance – Objects at orange-red temperature, for example, a piece of metal heated by a torch (4500 to 5000 degree Kelvin), predominate radiant emittance and are invisible. – Objects at light dimmer temperature (4000 degrees Kelvin), such as a light bulb, human body, many machines and the like, emit a considerable amount of radiation energy and are detectable by thermal-infrared cameras. – At 1500 degrees Kelvin, the campfire is for warming purpose but not visibility.

2.2

Thermal-infrared Cameras

To a certain extent, the operation of thermal-infrared cameras is similar to visible light cameras. However, in term of capturing thermal-infrared radiation, thermal2

Figure 2.2 is contributed by user M0tty under the CC BY-SA 3.0 license at http://en. wikipedia.org/wiki/File:Wiens_law.svg

7

Chapter 2. Background

infrared cameras are complex in design and expensive, but produce very low resolutions. In general, there are two types of thermal-infrared cameras depending on detectors they use: quantum or thermal [35].

2.2.1

Optris PI450

A Optris PI450 thermal-infrared camera is used in this project. Its specifications is shown in Figure 2.3 3 .

(a) Optris PI450 camera

(b) Connection option

(c) Optris PI450 specifications Figure 2.3: Thermal-infrared camera and specifications

3

Figure 2.3 is contributed by Optris supplier in PI450 datasheet at http://www.optris.com/ pi-lightweight-netbox

8

2.3. Thermal-infrared Imagery

2.2.2

Noise

Image noise is the effect of error in data acquisition of thermal-infrared camera. It may arise by sensor or bolometer pixels. In thermal-infrared imagery context, there is a phenomenon called Fixed Pattern Noise (FPN) which is small but unavoidable. FPN is affected by two factors: Dark Signal Non-Uniformity (DSNU) and Photo Response Non-Uniformity (PRNU). It is more challenging as non-uniformities vary over time. In practical applications, for example, SLAM, images with FPN are unusable. With the purpose of recovering original thermal-infrared signal, initial calibration or ongoing real-time processing or a combination of both methods is employed. This project only focuses on real-time image processing. In specific, a natural image denoising is used as a process of removing noise from a corrupted image.

(a) Visible image

(b) Thermal-infrared image

Figure 2.4: Comparison of visible and thermal-infrared images at same time and condition

2.3

Thermal-infrared Imagery

In principal, thermal-infrared images can be viewed and processed in much the same way as regular greyscale images captured from a conventional visible spectrum camera. However, there are several important differences between images in these two modalities. Considering these differences can ensure that the strengths of thermal-infrared images are not squandered, and that the weaknesses are effectively managed. 9

Chapter 2. Background

2.3.1

Variation in SNR

Basically, thermal-infrared cameras produce low SNR images [33], so that thermalinfrared output seems to be noisy. However, it depends enormously on the environmental conditions and viewpoint. When the temperature of an object increases, it produces more light and the wavelength of this light becomes shorter, therefore FPN swamps the useful information in an image. Consequently, in these conditions where there are only passive objects, the SNR of thermal-infrared image is low. Figure 2.5 demonstrates the variation in SNR of thermal-infrared imagery that occurs between different viewpoints within the same environment.

(a) High SNR raw image

(b) Low SNR raw image

(c) High SNR gray-scale image

(d) Low SNR gray-scale image

Figure 2.5: Example of high and low SNR images

2.3.2

Degrees per graylevel

It is worth to get a clear concept about Degrees Per Graylevel (DPG). The raw 14bit images from the Optris have a DPG of 0.01. This means that each change in 10

2.3. Thermal-infrared Imagery

graylevel in the output image represents 0.01 degrees Celsius. Therefore, a difference in 100 graylevels will represent 1 degree Celsius. Hence, for example, if the input image has graylevels ranging from 2217 to 3342, that will mean it has a range of temperatures of: 3342 − 2217 = 11.25◦ C 100 This would be a fairly high SNR image. A low SNR image with graylevels ranging from 2560 to 2692 would represent a range of temperatures of: 2692 − 2560 = 1.32◦ C 100 Within one raw 16-bit sequence, the temperature range of each image may change, but the DPG stays the same: fixed at 0.01. Depending on the method of converting from 14-bit to 8-bit images, the DPG may be modified. In fact, many conversion methods will result in a different DPG for every output image in the sequence, which is not desirable. Effectively, a lower DPG means a higher magnitude of noise. Hence, an approach that will maintain the same DPG in the 8-bit output image, even when the SNR is very different, is recommended to be used. For example, one approach is to find the median intensity of the 14-bit image, and then scale the raw graylevels by a fixed factor, such as 0.5, and threshold to keep all values within the range of 0 − 255. The pseudo code of scaling algorithm is shown in Algorithm 1. Algorithm 1 Scale DPG 1: procedure scaledpg(image) 2: 3: 4: 5: 6: 7: 8: 9: 10:

input: raw image R find median value M of R for every pixel has intensity R(x, y) do A = [R(x, y) − I] ∗ 0.5 + 127 B = min(A,255) G(x, y) = max(B,0) end for output: scaled image G end procedure

For low SNR images, these may appear to a human viewer to have very low contrast, but for image processing and edge detection algorithms, they are able to be more 11

Chapter 2. Background

sensitive. The important thing is to use the same formula for converting all types of images to 8-bit. Therefore, a script to generate a directory of 8-bit images from raw 16-bit images, and using the default scaling will ensure the same DPG for all output images is expected.

12

Chapter 3 Noise Reduction Filters Image noise is a random variation of brightness and visible as grains. It may arise by the sensor and circuity of a digital camera during the time of capturing or image transmission that adds spurious and extraneous information [82]. Noise in image is defined as pixels showing false or different intensity values instead of true or expected values. Natural image denoising is a process of reducing or removing noise from an image. In other words, it is defined as a process of guesstimating an original clean version of noise corrupted image [40]. The common types of noise that arises in a image are impulse noise (salt-and-pepper noise), amplifier noise (Gaussian noise), shot noise, quantization noise (uniform noise), film grain, on-isotropic noise, multiplicative noise (speckle noise) and periodic noise [58]. Depending on owning different characteristics, which makes them distinguishable, each type of noise is able to afflict image in different context. As a result, noise reduction filters are developed to minimize the effects of noise in order to ameliorate image processing. Image denoising algorithms have significantly advanced over the last few decades [14, 49, 57]. However, it seems that the performance of denoising methods is commencing to converge [40]. Some approaches transfer the image signal to a substitute domain where noise can be straightforwardly removed from the signal [39]. Portilla et al. [62] propounds a wavelet-based Bayes Least Squares with a Gaussian Scale-Mixture (BLS-GSM) method. More recent techniques exploit the non-local statistics of images [13, 14]. To the best of my knowledge, the class of well-engineered algorithm BM3D [19] is effective in both grayscale and color image. Although there is a numerous proposed techniques, most of them are designed for a determined sort of noise or require noise statistical properties assumptions. For 13

Chapter 3. Noise Reduction Filters

(a) Original image

(b) Impulse noise

(c) Amplifier noise

(d) Shot noise

(e) Speckle noise

(f) Gaussian noise

Figure 3.1: Six common types of noise example, the Wiener filter [86] (Section 3.1) performs effectively at eliminating speckle and Gaussian noise, but the input signal and noise are assumed to be widesense stationary processes with known autocorrelation functions. Median filtering (Section 3.2) outperforms in term of salt and pepper noise, but not effective for additive Gaussian noise. Even popular approach, BM3D [19] in Section 3.4, targets primarily toward Gaussian noise [2].

3.1 3.1.1

Least Square Error Wiener-Kolmogorov Filter Background

Andrei Kolmogorov and Norbert Wiener developed least squared error filter theory in 1941 and 1949, respectively. In particular, while time-domain analysis was developed by Kolmogorov, frequency-domain analysis was formed by Wiener. Least squared error filter theory forms the foundation of data-dependent adaptive linear filters and play a vital role in broad areas such as linear prediction, echo cancellation, signal restoration, channel equalisation, radar signal processing and system identification [81]. In image processing, the Wiener filter is employed to acquire an approximation of a desired or aim random process by linear time-invariant filtering an observed 14

3.1. Least Square Error Wiener-Kolmogorov Filter

noisy input. In specific, the Wiener filter is used in order to minimize the mean square error between the estimated random process and the desired process by computing the coefficients of a least squared error filter. Wiener filter approach can be characterized as follows [12]. – Assumption: signal and (additive) noise are stationary linear stochastic processes with known spectral characteristics or known autocorrelation and crosscorrelation – Requirement: the filter must be physically realizable/causal – Performance criterion: minimum mean-square error (MMSE) In general, block diagram of Wiener filter problem is shown as following.1

Figure 3.2: Block diagram of Adaptive Wiener Filter

3.1.2

Wiener Filter

Wiener filter in 2D image processing [28] is considered in the following part. Given a system:

y(n,m) = h(n,m) ∗ x(n,m) + v(n,m)

(3.1)

Here ∗ denotes convolution, x(n,m) is input signal, h(n,m) is known impulse response of a linear time-invariant system, v(n,m) is unknown additive noise, independent of x(n,m) and y(n,m) is observed signal. Deconvolution filter g(n,m) can be found by estimating x(n,m) as following: 1

Figure 3.2 is contributed by user Wdwd under the CCCA-SA 1.2 license at http://en. wikipedia.org/wiki/File:AdaptiveFilter-C.png.

15

Chapter 3. Noise Reduction Filters

xˆ (n,m) = g(n,m) y(n,m)

(3.2)

Here xˆ (n,m) is an estimate of x(n,m) that minimize the mean-square error. In frequency-domain, the transfer function of g(n,m) is:

G(ω1 ,ω2 ) =

H ∗ (ω1 ,ω2 )S(ω1 ,ω2 ) |H(ω1 ,ω2 )|2 S(ω1 ,ω2 ) + N (ω1 ,ω2 ) + N (ω1 ,ω2 )

(3.3)

Here G(ω1 ,ω2 ) and H(ω1 ,ω2 ) are the Fourier transforms of g(n,m) and h(n,m), respectively, S(ω1 ,ω2 ) is the mean power spectral density of the input signal s(n,m), N (ω1 ,ω2 ) is the mean power spectral density of the noise n(n,m) and the superscript ∗ denotes complex conjugation. The equation for G(ω1 ,ω2 ) can be re-written as, 

G(ω1 ,ω2 ) =



2

|H(ω1 ,ω2 )| 1   H(ω1 ,ω2 ) |H(ω ,ω )|2 + N (ω1 ,ω2 ) 1 2 S(ω ,ω ) 1

(3.4)

2

It leads to the solution for minimum error value: ” —∗ G ∗ (ω1 ,ω2 )N (ω1 ,ω2 ) − H(ω1 ,ω2 ) 1 − G(ω1 ,ω2 )H(ω1 ,ω2 ) S(ω1 ,ω2 ) = 0 (3.5)

3.2 3.2.1

Median Filter Background

Median filter is a nonlinear smoothing technique. Under certain conditions, median filter preserves edges while most of other noise-removing techniques do not. Since edges are blistering features to the visual appearance of images and the concept and implementation of median filter is straightforward, it is widely used in 2D digital image processing. For low levels of noise, such as Gaussian noise, median filter is provable better than Gaussian one at removing noise while maintaining edges over a specified window 16

3.2. Median Filter size [5]. However, for high levels of noise, the performances of median and Gaussian filters are nearly analogous. Whereas, for impulse noise, such as speckle and salt and pepper noise, it is vaguely efficient [4]. Median filter is a rank-selection (RS) filter, a member of the family of rank-conditioned rank-selection (RCRS) filters, for example mean, minimum and maximum filters. The concept of the median filter is to run through an image pixel by pixel to replace each pixel by the median value of neighbors in a fixed square window size, namely 3 × 3, 5 × 5, , 7 × 7 and so on.

3.2.2

Median Filter

The pseudo code of median filter is intuitive [41]. Algorithm 2 Median filtering 1: procedure medianfiltering(image) 2: 3: 4: 5: 6: 7: 8: 9:

input: noisy image for every pixel in the image do sort values in the window pick the median value in the sorted list replace the pixel value with median one end for output: filtered image end procedure

An example is illustrated below 2 .

Figure 3.3: 3 × 3 kernel in median filter 2

Figure 3.3 is contributed at http://cwcaribbean.aoml.noaa.gov/bilko/intro4.html

17

Chapter 3. Noise Reduction Filters

3.3 3.3.1

Non-local Means Filter Background

Non-local means (NL-means) is a linear smoothing technique. NL-means filtering is an algorithm computing average values of all pixels in the image, which are analyzed how similar they are to the objective pixel. The main difference between NLmeans and other filters is the meticulous employment of all possible self-predictions an image is able to provide. Compared to local filters, which process pixels within a local square window to aim for a reconstruction of the main geometrical configurations, NL-means is more effective in post-filtering intelligibility and preserving details of image and fine structure [14]. Figure 3.4 and 3.5

3

demonstrate applications of NL-means in image processing

area.

(a) Original image

(b) Denoised image

Figure 3.4: Changing the content of aerial views

3.3.2

Non-local Means Filter

NL-means algorithm is based on the use of "neighborhood" pixels to predict the center point. To the best of author’s knowledge, the most well-known and popular NL-means approach is proposed by Buades in [13] and extended in [14]. His NLmeans algorithm is summarized as follows. 3

Figure 3.4 and 3.5 were contributed by Jean-Michel Morel at http://www.farman.enscachan.fr/6_JeanMichel_Morel.pdf

18

3.3. Non-local Means Filter

Figure 3.5: Removing text from images by the Efros Leung algorithm (Ballester et al., 2003) Step 1 Let v be a discrete noisy image, the estimated value NL-means of v(i) is computed as.

N L[v](i) =

X

w(i, j)v( j)

(3.6)

j∈I

Here the weights of neighborhood windows {w(i, j)} j depend on the correlation P between the pixels i and j, and meets conditions 0 ≤ w(i, j) ≤ 1 and j w(i, j) = 1. Step 2 The similarity between two pixels i and j is determined as a decreasing function of the weighted Euclidean distance, such that an equality showing the robustness of the algorithm is calculated.

Ekv(Ni ) − v(N j )k22,a = kv(Ni ) − v(N j )k22,a + 2σ2

(3.7)

where Nk denotes a square neighborhood of fixed size and centered at a pixel k, a > 0 is the standard deviation of the Gaussian kernel. 19

Chapter 3. Noise Reduction Filters

Figure 3.6: Display of the Non-local means weight distribution used to estimate the central pixel of every image. The weights go from 1(white) to zero(black) (Buades et al., 2005b)

Step 3 The similarity weights are defined as (Figure 3.6 4 ).

1 − e w(i, j) = Z(i)

kv(Ni )−v(N j )k2 2,a h2

(3.8)

Here Z(i) is the normalizing constant.

Z(i) =

X

e



kv(Ni )−v(N j )k2 2,a h2

(3.9)

j

Here h is a degree of filtering. In short, NL-means is defined by the simple formula.

1 N L[u](x) = C(x)

Z e



Ga ∗(|u(x+.)−u( y+.)|2 )(0) h2

u( y)d y

(3.10)



where C(x) is a normalizing constant, Ga is a Gaussian kernel and h acts as a filtering parameter. 4

Figure 3.6 was contributed by Buades et al. at http://bengal.missouri.edu/~kes25c/nl2.

pdf

20

3.3. Non-local Means Filter

Pseudo-code

The pseudo-code of Non-local means is illustrated in Algorithm 3 [1].

Algorithm 3 Non-local means filter 1: procedure nlmeans(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36:

input: noisy image for every pixel in the image do take a window centered in x with size (2m + 1 × 2m + 1) take a window centered in x with size (2n + 1 × 2n + 1) for each pixel y in A(x,m) and y 6= x do compute the difference between W (x,n) and W ( y,n) if w(x, y) > w ma x then w ma x = w(x, y) end if compute the average of w(x, y) compute the sum of weights end for give to x the maximum of the other weights compute total weights compute the restored value compute distance end for output: filtered image end procedure procedure computedistance(x, y,n) input: x, y,n dist ancet ot al = 0 dist ance = (u(x) − u( y))2 for k = 1 → n do for each i = (i1 ,i2 ) do pair of integer number such that max(|i1|,|i2|) = k dist ance+ = (u(x + i) − u( y + i))2 end for aux = dist ance/(2k + 1)2 dist ancet ot al+ = aux end for dist ancet ot al/ = n output: distance end procedure

21

. A(x,m) . W (x,n) . d(x, y)

Chapter 3. Noise Reduction Filters

3.4

3.4.1

Sparse 3D Transform-domain Collaborative Filter Background

Currently, Sparse 3D Transform-domain Collaborative Filter (BM3D) is a well-known image denoising algorithm introduced by Dabov et al. in 2007 [19]. The basic principles of this algorithm is grouping and collaborative Wiener filtering in two-stage estimations. There is a number of developed algorithms based on the concepts of BM3D, such as [38], [34], [90], and [17]; however, BM3D is still the most successful approach, especially for low-SNR thermal-infrared image.

3.4.2

BM3D Filter

According to Dabov et al., BM3D has two major steps [19]. They are clearly shown in Figure 3.7 and Figure 3.8 5 .

Figure 3.7: Scheme of BM3D algorithm (Dabov et al., 2007) Two major steps are shown as follows [19]:

Step 1: Basic estimate (a) Block-wise estimates. For each patch in the noisy image. (i) Grouping. Identify similar blocks and collect them together. 5

Figure 3.7 and Figure 3.8 are adapted from original published article of Dabov et al. [19]

22

3.4. Sparse 3D Transform-domain Collaborative Filter

Figure 3.8: Patches, search window and overlapping (Dabov et al., 2007) (ii) Collaborative hard-thresholding. Apply a 3D conversion to estimate patches’ initial locations. (b) Aggregation. Estimate the basis of original image.

Step 2: Final estimate Combine basic estimate and collaborative Wiener filtering to improve the initial estimate. (a) Block-wise estimates. For each patch in the noisy image. (i) Grouping. Group noisy image and initial estimate by found positions. (ii) Collaborative Wiener filtering. Performing inverse 3D transform after applying Wiener filtering to estimate patches’ final positions. (b) Aggregation. Estimate the original image.

Pseudo-code The pseudo code of BM3D is shownn in Algorithm 4 [38].

23

Chapter 3. Noise Reduction Filters

Algorithm 4 BM3D filtering 1: procedure bm3dfiltering(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24:

input: noisy image form blocks for each block in the noisy image do group matched blocks in 3D array group keep N har d blocks closest to processed one end for apply 3D isometric linear transform apply shrinkage of the transform spectrum apply inverse linear transform for each block do basic estimate save in buffer end for for each block in basic estimate do group matched blocks in 3D array group keep N wien blocks closest to processed one end for compute Wiener coefficients for each block do final estimate end for output: filtered image end procedure

24

Chapter 4 Edge Detectors Edge detection is a process aiming at locating points in an image at which the image brightness has discontinuities. Edge detection is a fundamental and essential tool in image processing since edge contains a wealth of internal information of the image and reduces the amount of data by filtering out useless information. The goals of edge detection are to produce a binary image in which there are curves separating different regions and to extract feature points, lines and so on. With the rapid development of image processing, edge detection becomes more important and numerous techniques has been proposed over the past 50 years [21]. Early works [22, 25, 61, 76] focused on the detection of intensity or color gradients. Amongst all of them, Canny edge detector [16], which uses dynamic template size, outperforms and is widely used in recent research. More recently, the works exploiting the presence of textures [52], combining low, mid and high-level cues [91] and computing gradients across learned sparse codes of patch gradients [88] are proposed. The chapter first provides a background of classical edge detection methods, Robert cross, Sobel, Laplacian of Gaussian and Canny, in Section 4.1, 4.2, 4.3, and 4.4, respectively. Next, a novel approach is explored in Section 4.5.

4.1 4.1.1

Roberts cross Detector Background

The Roberts cross operator is one of the first edge detector and was originally proposed by Lawrence Roberts in 1963 in his PhD thesis, titled Machine Perception Of 25

Chapter 4. Edge Detectors Three-Dimensional Solids [64] . The purpose of Roberts cross operator is to sum up squares of the differences between diagonally adjacent pixels in order to estimate the gradient of an image. In order to compute the gradient of an image, two kernels 2×2 are used to spotlight alteration in diagonal direction. Its simplicity and complexity are the most captivating features in that period. However, with the rapid development of computers in term of speed, Roberts cross operator is no longer used since it is too responsive to noise.

4.1.2

Roberts cross detector

Robert cross operator includes three steps.

Step 1 Roberts cross operator begins by convolution of original image, I, with 2×2 kernels. •

˜ 1 0 Gx = ∗I 0 −1

•

˜ 0 1 Gy = ∗I −1 0

and

(4.1)

Step 2 The gradient of a pixel at (x, y) is defined as following.

∇I(x, y) = G(x, y) =

Ç

G 2x (x, y) + G 2y (x, y)

(4.2)

Step 3 The direction of the gradient is computed as below.

Θ(x, y) = arctan



G y (x, y)



G x (x, y)

The pseudo-code of Roberts cross operator is shown in Algorithm 5. 26

(4.3)

4.2. Sobel Detector

Algorithm 5 Roberts cross operator 1: procedure robertsedge(image) 2: 3: 4: 5: 6: 7: 8: 9:

input: image for every pixel in the image do convolve with two 2 × 2 kernels compute gradient compute direction of gradient end for output: binary image end procedure

4.2 4.2.1

Sobel Detector Background

Sobel filter was developed by Sobel and Feldman in 1968. In 2014, Sobel, one of the authors of Sobel operator, reviewed his works in an article titled History and Definition of the Sobel Operator [75]. Technically, the concept of Sobel filter is similar to Roberts cross operator’s. Instead of using two 2 × 2 matrices, Sobel proposed to convolve with two 3 × 3 matrices, in horizontal and vertical direction, to find regions of high spatial frequency that correspond to edges. There are a number of articles done to improve on Sobel approach. In 2009a, JinYu et al. [32] improved existing operator by finding optimal thresholding; while Wenshuo Gao and Liu suggested to combine Sobel filter with soft-threshold wavelet de-noising in 2010 [85].

4.2.2

Sobel Detector

Similar to Robert cross operator, Sobel filter includes three steps.

Step 1 Convolution original image, I, with two 3 × 3 kernels gives. 

 −1 0 1 G x = −2 0 2 ∗ I −1 0 1



and

27

 1 2 1 0 0 ∗ I Gy =  0 −1 −2 −1

(4.4)

Chapter 4. Edge Detectors

Instead of convolution with two above 3 × 3 kernels, Sobel operator can be done by convolution with four decomposed kernels as follows.



   −1 0 1 1   • ˜ • ˜    −2 0 2 = 2 −1 0 1 = 1 ∗ 1 1 −1 ∗ 1 1 1 1 −1 0 1 1

(4.5)

and



   1 2 1 1   •1˜ • 1 ˜      0 1 1 ∗ 1 1 0 0  = 0 1 2 1 = ∗ 1 −1 −1 −2 −1 1

(4.6)

Step 2

Similar to Roberts cross operator, the gradient of image can be computed as.

∇I(x, y) = G(x, y) =

Ç

G 2x (x, y) + G 2y (x, y)

(4.7)

Step 3

Using this information, the gradient’s direction is calculated.

Θ(x, y) = arctan



G y (x, y) G x (x, y)

Pseudo-code

The pseudo-code of Sobel operator is shown below. 28

 (4.8)

4.3. Laplacian of Gaussian Detector

Algorithm 6 Sobel operator 1: procedure sobeledge(image) 2: 3: 4: 5: 6: 7: 8: 9:

input: image for every pixel in the image do convolve with two 3 × 3 kernels or decomposed kernels compute gradient compute direction of gradient end for output: binary image end procedure

4.3 4.3.1

Laplacian of Gaussian Detector Background

Laplacian of Gaussian is a 2-D isotropic appraising 2nd spatial derivative of an image. The Laplacian detector is normally applied after an image is smoothed [50] with the purpose of reducing effects of noise. The Laplacian detector is effective in blob detection.

4.3.2

Laplacian of Gaussian Detector

Given an image, I, the Laplacian is computed by: ∂ 2I ∂ 2I L(x, y) = 2 + 2 ∂ x ∂ y

(4.9)

Step 1

This step can be done by three traditional convolution filters shown below:



 0 −1 0 4 −1 ∗ I L = −1 0 −1 0



or

 −1 −1 −1 8 −1 ∗ I L = −1 −1 −1 −1

29



or

 −1 2 −1 2∗ I L =  2 −4 −1 2 −1 (4.10)

Chapter 4. Edge Detectors

Step 2 To reduce the complexity, it is essential to convolve Laplacian with Gaussian kernel before processing on image. The 2D LoG with standard deviation σ is represented as:   x 2 + y 2 − x 2 +2y 2 1 LoG(x, y) = − 1− e 2σ πσ4 2σ2

(4.11)

An discrete approximation kernel with standard deviation σ = 1.4 is shown in Figure 4.1 1 .

Figure 4.1: Discrete approximation to LoG function with Gaussian standard deviation σ = 1.4

Pseudo-code To summarize, the pseudo-code of LoG is represented in Algorithm 7.

4.4 4.4.1

Canny Detector Background

Canny edge detector was developed by Canny in 1986 in a paper named "A Computational Approach To Edge Detection" [16]. This technique is considered one of 1

Figure 4.1 is contributed at http://www.math.tau.ac.il/ turkel/notes/Maini.pdf

30

4.4. Canny Detector

Algorithm 7 Laplacian of Gaussian 1: procedure LoG(image) 2: 3: 4: 5: 6: 7: 8:

input: image convolve Gaussian filter with Laplacian kernel for every pixel in the image do convolve with Laplacian of Gaussian kernel end for output: binary image end procedure

the most successful edge detection method and is widely used in the engineering applications. Canny defined three criteria of good edge detection. They are good detection, which is the ability that an algorithm marks actual edges, good localization, which shows how near marked edge is to real edge, and minimal response, which expresses noise should not construct false edges. Canny edge detector was enhanced by Bao et al. by improving localization by using scale multiplication in 2005 [8]. In 2009, an improved Canny edge detection was proposed by Wang [84]. In specific, Wang propounded using self-adaptive filter instead of Gaussian one to refine the original approach.

4.4.2

Canny Detector

The Canny algorithm are demonstrated in six major steps.

Step 1 Since noisy and unprocessed raw image is the input, Canny edge detector uses Gaussian blur to filter out noise. A convolution mask, which is much smaller than the raw image, is slid over the image. The greater the size of Gaussian kernel is, the less sensitive the detector to noise. Hence, the pick of window size is very important. An example of 5 × 5 Gaussian kernel is demonstrated as following. 

 2 4 5 4 2 4 9 12 9 4 1   K= 5 12 15 12 5 159 4 9 12 9 4 2 4 5 4 2 31

(4.12)

Chapter 4. Edge Detectors

Step 2 After smoothing the raw image, gradient and direction of gradient of each pixel are computed. This step is similar to Sobel method. 

 −1 0 1 G x = −2 0 2 ∗ I −1 0 1



and

 1 2 1 0 0 ∗ I Gy =  0 −1 −2 −1

(4.13)

G 2x (x, y) + G 2y (x, y)

(4.14)

Step 3 Gradient is computed as.

∇I(x, y) = G(x, y) =

Ç

Step 4 Gradient’s direction is followed.

Θ(x, y) = arctan



G y (x, y) G x (x, y)

 (4.15)

The direction is rounded to one of four feasible angles, namely 0,45,90 and 135 degrees, and is illustrated in Figure 4.2 2 .

Step 5 After gradient’s direction is found, non-maximum suppression is applied to acquire thin lines in binary image. Non-maximum suppression considers every pixel given image gradients [87]. Step 6 Final step of Canny detector is hysteresis, which involves high and low thresholds. In specific, a pixel is considered belonging to an edge if its pixel gradient is higher 2

Figure 4.2 is contributed at http://cswww.essex.ac.uk/staff/hhu/Papers/CES-506.pdf

32

4.5. Enhanced Canny Detector - A Smoothness-based Detector

Figure 4.2: Four direction of two sets of pixels in 3×3 mask (Oskoei and Hu, 2010) than high threshold. Otherwise, a pixel is considered not belonging to an edge if its pixel gradient is smaller than low threshold. If a pixel gradient is between the two thresholds, it is accepted belonging to an edge if and only if it is connected to a pixel having gradient above the high threshold.

Pseudo-code The pseudo-code of Canny edge detector is shown in Algorithm 8.

4.5

Enhanced Canny Detector - A Smoothness-based Detector

A novel edge detection method is proposed to improve on existed Canny edge detector. This algorithm can enhance the robustness and quality of output binary image by removing most of fake edges generated by noise. This technique is shown to be effective in low-SNR video.

4.5.1

Outline

The proposed detector has been implemented in Matlab using a combination of open source libraries provided by Matlab society. It has been contrived as a process 33

Chapter 4. Edge Detectors

Algorithm 8 Canny edge detector 1: procedure cannyedge(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26:

input: image apply Gassian blur to remove noise for every pixel in the image do compute gradient compute direction of gradient end for relate gradient direction . 0,45,90 or 135 degrees for every pixel in the image do suppress non-maximum end for for every pixel in the image do if gradient is greater than high threshold then this pixel is included in binary image else if gradient is less than low threshold then this pixel is NOT included in binary image else if connected to a pixel having gradient greater than high threshold then this pixel is included in binary image else this pixel is NOT included in binary image end if end if end for output: binary image end procedure

which takes raw image as input and outputs a binary image with a collection of indexed tracked edges. The proposed technique is inspired by Canny edge detector [16] and Global and Local Curvature Properties Detector [30]. The performance is promising since most of fake edges are filtered out.

4.5.2

Algorithm

There are seven major steps in proposed smoothness-based detector. 34

4.5. Enhanced Canny Detector - A Smoothness-based Detector

Step 1 The first step is to denoised by Gaussian kernel or well-engineered noise reduction filters, such as NL-means and BM3D discussed in Chapter 3. The output after Step 1 is demonstrated in Figure 4.3 and Figure 4.4. For high-snr image (Figure 4.3), there is no big differences between raw and denoised images. However, in low SNR scenario (Figure 4.4), BM3D illustrates the best output and uniformly perform better thang the others, because its output contains structured objects which allow a productive grouping and collaborative filtering.

(a) Raw image

(b) Grayscale image

(c) Denoised image by NL-means

(d) Denoised image by BM3D

Figure 4.3: Outputs of proposed edge detection after Step 1 for high SNR image

Step 2 The second step is to apply Canny algorithm (see Section 4.4) to detect edges. Figure 4.5 and 4.6 show binary images after being denoised. BM3D once again shows its effectiveness in terms of preservation of details and reduction of noise. 35

Chapter 4. Edge Detectors

(a) Raw image

(b) Grayscale image

(c) Denoised image by NL-means

(d) Denoised image by BM3D

Figure 4.4: Outputs of proposed edge detection after Step 1 for low SNR image Step 3 Next, binary edge map found from Step 2 is the input of Step 3 [30] in order to extract contours as in the CSS method [63]. After that, some steps are essential to follow in order to find true corners and remove false ones. Details are broached in Section 5.4. According to Rattarangsi and Chin, six steps in the CSS method are demonstrated below [30]. (a) Detect binary map. (b) Extract edges. (c) Compute curvature for each edge. (d) Remove round corners by comparing to an adaptive threshold. (e) Remove false corners. (f) Set end points as corners. True corners in different type of images are shown in Figure 4.7 and 4.8. 36

4.5. Enhanced Canny Detector - A Smoothness-based Detector

(a) NL-means

(b) BM3D

Figure 4.5: Outputs of proposed edge detection after Step 2 for high SNR image. The inputs of (a) and (b) are taken from Step 1 using Nl-means and BM3D, repectively

(a) NL-means

(b) BM3D

Figure 4.6: Outputs of proposed edge detection after Step 2 for low SNR image. The inputs of (a) and (b) are taken from Step 1 using Nl-means and BM3D, repectively Step 4 This step is to link edges provided from Step 3 and update corners accordingly. There are three types of linking. – Type 1: the ending of an edge to the ending of another edge. This type of linking combine two edges into a longer one. – Type 2: the ending of an edge to a mid point of another edge. Here, this type creates a T-junction. – Type 3: the beginning to the ending of an edge. This type of linking changes the curve mode from line to loop. To recall, a contour j

j

j

A j = {P1 , P2 ,..., PN } 37

Chapter 4. Edge Detectors

(a) NL-means

(b) BM3D

Figure 4.7: Outputs of proposed edge detection after Step 3 for high SNR image. The inputs of (a) and (b) are taken from Step 2 using Nl-means and BM3D, repectively

(a) NL-means

(b) BM3D

Figure 4.8: Outputs of proposed edge detection after Step 3 for low SNR image. The inputs of (a) and (b) are taken from Step 2 using Nl-means and BM3D, repectively is determined to be closed (loop) or open (line). ( A

j

j

j

j

j

closed

if |P1 ,PN | < T

open

if |P1 ,PN | > T

By default, T is set at 2-3 pixels. The algorithm of Step 4 is shown in Algorithm 9.

Step 5 Next step is to compute edges’ metrics. This step is described as follows. 38

4.5. Enhanced Canny Detector - A Smoothness-based Detector

Algorithm 9 Link edges 1: procedure linkedge 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25:

input: binary image, GAP_LINK extract contours from binary image for every edge in an image do for each point in an edge, A do if exist a pixel, B, where distance A-B is smaller than GAP_LINK then if A and B are beginning points of different edges then Type-1 update edges update corners else if A is mid point and B is beginning point then Type-2 update edges update corners else if A and B are beginning and ending points of an edge then Type-3 update curve-mode to loop update edges update corners end if end if end for end for output: binary image with contours end procedure

Firstly, for given contours and corners in Step 4, which corner belong to which contour should be done. At the end of this sub-step, a table showing a list of edges with corners lying on them is expected. Based on the selection of ending point is corner [30], therefore, there are greater or equal to two corners in each edge. For each edge, based on its corners, its metric is computed. An edge metric depends on four aspects. – Length of this edge. The longer the edge is, the less possible it is generated by noise. – Smoothness of this edge. The detected edge generated by object is more likely to be smoother than the one generated by noise. – Number of corners lying on this edge. – Curve mode. Line or loop. This factor represents how much a loop is more preferable than a line. Consider an N-pixels edge, 39

Chapter 4. Edge Detectors

j

j

j

j

j

j

A j = {P1 , P2 ,..., PN }

with M corners in order.

C j = {C1 , C2 ,..., C M } j

j

j

j

Clearly, M ≤ N , P1 = C1 and PN = C M . The strategy of computing edge-score is summarized as follows. (a) From 2 adjacents corners, extract a small curve between them. (b) Fit the extracted curve to a cubic model. f (x) = ax 3 + bx 2 + c x + d

(4.16)

(c) Compute summation of squares of residuals. N X j RSS j = ( yi − f j (x i ))2

(4.17)

i=1

where yi is real index (of extracted edge), and f (x i ) is estimated index (of fitted model). (d) Compute summation of all summations (RSS) of divided parts. RSS =

M −1 X

RSS j

(4.18)

j=1

(e) Compute average residual (AR). RSS N

(4.19)

AR × W × L × M 2 N2 ×N0 ×Φ

(4.20)

AR = (f) Edge-Score (ES) is computed as. ES =

Here W and L are width and length of image, respectively; N is number of pixels of an edge; M is number of corners lying on this edge; AR is average 40

4.5. Enhanced Canny Detector - A Smoothness-based Detector residual; N 0 is the summation of pixels of all edges connected to this edge and this edge; and Φ is loop-line-ratio.

Step 6 Next, edges are ranked based on their metrics. The smaller the metric, the better the edge.

Step 7 Finally, E T strongest edges are kept, the weaker are removed. E T represents a threshold of number of edges which is desirable to be kept.

Pseudo-code The flowchart and pseudo-code of Proposed Smoothness-based Detector is shown in Figure 4.9 and Algorithm 10, respectively.

41

Chapter 4. Edge Detectors

Algorithm 10 Edge Detection: Proposed Smoothness-based Detector 1: procedure smoothnessbasededge(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27:

input: image, E T - edge threshold filter noise apply Canny edge detection to find binary image apply Global and Local Curvature Properties Detector to find corners link edges find which corner belong to which edge for every edge in the provided binary image do extract this edge into M − 1 parts for every part do fit this part to a cubic polynomial . ax 3 + bx 2 + c x + d find a, b, c and d estimate y index given x index compute summation of squares of residuals end for compute summation of all summations of squares of residuals compute average residual compute edge-score end for rank edges if E T ≥ N (number of edges) then keep all edges else keep E T strongest edges end if output: binary image end procedure

42

4.5. Enhanced Canny Detector - A Smoothness-based Detector

Figure 4.9: Flowchart of Enhanced Edge Detector

43

Chapter 5 Feature Detectors Corner detection is a technique used in image processing area to detect interest points inferring the contents of an image. These points must be uniquely recognizable since they represent the boundary of two edges. Intuitively saying, corner points are junctions of contours which show huge differences in all directions in the neighborhood. According to Tuytelaars and Mikolajczyk, good corner points should have following properties [80]. – Repeatability. Given two images representing different angles from the same scene, a good corner detector should find a high percentage of interest points corresponding to the same real world locations. – Distinctiveness/informativeness. Local features should be sufficiently informative. – Locality. The corner points should depend on a small spatial area. – Quantity. The detected points should be adequately huge. – Accuracy. Local features should be true. – Efficiency. Corner detector should be fast to be applicable. Figure 5.1 1 shows a visual image representing how a good corner point is defined. Corners detected by a feature detector is shown in Figure 5.2 2 . 1

Figure 5.1 was contributed by Kim at http://cdn.intechopen.com/pdfs/41040/InTechRobust_corner_detection_by_image_based_direct_curvature_field_estimation_for_ mobile_robot_navigation.pdf 2 Figure 5.2 is contributed by user Retardo without licensing at http://en.wikipedia.org/ wiki/File:Corner.png.

44

Figure 5.1: Requirements of good corner detectors (Kim, 2012)

Figure 5.2: Example of feature points

The structure of this chapter is as follows. First, the provides a background of Harris detector in Section 5.1. Next, Susan, Features from Accelerated Segment Test (FAST), and Global and Local Curvature Properties Detector (GLCPD) are reviewed in Section 5.2, 5.3, and 5.4, respectively. Finally, a novel approach, which is inspired by Canny detection and GLCPD, is discussed in Section 5.5. 45

Chapter 5. Feature Detectors

5.1 5.1.1

Harris Detector Background

In 1980, one of the first interest point detector was developed by Moravec [56]. His detector is intensity-based method based on the auto-correlation of the signal. Four discrete shifted windows in directions parallel to the rows and columns are employed to compute the gray-value differences. If the minimum of these four differences is upper-level to a threshold, a feature point is identified. Instead of using discrete patches by Moravec, Harris and Stephens considered the differential of the corner score regarding to direction directly [29]. Similar to Moravec detector, Harris also uses a local auto-correlation function to determine the local changes in an image. In spite of the appearances of a large number corner detectors, Harris detector is widely utilized owing to its powerful invariance to rotation, scale, illumination variation and image noise [69].

5.1.2

Harris Detector

Let a grayscale 2-dimensional image be I.

Step 1 The weighted sum of squared differences (SSD) between two patches, proposed by Harris, is defined as:

E(x, y) =

XX u

w(u,v) (I(x + u, y + v) − I(u,v))2

(5.1)

v

Step 2 Taylor expansion is employed to approximate I(x + u, y + v). Only first partial derivatives are carried on being as below.

I(u + x,v + y) ≈ I(u,v) + I x (u,v)x + I y (u,v) y 46

(5.2)

5.1. Harris Detector

This produces:

E(x, y) ≈

XX u

w(u,v) I x (u,v)x + I y (u,v) y

2

(5.3)

v

which can be represented as.

E(x, y) ≈ x

 x‹ y A y

(5.4)

where A is Hessian matrix and can be computed as.

A=

XX u

v

I x2 w(u,v) Ix I y 

   2 Ix I y 〈I x 〉 〈I x I y 〉 = I 2y 〈I x I y 〉 〈I 2y 〉

Step 3 With the information, corner response is measured as.

Mc = λ1 λ2 − κ(λ1 + λ2 )2 = det(A) − κ trace2 (A)

(5.5)

Here κ is a tunable sensitivity parameter feasible in range of 0.04–0.15 By exploring the eigenvalues of A (Figure 5.3 3 ), the following inferences can be made. (a) If λ1 ≈ 0 and λ2 ≈ 0 , this pixel is not a feature point. (b) If λ1 ≈ 0 and λ2 >> λ1 , this pixel belongs to an edge. (c) If λ1 and λ2 are large, this pixel is a feature point. In 1994, Shi and Tomasi concluded using the smallest eigenvalue of A provides a better approach based on the assumption of affine image deformation [70].

Mc = min(λ1 ,λ2 ) 3

(5.6)

Figure 5.3 is contributed by Robert Collins at http://www.cse.psu.edu/ rcollins/CSE486/lecture06.pdf

47

Chapter 5. Feature Detectors

(a)

(b)

Figure 5.3: Classification of image point using eigenvalues of M

Pseudo-code

The pseudo-code of Harris feature detector is introduced in Algorithm 11.

Algorithm 11 Harris feature detector 1: procedure harrydector(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19:

input: image compute x derivative of image compute y derivative of image for every pixel in the image do compute products of x derivative compute products of y derivative compute products of x and y derivatives compute sums of products of x derivatives compute sums of products of y derivatives compute sums of products of x and y derivatives . H(x, y) =

find H(x, y) matrix

compute the response of the detector end for threshold on R compute nonmax suppression find feature points output: binary image end procedure

48

. I x = Gσ x ∗ I . I y = Gσ y ∗ I . I x2 = I x .I x . I y2 = I y .I y . I x y = I x .I y . S x2 = Gσ ∗ I x2 . S y2 = Gσ ∗ I y2 . S x y = Gσ ∗ I x y   S x2 (x, y) S x y (x, y)

S x y (x, y) S y2 (x, y) . R = Det(H) − k(Tr ace(H))2

5.2. Susan Detector

5.2 5.2.1

Susan Detector Background

Smallest Univalue Segment Assimilating Nucleus (SUSAN) corner detector was proposed by Smith and Brady in 1997 [74]. In the approach, a mask is used to compare the brightness of pixels within an area. This area is known as Univalue Segment Assimilating Nucleus (USAN).

Figure 5.4: Four circular masks at different places on a simple image (Liu et al., 2009) Instead making assumptions about the form of local and internal image structures, SUSAN studies different regions, uses direct local computations, and finds locations where individual region boundaries have high curvature. In the early stages, corner finder is considered as 2D feature detection; while edge finder is considered as 1D feature detection.

5.2.2

Susan Detector

According to Smith and Brady, feature points are detected by five steps.

Step 1 In order to find isotropic response in the first step, Smith and Brady proposed to use circular masks to compare with every pixel in an image by the following function.

c(m) ~ =e



€ I(m)−I( Š ~ m ~ ) 6

49

t

0

(5.7)

Chapter 5. Feature Detectors where t is the radius of the nucleus, I(m) ~ is the brightness of any pixel.

Step 2

Next step is to compute number of pixels lying within the circular mask by calculating the area of the SUSAN.

n(M ) =

X

c(m) ~

(5.8)

m∈M ~

Step 3

Then, compute edge strength image by subtracting USAN size from the geometric threshold by following rule.

¨ R(M ) =

g − n(M ) if n(M ) < g 0

otherwise,

Step 4

Find edge direction by moment calculations.

Step 5

Apply non-maximum suppression to thin edge image.

Pseudo-code

The pseudo-code of Susan feature detector is introduced in Algorithm 12. 50

(5.9)

5.3. Features from Accelerated Segment Test Detector

Algorithm 12 Susan feature detector 1: procedure susandetector(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:

input: image for e dovery pixel in the image place a circular mask around find number of pixels in USAN having similar brightness end for subtract the USAN size from the geometric threshold (which is set lower than when finding edges) to produce a corner strength image. test for false positives by finding the USAN’s centroid and its contiguity. use non-maximum suppression to find corners output: binary image end procedure

5.3 5.3.1

Features from Accelerated Segment Test Detector Background

FAST was published in European Conference on Computer Vision by Rosten and Drummond in an article titled Machine learning for high-speed corner detection [66]. This technique was improved by Rosten et al. in FASTER and better: A machine learning approach to corner detection in 2010 [67]. FAST has been cited in over 1300 articles and becomes one of the state-of-the-art feature detectors due to its high speed real-time frame-rate application. Referring to its name, FAST takes less time and computational resources than most of existing approaches.

5.3.2

FAST Detector

According to Rosten and Drummond, feature points are detected as follows.

Step 1 Select a pixel p with intensity I p . Step 2 Choose threshold t. 51

Chapter 5. Feature Detectors

Step 3 Consider 16 points around p as following 4 .

Figure 5.5: FAST point processing (Rosten and Drummond, 2006)

Step 4

p is a feature point, if there is n contiguous points lying inside white dash-line circle and being greater than I p + t or smaller than I p − t. n is normally chosen at 12.

Step 5

Rosten and Drummond also proposed a high-speed test to check whether a pixel is a corner or not. In specific, only four points 1, 5, 9 and 13, around p, are examined. p is considered as a corner when there are at least three points brighter than I p + t or darker than I p − t.

Pseudo-code

The pseudo-code of Features from Accelerated Segment Test is introduced in Algorithm 13. 4

Figure 5.5 is contributed by the author Rosten et al. at his web page http://www. edwardrosten.com/work/fast.html

52

5.4. Global and Local Curvature Properties Detector

Algorithm 13 FAST detector 1: procedure fastdector(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18:

input: image choose threshold t for every pixel in the image do high-speed test if 3/4 points are brighter than I p + t or darker than I p − t then consider 16 points around p if 12 points are brighter than I p + t or darker than I p − t then p is corner else p is not a corner end if else p is not a corner end if end for output: feature points end procedure

5.4 5.4.1

Global and Local Curvature Properties Detector Background

GLCPD is published in Optical Engineering journal in May 2008. This work was proposed by He and Yung in an article named "Corner detector based on global and local curvature properties" [30]. GLCPD performs well in terms of feature correspondence and subjective manner. The algorithm was developed based on an early technique, Curvature Scale Space (CSS), proposed by Mokhtarian and Suomela in 1998 [55]. Compared to the original CSS method, GLCPD evidences of the abilities to detect true corners and eliminate false ones. In addition, GLCPD outperforms the inspired technique by improving localization in a small neighborhood. According to He and Yung, the main contributions of their approach are to detect feature points by computing global and local curvature properties, to distinguish round corners from obtuse corners, and to parameterize the approach. GLCPD shows a promising performance, in spite of its dependence on an edge detector , namely Canny method, in preprocessing step. Intuitively speaking, GLCPD is a powerful technique with regard to capabilities of distinguishing true corners and optimizing localization errors. 53

Chapter 5. Feature Detectors

5.4.2

Global and Local Curvature Properties Detector

He and Yung developed GLCPD with the purpose of utilizing global and local curvature properties. With this philosophy, a new procedure is proposed based on [55].

Step 1

Apply Canny detector to obtain a binary edge map. A comprehensive process of Canny detector is shown in Section 4.4.

Step 2

According to CSS method proposed by Mokhtarian and Suomela, contours are extracted by Algorithm 14. Algorithm 14 CSS algorithm procedure extractcontour(binary image) input: binary image for every point not be marked do store index set current point be this point while not connected to end-point do fill the gap change current point to filled point end while if connected to end-point then extract contour end if if end-point connected to an extracted edge then set as T-junction else set as end-point end if end for output: contours end procedure

54

5.4. Global and Local Curvature Properties Detector

Step 3 Next, He and Yung proposed to compute the curvature value of each pixel of contours to find the true corners. Curvature value of pixel i in contour j is given by. j

j Ki

=

j

j

j

∆x i ∆2 yi − ∆2 x i ∆ yi j

j

[(∆x i )2 + (∆ yi )2 ]1.5

for i = 1,2,...,N

(5.10)

Step 4 According to He and Yung, Region of Support (ROS) is defined by the portion of a contour bounded by two nearest curvature minimas. Consequently, adaptive threshold in a ROS is calculated as following. u+L X1 1 T (u) = R × K = R × |Ki | L1 + L2 + 1 i=u−L

(5.11)

2

where L1 + L2 is the size of ROS, centered at u, R denotes the minimum ratio of major axis to minor axis of an ellipse (default value of R is 1.5), and K is the mean curvature of the ROS. Round corners are removed if their curvature values are smaller than the adaptive thresholds in ROS. Otherwise, corners are defined as true corners.

Step 5 Next step is to remove false corners. He and Yung mention that a well-defined corner should have a relatively sharp angle. It means obtuse corners, which are greater than θo btuse = 162 ◦ , should be eliminated. To alleviate this problem, He and Yung determine the angle of an arbitrary point as. ¨ ∠C =

|γ1 − γ2 |

if |γ1 − γ2 | < π

2π − |γ1 − γ2 | otherwise

where γ1 and γ2 are the tangents from this point to two end-points of ROS. 55

(5.12)

Chapter 5. Feature Detectors

Step 6

Finally, a contour

j

j

j

A j = {P1 , P2 ,..., PN }

is determined to be closed (loop) or open (line).

( Aj

j

j

j

j

closed

if |P1 ,PN | < T

open

if |P1 ,PN | > T

Pseudo-code

To summarize, the pseudo-code of GLCPD algorithm is illustrated in Algorithm 15. Algorithm 15 GLCP detector 1: procedure curvaturecorner(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21:

input: an image extract binary image by Algorithm 8 extract contours from binary edge map by Algorithm 14 for every pixel in contours do compute curvature value by Equation 5.10 end for for each pixel in contours do compute adaptive threshold in ROS by Equation 5.11 j if Ki < T (u) then remove round corner end if end for for each corner do compute angle of corner if ∠C > θo btuse then remove false corner end if end for output: corner points end procedure

56

j

. Ki

. T (u)

. ∠C

5.5. Enhanced Curvature Detector

5.5

Enhanced Curvature Detector

A novel feature detection method is proposed to improve on existed GLCPD developed by He and Yung [30]. This algorithm can enhance the robustness and quality of corners by removing the weaker one based on a proposed ranking method. This technique is shown to significantly outperform conventional approaches.

5.5.1

Outline

The proposed detector has been implemented in Matlab using a combination of open source libraries provided by Matlab society. It has been contrived as a process which takes raw image as input and outputs a list of indexed corners. The proposed technique is inspired by Canny edge detector [16] and Global and Local Curvature Properties Detector [30].

5.5.2

Algorithm

There are nine major steps in the enhanced curvature detector. Step 1-6 of proposed corner ranking is similar to step 1-6 of proposed Smoothnessbased Detector mentioned in Section 4.5.

Step 1 The first step is to filter out any noise using Gaussian kernel or well-engineered noise reduction filters, such as NL-means and BM3D, which were mentioned in Chapter 3.

Step 2 The second step is to apply Canny edge detector (refer to Section 4.4) to detect edges.

Step 3 Next, binary edge map found from Step 2 is the input of Step 3 [30] in order to extract contours as in the CSS method [63]. After that, some steps are essential to 57

Chapter 5. Feature Detectors

follow in order to find true corners and remove false ones. Details are broached in Section 5.4.

Step 4 This step is to link edges provided from Step 3 and update corners accordingly. There are three types of linking. – Type 1: the ending of an edge to the ending of another edge. This type of linking combine two edges into a longer one. – Type 2: the ending of an edge to a mid point of another edge. Here, this type creates a T-junction. – Type 3: the beginning to the ending of an edge. This type of linking changes the curve mode from line to loop.

Step 5 Next step is to compute edges’ metrics. This step is described as follows. For given contours and corners in Step 4, the category showing which corner belong to which contour is found. At the end of this sub-step, a table showing a list of edges with corners lying on them is expected. Based on the selection of is-End-point-IsCorner [30] is true, for each edge, there are greater or equal to two corners. For each edge, based on its corners, its metric is calculated. An edge metric depends on four aspects. – Length of this edge. The longer the edge is, the less possible it is generated by noise. – Smoothness of this edge. The detected edge generated by object is more likely to be smoother than the one generated by noise. – Number of corners lying on this edge. – Curve mode. Line or loop. This factor represents how much a loop is more preferable than a line. Consider an N-pixels edge,

j

j

j

A j = {P1 , P2 ,..., PN } 58

5.5. Enhanced Curvature Detector

with M corners in order.

j

j

j

C j = {C1 , C2 ,..., C M }

j

j

j

j

Clearly, M ≤ N , P1 = C1 and PN = C M . The strategy of computing edge-score is summarized as follows. (a) From 2 adjacents corners, extract a small curve between them. (b) Fit the extracted curve to a cubic model. f (x) = ax 3 + bx 2 + c x + d

(5.13)

(c) Compute summation of squares of residuals. N X j RSS j = ( yi − f j (x i ))2

(5.14)

i=1

where yi is real index (of extracted edge), and f (x i ) is estimated index (of fitted model). (d) Compute summation of all summations (RSS) of divided parts. RSS =

M −1 X

RSS j

(5.15)

j=1

(e) Compute average residual (AR). RSS N

(5.16)

AR × W × L × M 2 N2 ×N0 ×Φ

(5.17)

AR = (f) Edge-Score (ES) is computed as. ES =

where W and L are width and length of image, respectively; N is number of pixels of an edge; M is number of corners lying on this edge; AR is average residual; N 0 is the summation of pixels of all edges connected to this edge and this edge; and Φ is loop-line-ratio. 59

Chapter 5. Feature Detectors

Step 6

Next, edges are ranked based on their metrics.

Step 7

Next step is to compute corners’ metrics. This step is described as follows. From step 5 above, a list categorizing which corner belong to which contour is found. In addition, gradient value is computed in Equation 4.14 from step 2. In addition, from step 3, curvature value of each corner by is shown by following equation.

=

j

j

j

j Ki

j

∆x i ∆2 yi − ∆2 x i ∆ yi j

j

[(∆x i )2 + (∆ yi )2 ]1.5

for i = 1,2,...,N

For each corner, based on its contour, which it belongs to, corner’s metric is evaluated. In general, a corner metric depends on four aspects. – Its contour’s metric, ES. j

– Its curvature value, Ki

– Its gradient value, ∇I(x, y). – Position mode, Θ. Mid or Ending. This factor represents how much a midpoint is more preferable than an ending-point. To summarize, metric of a corner is computed as.

CS =

ES K × ∇I × Θ

(5.18)

Step 8

Rank corners based on their metrics. The smaller the corner score, the higher the ranking. 60

5.5. Enhanced Curvature Detector

Step 9 This step involves in removing weaker and border corners. The final result is N strongest corners. The pseudo-code of removing weaker corners is shown in Algorithm 16. Algorithm 16 Remove weaker and border corners 1: procedure smoothnessbasedcorner(corners) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27:

input: ranked corners, BORDER_GAP, RE M OV E_RADI US, N - numbers of kept corners for every corners do if distance to borders < BORDER_GAP then remove from the list end if end for set K = 0 . K is a counter initialize a queue Q for every corners from strongest to weakest do for every corners in Q do compute d - distance between two pixel if d < RE M OV E_RADI US then break for-loop REMOVE is true end if end for if REMOVE is false then K = K +1 push this corner to Q end if if K = N then break for-loop end if end for output: N corners end procedure

Pseudo-code The flowchart and pseudo-code of Proposed Corner Ranking is shown in Figure 5.6 and Algorithm 17, respectively.

61

Chapter 5. Feature Detectors

Algorithm 17 Proposed corner ranking 1: procedure cornerranking(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27:

input: image, C T - corner threshold filter noise apply Canny edge detection to find binary image apply Global and Local Curvature Properties Detector to find corners link edges find which corner belong to which edge for every edge in the provided binary image do extract this edge into M − 1 parts for every part do fit this part to a cubic polynomial . ax 3 + bx 2 + c x + d find a, b, c and d estimate y index given x index compute summation of squares of residuals end for compute summation of all summations of squares of residuals compute average residual compute edge-score end for rank edges for every corners do compute corner-score end for rank corners remove weaker corners output: C T corners end procedure

62

5.5. Enhanced Curvature Detector

Figure 5.6: Flowchart of Enhanced Feature Detector

63

Chapter 6 Feature Tracking Feature tracking is a fundamental process in image processing with the purpose of extracting motion information of feature points over multiple continuous frames. This process is essential in many vision-based real-world applications such as control systems [60] , human-computer interactions [20], or medical imaging [9] [71]. In feature tracking matter, there are some recognizable challenges: – What is good feature to track? – How to track efficiently? – How to deal with point that tends to deform over time? – How to take care of low-SNR sequence? Is there a need to denoise every frame or just specific frame? – How to handle accumulated noise? – When a feature point disappears? How often to detect feature points? What conditions? This chapter covers three important aspects of feature tracking process. First, two simple and effective feature detectors are proposed in Section 6.1. Second, Oriented FAST and Rotated BRIEF feature descriptor is discussed in Section 6.2. Finally, Lucas-Kanade Tracker is reviewed in Section 6.3.

6.1

Feature Detectors

A novel feature detection method is proposed to improve on existed GLCPD developed by He and Yung [30]. This algorithm can enhance the robustness and quality of corners by removing the weaker one based on a proposed ranking method. This 64

6.1. Feature Detectors

technique is shown to significantly outperform conventional approaches. The evaluation in deep will be discussed in Chapter 7.

6.1.1

Proposed Gradient-based Feature Detector

6.1.1.1

Outline

The proposed detector has been implemented in Matlab using a combination of open source libraries provided by Matlab society. It has been contrived as a process which takes raw image as input and outputs a list of indexed corners. The proposed technique is inspired by Canny edge detector [16] and Global and Local Curvature Properties Detector [30]. A simple and efficient feature detection method is proposed to improve on feature tracking performance.

6.1.1.2

Algorithm

In order to implement the proposed feature detector, a series of steps must be followed.

Step 1 The first step is to remove noise. NL-means, BM3D or 5 × 5 Gaussian kernel as following should be employed. This step is optional; the effects of this will be discussed later. 

 2 4 5 4 2 4 9 12 9 4 1   K= 5 12 15 12 5 159 4 9 12 9 4 2 4 5 4 2

Step 2 After smoothing the raw image, gradient in horizontal and vertical directions of each pixel are computed. 65

Chapter 6. Feature Tracking



 −1 0 1 G x = −2 0 2 ∗ I −1 0 1



and

 1 2 1 0 0 ∗ I Gy =  0 −1 −2 −1

Step 3 Next, gradient of each pixel is computed as.

∇I(x, y) = G(x, y) =

Ç

G 2x (x, y) + G 2y (x, y)

Step 4 Rank pixel based on its gradient.

Step 5 Remove weaker corners. Remove border corners if applicable.

Pseudo-code The pseudo-code and flowchart of proposed gradient-based feature detector is shown in Algorithm 18 and Figure 6.1, respectively. Algorithm 18 Proposed gradient-based feature detector 1: procedure cornerranking(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:

input: image, C T - corner threshold filter noise for every pixel in the provided image do compute gradient end for rank corners remove weaker corners remove border corners output: C T corners end procedure

66

6.1. Feature Detectors

Figure 6.1: Flowchart of Gradient-based Feature Detector

6.1.2

Proposed Gradient/Edge-based Feature Detector

6.1.2.1

Outline

The proposed detector has been implemented in Matlab using a combination of open source libraries provided by Matlab society. It has been contrived as a process which takes raw image as input and outputs a list of indexed corners. The proposed technique is inspired by Canny edge detector [16] and Global and Local Curvature Properties Detector [30]. A simple and efficient feature detection method is proposed to improve on feature tracking performance. 67

Chapter 6. Feature Tracking

6.1.2.2

Algorithm

In order to implement the proposed feature detector, a series of steps must be followed.

Step 1 The first step is to remove noise. In specific, NL-means, BM3D or Gaussian kernel should be employed.

Step 2 Apply Canny edge detector. This step returns a binary image, where 0 represents edge and 1 represents background, or vice versa.

Step 3 After smoothing the raw image, gradient in horizontal and vertical directions of each pixel belonging to an edge, which are found in Step 2, are computed. 

 −1 0 1 G x = −2 0 2 ∗ I −1 0 1



and

 1 2 1 0 0 ∗ I Gy =  0 −1 −2 −1

Step 4 Next, gradient of each edge-pixel is computed as.

∇I(x, y) = G(x, y) =

Ç

Step 5 Rank point based on its gradient. 68

G 2x (x, y) + G 2y (x, y)

6.2. Feature Descriptor - Oriented FAST and Rotated BRIEF

Step 6 Remove weaker corners. Remove border corners if applicable.

Pseudo-code The pseudo-code and flowchart of proposed gradient/edge-based feature detector is shown in Algorithm 19 and Figure ??, respectively. Algorithm 19 Proposed gradientedge-based feature detector 1: procedure cornerranking(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12:

input: image, C T - corner threshold filter noise apply Canny edge detector for every pixel lying on edge do compute gradient end for rank corners remove weaker corners remove border corners output: C T corners end procedure

6.2

6.2.1

Feature Descriptor - Oriented FAST and Rotated BRIEF Background

Feature matching plays an important role within computer vision area, such as object recognition, robotic mapping, video tracking or 3D reconstruction. Corner detectors, such as FAST or Harris, are rotation-invariant but scaling-variant. Consequently, feature descriptor was introduced to locate and identify a unique object in a training image containing a diversity of objects. In 2004, Lowe proposed Distinctive Image Features from Scale-Invariant Keypoints [44]. Based on Lowe’s concept, Bay et al. introduced Speeded Up Robust Features (SURF) in 2006 with the purpose of reducing computation cost [10]. In 2011, Rublee et al. proposed Oriented FAST and Rotated BRIEF aiming at real-time applications [68]. To be cited in more than 800 times over the last five years, Oriented 69

Chapter 6. Feature Tracking

Figure 6.2: Flowchart of Gradientedge-based Feature Detector

FAST and Rotated BRIEF (ORB) is becoming one of the most successful feature descriptors of all time.

6.2.2

Oriented FAST and Rotated BRIEF Feature Detector

ORB is a combination of Binary Robust Independent Elementary Features (BRIEF) [15] and FAST with a number of modifications to improve the performance. There are steps must to be followed in ORB. 70

6.2. Feature Descriptor - Oriented FAST and Rotated BRIEF

Step 1 The first step is to detect feature points in an image by FAST. Rublee et al. encourage to use FAST-9, since it has a good performance. As FAST does not produce a measure of cornerness and multi-scale features, Rublee et al. proposed to employ Harris corner detector to find the best N key-points.

Step 2 Next, Rublee et al. compute orientations by intensity centroid. First of all, the moments of patch are defined by Rosin [65] as.

m pq =

X

x p yq I(x, y)

(6.1)

x, y

With these moments, the centroid is found. m10 m01 , C= m00 m00 

‹

(6.2)

Hence, Rublee et al. could compute the orientation of the patch by constructing a ~ vector from th corner’s center, O, to the centroid, OC.

θ = atan2(m01 ,m10 )

(6.3)

where atan2 is the quadrant-aware version of arctan.

Step 3 In order to make BRIEF to be invariant to in-lane rotation, Rublee et al. introduced a novel technique, Steered BRIEF. For any feature set of n binary tests at location (x i , yi ), Rublee et al. define a 2 × n matrix. 71

Chapter 6. Feature Tracking



x ,..., x n S= 1 y1 ,..., yn

‹

(6.4)

Using the patch orientation, θ , a steered version is constructed.

Sθ = Rθ S

(6.5)

g n (p,θ ) := f n (p)|(x i , yi ) ∈ Sθ

(6.6)

Hence, the steered BRIEF becomes.

Step 4 The final step of ORB is learning the sampling pairs. The learning is done by applying the following greedy algorithm [24]. (a) Run test on training datasets. (b) Form vector T . (c) Apply greedy search. Once this algorithm terminates, a set of 256 relatively uncorrelated tests with high variance is obtained.

Pseudo-code The pseudo-code of Oriented FAST and Rotated BRIEF is illustrated in Algorithm 20 [68].

6.3 6.3.1

Feature Tracker - Lucas-Kanade Tracker Outline

In 1981, Lucas et al. developed the Lucas-Kanade Tracker (LKT) method for optical flow estimation. The key principle of LKT is the use of least squares approximation 72

6.3. Feature Tracker - Lucas-Kanade Tracker

Algorithm 20 Oriented FAST and Rotated BRIEF 1: procedure orb(image) 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:

input: image detect N feature points by FAST-9 for every pixel do compute moment of patch find the centroid end for construct a steered BRIEF learn sampling pairs form vector T greedy search output: feature descriptors end procedure

to estimate the warped feature points’ movement based on a previous frame [46, 47]. There are three important assumptions of Lucas-Kanade Tracker. – Firstly, it is the brightness constancy which grantees that the projection of the one point is similar in each frame. – The second assumption is the small motion, meaning LKT works if and only if each point represents a reasonably small movement between two adjacent frame. – Lastly, it is the spatial coherence which illustrates that points move like their neighbors. An example of movements of feature points is shown in Figure 6.31 .

Figure 6.3: Feature movement in two adjacent frames 1

Figure 6.3 is contributed by Derek Hoiem in his lecture notes at https://courses.engr. illinois.edu/cs543/sp2012/lectures/

73

Chapter 6. Feature Tracking

6.3.2

Lucas-Kanade Tracker

Step 1 The first step is to find good features to track. Lucas et al. suggested to employ Harris corners [70], which was proposed by Shi and Tomasi, in this step (refer to Section 5.1 for more details).

Step 2 Use intensity second moment matrix and difference across frames to find displacement. (a) With the assumption of brightness constancy. I(x, y, t) = I(x + u, y + v, t + 1)

(6.7)

(b) Take Taylor series approximation to obtain Optical Flow equation. I x .u + I y .v + I t = 0

(6.8)

where I x and I y are image derivative along x and y, respectively, and I t is difference over frame (or time). (c) Here, Lucas et al. proposed to use a 3 × 3 patch around one pixel. Hence, for one pixel, 9 points moving in the same motion is considered. Plug them into a Least Square method, u and v are found.    I x (p1 ) I y (p1 ) I t (p1 )  I x (p2 ) I y (p2 ) •u˜  I t (p2 )  .   .  = − .  ..  ..  ..  v I t (p9 ) I x (p9 ) I y (p9 ) 

(6.9)

or, • ˜ u A = −B v Thus, • ˜ u = −(AT A)−1 AT b v 74

(6.10)

6.3. Feature Tracker - Lucas-Kanade Tracker

Step 3 Iterate and use coarse-to-fine search to deal with larger movements by iterative refinement. (a) Initialize (x 0 , y 0 ) = (x, y) (b) Compute (u,v) by Equation 6.10 (c) Shift window by (u, v) : x 0 = x 0 + u; y 0 = y 0 + v (d) Recalculate I t (e) Repeat steps 2-4 until small change

Step 4 When creating long tracks, check appearance of registered patch against appearance of initial patch to find points that have drifted.

Pseudo-code The pseudo-code of Harris feature detector is introduced in Algorithm 21. Algorithm 21 Lucas-Kanade tracker 1: procedure lktracker(video) 2: 3: 4: 5: 6: 7: 8: 9: 10:

input: image find Harris corners compute image derivatives I x , I y and I t use 3 × 3 patch to form a Least Square Error problem estimate movement of each pixel use coarse-to-fine for larger movements find points that have drifted output: feature tracking record end procedure

75

Chapter 7 Evaluation This chapter is designed for a comparative study for the consistency and robustness of discussed approaches in thermal-infrared sequence, and more specially in low-SNR images. Section 7.1 discusses in deep the datasets for evaluation. Next, in Section 7.2, noise reduction filters are compared in a variety of circumstances. Feature detectors and edge detectors are, then, evaluated in Section 7.3 and Section 7.4, respectively. Finally, a detailed procedure for evaluating feature tracking process is provided in Section 7.5.

7.1

Data Sets

Four data sequences were captured for specifically analyzing the proposed feature detection method and improvements relating to sparse optical flow. These datasets were captured using an Optris PI450 thermal-infrared camera, with an image resolution of 384 by 288 pixels, and a maximum framerate of 80Hz. Each sequence was approximately X minutes in length, with a minimum of Y frames captured. As shown in Figure 7.1, the four sequences explore four different types of environments. Specifically, sequences 1 and 2 (shown in Figures 7.1 (a) and 7.1 (b)) explore low SNR indoor and outdoor scenes respectively. Sequences 3 and 4 (shown in Figures 7.1 (c) and 7.1 (d)) then include footage from different high SNR indoor and outdoor scenes respectively. Each of these sequences was captured for the purpose of evaluating the image processing, feature detection and feature tracking approaches independent of a full SLAM implementation. In order to simplify the evaluation, the physical motion of the camera for each sequence was limited so that mostly the same physical objects were in view for the full duration. 76

7.2. Noise Reduction Filters

(a) Low SNR, Indoor

(b) Low SNR, Outdoor

(c) High SNR, Indoor

(d) High SNR, Outdoor

Figure 7.1: Datasets for evaluation

7.2

Noise Reduction Filters

It can be seen from the Figure 7.2 and 7.3 that BM3D filter outperforms all of the other techniques. In specific, BM3D demonstrates a significant improvement on the original image by smoothing the intensity transitions made by speckle noise and uniforming local and global areas. In addition, for a textured image like a workspace shown in Figure 7.3, BM3D also show good preservation of details such as walls, dividers, windows and the like. While median and NLM filters are struggling with extreme level of noise; BM3D filter illustrates reasonable reconstruction of the original image. In short, BM3D is a robust denoising method in term of subjective visual quality. After a comparative study, BM3D is proposed to be used in low SNR thermal-infrared image in feature tracking. In fact, the performance is very promising. However, there is one disadvantage of BM3D that need to be considered. That is the computational complexity. In similar circumstances, BM3D is five time slower than NLM. 77

Chapter 7. Evaluation

Hence, it would be a burden in real-time applications.

Figure 7.2: Fragments of the grayscale electrical wirings denoised by Median, NLM and BM3D filters

7.3

Edge Detectors

As edges are fundamental elements in machine vision, it is crucial to evaluate edge detectors. Here, two low SNR thermal-infrared images are used for evaluation. Figure 7.4 shows a workspace including a number of objects, such as monitor, laptop, charger, mobile phone and so on; while Figure 7.5 illustrates the same space but at different viewpoint. It can be seen from Figure 7.4 and 7.5 that classical detectors, such as Prewitt, Roberts and Sobel, are very sensitive to noise and inaccurate, especially Roberts’ in Figure 7.4. In particular, a lot of short edges, or even dots, are detected that look like impulse noise. The more noise appears or the lower the magnitude of the edges us, the worse final result is [72] [51]. In fact, these edges should be removed. However, classical operators are very simple since they only use two 78

7.3. Edge Detectors

Figure 7.3: Fragments of the grayscale workspace denoised by Median, NLM and BM3D filters steps of convolutions. Furthermore, the basics of detected edges with orientations are sufficient in some circumstances. Laplacian of Gaussian detector demonstrates a better performance compared to classical operators, although it applies standard convolution methods. The advantages of Laplacian of Gaussian detector are the capability to find correct location of edges and the use of wider Gaussian kernel, 9 × 9 (see Chapter 4); hence, the outcome binary image is better than classical operators’. But, Laplacian of Gaussian detector fails to find gradients of edges and curves having variety of intensities. Canny’s algorithm is still considered one of the most successful edge detection method and is widely used in the engineering applications. Canny defined three criteria of good edge detection. They are good detection, which is the ability that an algorithm marks actual edges, good localization, which shows how near marked edge is to real edge, and minimal response, which expresses noise should not construct false edges. However, Canny detector is complex computation and time consuming since it has six steps (see Section 4.4). 79

Chapter 7. Evaluation

(a) Prewitt detector

(b) Roberts detector

(c) Sobel detector

(d) Laplacian of Gaussian detector

(e) Canny detector

(f) Proposed detector

Figure 7.4: Detection results on denoised high SNR image shown in Figure 4.3

80

7.3. Edge Detectors

(a) Prewitt detector

(b) Roberts detector

(c) Sobel detector

(d) Laplacian of Gaussian detector

(e) Canny detector

(f) Proposed detector

Figure 7.5: Detection results on denoised low SNR image shown in Figure 4.4

81

Chapter 7. Evaluation

The proposed algorithm is an enhanced Canny detector. It can be seen from Figure 7.4 and Figure 7.5 that my approach removes most of edges made by noise. Moreover, it preserve the details even at two regions having very low intensity differences. Consequently, it outperforms all the other methods in term of connected edges and structured object detection. But, similar to Canny method, it also has a complex computation.

7.4

Feature Detectors

To evaluate feature detectors, two low SNR thermal-infrared images are employed. Figure 7.6 shows a workspace including a number of objects, such as monitor, laptop, charger, mobile phone and so on; while Figure 7.7 illustrates the same space but at different viewpoint. The question is "What are the best ten corners in the given images?"

(a) FAST detector

(b) Harris detector

(c) MinEigen detector

(d) Proposed detector

Figure 7.6: Top 10 strongest corners determined by FAST, Harris, MinEigen and Proposed detectors in high SNR image

82

7.4. Feature Detectors

(a) FAST detector

(b) Harris detector

(c) MinEigen detector

(d) Proposed detector

Figure 7.7: Top 10 strongest corners determined by FAST, Harris, MinEigen and Proposed detectors in low SNR image (after being denoised by BM3D) The Harris corner detector is really a simple algorithm. When a corner resides, a lot of horizontal and vertical gradients are detected. The Harris corner detector generalizes to images. In term of stability, Harris is better than Susan corner detector. The result is the same in the experiment of anti-noise experiment [18]. Besides, the complexity of Harris is also much lower than Susan’s. In short, Harris is superior to Susan algorithm. MinEigen detector was developed by Shi and Tomasi in 1994. This method is rotation invariant since it has a scoring function proposed by Harris corner detector. Similar to Harris, MinEigen algorithm has good repeatability and localization accuracy, and reasonable robustness and efficiency. FAST is a successful method for feature tracking due to its excellent computational efficiency. In trade-off matter, FAST only has fair repeatability, localization accuracy and robustness; furthermore, it could not detect blob and is affine variant. Like the above detectors, FAST does not detect true corners for human vision. This problem inspired He and Yung develop a really good human-corner detector [30]. 83

Chapter 7. Evaluation

It can be observed from Figure 7.6 and Figure 7.7 that proposed detector has a better well-spreading distribution of corners than the others, such as FAST, Harris and MinEigen detectors. The most important advantage is that since the proposed detector is an enhanced curvature detector, it is able to detect most of obvious true corners by suppressing false and round corners in both simple and complex shapes. Additionally, it works well on different size objects and demonstrates really promising ranking performance. The only disadvantage of proposed detector is its complexity.

Table 7.1: Comparison of feature trackers

7.5 7.5.1

Feature Tracking Restructure main data directory

A directory called image_subsequences should be created to store images. The raw captured sequences, which were recorded, will not go in here. Instead, it will contain subdirectories for different re-numbered raw subsequences which were created by c r eat eloopesubseq.m function. Only subsequences of a standard length - 50 frames - that have no problems such as zero motion or Non-Uniformity Correction (NUC) interruptions should be placed in this directory. All images should be stored in raw (16-bit) format. This folder should be shared with supervisor, such that he can check it. The plan is that subdirectories from this directory should be able to be easily added and removed if new and better data is captured. The directory structure should look like the following: 84

7.5. Feature Tracking

– image_subsequences + lo_snr_seq_1 · 000000.png · 000001.png · ... · 000049.png + lo_snr_seq_2 · ... · ... + hi_snr_seq_1 · ... · ... Hence, each time new data is collected from the camera, some manual work will need to be done to generate one or more subsequences from the original captured sequence to put in this image_subsequences directory. Frame re-indexing procedure is shown in Algorithm 22.

7.5.2

Prepare image processing scripts

A second directory called processed_subsequences should be generated that contains all of the 8-bit versions of the raw subsequences. It is the 8-bit versions that are actually used for the experiments. One or more scripts or launch files will need to be prepared that do the following: – Generate 8-bit versions of each subsequence with different DPG; – Generate denoised versions of each different DPG subsequence. These scripts may need to call thermalvis launch files. All results will be uploaded to repository of Github. The project is shared between team-members. Once a member has access to the original image_subsequences directory, he should be able to generate a processed_subsequences directory on his computer that is identical to the developer. The following directory structure should be used: – processed_subsequences + lo_snr_seq_1 · dpg_0.005 85

Chapter 7. Evaluation

Algorithm 22 Frame re-indexing procedure 1: procedure createloopesubseq 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25:

input: sequence of images and N0 , desired number of frames for each frame in sequence do compare to previous frame if similar then label this frame as NUC occurs end if end for extract subsequences when NUC not occurs for each subsequence do find N , number of frames while N > N0 − 1 do extract N0 − 1-frames subsequence N → N − (N0 − 1) end while end for for each N0 − 1-frames subsequence do if N0 is even then index frames in order of 1,3,5,...,N0 − 1,N0 − 3,N0 − 5,...1 else index frames in order of 1,3,5,...,N0 − 1,N0 − 2,N0 − 4,...1 end if end for OUTPUT: N0 -frames subsequences end procedure

· 000000.png · 000001.png · ... · 000049.png · dpg_0.01 · ... + lo_snr_seq_2 · ... The procedure of creating different DPG sequences is shown in Algorithm 23. In the evaluation, four types of DPG levels including 0.005, 0.01, 0.05 and 0.1 are used (see Figure 7.8). 86

7.5. Feature Tracking

Algorithm 23 Generate different DPG sequences 1: procedure creatediffdpg 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17:

input: sequence of images and N0 , desired number of frames setup system initialize variables call thermalvis program for each DPG do revise launch file accordingly for each subsequence do for each denoising method do for each image do generate a new image end for end for end for end for output: N0 -frames subsequences end procedure

7.5.3

Prepare image processing validation scripts

A validation script needs to be created to ensure that all of the processed subsequences are fine. The script should automatically load each subsequence, and display all frames in order to the user in Matrix Laboratory (MATLAB), making it easy to check if the DPG is correct. It is important that when displaying an 8-bit sequence, the graylevels are not rescaled, since the consistency of the images is considered to be observed. By the end of each sequence, a figure showing the average of intensities is displayed. Here, it is able to be observed whether there is problem with denoising methods or frame processing or not (see Figure 7.9). The procedure of processed-image validation is shown in Algorithm 24.

7.5.4

Prepare feature detection scripts

Scripts for detecting features need to be prepared. Features, which are detected by a number of feature detector introduced in Section 5, should be stored for the first image only for each processed subsequence. Another directory can be made that has a similar structure to the first two. All detectors will need to be run on all processed subsequences. 87

Chapter 7. Evaluation

(a) DP G = 0.005

(b) DP G = 0.01

(c) DP G = 0.05

(d) DP G = 0.1

Figure 7.8: Four types of DPG are used in the experiments The directory structure should look like the following: – features_data + indoor_hi_snr1 · indoor_hi_snr1_seq_02_dpg_0.01_bm3d_harr.txt · indoor_hi_snr1_seq_02_dpg_0.01_nlm_harr.txt · indoor_hi_snr1_seq_02_dpg_0.01_ori_harr.txt · ... + indoor_hi_snr2 · ... · ... + indoor_lo_snr1 · ... · ... Feature detection procedure is shown in Algorithm 25. 88

7.5. Feature Tracking

Figure 7.9: Video validation Index Feature

x-index

y-index

Metric

0 1 2 3 4 .. .

219 247 139 133 88 .. .

123 105 260 228 209 .. .

7 7 8 8 8 .. .

98 99

358 335

82 50

71 147

Table 7.2: Feature data text file

7.5.5

Prepare feature detection validation scripts

A validation script for the features data also needs to be prepared. It should loop through each f eatur es_dat a directory and, for each detector, show the features on the first frame of the corresponding processed subsequence. The user can press ENTER to move to the next frame. The features should be color coded according to their strength, e.g. with bright red colors for stronger features, and greener colors for weaker features. An example of validation of a sequence is demonstrated in Figure 7.10. 89

Chapter 7. Evaluation

Algorithm 24 Video validation 1: procedure validvideo 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:

setup system initialize variables for each sequence do for each DPG do for frame 1 → N0 do window 1: display original image window 2: display NL-means-denoised image window 3: display BM3D-denoised image window 4: plot average intensities of three denoising methods end for end for end for end procedure

Feature detection validation procedure is shown in Algorithm 26.

7.5.6

Prepare feature tracking scripts

Scripts are to be prepared that perform tracking on each of the processed subsequences, using each of the relevant initial detected features. Results should be stored in a parallel directory structure t r acking_dat a with one file for each detector-subsequence combination. Directory structure should be similar to Section 7.5.4. – tracking_data + indoor_hi_snr1 · indoor_hi_snr1_seq_02_dpg_0.01_bm3d_harr.txt · indoor_hi_snr1_seq_02_dpg_0.01_nlm_harr.txt · indoor_hi_snr1_seq_02_dpg_0.01_ori_harr.txt · ... + indoor_hi_snr2 · ... · ... + indoor_lo_snr1 · ... · ... Feature tracking procedure is shown in Algorithm 27. 90

7.5. Feature Tracking

Algorithm 25 Feature detection script 1: procedure featuredetection 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19:

input: sequences of images setup system initialize variables for each DPG do revise launch file accordingly for each subsequence do for each denoising method do extract first frame for each feature detectors do find feature points remove weaker and border feature points end for write corners to a text file end for end for end for output: corner text files end procedure

7.5.7

Prepare feature tracking validation scripts

Create a tracking validation script. This will be similar to the detection validation script, except that it will show all images in each processed subsequence, with the tracked feature location and a "tail." It should enable us to see the following: – Which features are strong and which are weak? – When features die? – How features move? Hopefully, weird things are happening here, such as weaker features surviving longer or drifting less than strong features, is expected to be seen here. An example of validation of a sequence is demonstrated in Figure 7.11. Feature tracking validation procedure is shown in Algorithm 28.

7.5.8

Denoising methods analysis script

Have a nested loop which for each set, comprised of a unique: – Denoising methods 91

Chapter 7. Evaluation

Figure 7.10: Corner detection validation – DPG level – High SNR or low SNR group of subsequences Hence, one set might be bm3d_0.01_lo_fast, bm3d_0.01_lo_grad, nlm_0.01_lo_fast and so on. Generates the following set of results: – Pixel drift for each decile of features - grouped by feature strength across the entire SET, not just for each sequence individually. – Similarly, survival rate for each decile of features grouped by strength. And displays results in the following plots: – For only the BEST DPG level, and low SNR, show all 3 denoising methods on the one bar plot for comparison of pixel drift for each feature strength decile. – For only the BEST DPG level, and low SNR, show all 3 denoising methods on the one bar plot for comparison of survival rate for each feature strength decile. This experiment should determine once and for all the level of improvement able to be achieved by the denoising methods. Hence, table of data showing relevant information, such as the average drift for the strongest 10% of features and the 92

7.5. Feature Tracking

Algorithm 26 Feature detection validation script 1: procedure validfeaturedetection 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:

setup system initialize variables for each DPG do revise launch file accordingly for each subsequence do for each denoising method do extract first frame display first frame and detected corners end for press ENTER or mouse click to continue end for end for end procedure

Algorithm 27 Feature tracking script 1: procedure featuretracking 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:

input: sequences of images, corner text files setup system initialize variables for each DPG do revise launch file accordingly for each subsequence do for each denoising method do call thermalvis program end for end for end for output: tracking text files end procedure

average survival rate for the strongest 10% of features, comparing all 3 denoising methods, can be seen. Based on the results, one of the denoising methods is sugeested to be implemented in C++ so that it can be used in a real-time feature tracking application. The procedure of processed-image validation is shown in Algorithm 29. In the evaluation, four types of DPG levels including 0.005, 0.01, 0.05 and 0.1 are used (see Figure 7.12). 93

Chapter 7. Evaluation

Index Feature

Frame

x-index

y-index

0 0 0 0 0 .. .

0 1 2 3 4 .. .

219 217.272 215.65 214.704 213.263 .. .

123 122.385 121.643 120.334 118.093 .. .

0 1 .. .

49 0 .. .

219.082 247 .. .

123.015 105 .. .

1 .. .

49 .. .

247.87 .. .

104.827 .. .

99

49

334.826

49.9278

Table 7.3: Feature tracking text file Index Feature

Lifetime

T-disp.

A-disp.

Drift

Metric

x-index

y-index

Kept?

0 1 2 .. .

50 50 20 .. .

67.1 69.7 26.3 .. .

1.34 1.39 1.32 .. .

0.083 0.887 13.75 .. .

7 7 8 .. .

219 247 139 .. .

123 105 260 .. .

0 0 0 .. .

99

50

80.1

1.60

0.188

147

335

50

0

Table 7.4: Processed feature tracking text file

7.5.9

Feature detection analysis script

Using the best denoising method and DPG level, have a nested loop which for each set, comprised of a unique: – Feature detector – High SNR or low SNR group of subsequences Combine subsets into a bigger set. For example, one set might be dpg_0.01_lo_fast, dpg_0.01_lo_grad, and so on. Generates the following set of results: – Pixel drift for each decile of features (grouped by feature strength across the entire SET, not just for each sequence individually) 94

7.5. Feature Tracking

Figure 7.11: Corner tracking validation Metric

Lifetime

Drift.

Kept?

T-disp.

7 7 8 .. .

50 50 20 .. .

0.083 0.887 1.32 .. .

0 0 0 .. .

67.1 69.7 26.3 .. .

147

50

0.188

0

80.1

Table 7.5: Processed feature tracking mat file – Similarly, survival rate for each decile of features grouped by strength And displays results in the following plots: – Show all detector methods on the one bar plot for comparison of pixel drift for each feature strength decile - one plot for High SNR, and one for Low SNR – Show all detector methods on the one bar plot for comparison of survival rate for each feature strength decile - one plot for High SNR, and one for Low SNR – A single scatter plot which shows all features that survived the full sequence, with feature strength vs final drift Based on the results, one of the new feature detection methods should be proposed to be implemented in C++ so that it can be used in a real-time feature tracking application. 95

Chapter 7. Evaluation

Algorithm 28 Feature tracking validation script 1: procedure validfeaturetracking 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31:

setup system initialize variables for each DPG do revise launch file accordingly for each subsequence do for each denoising method do runvalidation end for press ENTER to jump next a next sequence mouse click to next frame end for end for end procedure procedure runvalidation read last text file find number of tracks store tracks read data from last text file initialize matrix of feature points compute life-time for each feature point do extract indices end for for each frame do window 1: show the tails of live features window 2: show overall dead and live features in the first frame window 3: show red tails of features dead from last frame window 4: show dead features from previous frame end for end procedure

The procedure of processed-image validation is shown in Algorithm 30. To summarize, the flowchart of feature tracking process is shown in Figure

7.5.10

Results

In this section, the effects of denoising methods and feature detectors are evaluated. 96

7.5. Feature Tracking

Average drift low SNR dpg−0.01 Original=9.276, NL−means=4.699, BM3D=5.276

20

Original NL−means BM3D

18

16

Drift (in pixels)

14

12

10

8

6

4

2

−1

00

%

0% −9

0% −8

80

90

Strength (top percent)

70

0% −7 60

0% 50

−6

0% −5 40

0% −4 30

0% −3 20

−2 10

0−

10

%

0%

0

Figure 7.12: Average drift of low SNR at DPG=0.01 Denoising methods Figure 7.12-7.15 demonstrate the average drift and survival rate of each decile in low SNR sequence at two DPG values - 0.01 and 0.005. Drift shows how far a feature point moves wrongly compared to its actual position; while survival rate represents how long a feature point can live throughout the video. Figure 7.12 and 7.13 shows a reasonable positive trend from the strongest to the weakest decile, which represent 10% of evaluated feature points. The strange that the drift reduces for the weakest 10% can be explained by the survival rate of the weakest decile is very low. Hence, a very few number of feature points are evaluated. The solution to this problem is to ignore the decile which has survival rate smaller than a preset threshold. Figure 7.14 and 7.15 demonstrate a very good performance when the DPG is set to 0.005. Features from the original image do not seem to be sorted as effectively as with the denoised images, because many of the detected features correspond to noise and the significant noise effects stronger features almost as much as weaker features. Figure 7.12-7.15 illustrate an improvement of denoising methods affecting on low 97

Chapter 7. Evaluation

Average survival rate low SNR dpg−0.01 Original=4.461, NL−means=7.581, BM3D=11.194

40

Original NL−means BM3D 35

Survival rate (in percent)

30

25

20

15

10

5

−1

00

%

0% −9

0% −8

80

90

Strength (top percent)

70

0% −7 60

0% 50

−6

0% −5 40

0% −4 30

0% −3 20

−2 10

0−

10

%

0%

0

Figure 7.13: Average survival rate of low SNR at DPG=0.01 SNR thermal-infrared video. In particular, when DPG is set at 0.01 and 0.005, the average drifts of NL-means and BM3D are reduced by 50%, from 9.276 to 4.699 and 5.276, and more than 50%, from 8.45 to 4.85 and 3.795, respectively (see Figure 7.12 and 7.14). Similarly, Figure 7.13-7.15 also demonstrate a dramatic improvement of denoising methods affecting on low SNR thermal-infrared video. In specific, when DPG is set at 0.01 and 0.005, the average survival rates increase rapidly by nearly three times, from 4.461 to 7.581 and 11.194, and six times, from 2.815 to 6.733 and 13.58, respectively.

Feature detectors Figure 7.16-7.21 give information on average drift and survival rate of each decile in low SNR sequence at DPG value of 0.01 by using different feature trackers, such as FAST, Harris, MinEigen and two proposed detectors introduced in Section 6.1. Figure 7.16-7.18 represents the performance of five feature detectors in average drift of evaluation criteria. In Figure 7.16, it can be seen from these figures that 98

7.5. Feature Tracking

Average drift low SNR dpg−0.005 Original=8.450, NL−means=4.846, BM3D=3.795

20

Original NL−means BM3D

18

16

Drift (in pixels)

14

12

10

8

6

4

2

−1

00

%

0% −9

0% −8

80

90

Strength (top percent)

70

0% −7 60

0% 50

−6

0% −5 40

0% −4 30

0% −3 20

−2 10

0−

10

%

0%

0

Figure 7.14: Average drift of low SNR at DPG=0.005 these two proposed methods demonstrate a superior performance compared to existing approaches. In specific, the values of proposed methods are 7.827 and 8.108; while the others are 8.29, 9.276 and 9.060. However, in the video denoised by NLmeans, the two proposed methods are only better than FAST - 5.775 and 5.186 to 5.952, but not Harris and MinEigen whose results are 4.699 and 4.486, respectively. The video denoised by BM3D, otherwise, illustrates a degrade of two proposed feature detectors over the others. Similar to Figure 7.16-7.18, Figure 7.19-7.21 represents the performance of five feature detectors in the evaluation of survival rate over three types of video. In Figure 7.19, it can be seen from these figures that these two proposed methods demonstrate a superior performance compared to existing approaches. In specific, the values of proposed methods are 4.176 and 3.867; while the others are 2.595, 4.461 and 2.763 of FAST, Harris and MinEigen, respectively. However, in the video denoised by NL-means, the gradient-based methods is only better than MinEigen -6.516 to 6.492, but not FAST and Harris whose results are 7.42 and 7.581, respectively. Conversely, gradientedge-based technique shows a great rise in survival rate in the NL-means video. Furthermore, the video denoised by BM3D, otherwise, illustrates a degrade of two proposed feature detectors over the others. 99

Chapter 7. Evaluation

Average survival rate low SNR dpg−0.005 Original=2.815, NL−means=6.733, BM3D=13.580

40

Original NL−means BM3D 35

Survival rate (in percent)

30

25

20

15

10

5

−1

00

%

0% −9

0% −8

80

90

Strength (top percent)

70

0% −7 60

0% 50

−6

0% −5 40

0% −4 30

0% −3 20

−2 10

0−

10

%

0%

0

Figure 7.15: Average survival rate of low SNR at DPG=0.005 Overall, FAST, Harris and MinEigen ranking works are much better with denoising methods, such as NL-means and BM3D, suggesting that the noise greatly affects the effectiveness of the ranking of these standard methods. Furthermore, our own detectors seem to rank better without denoising, and perform a bit better than the standard methods, although this reverses when noise levels are decreased by denoising. This suggests that our detection and ranking method may be an alternative to denoising, but as it stands the combination of our method plus denoising is not effective.

100

7.5. Feature Tracking

Average drift of Original for low SNR (dpg=0.01) 20

FAST=8.290, Harris=9.276, MinEigen=9.060, Gradient=7.827, GradientEdge=8.108 FAST Harris MinEigen Gradient GradientEdge

18

16

Drift (in pixels)

14

12

10

8

6

4

2

% 00 −1 90

80

−9 0

%

% −8 0

% −7 0

70

% 60

%

−6 0 50

−5 0 40

% −4 0 30

% −3 0 20

% −2 0 10

0−

10

%

0

Strength (top percent)

Figure 7.16: Comparison of average drift of original low-SNR sequences at DPG=0.01

Average drift of Non−local for low SNR (dpg=0.01) 20

FAST=5.952, Harris=4.699, MinEigen=4.486, Gradient=5.775, GradientEdge=5.186 FAST Harris MinEigen Gradient GradientEdge

18

16

Drift (in pixels)

14

12

10

8

6

4

2

% 00 −1 90

80

−9 0

%

% −8 0 70

60

−7 0

%

% −6 0 50

% −5 0 40

% −4 0 30

% −3 0 20

% −2 0 10

0−

10

%

0

Strength (top percent)

Figure 7.17: Comparison of average drift of NLM low-SNR sequences at DPG=0.01

101

Chapter 7. Evaluation

Algorithm 29 Denoising methods analysis script 1: procedure denoisinganalyze 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38:

setup system initialize variables for each DPG do load mat files filter and combine into some combinedMatrices call plotdrift call plotsurvivalrate end for end procedure procedure plotdrift input: combinedMatrices extract top numElement points from each subsequence delete untrackable points sort points by strength re-weight for each portion do extract matrix keep survived points compute average drift end for plot figures output: drift plots end procedure procedure plotsurvivalrate input: combinedMatrices extract top numElement points from each subsequence delete untrackable points sort points by strength for each portion do extract matrix keep survived points store all survived points compute survival rate end for plot figures output: survival rate plots end procedure

102

7.5. Feature Tracking

Algorithm 30 Feature detection analysis script 1: procedure featureanalyze 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45:

setup system initialize variables for each detector do load mat files filter and combine into some combinedMatrices call plotdrift call plotsurvivalrate end for end procedure procedure plotdrift input: combinedMatrices extract top numElement points from each subsequence delete untrackable points sort points by strength re-weight for each portion do find number of subsequences extract appropriate matrix from each subsequence remove stationary subsequence delete incorrect points by NUC keep survived points compute average drift end for plot figures output: drift plots end procedure procedure plotsurvivalrate input: combinedMatrices extract top numElement points from each subsequence delete untrackable points sort points by strength re-weight for each portion do find number of subsequences extract appropriate matrix from each subsequence remove stationary subsequence delete incorrect points by NUC keep survived points store all survived points compute survival rate end for plot figures output: survival rate plots end procedure 103

Chapter 7. Evaluation

Average drift of BM3D for low SNR (dpg=0.01) 20

FAST=5.607, Harris=5.276, MinEigen=5.381, Gradient=7.261, GradientEdge=6.958 FAST Harris MinEigen Gradient GradientEdge

18

16

Drift (in pixels)

14

12

10

8

6

4

2

% 00 −1 90

80

−9 0

%

% −8 0 70

60

−7 0

%

%

%

−6 0 50

%

−5 0 40

%

−4 0 30

%

−3 0 20

−2 0 10

0−

10

%

0

Strength (top percent)

Figure 7.18: DPG=0.01

Comparison of average drift of BM3D low-SNR sequences at

Survival rate of Original for low SNR (dpg=0.01) 40

FAST=2.595, Harris=4.461, MinEigen=2.763, Gradient=4.167, GradientEdge=3.867 FAST Harris MinEigen Gradient GradientEdge

35

Survival rate (in percent)

30

25

20

15

10

5

90

−1

00

%

0% 80

−9

0% −8 70

0% 60

−7

0% −6 50

0% −5 40

0% −4 30

0% −3 20

−2 10

0−

10

%

0%

0

Strength (top percent)

Figure 7.19: Comparison of average survival rate of original low-SNR sequences at DPG=0.01 104

7.5. Feature Tracking

Survival rate of Non−local for low SNR (dpg=0.01) 40

FAST=7.420, Harris=7.581, MinEigen=6.492, Gradient=6.516, GradientEdge=7.661 FAST Harris MinEigen Gradient GradientEdge

35

Survival rate (in percent)

30

25

20

15

10

5

% 00 −1 90

80

−9 0

%

% −8 0

% −7 0

70

% 60

%

−6 0 50

−5 0 40

% −4 0 30

% −3 0 20

% −2 0 10

0−

10

%

0

Strength (top percent)

Figure 7.20: Comparison of average survival rate of NLM low-SNR sequences at DPG=0.01 Survival rate of BM3D for low SNR (dpg=0.01) 40

FAST=10.181, Harris=11.194, MinEigen=9.416, Gradient=5.665, GradientEdge=5.956 FAST Harris MinEigen Gradient GradientEdge

35

Survival rate (in percent)

30

25

20

15

10

5

90

−1

00

%

0% 80

−9

0% −8 70

0% 60

−7

0% −6 50

0% −5 40

0% −4 30

0% −3 20

−2 10

0−

10

%

0%

0

Strength (top percent)

Figure 7.21: Comparison of average survival rate of BM3D low-SNR sequences at DPG=0.01 105

Chapter 7. Evaluation

Survival rate in n−frame sequence 80 Original (Detection) NL−means (Detection) BM3D (Detection) Original (Tracking) NL−means (Tracking) BM3D (Tracking) Original (Detection & Tracking) NL−means (Detection & Tracking) BM3D (Detection & Tracking)

70

50

40

30

20

10

50

45

40

35

30

25

20

15

10

5

0

Number of frames (in frames)

Figure 7.22: Survival rate in n-frame sequence at DPG=0.01

106

Survival rate (in percent)

60

7.5. Feature Tracking

Figure 7.23: Flowchart of Feature Tracking Process

107

Chapter 8 Conclusions and Future Work This thesis has proposed a number of methods for feature detection and tracking in the field of thermal-infrared video. Many of these methods have been integrated into a hand-held thermal-infrared camera for evaluation, which has been demonstrated to be less affected by changes in the environment lighting, shadows or outof-view motion. The purpose of this final chapter is to provide a summary of the contributions of the thesis (Section 8.1), and to propose future research directions (Section 8.2).

8.1

Contributions

This thesis has contributed a methodology for feature tracking matter in low-SNR video created by thermal-infrared camera and four techniques within the image processing area. The contributions in detail of the thesis are revisited as follows.

8.1.1

Noise Reduction Filters

– Review popular denoising techniques for thermal-infrared cameras. – Perform noise reduction to improve edge and feature detection and feature tracking in thermal-infrared modality. – Evaluate the positive effects of noise reduction filters in low-SNR videos. – Propose an approach for thermal-infrared scenarios.

8.1.2

Feature Detectors

– Investigate well-known feature detectors used in normal circumstances. 108

8.2. Future Work

– Implement a number of detectors in MATLAB. – Evaluate the performance of a number of feature detectors in feature tracking. – Propose two effective techniques in detecting interest points for thermalinfrared video. – Propose a corner-strength ranking technique in traditional framework.

8.1.3

Edge Detectors

– Review popular edge detectors. – Implement a few edge detectors in MATLAB. – Evaluate the performance of various edge detectors. – Propose a smoothness-based edge detector. – Propose a edge-strength ranking process in standard framework.

8.1.4

Feature Tracking

– Investigate current methods for points tracking. – Implement feature tracking using different feature detectors and denoising methods. – Evaluate the combinations of a number of feature detectors and denoising methods.

8.2 8.2.1

Future Work Implementation of Denoising Methods in C++ Code

Using both denoising methods, namely Non-local Means Filter and Sparse 3D Transformdomain Collaborative Filter, achieved considerably better results than what was possible with the original noisy images. In future work, there is a need to decide which one can be more easily and efficiently implemented in C++. Then, it will be integrated into the existing project. A summary of comparative qualities of each method should be made, so that a reasonable decision of which one to implement in C++ should be found. The following factors are considered. – How fast is the algorithm? + How fast is it likely to be in C++? In MATLAB, NL-means is five times faster than BM3D. 109

Chapter 8. Conclusions and Future Work + Will it be able to denoise every frame, or just frames used for detection? – Is there already C++ code? + What other libraries does it depend on? + How hard will it be to integrate into current system? + Are there any copyright issues with using the code? + If there is no code, how complex is the algorithm? – How good is the performance on low-SNR images compared to the other methods? – Is there any copyright/patent issue with using the algorithm? – If there is existing code without copyright/patent issue, the latest C++ code should be checked out, compiled and throughly tested.

8.2.2

Improve Performance of Custom Feature Detectors

The following factors should be achieved. – Improve the ranking of both feature detectors so that both methods achieve robust feature ranking. In specific, two factors below should be considered. + Drift decreases with higher feature response or strength. + Survival rate increases with higher feature response or strength. – Achieve better, or comparable, but differently advantageous, performance compared to existing OpenCV alternatives

8.2.3

Performance Evaluation for Monocular SLAM

The following factors should be done. – Compile current system with PCL to make sure that the program can build all components including SLAM and it runs without crashing. – Obtain five test sequences and test the five launch files. – Generate denoised versions of these sequences. – Determine good initialization settings for each sequence.

110

Bibliography

[1] Aarya, D. (2013). Non-local means algorithm for image de-noising. Recent Advances in Electrical Engineering Series, (10). 21 [2] Agostinelli, F., Anderson, M. R., and Lee, H. (2013). Adaptive multi-column deep neural networks with application to robust image denoising. In Advances in Neural Information Processing Systems, pages 1493–1501. 14 [3] Arbelaez, P., Maire, M., Fowlkes, C., and Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 33(5):898–916. [4] Arce, G. R. (2005). Nonlinear signal processing: a statistical approach. John Wiley & Sons. 17 [5] Arias-Castro, E. and Donoho, D. L. (2009). Does median filtering truly preserve edges better than linear filtering? The Annals of Statistics, pages 1172–1206. 17 [6] Artieda, J., Sebastian, J. M., Campoy, P., Correa, J. F., Mondragón, I. F., Martínez, C., and Olivares, M. (2009). Visual 3-d slam from uavs. Journal of Intelligent and Robotic Systems, 55(4-5):299–321. 2 [7] Ballester, C., Caselles, V., Verdera, J., Bertalmio, M., and Sapiro, G. (2003). A variational model for filling-in. In Proc. Int’l Conf. Image Processing. Citeseer. 19 [8] Bao, P., Zhang, D., and Wu, X. (2005). Canny edge detection enhancement by scale multiplication. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27(9):1485–1490. 31 [9] Bardinet, E., Cohen, L., and Ayache, N. (1996). Tracking medical 3d data with ˚ a deformable parametric model. In Computer VisionUECCV’96, pages 315–328. Springer. 64 [10] Bay, H., Tuytelaars, T., and Van Gool, L. (2006). Surf: Speeded up robust features. In Computer Vision–ECCV 2006, pages 404–417. Springer. 69 [11] Beiser, A., Mahajan, S., and Choudhury, S. R. (2003). Concepts of modern physics. Tata McGraw-Hill Education. 6 [12] Brown, R. G. H. and Y.C., P. (1996). Introduction to Random Signals and Applied Kalman Filtering. New York: John Wiley & Sons, 3 edition. 15 111

Bibliography [13] Buades, A., Coll, B., and Morel, J.-M. (2005a). A non-local algorithm for image denoising. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, volume 2, pages 60–65. IEEE. 13, 18 [14] Buades, A., Coll, B., and Morel, J.-M. (2005b). A review of image denoising algorithms, with a new one. Multiscale Modeling & Simulation, 4(2):490–530. 13, 18, 20 [15] Calonder, M., Lepetit, V., Strecha, C., and Fua, P. (2010). Brief: Binary robust independent elementary features. In Computer Vision–ECCV 2010, pages 778– 792. Springer. 70 [16] Canny, J. (1986). A computational approach to edge detection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (6):679–698. 25, 30, 34, 57, 65, 67 [17] Chatterjee, P. and Milanfar, P. (2010). Is denoising dead? Image Processing, IEEE Transactions on, 19(4):895–911. 22 [18] Chen, J., Zou, L.-h., Zhang, J., and Dou, L.-h. (2009). The comparison and application of corner detection algorithms. Journal of multimedia, 4(6):435– 441. 83 [19] Dabov, K., Foi, A., Katkovnik, V., and Egiazarian, K. (2007). Image denoising by sparse 3-d transform-domain collaborative filtering. Image Processing, IEEE Transactions on, 16(8):2080–2095. 13, 14, 22, 23 [20] Darrell, T., Moghaddam, B., and Pentland, A. P. (1996). Active face tracking and pose estimation in an interactive room. In Computer Vision and Pattern Recognition, 1996. Proceedings CVPR’96, 1996 IEEE Computer Society Conference on, pages 67–72. IEEE. 64 [21] Dollár, P. and Zitnick, C. L. (2013). Structured forests for fast edge detection. In Computer Vision (ICCV), 2013 IEEE International Conference on, pages 1841– 1848. IEEE. 25 [22] Duda, P. E. and Richard, O. (1973). Hart, pattern classification and scene analysis. 25 [23] Engelhard, N., Endres, F., Hess, J., Sturm, J., and Burgard, W. (2011). Realtime 3d visual slam with a hand-held rgb-d camera. In Proc. of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum, Vasteras, Sweden, volume 180. 1 [24] Everingham, M., Van Gool, L., Williams, C., Winn, J., and Zisserman, A. (2009). The pascal visual object classes challenge 2009. In 2th PASCAL Challenge Workshop. 72 [25] Fram, J. R. and Deutsch, E. S. (1975). On the quantitative evaluation of edge detection schemes and their comparison with human performance. Computers, IEEE Transactions on, 100(6):616–628. 25 [26] Fukunaga, K. (2013). Introduction to statistical pattern recognition. Academic press. 1 [27] Gehler, P. and Nowozin, S. (2009). On feature combination for multiclass object classification. In Computer Vision, 2009 IEEE 12th International Conference on, pages 221–228. IEEE. 2 112

Bibliography [28] Gonzalez, R. C., Woods, R. E., and Eddins, S. L. (2003). Digital Image Processing Using MATLAB. Prentice-Hall, Inc., Upper Saddle River, NJ, USA. 15 [29] Harris, C. and Stephens, M. (1988). A combined corner and edge detector. In Alvey vision conference, volume 15, page 50. Manchester, UK. 2, 46 [30] He, X. C. and Yung, N. H. (2008). Corner detector based on global and local curvature properties. Optical Engineering, 47(5):057008–057008. 34, 36, 39, 53, 54, 55, 57, 58, 64, 65, 67, 83 [31] Irani, M. and Peleg, S. (1991). Improving resolution by image registration. CVGIP: Graphical models and image processing, 53(3):231–239. 2 [32] Jin-Yu, Z., Yan, C., and Xian-Xiang, H. (2009a). Edge detection of images based on improved sobel operator and genetic algorithms. In Image Analysis and Signal Processing, 2009. IASP 2009. International Conference on, pages 31– 35. IEEE. 27 [33] Jin-Yu, Z., Yan, C., and Xian-Xiang, H. (2009b). Ir thermal image segmentation based on enhanced genetic algorithms and two-dimensional classes square error. In Information and Computing Science, 2009. ICIC’09. Second International Conference on, volume 2, pages 309–312. IEEE. 10 [34] Katkovnik, V., Foi, A., Egiazarian, K., and Astola, J. (2010). From local kernel to nonlocal multiple-model image denoising. International journal of computer vision, 86(1):1–32. 22 [35] Keyes, R. J. (1977). Optical and infrared detectors. Optical and Infrared Detectors, 1. 8 [36] Kim, S. (2012). Robust corner detection by image-based direct curvature field estimation for mobile robot navigation. Int J Adv Robotic Sy, 9(187). 44, 45 [37] Lapeyronnie, A., Parisot, C., Meessen, J., Desurmont, X., and Delaigle, J.F. (2008). Real-time road traffic classification using mobile video cameras. In Electronic Imaging 2008, pages 681108–681108. International Society for Optics and Photonics. 2 [38] Lebrun, M. (2012). An analysis and implementation of the bm3d image denoising method. Image Processing On Line, (2012). 22, 23 [39] LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278– 2324. 13 [40] Levin, A. and Nadler, B. (2011). Natural image denoising: Optimality and inherent bounds. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 2833–2840. IEEE. 13 [41] Lim, J. S. (1990). Two-dimensional signal and image processing. Englewood Cliffs, NJ, Prentice Hall, 1990, 710 p., 1. 17 [42] Liu, J., Jakas, A., Al-Obaidi, A., and Liu, Y. (2009). A comparative study of different corner detection methods. In Computational Intelligence in Robotics and Automation (CIRA), 2009 IEEE International Symposium on, pages 509–514. IEEE. 49 [43] Lowe, D. G. (1999). Object recognition from local scale-invariant features. In Computer vision, 1999. The proceedings of the seventh IEEE international conference on, volume 2, pages 1150–1157. Ieee. 1 113

Bibliography [44] Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91–110. 69 [45] Lu, F. and Milios, E. (1997). Robot pose estimation in unknown environments by matching 2d range scans. Journal of Intelligent and Robotic Systems, 18(3):249–275. 2 [46] Lucas, B. D. (1985). Generalized image matching by the method of differences. 73 [47] Lucas, B. D., Kanade, T., et al. (1981). An iterative image registration technique with an application to stereo vision. In IJCAI, volume 81, pages 674–679. 72, 73, 74 [48] Luhmann, T., Piechel, J., and Roelfs, T. (2013). Geometric calibration of thermographic cameras. In Thermal Infrared Remote Sensing, pages 27–42. Springer. 6 [49] Luisier, F., Blu, T., and Unser, M. (2007). A new sure approach to image denoising: Interscale orthonormal wavelet thresholding. Image Processing, IEEE Transactions on, 16(3):593–606. 13 [50] Maini, R. and Aggarwal, H. (2009a). Study and comparison of various image edge detection techniques. International Journal of Image Processing (IJIP), 3(1):1–11. 29 [51] Maini, R. and Aggarwal, H. (2009b). Study and comparison of various image edge detection techniques. International Journal of Image Processing (IJIP), 3(1):1–11. 78 [52] Martin, D. R., Fowlkes, C. C., and Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 26(5):530–549. 25 [53] Meyer-Baese, A. and Schmid, V. J. (2014). Pattern Recognition and Signal Analysis in Medical Imaging. Elsevier. 1 [54] Miller, J. L. (1994). Principles of infrared technology. Springer. 5 [55] Mokhtarian, F. and Suomela, R. (1998). Robust image corner detection through curvature scale space. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20(12):1376–1381. 53, 54 [56] Moravec, H. P. (1980). Obstacle avoidance and navigation in the real world by a seeing robot rover. Technical report, DTIC Document. 2, 46 [57] Motwani, M. C., Gadiya, M. C., Motwani, R. C., and Harris Jr, F. C. (2004). Survey of image denoising techniques. In Proceedings of GSPX, pages 27–30. Citeseer. 13 [58] Mythili, C. and Kavitha, V. (2011). Efficient technique for color image noise reduction. The research bulletin of Jordan, ACM, 1(11):41–44. 13 [59] Oskoei, M. A. and Hu, H. (2010). A survey on edge detection methods. University of Essex, UK. 33 [60] Papanikolopoulos, N. P., Khosla, P. K., and Kanade, T. (1993). Visual tracking of a moving target by a camera mounted on a robot: a combination of control and vision. Robotics and Automation, IEEE Transactions on, 9(1):14–35. 64 114

Bibliography [61] Perona, P. and Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 12(7):629–639. 25 [62] Portilla, J., Strela, V., Wainwright, M. J., and Simoncelli, E. P. (2003). Image denoising using scale mixtures of gaussians in the wavelet domain. Image Processing, IEEE Transactions on, 12(11):1338–1351. 13 [63] Rattarangsi, A. and Chin, R. T. (1990). Scale-based detection of corners of planar curves. In Pattern Recognition, 1990. Proceedings., 10th International Conference on, volume 1, pages 923–930. IEEE. 36, 57 [64] Roberts, L. G. (1963). MACHINE PERCEPTION OF THREE-DIMENSIONAL soups. PhD thesis, Massachusetts Institute of Technology. 26 [65] Rosin, P. L. (1999). Measuring corner properties. Computer Vision and Image Understanding, 73(2):291–307. 71 [66] Rosten, E. and Drummond, T. (2006). Machine learning for high-speed corner detection. In Computer Vision–ECCV 2006, pages 430–443. Springer. 2, 51, 52 [67] Rosten, E., Porter, R., and Drummond, T. (2010). Faster and better: A machine learning approach to corner detection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(1):105–119. 51, 52 [68] Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011). Orb: an efficient alternative to sift or surf. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 2564–2571. IEEE. 69, 71, 72 [69] Schmid, C., Mohr, R., and Bauckhage, C. (2000). Evaluation of interest point detectors. International Journal of computer vision, 37(2):151–172. 46 [70] Shi, J. and Tomasi, C. (1994). Good features to track. In Computer Vision and Pattern Recognition, 1994. Proceedings CVPR’94., 1994 IEEE Computer Society Conference on, pages 593–600. IEEE. 2, 47, 74, 83 [71] Shi, P., Robinson, G., Todd Constable, R., Sinusas, A., and Duncan, J. (1995). A model-based integrated approach to track myocardial deformation using displacement and velocity constraints. In Computer Vision, 1995. Proceedings., Fifth International Conference on, pages 687–692. IEEE. 64 [72] Shrivakshan, G. and Chandrasekar, C. (2012). A comparison of various edge detection techniques used in image processing. IJCSI International Journal of Computer Science Issues, 9(5):272–276. 78 [73] Smith, S. M. (1995). Asset-2: Real-time motion segmentation and shape tracking. In Computer Vision, 1995. Proceedings., Fifth International Conference on, pages 237–244. IEEE. 2 ˚ a new approach to low level [74] Smith, S. M. and Brady, J. M. (1997). SusanU image processing. International journal of computer vision, 23(1):45–78. 49 [75] Sobel, I. (2014). History and definition of the sobel operator. 27 [76] Sobel, I. and Feldman, G. (1968). A 3x3 isotropic gradient operator for image processing. a talk at the Stanford Artificial Project in, pages 271–272. 25, 27 115

Bibliography [77] Sulong, G., Ebrahim, A. Y., and Jehanzeb, M. (2014). Offline handwritten signature identification using adaptive window positioning techniques. arXiv preprint arXiv:1407.2700. 1 [78] Szeliski, R., Kang, S. B., and Shum, H.-Y. (1995). A parallel feature tracker for extended image sequences. In Computer Vision, 1995. Proceedings., International Symposium on, pages 241–246. IEEE. 2 [79] Trier, Ø. D., Jain, A. K., and Taxt, T. (1996). Feature extraction methods for character recognition-a survey. Pattern recognition, 29(4):641–662. 1 [80] Tuytelaars, T. and Mikolajczyk, K. (2008). Local invariant feature detectors: a survey. Foundations and Trends R in Computer Graphics and Vision, 3(3):177– 280. 44 [81] Vaseghi, S. V. (2000). Advanced Digital Signal Processing and Noise Reduction. John Wiley & Sons Ltd, 2nd edition. ISBNs: 0-471-62692-9. 14 [82] Verma, R. and Ali, J. (2013). A comparative study of various types of image noise and efficient noise removal techniques. International Journal of Advanced Research in Computer Science and Software Engineering, 3. 13 [83] Viola, P. and Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, volume 1, pages I–511. IEEE. 2 [84] Wang, B. and Fan, S. (2009). An improved canny edge detection algorithm. In Computer Science and Engineering, 2009. WCSE’09. Second International Workshop on, volume 1, pages 497–500. IEEE. 31 [85] Wenshuo Gao, Xiaoguang Zhang, L. Y. and Liu, H. (2010). An improved sobel edge detection. In Computer Science and Information Technology (ICCSIT), volume 5, pages 67 – 71. IEEE. 27 [86] Wiener, N. (1949). Extrapolation, interpolation, and smoothing of stationary time series, volume 2. MIT press Cambridge, MA. 14 [87] Wikipedia (2014). Canny edge detector — wikipedia, the free encyclopedia. [Online; accessed 7-October-2014]. 32 [88] Xiaofeng, R. and Bo, L. (2012). Discriminatively trained sparse code gradients for contour detection. In Advances in neural information processing systems, pages 584–592. 25 [89] Xu, L., Krzyzak, A., and Suen, C. Y. (1992). Methods of combining multiple classifiers and their applications to handwriting recognition. Systems, man and cybernetics, IEEE transactions on, 22(3):418–435. 1 [90] Zhang, L., Dong, W., Zhang, D., and Shi, G. (2010). Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition, 43(4):1531–1549. 22 [91] Zheng, S., Yuille, A., and Tu, Z. (2010). Detecting object boundaries using low-, mid-, and high-level information. Computer Vision and Image Understanding, 114(10):1055–1067. 25 [92] Zhu, S. and Ma, K.-K. (2000). A new diamond search algorithm for fast blockmatching motion estimation. Image Processing, IEEE Transactions on, 9(2):287– 290. 2 116

Suggest Documents