Cooperative Anchoring - CiteSeerX

Doctoral Dissertation

Cooperative Anchoring: Sharing Information about Objects in Multi-Robot Systems

Kevin LeBlanc Technology

Örebro Studies in Technology 39 örebro 2010


Örebro Studies in Technology 39

Kevin LeBlanc


© Kevin LeBlanc, 2010 Title: Cooperative Anchoring: Sharing Information about Objects in Multi-Robot Systems Publisher: Örebro University, 2010 www.publications.oru.se Printer: Intellecta Infolog, Kållered 10/2010

ISBN

ISSN 1650-8580 978-91-7668-754-3

Abstract In order to perform most tasks, robots must perceive or interact with physical objects in their environment; often, they must also communicate and reason about objects and their properties. Information about objects is typically produced, represented and used in different ways in various robotic sub-systems. In particular, high-level sub-systems often reason with object names and descriptions, while low-level sub-systems often use representations based on sensor data. In multi-robot systems, object representations are also distributed across robots. Matters are further complicated by the fact that the sets of objects considered by each robot and each sub-system often differ. Anchoring is the process of creating and maintaining associations between descriptions and perceptual information corresponding to the same physical objects. To illustrate, imagine you are asked to fetch “the large blue book from the bookshelf”. To accomplish this task, you must somehow associate the description of the book you have in your mind with the visual representation of the appropriate book. Cooperative anchoring deals with associations between descriptions and perceptual information which are distributed across multiple agents. Unlike humans, robots can exchange both descriptions and perceptual information; in a sense, they are able to “see the world through each other’s eyes”. Again, imagine you are asked to fetch a particular book, this time from the library. But now, in addition to your own visual representations, you also have access to information about books observed by others. This can allow you to find the correct book without searching through the entire library yourself. This thesis proposes an anchoring framework for both single-robot and cooperative anchoring that addresses a number of limitations in existing approaches. The framework represents information using conceptual spaces, allowing various types of object descriptions to be associated with uncertain and heterogeneous perceptual information. An implementation is described which uses fuzzy logic to represent, compare and combine information. The implementation also includes a cooperative object localisation method which takes uncertainty in both observations and self-localisation into account. Experiments using simulated and real robots are used to validate the proposed framework and the cooperative object localisation method. i

Acknowledgements I’ve often said that the acknowledgements section is probably one of the most frequently read parts of a thesis, and that as such, it should be written with particular care. However, like most Ph.D. students before me, I wrote this section at the last minute, and I apologise if in the final rush to get everything finished I’ve overlooked anyone. When you spend as much time doing anything as I’ve spent preparing this thesis, a lot of people are bound to have had the opportunity to help. First and foremost, I would like to thank my supervisor, Alessandro Saffiotti, for giving me the opportunity to carry out my Ph.D. studies at Örebro University. The breadth and depth of his knowledge have been an immensely valuable resource over the years, and I am extremely grateful for his guidance, encouragement, and patience throughout my studies. A number of people have helped the development of the ideas and methods contained in this thesis. In particular, I would like to thank Silvia Coradeschi, Amy Loutfi, and Mathias Broxvall for their involvement and interest in this work. I would also like to thank Mathias for help with some of the algorithms in this thesis, and for his work on the Peis middleware, which was an invaluable tool during the experimental phase of this work. The presented experiments would also not have been possible without help from Per Sporrong and Bo-Lennart Silfverdal, who somehow managed to keep the robots and other hardware at AASS up and running despite my software. For always having answers to administrative questions, I would like to thank Barbro Alvin, Anne Moe, Kicki Ekberg, and Jenny Tiberg; I would also like to thank the countless others who have helped me with all sorts of administrative issues over the years. This work was partially funded by CUGS (the National Graduate School in Computer Science, Sweden), and I would like to thank the lecturers and my fellow CUGS students for making CUGS courses and seminars both rewarding and entertaining. I would also like to thank the Swedish Knowledge Foundation for financial support, and ETRI (Electronics and Telecommunications Research Institute, Korea) for funding the Peis-Ecology project.

iii

iv

And of course, Ph.D. studies involve more than just taking courses and writing a thesis. I would like to thank everyone at AASS, past and present, for making the work environment so enjoyable. In particular, I would like to thank Robert Lundh for many interesting conversations while travelling to and from CUGS courses, and for listening to my numerous complaints about the oddities of the Swedish language. I would also like to thank the Italians for dragging me out of my office when I needed it most. And thank you to everyone who went skiing, biking, swimming, or running with me over the years. The fact that we chose to torture ourselves the way we did rather than work on our theses shows just how powerful the procrastination instinct is with Ph.D. students in general (and this one in particular). A special thank you goes to my parents and my brother, for their love and support, and for providing me with everything I needed in order to get where I am today. I would also like to thank my extended family and friends, both near and far, for supporting me throughout my studies. And last but not least, I would like to express my loving thanks to Anita for her encouragement, support, patience, and help during the writing of this thesis.

Contents 1 Introduction 1.1 Motivation . 1.2 Illustration . 1.3 Objectives . . 1.4 Challenges . . 1.5 Contributions 1.6 Outline . . . 1.7 Publications .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

1 1 2 4 4 7 8 9

2 Related Work 2.1 Anchoring . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Single-Robot Anchoring . . . . . . . . . . . . . . . 2.1.2 Cooperative Anchoring . . . . . . . . . . . . . . . . 2.1.3 Overcoming the Limitations of Existing Approaches 2.2 Related Challenges . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Symbol Grounding . . . . . . . . . . . . . . . . . . 2.2.2 Binding . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Perception Management . . . . . . . . . . . . . . . 2.2.4 Tracking . . . . . . . . . . . . . . . . . . . . . . . . 2.2.5 Data Association . . . . . . . . . . . . . . . . . . . 2.2.6 Information Fusion . . . . . . . . . . . . . . . . . . 2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

11 11 12 15 16 19 20 20 20 20 21 23 27

3 Problem Formalisation 3.1 Ingredients . . . . . . . . . 3.1.1 Information Sources 3.1.2 Anchors . . . . . . . 3.1.3 Descriptions . . . . . 3.2 Problem Definition . . . . . 3.2.1 Data Association . . 3.2.2 Information Fusion .

. . . . . . .

. . . . . . .

. . . . . . .

29 29 31 32 32 33 33 34

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . v

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

CONTENTS

vi

3.2.3 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Anchoring Framework 4.1 Framework Overview . . . . . . . . . . 4.1.1 Local and Global Anchoring . . . 4.1.2 A Decentralised Approach . . . . 4.1.3 Illustration . . . . . . . . . . . . 4.2 Conceptual Spaces . . . . . . . . . . . . 4.2.1 Interpretations . . . . . . . . . . 4.2.2 Similarity . . . . . . . . . . . . . 4.2.3 Anchor Spaces . . . . . . . . . . 4.3 Local Anchor Management . . . . . . . 4.3.1 Self-anchors . . . . . . . . . . . . 4.3.2 Local Data Association . . . . . . 4.3.3 Local Information Fusion . . . . 4.3.4 Local Prediction . . . . . . . . . 4.3.5 Local Anchor Deletion . . . . . . 4.3.6 Illustration . . . . . . . . . . . . 4.4 Global Anchor Management . . . . . . . 4.4.1 Global Data Association . . . . . 4.4.2 Global Information Fusion . . . . 4.4.3 Global Prediction . . . . . . . . . 4.4.4 Global Anchor Deletion . . . . . 4.4.5 Illustration . . . . . . . . . . . . 4.5 Descriptions . . . . . . . . . . . . . . . . 4.5.1 Descriptions and Anchoring . . . 4.5.2 Descriptions and Interest Filtering 4.6 Names . . . . . . . . . . . . . . . . . . . 4.6.1 Assigning Names . . . . . . . . . 4.6.2 Associating Names and Anchors . 4.7 Framework Summary . . . . . . . . . . . 4.8 Discussion . . . . . . . . . . . . . . . . .

34 35

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37 37 37 38 39 41 42 44 44 44 45 46 46 47 47 48 49 50 51 52 52 53 54 54 55 56 56 57 58 61

5 Framework Realisation Part 1: Representations 5.1 Implementation Overview . . . . . . . . . 5.1.1 Representations . . . . . . . . . . . 5.1.2 Processes . . . . . . . . . . . . . . 5.1.3 Experimental Tool . . . . . . . . . 5.2 Information Representation . . . . . . . . 5.2.1 Fuzzy Sets . . . . . . . . . . . . . . 5.2.2 Implementing Fuzzy Sets . . . . . . 5.2.3 Operations On Fuzzy Sets . . . . . 5.3 Domain Choices . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

63 63 63 64 65 65 65 67 73 78

CONTENTS

vii

5.3.1 Common Local and Global Anchor Spaces 5.3.2 Dimensions and Coordinate Systems . . . Descriptions . . . . . . . . . . . . . . . . . . . . . Grounding Functions . . . . . . . . . . . . . . . . Conceptual Sensor Models . . . . . . . . . . . . . 5.6.1 Symbolic Conceptual Sensor Models . . . 5.6.2 Numeric Conceptual Sensor Models . . . 5.6.3 Negative Information . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

78 79 80 81 81 82 85 88 88

6 Framework Realisation Part 2: Processes 6.1 Self-Localisation . . . . . . . . . . . . . . . . . . 6.1.1 Representation . . . . . . . . . . . . . . . 6.1.2 Landmark-Based Self-Localisation . . . . . 6.1.3 Adaptive Monte-Carlo Localisation . . . . 6.2 Object Localisation . . . . . . . . . . . . . . . . . 6.2.1 Relevant Information . . . . . . . . . . . . 6.2.2 Coordinate Transformation Process . . . . 6.2.3 Approximate Coordinate Transformation 6.2.4 Coordinate Transformation Complexity . 6.3 Data Association . . . . . . . . . . . . . . . . . . 6.3.1 Local Data Association . . . . . . . . . . . 6.3.2 Global Data Association . . . . . . . . . . 6.3.3 Data Association Algorithm . . . . . . . . 6.3.4 Bounded Data Association Algorithm . . . 6.3.5 Data Association Complexity . . . . . . . 6.4 Information Fusion . . . . . . . . . . . . . . . . . 6.4.1 Local Information Fusion . . . . . . . . . 6.4.2 Global Information Fusion . . . . . . . . . 6.4.3 Approximating Local Anchors . . . . . . . 6.4.4 Information Fusion Complexity . . . . . . 6.5 Prediction . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Local Prediction . . . . . . . . . . . . . . 6.5.2 Global Prediction . . . . . . . . . . . . . . 6.5.3 Anchor Deletion . . . . . . . . . . . . . . 6.6 Illustration . . . . . . . . . . . . . . . . . . . . . 6.6.1 Robot 1: Local Anchor Management . . . 6.6.2 Robot 2: Local Anchor Management . . . 6.6.3 Global Anchor Management . . . . . . . . 6.7 Summary . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89 89 90 90 94 95 95 96 98 100 101 101 102 104 110 112 115 115 116 116 117 118 118 119 119 119 119 123 124 127

5.4 5.5 5.6

5.7

CONTENTS

viii

7 Cooperative Object Localisation Experiments 7.1 Methodology . . . . . . . . . . . . . . . . . . . . 7.2 Experimental setup . . . . . . . . . . . . . . . . . 7.2.1 Robots . . . . . . . . . . . . . . . . . . . . 7.2.2 Environment . . . . . . . . . . . . . . . . 7.2.3 Ground truth . . . . . . . . . . . . . . . . 7.2.4 Performance Metrics . . . . . . . . . . . . 7.2.5 Software Setup . . . . . . . . . . . . . . . 7.3 Evaluated Methods . . . . . . . . . . . . . . . . . 7.4 Exploring The Input-Error Landscape . . . . . . . 7.5 Results . . . . . . . . . . . . . . . . . . . . . . . . 7.5.1 Artificial Errors on Target Observations . 7.5.2 Artificial Errors on Landmark Observation 7.5.3 Unaltered Data . . . . . . . . . . . . . . . 7.6 Discussion . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

129 130 130 130 131 132 132 133 135 136 138 138 143 144 145

8 Anchoring Experiments 8.1 Objectives . . . . . . . . . . . . . . . . . . . . . . 8.2 Methodology . . . . . . . . . . . . . . . . . . . . 8.3 Common Experimental Setup . . . . . . . . . . . 8.3.1 Environment . . . . . . . . . . . . . . . . 8.3.2 Fixed Cameras . . . . . . . . . . . . . . . 8.3.3 Mobile Robots . . . . . . . . . . . . . . . 8.3.4 Software Configuration . . . . . . . . . . 8.4 Experiment 1: Find a Parcel (Simulation) . . . . . 8.4.1 Goal . . . . . . . . . . . . . . . . . . . . . 8.4.2 Setup . . . . . . . . . . . . . . . . . . . . 8.4.3 Execution . . . . . . . . . . . . . . . . . . 8.4.4 Results . . . . . . . . . . . . . . . . . . . . 8.4.5 Discussion . . . . . . . . . . . . . . . . . . 8.5 Experiment 2: Find a Parcel (Real Robots) . . . . 8.5.1 Goal . . . . . . . . . . . . . . . . . . . . . 8.5.2 Setup . . . . . . . . . . . . . . . . . . . . 8.5.3 Execution . . . . . . . . . . . . . . . . . . 8.5.4 Results . . . . . . . . . . . . . . . . . . . . 8.5.5 Discussion . . . . . . . . . . . . . . . . . . 8.6 Experiment 3: Find Multiple Parcels . . . . . . . . 8.6.1 Goal . . . . . . . . . . . . . . . . . . . . . 8.6.2 Setup . . . . . . . . . . . . . . . . . . . . 8.6.3 Execution . . . . . . . . . . . . . . . . . . 8.6.4 Results . . . . . . . . . . . . . . . . . . . . 8.6.5 Discussion . . . . . . . . . . . . . . . . . . 8.7 Experiment 4: Anchoring in a Full Robotic System 8.7.1 Goal . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

149 149 150 151 151 152 153 155 160 160 160 163 165 165 168 168 168 170 179 179 180 180 180 181 194 194 195 195

CONTENTS

8.7.2 Setup . . 8.7.3 Approach 8.7.4 Execution 8.7.5 Results . . 8.7.6 Discussion 8.8 Summary . . . .

ix

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

195 196 203 205 206 206

9 Conclusions 9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . 9.1.1 Problem Definition . . . . . . . . . . . . . . . 9.1.2 Framework . . . . . . . . . . . . . . . . . . . 9.1.3 Realisation . . . . . . . . . . . . . . . . . . . 9.1.4 Experiments . . . . . . . . . . . . . . . . . . . 9.2 Limitations and Future Work . . . . . . . . . . . . . 9.2.1 Framework Improvements and Extensions . . 9.2.2 Implementation Improvements and Extensions 9.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

207 207 207 207 208 208 209 209 211 211

References

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

213

List of Figures 1.1 Problem illustration . . . . . . . . . . . . . . . . . . . . . . . . .

3

3.1 Illustration of the problem formalisation . . . . . . . . . . . . .

35

4.1 4.2 4.3 4.4 4.5

Framework Overview . . . Conceptual space . . . . . . Local anchor management . Global anchor management Framework summary . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

40 43 48 53 60

5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16 5.17

Uncertainty in fuzzy sets . . . . . . . . . . . . . . Fuzzy sets implemented using bin models . . . . . Parametric ramp membership functions . . . . . . Parametric 2D ramp membership functions . . . . Parametric trapezoidal membership functions . . Parametric 2D trapezoidal membership functions Multi-modal parametric membership functions . . Hybrid 2.5D grid . . . . . . . . . . . . . . . . . . Matching fuzzy sets . . . . . . . . . . . . . . . . . Fusing fuzzy sets to reach a consensus . . . . . . . Fusing unreliable information . . . . . . . . . . . Trapezoidal envelope . . . . . . . . . . . . . . . . Region information . . . . . . . . . . . . . . . . . Near self information . . . . . . . . . . . . . . . . Symbolic colour information . . . . . . . . . . . . Near position information . . . . . . . . . . . . . Numeric colour information . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

66 68 69 69 70 71 72 73 75 76 77 78 83 83 84 86 87

6.1 Landmark-based self-localisation . . . . . . . . . . . . . . . . . 6.2 AMCL self-localisation . . . . . . . . . . . . . . . . . . . . . . . 6.3 Coordinate transformation . . . . . . . . . . . . . . . . . . . . .

93 94 97

. . . . .

xi

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

LIST OF FIGURES

xii

6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11

Full versus approximate coordinate transformation Table of entities for robot 1 . . . . . . . . . . . . . Local data association search for robot 1 . . . . . . Table of entities for robot 2 . . . . . . . . . . . . . Local data association search for robot 2 . . . . . . Global matching example . . . . . . . . . . . . . . Global matching search . . . . . . . . . . . . . . . Global anchors . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

99 121 121 122 122 125 126 126

7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13 7.14 7.15 7.16

AIBO robot . . . . . . . . . . . . . . . . . . . . . . . . . . Experimental environment . . . . . . . . . . . . . . . . . . Experimental layouts . . . . . . . . . . . . . . . . . . . . . Tracker error versus distance from reference . . . . . . . . Systematic bearing errors . . . . . . . . . . . . . . . . . . . Random bearing errors . . . . . . . . . . . . . . . . . . . . Systematic range errors . . . . . . . . . . . . . . . . . . . . Random range errors . . . . . . . . . . . . . . . . . . . . . False positives . . . . . . . . . . . . . . . . . . . . . . . . . Fused error versus self-localisation errors . . . . . . . . . . Fused error versus self-orientation errors . . . . . . . . . . Results for each method . . . . . . . . . . . . . . . . . . . Self and ball position estimates . . . . . . . . . . . . . . . Orientation estimates . . . . . . . . . . . . . . . . . . . . . Bearing errors cause averaging to perform poorly . . . . . Range errors cause the proposed method to perform poorly

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

131 132 133 134 139 139 140 140 141 142 142 144 147 147 148 148

8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 8.12 8.13 8.14 8.15 8.16 8.17 8.18

The Peis-Home . . . . . . . . . . . . . . . . . . . Simulation of the Peis-Home . . . . . . . . . . . Fixed cameras and mobile robots . . . . . . . . . Images from the fixed cameras in the Peis-Home . Anchoring monitor tool . . . . . . . . . . . . . . Software configuration using simulator . . . . . . Software configuration using real robots . . . . . Experiment 1: domains . . . . . . . . . . . . . . . Experiment 1: initial configuration . . . . . . . . Experiment 1: anchors . . . . . . . . . . . . . . . Photos of setup for experiments 2 and 3 . . . . . Initial configuration for experiments 2 and 3 . . . Experiment 2 run 1: observations and trajectories Experiment 2 run 2: observations and trajectories Experiment 2 run 1: observation error versus time Experiment 2 run 2: observation error versus time Experiment 2 run 1: anchors . . . . . . . . . . . . Experiment 2 run 2: anchors . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

151 152 153 154 157 158 159 161 166 167 169 171 173 173 174 174 176 176

. . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

LIST OF FIGURES

8.19 8.20 8.21 8.22 8.23 8.24 8.25 8.26 8.27 8.28 8.29 8.30 8.31 8.32 8.33 8.34 8.35 8.36 8.37 8.38 8.39 8.40 8.41 8.42 8.43 8.44

Experiment 2 run 1: global anchors . . . . . . . . . . . . . . . Experiment 2 run 2: global anchors . . . . . . . . . . . . . . . Experiment 2 run 1: global anchor error versus time . . . . . . Experiment 2 run 2: global anchor error versus time . . . . . . Experiment 2 run 3: observations . . . . . . . . . . . . . . . . Experiment 2 run 5: observations . . . . . . . . . . . . . . . . Experiment 2 run 3: observation error versus time . . . . . . . Experiment 2 run 4: observation error versus time . . . . . . . Experiment 2 run 3: global anchors . . . . . . . . . . . . . . . Experiment 2 run 4: global anchors . . . . . . . . . . . . . . . Experiment 2 run 3: global anchor error versus time . . . . . . Experiment 2 run 4: global anchor error versus time . . . . . . Experiment 2 run 3: global anchors (bounded) . . . . . . . . . Experiment 2 run 4: global anchors (bounded) . . . . . . . . . Experiment 2 run 3: global anchor error versus time (bounded) Experiment 2 run 4: global anchor error versus time (bounded) Experiment 3 run 1: local timing . . . . . . . . . . . . . . . . Experiment 3 run 2: local timing . . . . . . . . . . . . . . . . Experiment 3 run 1: global timing . . . . . . . . . . . . . . . . Experiment 3 run 2: global timing . . . . . . . . . . . . . . . . Experiment 3: computation time versus number of associations Experiment 4: field of view of the fixed cameras . . . . . . . . Experiment 4: objects . . . . . . . . . . . . . . . . . . . . . . . Experiment 4: shape and SURF signatures . . . . . . . . . . . Experiment 4: range and bearing trapezoids . . . . . . . . . . Experiment 4: possible object positions . . . . . . . . . . . . .

xiii

. . . . . . . . . . . . . . . . . . . . . . . . . .

177 177 178 178 183 183 184 184 185 185 186 186 189 189 190 190 191 191 192 192 193 196 197 200 202 204

List of Tables 1.1 Examples of various types of information. . . . . . . . . . . . .

6

2.1 Limitations of existing anchoring approaches . . . . . . . . . . .

17

5.1 Positive and negative descriptions . . . . . . . . . . . . . . . . .

80

6.1 6.2 6.3 6.4 6.5 6.6 6.7

Entities used for local data association . Entities used for global data association . Associations of entities . . . . . . . . . . Hypotheses . . . . . . . . . . . . . . . . Local associations for robot 1 . . . . . . Local associations for robot 2 . . . . . . Associations for global matching . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

102 103 104 106 121 122 125

8.1 Full and bounded data association results . . . . . . . . . . . . . 188

xv

List of Algorithms 1 2 3 4 5

Fuzzy coordinate transformation . . . . . . . Approximate fuzzy coordinate transformation Data association algorithm . . . . . . . . . . . Bounded data association algorithm . . . . . . Analysis of the input-error landscape . . . . .

xvii

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

98 100 109 111 137

Chapter 1

Introduction 1.1 Motivation Robotic systems are used in an increasing number of application areas today [65]. This trend is supported by numerous advances which allow more useful and complex tasks to be performed. In particular, advances in multi-robot systems [33, 4, 55], and more recently, network robot systems [141, 139], allow a wide range of interesting problems to be addressed. For many of these problems, single-robot systems are either inadequate or inefficient. The advantages of multi-robot and network robot systems arise mainly from their ability to exploit parallelism, heterogeneity, and cooperation [74]. The vast majority of autonomous robot applications require that robots perceive or interact with physical objects in some way. Simply obtaining object properties is the goal of many tasks; this is true for most surveillance and detection tasks, for instance. In many other tasks, knowledge of object properties is required in order to enable identification and meaningful physical interaction; this is the case for tasks such as foraging and manipulation. Information about object positions, in particular, is crucial for most common tasks. Information about objects is typically produced, represented and used in different ways in the various sub-systems of robotic architectures. In particular, cognitive sub-systems often reason with names and descriptions of objects, while perception and control sub-systems often deal with object representations based on sensor data. In multi-robot systems, object representations are also distributed across robots. Matters are further complicated by the fact that the sets of objects considered by each robot and each sub-system often differ. Roughly stated, anchoring is the process of creating and maintaining associations between descriptions and perceptual information corresponding to the same physical objects [135, 38]. To illustrate, imagine you are asked to fetch “the large blue book from the bookshelf”. To accomplish this task, you must somehow associate the description of the book you have in your mind with the visual representation of the appropriate book.

1

CHAPTER 1. INTRODUCTION

2

When descriptions and perceptual information are distributed across multiple agents, the process is called cooperative anchoring. Unlike humans, robots can exchange both descriptions and perceptual information; in a sense, they are able to “see the world through each other’s eyes”. So not only can they extract and exchange object descriptions, such as “the large blue book”, but they can also directly exchange representations based on perceptual data. Again, imagine you are asked to fetch a particular book, this time from the library. But now, in addition to your own visual representations, you also have access to information about books observed by others. This can allow you to find the correct book without searching through the entire library yourself.

1.2 Illustration Figure 1.1 illustrates the cooperative anchoring problem. In the depicted scenario, a mobile robot called Astrid is told to fetch “parcel-21” from the entrance of an apartment containing a number of sensors and robots. In order to perform this task, Astrid can use information obtained from a number of different sources. • The task contains a description of the parcel of interest; Astrid might store this information in a knowledge base, for instance: position[parcel-21] = {entrance} • Astrid’s vision system can detect the colour and approximate positions of two observed objects: position ≈ (3.1, 1.5), colour = (0.3, 0.9, 0.8)

position ≈ (2.9, 1.5), colour = (0.0, 1.0, 0.7)

• An RFID reader called Reader-01, located near the entrance, can detect one RFID-tagged object: parcel-21 striped • Another robot, called PeopleBoy, is equipped with a vision system capable of detecting the colour and texture of objects; however, due to poor selflocalisation, PeopleBoy is, for the moment, unable to compute position estimates for detected objects: colour = (0.4, 0.9, 0.8), texture = {striped} colour = (0.0, 1.0, 0.8), texture = {none}

1.2. ILLUSTRATION

3

Figure 1.1: Illustration of the cooperative anchoring problem. Astrid is tasked with finding “parcel-21”, which is located near the entrance of the apartment. In order to identify the correct parcel, information from various sources must be considered.

• A black and white security camera called Camera-01, mounted on the ceiling, can detect object positions; due to its fixed position and elevated perspective, position estimates from the security camera are particularly accurate and precise: position = (3.11, 1.58) position = (2.82, 1.48) In the presented scenario, the cooperative anchoring problem with which Astrid is faced involves associating the provided description of “parcel-21” with corresponding perceptual information arriving from a number of heterogeneous and distributed sources. The given description contains both a name (“parcel-21”), and symbolic position information (“near the entrance”). Descriptions often contain names and symbolic information, since they typically originate from cognitive processes which reason with such representations. However, the description could just as easily have contained numeric information – for instance, the task could have been to fetch the parcel located at position (3.1, 1.6). The available perceptual information originates from a number of different sources. Although perceptual information is often numeric, sources can provide perceptual information at a symbolic level. In the above scenario, for instance, information from the RFID reader, as well as texture information from PeopleBoy’s vision system, are symbolic.


4

1.3 Objectives Although most robotic systems must address the anchoring problem in some way, few explicit approaches exist; most systems use application or system specific solutions, which fail to address the general problem. Moreover, existing approaches have a number of limitations. In particular, they do not adequately consider uncertainty and heterogeneity in object descriptions and perceptual information; also, cooperative aspects are often ignored. The main objective of this thesis is to propose a complete and novel anchoring framework which addresses these limitations. The framework should address both the single-robot and cooperative anchoring problems, and it should be able to associate various types of object descriptions with uncertain and heterogeneous perceptual information arriving from distributed sources. The thesis also aims to experimentally validate the proposed framework.

1.4 Challenges Anchoring involves a number of challenges. One important aspect is the creation and maintenance of object representations based on perceptual information; these will then be associated with descriptions of objects of interest. To address this challenge, the following sub-problems must be addressed. 1. Perceptual information should be associated with appropriate object representations. This is data association [12, 129, 60, 8], an important and well-studied problem in robotics. By addressing data association, anchoring ensures that perceptual information about a specific object is correctly associated with the appropriate internal representation of that object. 2. Perceptual information arriving from different sources at different times should be gathered and fused. Gathering properties is related to the binding problem [152, 18]; combining them is related to the information fusion problem [154, 76]. Binding brings the various names, descriptions, and representations of an object together. This allows sub-systems and other robots to easily access all available information about a particular object. Information fusion ensures that estimates of object properties are as complete and accurate as possible. As will be discussed in section 2.2.6, anchoring is mainly concerned with fusion at levels 0 and 1 of the JDL data fusion process model [158, 145, 100]. 3. Estimates of object properties should be maintained in time via prediction; prediction, data association, and information fusion are used to perform tracking [8, 10]. Prediction allows items of information arriving at different times to be meaningfully compared and combined. It also provides persistent estimates, which are useful when dealing with occlusions, sensor errors, and scarcity of perceptual resources.

1.4. CHALLENGES

5

Solutions to these sub-problems exist for many robotic applications. However, most of these consider only a few different types of information, often selected based on the task at hand or available sensors. Anchoring requires a more general approach, able to deal with the different types of information present in robotic systems. Some approaches to the sub-problems in question consider heterogeneous information; however, these approaches are rarely used in robotic applications. The anchoring approach proposed in this thesis allows the above sub-problems to be addressed despite information heterogeneity. There are many possible ways to categorise and describe information. Some of the different types of information used in robotic systems include: information from various domains (e.g. colour, position, or shape), information represented in different ways (e.g. grids, samples, or parametric functions), and information with different characteristics (e.g. noisy or unreliable). Uncertainty, in particular, is an important characteristic for robotic systems. Robots often deal with information characterised by various types and amounts of uncertainty. In table 1.1, a categorisation of information types which is particularly useful for describing the anchoring problem is proposed. The table describes information using the following three dimensions. • Perceptual versus Non-perceptual: Perceptual information is measured; such information is often, but not always, numeric. Many sensors act as virtual sensors, which “measure” numeric values but produce symbolic abstractions of these values. Non-perceptual information is modelled; for instance, a priori information is non-perceptual. Anchoring can be seen as the problem of associating non-perceptual object descriptions (right side of table 1.1) with representations of objects based on perceptual information (left side of table 1.1). • Symbolic versus Numeric: Typically, higher levels in robotic architectures rely mainly on symbolic information. Symbolic labels, or names, are often used to denote objects, and symbolic predicates are often used to describe them. Some sensors, virtual sensors in particular, may also provide information at a symbolic level. Numeric information is used in many different ways, throughout robotic architectures; in particular, lower levels in robotic architectures often produce and use numeric information for perception and control. • Interoceptive versus Exteroceptive: Information about oneself is interoceptive; this includes proprioception (e.g. feeling the position of one’s arm) and egoreception (e.g. seeing one’s arm at a given position). Exteroceptive information is about external entities, such as physical objects. This distinction is particularly important in systems such as network robot systems [141, 139], in which observed objects can communicate their properties to observing robots.


6

Table 1.1: Examples of various types of information. Anchoring associates nonperceptual object descriptions (right) with representations of objects based on perceptual information (left). This can be particularly challenging given the diversity of the relevant information (rows). Perceptual (measured)

Non-perceptual (modelled)

Interoceptive (about self)

{battery-low}, {pan-joint-stuck}, {grasper-open}

{my_colour=red}, {my_weight=heavy}

Exteroceptive (about world)

{obstacle-near}, {door-open}, {lights-on}

{colour=green}, {texture=striped}, topological map

Interoceptive (about self)

battery_voltage=11.3, pan_position=0.23

my_colour=(0, 249, 88), my_weight=3.2

Exteroceptive (about world)

range_bearing=(2, 9), cup_volume=0.20, blob_colour=(68, 99, 84), cup_position=(2, 8), wall_position=(3, 23) geometric map

Symbolic (qualitative)

Numeric (quantitative)

Many common problems involve only a small subset of the types of information discussed here. For example: • self-localisation normally relies on numeric perceptual exteroceptive information (e.g. there is a wall 0.92m away) and numeric non-perceptual exteroceptive information (e.g. a geometric map); • topological self-localisation might use symbolic perceptual exteroceptive information (e.g. “room-10” detected) and symbolic non-perceptual exteroceptive information (e.g. a topological map); • cooperative self-localisation might use perceptual interoceptive information (own properties) and perceptual exteroceptive information (properties of perceived robots); • traditional sensor fusion and tracking approaches often consider only numeric perceptual exteroceptive information (e.g. there are objects at positions (12, 35) and (12, 39)). Most existing works on anchoring consider only symbolic non-perceptual exteroceptive information (e.g. the green box-shaped object), and numeric perceptual exteroceptive information (e.g. an object with colour (67, 200, 177) was observed at position (21, 32)). However, as has been discussed, the anchoring problem can involve all of the types of information discussed here.

1.5. CONTRIBUTIONS

7

1.5 Contributions Although anchoring can be vital for even the most trivial tasks, humans typically perform anchoring without even thinking about it. Perhaps correspondingly, the problem is often overlooked in the robotics literature. In many robotic architectures, anchoring is performed in an ad-hoc manner, where the approach to anchoring is hidden within the implementation. For a number of years, however, the anchoring problem has been gaining recognition as an important challenge in robotics, and several works explicitly address the problem. Despite this, many approaches suffer from a number of important limitations, in particular with respect to the types of information considered. The cooperative anchoring problem has received little attention in the literature, and only a few works consider anchoring from a multi-robot perspective. Given this, the contributions of this thesis are the following. 1. The main contribution of this thesis is the proposal of a complete and novel anchoring framework for robotic systems, which addresses both single-robot anchoring and cooperative anchoring. The proposed framework addresses a number of limitations in current approaches, and provides a unified approach which transparently extends from single-robot to multi-robot scenarios. 2. The thesis presents a “proof of concept” realisation of the proposed framework, which is used to validate its applicability to the anchoring problem. The implementation uses fuzzy logic as a primary tool for representing, comparing, and combining information. The implementation is able to consider various types of information about objects originating from multiple robots. A number of experiments are described which illustrate how the framework addresses the anchoring problem. 3. The thesis proposes a novel data association algorithm, used within the realised anchoring framework, which considers various types of information from various domains. The algorithm allows heterogeneous items of information to be matched and associated, for both single-robot and cooperative anchoring. The algorithm is validated through experiments performed using the presented implementation of the framework. 4. The thesis proposes a novel information fusion algorithm for cooperative object localisation, used within the realised anchoring framework. The approach is based on fuzzy logic, and it fully considers uncertainty both in observations and self-localisation. A set of experiments validates the fusion algorithm using an experimental methodology which systematically tests the algorithm’s robustness with respect to various types of errors on each of the method’s inputs.

8


1.6 Outline The rest of this thesis is organised as follows. Chapter 2 discusses existing approaches to the anchoring and cooperative anchoring problems, and gives an overview of works which address a number of important related problems. Chapter 3 describes the systems considered in this work, and provides a formal definition of the anchoring and cooperative anchoring problems, in terms of a number of key system components. Chapter 4 presents the proposed computational framework for single-robot and cooperative anchoring. The chapter also briefly discusses conceptual spaces, which inspired the approach to information representation used in the framework. Chapter 5 discusses how fuzzy sets are used to represent information in the presented implementation of the proposed framework. A number of transformations are also presented, which are used to convert various types of information into representations within the same conceptual space. Chapter 6 describes the various processes used in the implementation of the proposed framework. The chapter describes how self-localisation and object localisation are performed, and detailed descriptions of the implemented data association and information fusion algorithms are given. Chapter 7 presents a number of experiments which examine the performance of the proposed information fusion algorithm, applied to the cooperative object localisation problem. The experiments include an “input-error landscape” analysis, which characterises the performance of the fusion algorithm in response to various types of systematic and random errors applied to the method’s inputs. Chapter 8 presents a number of experiments which illustrate the applicability of the proposed framework to the anchoring problem. The first experiment was performed in a mid-fidelity simulator, the other three were performed using real robots. Chapter 9 concludes the thesis with a summary of the work and its contributions, a discussion of the limitations of the proposed framework and the presented implementation of it, and an overview of possible directions for future work.

1.7. PUBLICATIONS

9

1.7 Publications Some of the work presented in this thesis has been published in a number of journal and conference papers, available at http://aass.oru.se. • D. Herrero-Pérez, H. Martínez-Barberá, K. LeBlanc, and A. Saffiotti. Fuzzy Uncertainty Modeling for Grid Based Localization of Mobile Robots Int Journal of Approximate Reasoning, 51(8):912–932, October 2010. • K. LeBlanc and A. Saffiotti. Multirobot object localization: A fuzzy fusion approach. IEEE Trans on Systems, Man and Cybernetics B, 39(5):1259– 1276, 2009. • A. Saffiotti, M. Broxvall, M. Gritti, K. LeBlanc, R. Lundh, J. Rashid, B. S. Seo, and Y. J. Cho. The PEIS-ecology project: vision and results. In Procs of the IEEE Int Conf on Intelligent Robots and Systems (IROS), pages 2329–2335, Nice, France, 2008. • K. LeBlanc and A. Saffiotti. Cooperative anchoring in heterogeneous multi-robot systems. In Procs of the IEEE Int Conf on Robotics and Automation (ICRA), Pasadena, CA, USA, 2008. • K. LeBlanc and A. Saffiotti. Issues of perceptual anchoring in ubiquitous robotic systems. In Procs of the ICRA-07 Workshop on Omniscient Space, Rome, Italy, 2007. • K. LeBlanc and A. Saffiotti. Cooperative information fusion in a network robot system. In Proc of the Int Conf on Robot Communication and Coordination (RoboComm), Athens, Greece, 2007. • J.-P. Cánovas, K. LeBlanc, and A. Saffiotti. Robust multi-robot object localization using fuzzy logic. In D. Nardi, M. Riedmiller, and C. Sammut, editors, RoboCup 2004: Robot Soccer World Cup VIII, LNCS, pages 247–261. Springer, 2005. • J.-P. Cánovas, K. LeBlanc, and A. Saffiotti. Cooperative object localization using fuzzy logic. In Procs of the IEEE Int Conf on Methods and Models in Automation and Robotics (MMAR), pages 773–778, 2003. • A. Saffiotti and K. LeBlanc. Active perceptual anchoring of robot behavior in a dynamic environment. In Procs of the IEEE Int Conf on Robotics and Automation (ICRA), pages 3796–3802, San Francisco, CA, 2000.

Chapter 2

Related Work In this chapter a discussion of related work is presented. Existing single-robot and cooperative approaches to the anchoring problem are first described, and a number of their limitations are discussed. The anchoring problem is then situated with respect to a number of important related problems. Specifically, the relationships between anchoring and symbol grounding, binding, perception management, tracking, data association, and information fusion are discussed.

2.1 Anchoring In chapter 1, anchoring was described as the process of creating and maintaining associations between descriptions and perceptual information corresponding to the same physical objects. The problem of performing anchoring in robotic systems was originally acknowledged by Saffiotti [135], and a detailed formalisation was first proposed by Coradeschi and Saffiotti [38]. The term “anchor” was borrowed from the field of situation semantics [14], where the term is used to refer to the assignment of variables to individuals, relations, and locations. Although previous works had examined the problem of linking descriptions to their referents from philosophical and linguistic standpoints [63, 134], the relevant concepts had yet to be applied to the corresponding computational problem facing artificial systems equipped with sensors. The anchoring problem has often been overlooked in robotics, and the problem is often addressed using ad hoc approaches. Anchoring has, however, gradually gained recognition as an important challenge for robotic systems, as is evidenced by a number of workshops, special issues, and surveys which address the topic [39, 41, 42, 102].

11

CHAPTER 2. RELATED WORK

12

2.1.1

Single-Robot Anchoring

Anchoring Foundations A number of approaches to single-robot anchoring have been proposed over the years, and many of these were inspired by the formalisation proposed by Coradeschi and Saffiotti [38]. Their work has been extended and studied in a number of subsequent works [40, 104]. The later versions of their framework include the following components. • A symbol system which contains: a set of symbols which denote objects (e.g. “cup-22”); a set of unary predicate symbols, which describe symbolic properties of objects (e.g. “green”); and an inference mechanism which uses these components. • A perceptual system which contains: a set of percepts (collections of measurements assumed to have originated from the same object) and a set of attributes (measurable properties of percepts). • A predicate grounding relation, which embodies the correspondence between the unary predicates in the symbol system and the attributes in the perceptual system. The symbol system assigns unary predicates, such as {green}, to symbols which denote objects, such as “cup-22”. The perceptual system continuously generates percepts, such as regions in images, and associates them with measurable attributes of corresponding objects, such as HSV colour values. The associations between symbols and percepts are reified via structures called anchors. Each anchor contains one symbol, one percept, and estimates of one object’s properties. Anchors are time indexed, since their contents can change over time. Anchors are managed using the following three steps. • Anchor creation can occur in both bottom-up and top-down manners; both are event-based. Bottom-up anchor creation occurs when the perceptual system generates a percept which matches an a priori description of interesting objects, and which does not match any existing anchors. An arbitrary symbol is assigned to such anchors. Top-down anchor creation occurs when the symbol system provides a symbol and corresponding symbolic description it wants to anchor, and this description matches an existing percept but no existing anchors. Top down anchoring occurs only when a provided symbolic description does not match the a priori description used to trigger bottom-up anchor creation. • Anchor maintenance involves periodically assigning newly received percepts to appropriate anchors, as well updating the object property estimates stored in anchors. These updates can include predictions as well as updates based on newly received percepts.

2.1. ANCHORING

13

• Anchor deletion occurs when an anchor has not been updated with perceptual information within a certain time limit. The time to deletion can be decreased if an expected observation did not occur. Anchoring and Concepts Chella et al [35] also extend the work by Coradeschi and Saffiotti [38]; their work uses conceptual spaces to represent information. Gärdenfors proposed conceptual spaces as a means to bridge the gap between symbolic and subsymbolic representations [66]. This makes them well-suited for anchoring, which also deals with both symbolic and sub-symbolic information. Conceptual spaces will be discussed in more detail in section 4.2. In the work by Chella et al, a conceptual space is defined which includes all dimensions of interest for anchoring in the given application (e.g. hue, saturation, value, x-position and y-position). This space is used to represent both predicates and percepts. The work by Chella et al provides three main advantages compared to previously described framework. 1. Operations performed on anchors and descriptions can be generalised, since information is always represented in the same type of conceptual space. This can avoid some application or configuration specific operations within the anchoring process itself; it can also make it easier to add new predicates and new sensors to the system in a modular fashion. 2. The integration of symbols and percepts is clarified, since both descriptions and perceptual information are represented in the same conceptual space. Symbols are therefore perceptually grounded, and perceptual information from multiple sensors can be conveniently represented and used by cognitive processes. The common representation also simplifies matching and fusion operations; these are fundamental for anchoring, as will be discussed in later chapters. 3. The temporal representation of anchors is clarified, since each anchor can be represented as a trajectory in a conceptual space. This facilitates prediction. The framework by Coradeschi and Saffiotti [38] has also been extended by Daoutis et al [45] to incorporate high-level conceptual reasoning about perceived objects. This is achieved by combining anchoring with methods from knowledge representation and reasoning [114, 103]. A KRR system is used which reasons using ontologies of concepts which describe “common sense” knowledge [111, 112, 97, 96, 95]. This type of knowledge can be used together with abstractions of perceptual information to allow artificial systems to access contextual and conceptual information about perceived objects. This is particularly useful for communicating with humans, as well as other artificial systems

14


which reason at a symbolic level. Work by Melchert et al [115] examined the use of spatial relations, in particular, to facilitate interaction with a human user. This work has been combined with a full KRR system by Daoutis et al [45], resulting in a system which is able to reason about perceived objects at a conceptual level. The system can also communicate its knowledge about perceived objects and their properties via a natural language interface. Bonarini et al [23, 24] propose a framework in which anchoring involves the bottom-up identification of perceived objects as instances of known concepts. Instances are similar to the anchors used in the previously described frameworks. Concepts are effectively structured a priori descriptions of objects of interest. As in the conceptual spaces framework proposed by Gärdenfors, concepts are defined as sets of properties which can be specialisations and generalisations of one another. Instances include estimates of object properties based on parent concepts as well as perceptual information. Several instances of the same concept may exist, and each instance is associated with exactly one concept: the most “specific” concept which matches the perceived properties of the object. A distinction is made between substantial properties, which are unchanging and inherent to a concept (e.g. a ball’s shape is round), versus accidental properties, which are dynamic properties associated with a particular instance of a concept (e.g. the ball is at a particular position). This distinction is similar to the distinction between matching and action properties proposed by Saffiotti [135]. Only substantial properties are considered when comparing concepts with observed objects. Bonarini et al consider two main types of uncertainty. First, uncertainty is considered during the transformation of raw sensor data into single-valued features (similar to attributes in the previously described frameworks). This transformation can include low-level filtering as well as compensation for certain types of sensor errors. In later works features are also associated with a reliability measure [24]. The second type of uncertainty considered is uncertainty in the match between concepts and their instances. A reliability measure is associated with each instance, which represents the degree of matching between the concept and the instance. This measure takes into consideration the number of domains which match as well as how well they match. As in the conceptual spaces framework by Gärdenfors, the proximity of features and properties can be used as a measure of similarity. The structured way in which Bonarini et al use concepts allows object types to be conveniently identified, and useful properties can be inferred from this classification. However the approach is limited in how it represents objects of interest. In particular, accidental properties cannot be used to describe objects of interest. Position information, in particular, is usually an accidental property, and it is often one of the most salient object properties. As such, it is often used to describe and recognise objects.

2.1. ANCHORING

15

Applied Anchoring Anchoring is also addressed in a number of works which focus on specific applications, as opposed to the previously discussed works which focus on the anchoring problem itself. Such works often provide few details regarding how anchoring is performed, although the anchoring problem is nonetheless addressed for the given application. For instance, work by Heintz et al [79] lists anchoring as one of the key components in a knowledge processing middleware used to process perceptual information and make it available to cognitive layers. Shapiro and Ismail [143] describe a general robotic architecture in which perceived objects are represented using tuples of values. High-level object descriptions are aligned with perceived objects, via application specific alignment functions, in order to allow various objects in the environment to be detected and described. Modayil and Kuipers [117] detect clusters of raw sensor values which correspond to objects in the environment; information extracted from these clusters is used to track and categorise perceived objects. These tracked objects can then be associated with symbols used for high-level reasoning. Fritsch et al [64] perform people-tracking using an extended version of the anchoring framework proposed by Coradeschi and Saffiotti [38]. The extended framework combines individually detected parts of humans into composite objects, which can be detected and tracked. A number of works focus on grounding linguistic terms to perceptual data in order to facilitate human-robot interaction [101, 160, 133]. In these works, symbols used to refer to objects are associated with some form of perceptual representation; this representation often depends on the used sensor configuration.

2.1.2

Cooperative Anchoring

In chapter 1, cooperative anchoring was described as the process of performing anchoring in systems which have descriptions and perceptual information distributed across multiple agents. Few works have attempted to address this problem explicitly. Bonarini et al [23, 24] briefly describe an extension of their framework which allows instances of concepts to be exchanged between robots. The approach assumes that all robots share the same set of concepts; exchanged instances are then compared and combined using similar methods to those used when associating existing instances with new perceptual information. The resulting instances are then matched against the set of known concepts. These global instances do not replace exchanged local instances; instead, they complement the locally created instances with information received from other robots. Some single-robot anchoring approaches are applied in multi-robot systems by treating robots as sensors which belong to the same overall system. In these approaches, perceptual information is collected from distributed sensors, but descriptions of objects of interest, associations with perceptual information,


16

and the anchoring process itself are not distributed. One example of such an approach is that proposed by Daoutis et al [45]; in their approach, anchoring is performed in a robot ecology, where both robots and fixed sensors are used to detect properties of objects of interest. Another example is an approach by Mastrogiovanni et al [108], which performs symbolic data fusion in an ambient intelligence scenario. The approach allows multiple sensors to contribute to the knowledge of the overall system; sensor data is then associated with entities in a centralised knowledge base.

2.1.3

Overcoming the Limitations of Existing Approaches

In the above discussion, four main approaches which explicitly address the anchoring problem have been described: Coradeschi and Saffiotti [38], Chella et al [35], Daoutis et al [45], and Bonarini et al [23]. Although some of these approaches have been extended and refined in other works, the fundamental characteristics of the approaches remain the same. Table 2.1 provides a summary of some of the limitations of these approaches. This thesis proposes a novel anchoring framework which addresses these limitations. The proposed framework addresses both the single-robot and cooperative anchoring problems, and it allows heterogeneous and uncertain information to be considered. It also provides a flexible strategy for managing descriptions of objects of interest. Earlier versions of the framework have been described by LeBlanc and Saffiotti [92, 93]. The approach was inspired by several of the previously described approaches, and it inherits advantages from a number of these. In particular, the advantages resulting from the use of conceptual spaces for anchoring, as first proposed by Chella et al [35], are exploited and extended in a number of ways. The key differences between the proposed framework and previous approaches are described here. Uncertainty The proposed framework addresses uncertainty more comprehensively than previous approaches. In particular, descriptions of objects, information obtained from sensors, as well as estimates of object properties, are all represented as generic regions in conceptual spaces. These regions can be multimodal and complex, as opposed to the points and crisp regions used in previous approaches. This allows various types of uncertainty to be represented and considered both while tracking perceived objects and when comparing descriptions and perceptual information.

2.1. ANCHORING

17

Daoutis et al

Bonarini et al

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

Proposed Approach

Chella et al

Uncertainty in perceptual information is not fully represented, only crisp values are used (possibly with an associated reliability measure). Uncertainty in matches between descriptions and perceptual information is ignored, or computed using only the number of domains in which entities match. Representations of descriptions and estimates of object properties are in sensor, domain, or application specific formats, making it difficult to generalise operations and add new sensors, domains, concepts and descriptions. Representations of perceived information consider only numeric sensor data; symbolic and abstracted perceptual information are not considered. Descriptions of objects of interest are represented both as a priori sensor calibrations (for bottom-up anchoring) and symbolic predicates (for top-down anchoring), making them difficult to manage and update. Descriptions are only symbolic; support for numeric descriptions of objects of interest is not provided. Descriptions only contain substantial properties, used to define concepts; support for descriptions of particular object instances is not provided. Multiple descriptions cannot match the same observed object. Domains in which information can be represented are not treated separately, resulting in increased computational complexity. Names are only used to denote objects; names are not used for data association or for associating descriptions and perceptual representations. Cooperative anchoring is either not addressed, or only briefly discussed.

Coradeschi and Saffiotti

Table 2.1: Limitations of the main approaches to anchoring found in the literature. Limitations marked with an X apply to the method in the corresponding column.

18


Heterogeneity The proposed framework exploits the richness of conceptual spaces to allow various types of information to be used as descriptions and perceptual information. In particular, both descriptions and perceptual information can be symbolic or numeric. Previous approaches typically assume that descriptions are symbolic, and perceptual information is numeric. These assumptions are particularly limiting for systems in which many heterogeneous devices are used, such as network robot systems [141] and robot ecologies [138, 139]. Descriptions The proposed framework is more general and flexible than previous approaches when it comes to representing descriptions of objects of interest. This allows a broader range of applications to be addressed. Descriptions can be symbolic or numeric, and they can span multiple domains. They are represented using generic regions in conceptual spaces, thus avoiding the need for sensor-specific representations or calibrations. The framework also allows groups of descriptions to be activated, deactivated, and stored for particular applications. Only bottom-up anchoring is needed: anchors are created when new objects are detected which match active descriptions of objects of interest. Descriptions can also be named, and names can be used to constrain associations between descriptions and perceptual information. The proposed approach also allows multiple descriptions to be associated with a single anchor, and a single description to be associated with multiple anchors. None of the previous frameworks support the same object matching two descriptions, for instance: “the green cup in the kitchen” and “objects on the table”. In the proposed approach, associations are based solely on the characteristics of descriptions and perceived information, without any artificial constraints. The proposed treatment of descriptions replaces the a priori information used for bottom-up anchoring in Loutfi et al [104]; it also avoids the need for top-down anchoring, since top-down requests can be handled by creating and activating new descriptions. The proposed treatment of descriptions could also be used to represent the concepts used by Bonarini et al [23, 24] – their approach to representing concepts is similar to the conceptual spaces approach in a number of ways. But unlike the work by Bonarini et al, the proposed framework also allows all object properties, including “accidental” properties (such as object positions), to be used to describe objects of interest. Note that although the conceptual spaces approach does lend itself to the representation of concepts and high-level knowledge, this thesis does not focus on high-level reasoning using concepts and object classes. This is left as a potential direction for future work.

2.2. RELATED CHALLENGES

19

Domains In the proposed approach, dimensions in conceptual spaces are grouped into domains (such as colour, position, and shape). This grouping is included in the conceptual spaces framework [66], but it was not included in the anchoring framework proposed by Chella et al [35]. Considering domains separately reduces the complexity of matching and fusion operations, and it allows domains to be treated independently with respect to information representation and processing. The separation also makes it easier to add, remove, and update the domains considered by the system. Names In previous approaches, names are used only to denote objects. In the proposed framework, names are also used in the anchoring process itself. In particular, the framework allows object names to be perceived (e.g. by an RFID reader in a robot ecology). This allows names to constrain data association when applicable. Object descriptions can also be named, which means that names can also constrain matches between descriptions and perceived information about objects of interest. Cooperative Anchoring The proposed framework fully addresses both the single-robot and cooperative anchoring problems. It allows all available information about objects to be accessed transparently, regardless of whether the information was produced locally or received from other robots. This is achieved by having local anchors, which contain only information produced locally, and global anchors, which contain information from all sources. Both local and global representations are always available, and global anchors are created even if information about a given object is based solely on locally obtained information. Local anchors can be reliably maintained independently of other robots, making them particularly useful for task execution. Global anchors improve information completeness and robustness, and they can be particularly useful for building shared representations, which are useful for reasoning and coordination [24, 72].

2.2 Related Challenges The anchoring problem is broad, and it borders on a number of other important problems in robotics. In this section a number of these problems are briefly discussed. The intention of this section is mainly to situate anchoring with respect to neighbouring problems, rather than to provide a full discussion of existing work in these fields.


20

2.2.1

Symbol Grounding

Anchoring is sometimes described as a subset of the symbol grounding problem [78], which is the problem of associating symbols with their referents. Object descriptions are indeed often associated with symbols; for instance, the symbol “cup-22” might be used to denote an object described by a particular colour, size, and shape. However, unlike symbol grounding, which considers all types of symbols, anchoring only aims to ground symbols which denote physical objects.

2.2.2

Binding

Another problem which is closely related to anchoring is the binding problem [152, 18], which is the problem of gathering various object properties into one coherent entity. Anchoring allows object properties associated with an object’s description as well as properties obtained from perception to be linked. Properties in this sense can span various domains (such as position, colour, and shape), and they can have widely varying characteristics.

2.2.3

Perception Management

Anchoring can be used to facilitate perception management [132], which is an extension of sensor management [2]. Sensor management involves low-level allocation and control of perceptual resources; perception management extends this to include the use of high-level information to guide perception. In particular, perception management can involve the use of active perception [6] to select sensing actions which are expected to maximise information gain. For instance, in anchoring, this might mean that objects about which available information is thought to be unreliable (for instance, objects which have not been observed for some time) should be examined first. Saffiotti and LeBlanc [140] use anchoring to assist in controlling gaze in a dynamic environment with multiple objects of interest. Guirnaldo et al [74] propose a similar approach, in which anchoring is used to assist in the allocation of sensing resources in a multi-robot system. Perceptual actions can also be triggered by errors or ambiguity in the anchoring process itself. When such problems are communicated to an action planner, corresponding sensing actions or recovery procedures can be initiated [30, 88].

2.2.4

Tracking

In order to ensure that associations between object descriptions and perceptual representations are up to date, new perceptual information must be taken into account; this is accomplished by tracking [8, 10] objects over time. Tracking uses state estimation [109, 67, 73, 11] techniques to maintain persistent esti-


21

mates of object states. An estimate of an object’s state in this context is normally referred to as a track. Tracking is often addressed in two steps. In the first step, estimates are predicted to account for changes which may have occurred since the object was last observed. Predictions are based on assumptions about how object properties can change over time, as well as expected changes based on previous observations; for instance, in the position domain, estimated velocity is often used in the prediction step. In the second step, estimates are updated to account for newly received perceptual information. The update step is usually further divided into two important and well-studied sub-problems. The first is data association, which ensures that arriving information is used to update the correct track; the second is information fusion, which is used to update existing tracks with new information. Data association and information fusion are extremely broad and well-studied problems, and they are addressed in a wide range of applications; they will be discussed in more detail in sections 2.2.5 and 2.2.6. Most tracking applications are mainly concerned with object positions. In particular, many approaches focus on estimating the positions of multiple fast moving objects, using one or more sensors. Application-specific assumptions are often exploited to simplify the problem for a particular type of sensor or situation; for instance, the number or positions of sensors may be fixed, or the number of targets may be known in advance. Also, point estimates are often used to represent observations, and state estimates are often uni-modal. Anchoring requires a more general approach to tracking, able to estimate properties of an unknown number of objects in multiple domains, using information obtained from heterogeneous and mobile sensors. Observations and state estimates often have varying granularities and different representations, and different types and amounts of uncertainty need to be considered.

2.2.5

Data Association

Data association is the process of determining which measurements originate from which objects [12, 129, 60, 8]. It is typically performed in order to ensure that estimates of object properties are updated using the correct measurements. In most approaches it is assumed that each measurement originates from a single object, and that measurements obtained during a single frame (also called a scan, or a sampling interval) correspond to unique objects [17, 121]; however, some approaches relax or avoid these assumptions [68]. The vast majority of data association approaches consider only position domain information, and the physical distance between observations and tracks is the primary input to most data association algorithms. There are, however, a number of approaches which consider other domains. For instance, Megherbi et al [113] consider the position, colour, and audio frequency domains; Ras-

22


mussen and Hager [128] consider the position and texture domains; and Pérez et al consider the position and colour domains. Many approaches begin by setting a gate or threshold for each track; measurements farther from a track than this are assumed not to have originated from the corresponding object [7, 17]. Measurements outside the gates of all tracks are used to initiate new tracks. When objects are far apart, the gating approach can be sufficient. In general, however, multiple measurements might be within the gate of a single track, and a single measurement might be within the gates of multiple tracks. In such cases more sophisticated methods are needed. One simple technique is to use a nearest neighbour (NN) algorithm [44, 49], which sequentially associates each track with the closest measurement within its gate. Although fast to compute, the NN approach is greedy, and unless objects are far apart it often results in sub-optimal assignments from a global perspective. The global nearest neighbour (GNN) algorithm [7, 51] is a more common approach, which associates tracks and measurements (assuming gating criteria are satisfied) in such a way as to minimise the overall distance between all associated tracks and measurements. Most approaches are single-scan (or zerobackscan) approaches, which consider each frame in isolation. However, some multi-scan (or n-backscan) approaches exist, which consider multiple frames, for instance using a sliding time window [131, 9]. The GNN approach computes globally optimal assignments assuming error-free inputs; however, many GNN approaches lack robustness with respect to false positives and measurement errors. Also, the computational cost of GNN approaches grows quickly as the number of sensors, targets, or considered frames is increased. For this reason, many approaches compute approximations of the optimal solution, which are faster to obtain [46, 47, 126]. Another commonly used approach is the joint probabilistic data association filter (JPDAF) [60], which updates each track with a probabilistically weighted sum of all measurements within its gate. This approach can yield improved robustness with respect to false positives and measurement errors. However, the computational requirements of the approach can be high since all possible associations must be considered; also, in the presence of closely-spaced targets, tracks tend to converge; various improvements to the JPDAF algorithm aim to alleviate this “track coalescence” problem [58, 22]. One of the most popular approaches to data association is multiple hypothesis tracking (MHT) [129, 17]. In MHT, multiple hypotheses about object states are maintained and updated in parallel. When a particular hypothesis becomes inconsistent with new information, it is pruned. The approach effectively allows data association decisions to be deferred until more information is available. The complexity of MHT approaches can be problematic, so careful hypothesis pruning is required. Most of the discussed approaches use probabilities to assign measurements to tracks; however, some other approaches are used. For instance, Megherbi et


23

al [113] use an evidence theory approach to data association, which combines evidence from a number of sensing domains; Stubberud and Kramer [147] use fuzzy logic to enforce various constraints on data association decisions; and Zaveri et al [163] use a neural network approach to compute observation to track assignment weights. In this thesis, a fuzzy-logic approach to data association is proposed which is essentially a single-scan best-first GNN approach, which allows an arbitrary number of domains to be considered. Moreover, the implementation allows multi-modal observations and state representations to be used, and various types and amounts of uncertainty can be taken into consideration. The proposed approach is described in detail in section 6.3.

2.2.6

Information Fusion

Information fusion is an extremely broad topic; it can encompass, among other things, the fusion of sensor data, the combination of heterogeneous information at various levels of abstraction, the merging of databases, and the pooling of expert opinions [20]. It is beyond the scope of this thesis to provide an indepth overview of the field. This section will simply summarise the problem, and discuss how information fusion is normally used to combine information about objects in robotic systems. Although a universally accepted definition of information fusion may be elusive [153, 27], the general idea is relatively straightforward. Roughly speaking, the information fusion problem is the problem of combining information arriving from different sources or at different times in order to improve the quality of available information; the term “quality” in this context depends on the application. In robotic applications, information quality often pertains to its accuracy, completeness, and reliability. In general, redundant information can improve accuracy and reliability, while complementary information can improve completeness [74]. The JDL data fusion process model [158, 145, 100] has been proposed as a general approach to information fusion. The model includes a number of layers, which treat information at different levels of abstraction. The work in this thesis is mainly concerned with fusion at the lowest levels, 0 and 1. Level 0 fusion includes the processing and filtering of raw data, as well as the extraction of features from signals. In particular, level 0 fusion usually addresses alignment [76], which ensures that the information to be fused is comparable. Alignment can involve temporal and spatial components. In robotic applications, temporal alignment is usually achieved using prediction, as discussed earlier. Spatial alignment often involves coordinate transformations which ensure that the information being fused is represented using comparable units and coordinate systems. Level 1 information fusion involves estimating the properties of individual entities; the anchoring process itself deals mainly with information at this level. Higher levels deal with determining relationships between entities

24


(level 2), assessing the impact of available information (level 3), and refining the fusion process (level 4). Two of the most important aspects of information fusion approaches are how information is represented, and how uncertainty is handled. In robotics, uncertainty can arise from a number of sources, including, but not limited to: environment dynamics, sensor limitations, and approximations in models and computations. Object Localisation An important instance of information fusion in robotics in general (and anchoring in particular) is the fusion of information about object positions. The object localisation problem involves both the initial creation and iterative maintenance of object position estimates. Most approaches to the problem, both in single-robot and multi-robot systems, rely on a relatively small number of underlying information fusion techniques. One of the main advantages of using multi-robot systems for object localisation stems from the fact that different sensors have different characteristics and viewpoints. This allows the overall system’s field of view to be virtually extended, reducing the impact of sensor range and accuracy limitations. This advantage comes at a cost, however, since sensors on different robots usually have unknown relative positions; this means that the spatial alignment of position information is non-trivial. However, once object position estimates have been situated in a common reference frame, single-robot and multi-robot object localisation problems can often be addressed using similar methods. In multi-robot systems, spatial alignment is often achieved with the help of an agreed-upon global coordinate system, although some approaches avoid the need for such a coordinate system [70, 71]. Estimating object positions in a global coordinate system requires that a robot know its own pose in this frame; in other words, self-localisation needs to be addressed. There are many approaches to self-localisation [31, 82, 81, 48, 61, 151], and it is often addressed together with the mapping problem (a survey of this field is presented by Thrun [149]). The self-localisation problem can also be addressed using many of the same underlying fusion techniques as the single-robot and multirobot object localisation problems. In this thesis, self-localisation is addressed using two different approaches. The first is a fuzzy-logic based approach, which relies on known landmarks in the environment [31, 82]. The other is a probabilistic scan-matching method called Adaptive Monte-Carlo Localisation [151]. Both approaches will be discussed in section 6.1. Many of the most common approaches to fusing information about object positions are based on Kalman filters [87, 67] and linearised Kalman filters like the Extended Kalman Filter (EKF) [84, 109] and the Unscented Kalman Filter (UKF) [86]. The Kalman filter algorithm can be seen as a continuous-space implementation of the probabilistic Bayes filter algorithm [62, 150], in which


25

information is represented using Gaussian distributions. In general such methods are accurate, easy to implement, and computationally efficient. They are used both to fuse information arriving at different times, and to combine information arriving from different sources [146, 77, 57, 99, 54, 83]. Despite their widespread use, Kalman filters have a number of limitations. For one thing, they are uni-modal. This limitation can be somewhat offset by using multiple hypothesis tracking (MHT) [129, 142], as discussed earlier, or interacting multiple models (IMM) techniques [110]. Another limitation of methods which use Kalman filters is that since information is combined using what amounts to a weighted average, fused results can be significantly degraded in the presence of false positives and outliers. Various forms of gating [123, 56, 53] can help reduce this effect, although deciding what constitutes an outlier can be difficult. Markov localisation [61, 157] is another probabilistic method often used in localisation tasks. The algorithm can be viewed as a discrete-space implementation of the Bayes filter algorithm [150]. The idea is to maintain a (possibly multi-modal) discrete probability distribution over the state space; this is normally either grid-based or sample-based. New information increases the probability that an object is in a given region of the distribution, and decreases the probability that it is anywhere else. The resulting distribution will typically have a higher probability in positions which are consistent with the majority of the fused information. In general, Markov localisation is more computationally demanding than Kalman filter based methods, but it has been shown to be extremely robust. There is a trade-off between the accuracy and computational requirements of Markov localisation; sample-based approaches, in particular, allow a smooth transition along the axis of this trade-off. A hybrid method which combines Kalman filters with Markov localisation, called Markov-Kalman object localisation, has been shown to be very effective [51, 75]. Fuzzy logic is another popular tool for addressing information fusion problems [32, 19, 16], although its use in robotics is not as widespread as the probabilistic methods just described. This thesis includes a fuzzy logic approach to multi-robot object localisation [32, 94] as part of the overall anchoring framework. The proposed object localisation method uses fuzzy logic to compute a consensus between sources, so it is not susceptible to the problems which can arise from averaging inconsistent information. One of the key features of the approach is that it fully considers self-localisation uncertainty when computing object position estimates. Most works explicitly assume perfect selflocalisation, which is not normally achievable in robotic systems. In particular, even small errors in orientation can cause large errors in object position estimates. Only one other approach was found which addresses self-localisation uncertainty explicitly when performing multi-robot object localisation [122]; in this sample-based approach, a small number of possible object positions are computed taking self-localisation uncertainty into account, and these are exchanged between robots. In a few other approaches, self-localisation uncertainty is used to weight estimates of target object positions. For instance, see

26


the arithmetic mean method examined by Ferrein et al [57]; a similar idea is described, but not implemented, by Stroupe et al [146]. Fusion of Heterogeneous Information In addition to requiring the fusion of information about object positions, anchoring also involves fusing various other types of information, as discussed in section 1.4. In particular, both symbolic and numeric information need to be considered. Fusion of heterogeneous information has typically been addressed in research areas such as database systems [34, 119, 50], web services [98, 127], and decision support systems [80]. There are, however, a number of trends which are making the treatment of heterogeneous information an increasingly important consideration for robotic systems. One such trend is the increasing interest in network robot systems, ubiquitous robotic systems, ambient intelligence, robot ecologies, and other similar research areas [141, 108, 139]. In such systems, the broad range of available sensors and participating devices means that different domains and representations need to be explored. For instance, Mastrogiovanni et al [108] examine the fusion of symbolic information in a distributed ambient intelligence system using an approach based on description logic [5]. Another trend involves the development of intelligent systems which rely on ontologies and knowledge bases in order to develop “common sense” [95, 45]. In such systems, concepts and contextual information can be extremely heterogeneous, and this information needs to be represented and used in various ways. Work by Daoutis et al [45] combines information from various devices, and makes the results available to humans via a high-level knowledge base, which can communicate information about perceived objects using a natural language interface. A third trend is the growing field of human-robot interaction [28, 59, 118]. In order to facilitate communication with humans, representations must exist which allow humans to understand robots, and robots to understand humans. In particular, humans often communicate using symbolic information, which must somehow be grounded in representations which can be understood by robots. Also, perceived information usually needs to be abstracted to allow humans to understand it. One example of this is the grounding of linguistic terms for spatial relations [115] to allow a human and a robot to communicate about relative positions of objects of interest. These and other trends are making it important for robotic systems to be able to manage and fuse heterogeneous information. In the anchoring framework proposed in this thesis, information heterogeneity is addressed through the use of conceptual spaces [66], which allow various types of information to be represented; in particular, conceptual spaces provide a common representation for both symbolic and numeric information. This common space facilitates the comparison and fusion of heterogeneous information. Conceptual spaces

2.3. DISCUSSION

27

will be discussed in more detail section 4.2. It should be noted that the same fuzzy-logic based approach to information fusion is used for position domain information (i.e. in the object localisation method mentioned previously) and information in other domains.

2.3 Discussion This chapter has discussed previous work on anchoring, and the main advantages of the proposed framework with respect to existing approaches were summarised. In particular, the proposed framework addresses both single-robot and cooperative anchoring, and it considers various types of information and uncertainty. It also allows descriptions of objects of interest to be specified in a flexible manner. The chapter also briefly described a number of problems related to anchoring. The anchoring problem is broad, and borders on a number of important problems in robotics. The purpose of the presented discussion was to delineate anchoring, and situate it with respect to some neighbouring problems.

Chapter 3

Problem Formalisation This chapter provides a mathematical formalisation of the single-robot and cooperative anchoring problems. This formalisation is based on a number of ingredients, presented in section 3.1; these ingredients describe the systems considered in this work. Definitions are given in section 3.2, and an illustration is presented in section 3.3.

3.1 Ingredients The formalisation of the single-robot and cooperative anchoring problems used in this work is based on the following ingredients. Throughout this work, entities denoted using superscripts are per-robot entities, where the superscript indicates the associated robot. Other entities are common to the entire system. Certain ingredients are described as time varying; time indices are omitted to improve readability whenever possible. • There is a set of N > 0 objects, denoted O = {o1 , . . . , oN }, where the value of N is unknown. The set of objects O can change over time, since objects can be inserted into and removed from the environment. In this work, each object on is a physical object, which can be perceived using sensors. • There is a set of M > 0 robots, denoted R = {r1 , . . . , rM }. It is not assumed that robots know how many other robots there are, and the set of robots can change over time as robots enter and leave the environment. If M = 1, or if no form of communication is available (e.g. peer-topeer, multicast, or broadcast messages), then the problem transparently reduces to M independent single-robot anchoring problems. In this work, a robot rm can be any physical system with sensors or actuators. For example, fixed camera systems and RFID readers are considered to be robots, as are traditional autonomous robots.

29

30

CHAPTER 3. PROBLEM FORMALISATION

• Each robot rm has a set of Km > 0 information sources, denoted Sm = m {sm 1 , . . . , sKm }. The set of all information sources across all robots is denoted S. An information source sm k can be anything which produces perceptual information about objects. This information may consist of sensor data or abstractions of sensor data. Information sources can be seen as virtual sensors, which can produce both numeric and symbolic information. An information source typically consists of a sensor and some related processing, e.g. a vision system. m > 0 • Each information source sm k produces, at any time t, a set of Jk m m m m m percepts, denoted Zk = {zk [1], . . . , zk [Jk ]}. A percept zk [j] contains perceptual information about a single object. Each percept can contain information in one or more of the domains in the set Xm , which contains all of the domains in which robot rm is interested (e.g. position, shape, and texture). Percepts may contain numeric sensor data or abstractions of sensor data; in particular, percepts can contain both symbolic and numeric information. The set of all percepts produced a given robot’s S by m m information sources at time t is denoted Y m = K k=1 Zk . The set of all percepts produced by information sources on all robots at time t is S m denoted Y = M The set of all domains in which any robot is inm=1 Y . S m terested is denoted X = M m=1 X . Note that many information sources produce information in the same domains. Percepts can also be either named or unnamed; names will be discussed in more detail in chapter 4.

• Each robot rm maintains a set of Lm > 0 local anchors, denoted Ψm = m {ψm 1 , . . . , ψLm }. The set of all local anchors from all robots is denoted Ψ = SM m m and its contents are time-varying. Local anchors m=1 Ψ . The set Ψ are data structures containing estimates of object properties, computed solely based on information obtained from robot rm ’s own information sources. A local anchor ψm l reflects the estimated properties of one particular object which has been perceived by robot rm . There is one local m anchor ψm is aware. Each local anchor l for each object of which robot r contains information in one or more of the domains in the set Xm . Like percepts, local anchors can be either named or unnamed. • Each robot rm maintains a set of Gm > 0 global anchors, denoted Ωm = m m {ωm 1 , . . . , ωGm }. The set Ω and its contents are time-varying. Global anchors are data structures containing estimates of object properties computed using information obtained from information sources on all robots. A global anchor ωm g reflects the estimated properties of an object which has been perceived by at least one robot. If all information is successfully exchanged between robots, each robot has one global anchor for each object of which any robot is aware. Each global anchor contains information in one or more of the domains in the set X. Like percepts and local anchors, global anchors can also be either named or unnamed.

3.1. INGREDIENTS

31

• Each robot rm maintains a set of Im descriptions of objects of interest, m denoted Dm = {dm 1 , . . . , dIm }; this set can be updated at any time. A m given description di may be intended to refer to one or more objects. Objects in which the system is interested are described using positive descriptions; those in which the system is not interested are described using negative descriptions. Descriptions which refer to one specific object are definite (e.g. the green cup on the table, or Kevin’s green cup); those which describe general properties of interest are indefinite (e.g. a green cup). Descriptions can also be named or unnamed. Each description includes information in one or more of the domains X, and multiple descriptions can contain information in the same domains.

3.1.1

Information Sources

Heterogeneity Each information source produces percepts which can span multiple domains, and contain information of various types – recall table 1.1. For instance, a vision system might produce numeric information about object positions, colours, and shapes, computed based on captured images. Alternatively, symbolic information might be extracted instead; for example, colour or shape names might be produced, instead of numeric values. Similarly, a laser range finder might produce numeric information about object positions; or, the same sensor might be used to produce symbolic position domain information, such as {object-near}. An RFID reader can produce numeric or symbolic information, read from nearby RFID tags. Processing In order to produce percepts, information sources typically perform some processing on raw sensor data. For example, vision systems might use various image processing techniques to detect interesting regions in images, and to extract properties (such as object position and colour estimates) corresponding to these regions. Similarly, raw laser scans are normally segmented in order to produce object position estimates. Low-level filtering can also be performed, to remove outliers and increase the consistency of produced information. This can be particularly useful for noisy sensor data. Since data characteristics are highly source-dependent, it is useful to perform such filtering within each information source. Information sources can also be calibrated to ignore objects which do not meet certain criteria. For instance, an information source might be configured to detect only red objects, or objects within a certain range. This type of interest filtering will be discussed in more detail in section 4.5.


32

Time At any time t an information source produces a (possibly empty) set of percepts; these can be produced synchronously or asynchronously. For instance, a vision system might produce information from images at 10Hz, and a laser scanner might produce information from laser scans at 50Hz. Alternatively, an information source might only produce information when objects which meet certain criteria are detected, or when a significant change has been detected. There is a trade-off to consider here: constantly sending similar information can increase reliability; however, this can also result in increased computational load. In this work, time is treated as a sequence of discrete intervals called frames. Frames are the same length for all information sources within the same robot, and they are typically very short. Frame duration is not necessarily the same for all robots, and frames are not necessarily synchronised across robots. Assumptions It is assumed that each percept refers to a single object. It is further assumed that two percepts produced by the same information source during the same frame cannot refer to the same object. These assumptions are normally made in data association applications [17], although in general they may not always hold.

3.1.2

Anchors

An anchor is a data structure used to store estimates of one particular object’s properties, taking previous perceptual information into account. Anchors correspond to the internal representations of objects used by any system which observes objects. Such internal representations are needed since object properties at time t can depend on any percept produced from time 0 to time t, and it is not feasible to store all of these, or process them when estimates of object properties need to be computed. Instead, it is assumed that an anchor at time t takes into account all relevant percepts produced from time 0 to time t.

3.1.3

Descriptions

The anchoring problem involves associating positive descriptions and perceptual information referring to the same physical objects. Both positive and negative descriptions can also be used to determine whether arriving information is interesting, or should be discarded. Recall that information sources can also be used to discard uninteresting information. Interest filtering will be discussed in more detail in section 4.5.

3.2. PROBLEM DEFINITION

33

3.2 Problem Definition Given the above ingredients, the single-robot anchoring and cooperative anchoring problems can be defined as follows. Definition 3.2.1. The single-robot anchoring problem is the problem of creating and maintaining associations between positive descriptions and local anchors. Definition 3.2.2. The cooperative anchoring problem is the problem of creating and maintaining associations between positive descriptions and global anchors. In other words, addressing the anchoring problem means determining which object descriptions match which perceived objects. Note that multiple descriptions may match a single anchor; for instance, “green objects” and “objects on the table” might both match the green cup on the table. Also, multiple anchors may match a single description; for instance, two boxes might match the description “square objects in the living room”. To ensure that associations between descriptions and anchors are meaningful, anchors must be maintained, so that they reflect the latest information available. This maintenance involves a number of aspects. Most importantly, anchors should be updated in order to account for new perceptual information. In order to accomplish this, two important sub-problems must be addressed: data association and information fusion. Also, changes in object properties which can occur between observations must be taken into consideration through prediction.

3.2.1

Data Association

The data association problem which needs to be addressed in order to enable anchors to be updated with relevant information involves determining which percepts should be used to update which anchors, and which percepts should be used to create new anchors. Formally, for single-robot anchoring, this means that in order to update local anchors Ψm (t − 1) to account for percepts Y m (t) produced locally during frame t, robot rm needs to determine which elements in sets Ψm (t − 1) and Y m (t) refer to the same objects. This determination is typically based on some measure of how well the various elements match. This can be seen as the problem of finding a partitioning of the set αm (t) = Ψm (t − 1) ∪ Y m (t)

(3.1)

such that each partition αm l (t) contains all and only elements which refer to the same object ol . Each such partition may contain at most one local anchor, since anchors are assumed to refer to unique objects. New local anchors should be created and added to each partition which does not already contain one.


34

Also, each partition may contain at most one percept from a given information source, since it is assumed that any two percepts produced by the same information source during the same frame refer to different objects. Similarly, for cooperative anchoring, in order to update global anchors in the set Ωm (t − 1) to account for percepts Y(t) produced by all robots during frame t, robot rm needs to determine which elements in sets Ωm (t − 1) and Y(t) refer to the same objects. This can be seen as the problem of finding a partitioning of the set βm (t) = Ωm (t − 1) ∪ Y(t)

(3.2)

such that each partition βm g (t) contains all and only elements which refer to the same object og . In this case each partition may contain at most one global anchor and one percept from each information source. New global anchors should be created and added to partitions which do not already contain one.

3.2.2

Information Fusion

Once decisions regarding data association have been taken, entities which have been determined to refer to the same objects should be combined in order to obtain more accurate estimates of object properties; this is an instance of the information fusion problem. Formally, for single-robot anchoring, the elements in each partition αm l (t) of set αm (t) should be fused, in order to exploit information contained in previous and current percepts. The result of the fusion replaces the estimates contained in local anchors Ψm (t − 1), yielding new local anchors Ψm (t). Similarly, for cooperative anchoring, the elements in each partition βm g (t) of set βm (t) can be fused. The result of the fusion replaces the estimates contained in global anchors Ωm (t − 1), yielding new global anchors Ωm (t).

3.2.3

Prediction

In order to account for changes which occur between observations, estimates of object properties at time t must be propagated to subsequent points in time. This allows persistent representations of object properties to be available. Such representations are useful for control and reasoning tasks, and they enable meaningful comparisons between percepts arriving at different times. Predictions are based on knowledge of how object properties can change over time. Some domains, such as colour and shape, are usually static. Prediction in these domains is often either not used, or extremely simple. Other domains, such as position, can be highly dynamic; in such cases sophisticated prediction models may be applied. Formally, local anchors Ψm (t − 1) should be predicted in order for them to be comparable with percepts Y m (t). Similarly, global anchors Ωm (t−1) should be predicted in order for them to be comparable with percepts Y(t).

3.3. ILLUSTRATION

35

Figure 3.1: Illustration of some of the main ingredients used in the formalisation of the single-robot and cooperative anchoring problems. The figure is based on figure 1.1.

3.3 Illustration In figure 3.1 the illustration from section 1.2 is presented using the ingredients described in this chapter. Recall that the scenario involves a robot Astrid being asked to fetch “parcel-21” from the entrance. Astrid has a description of the parcel, represented as d11 . Perceptual information from a number of information sources is available: • Astrid’s (r1 ) vision system (s11 ); • Reader-01’s (r2 ) RFID reader (s21 ); • PeopleBoy’s (r3 ) vision system (s31 ); and • Camera-01’s (r4 ) vision system (s41 ). It is assumed that the percepts shown in the figure were all produced at time t, so the set of percepts produced by Astrid’s only information sources at time t is Y 1 (t) = {z11 [1], z11 [2]}; the percepts produced by all information sources across all robots at time t are described by the set Y(t) = {z11 [1], z11 [2], z21 [1], z31 [1], z31 [2], z41 [1], z41 [2]}.

36


It is further assumed that Astrid had previously created local anchors not shown in the figure, contained in the set Ψm (t − 1) = {ψ11 (t − 1), ψ12 (t − 1)}; Astrid is also assumed to have previously created two global anchors,also not shown, denoted Ωm (t − 1) = {ω11 (t − 1), ω12 (t − 1)}. Given the above, the single-robot anchoring problem requires that local anchors Ψm (t − 1) be predicted, allowing them to be compared to the percepts in set Y 1 (t). Data association is then performed, to determine which elements in the set α1 (t) = {z11 [1], z11 [2], ψ11 (t − 1), ψ12 (t − 1)} refer to the same objects; this step may result in the creation of new local anchors. Recall that local anchors, as well as percepts produced by the same information source during a single frame, are assumed to refer to separate objects. Associated entities are then fused to obtain updated estimates of object properties; these estimates are stored in updated local anchors Ψm (t). Finally, associations between the only positive description d11 and local anchors Ψm (t) can be updated. Similarly, the cooperative anchoring problem requires that previously existing global anchors Ωm (t − 1) be predicted, in order for them to be compared to the percepts in set Y(t). Data association is then used to determine which elements in the set β1 (t) = {z11 [1], z11 [2], z21 [1], z31 [1], z31 [2], z41 [1], z41 [2], ω11 (t − 1), ω12 (t − 1)} refer to the same objects; new global anchors may be created during this step. Associated entities are then fused, resulting in updated global anchors Ωm (t). Associations between these global anchors and the positive description d11 are then updated.

Chapter 4

Anchoring Framework This chapter describes the anchoring framework proposed in this work. An overview of the proposed framework is provided in section 4.1. Conceptual spaces, which are used to represent information in the presented framework, are described in section 4.2. In section 4.3, management of local anchors is discussed. Management of global anchors is discussed in section 4.4. Descriptions are discussed in section 4.5, and names are discussed in section 4.6. The chapter concludes with a summary of the framework, which explains how the computational steps described in this chapter are used.

4.1 Framework Overview The single-robot and cooperative anchoring problems were formalised in chapter 3. In this chapter, an anchoring framework which addresses both problems is presented. The development of the presented framework is the main contribution of this thesis. The framework allows non-perceptual object descriptions and perceptual representations of object properties to be represented in a consistent manner, and it defines a number of computational steps which allow the anchoring problem to be addressed. In particular, the framework includes a number of steps for the creation and maintenance of anchors, which are representations of object properties based on perceptual data. The framework also enables the association of anchors with matching object descriptions.

4.1.1

Local and Global Anchoring

The proposed framework decomposes the cooperative anchoring problem into two parts. The first part is called local anchor management, and it is essentially single-robot anchoring, per definition 3.2.1; it involves the management of each individual robot’s local anchors. These anchors contain only information obtained from a robot’s own information sources. The second part is called global anchor management, and it is essentially cooperative anchoring, per def37

38

CHAPTER 4. ANCHORING FRAMEWORK

inition 3.2.2; it involves the management of global anchors. Rather than creating global anchors using exchanged percepts, in the proposed framework global anchors are created using exchanged local anchors. The proposed approach is therefore a slight approximation of the cooperative anchoring problem as defined in chapter 3. The decomposition of the problem in this way is natural given the distributed nature of the systems considered, and it offers a number of important advantages. One advantage relates to the computational complexity of the data association problem. If all robots exchanged every percept from every information source (which would normally only be feasible if one were to disregard latencies, bandwidth limitations, and unreliability in communication channels) the data association problem would involve comparisons between all percepts in the set Y(t). The decomposition of the problem distributes and simplifies this computation, by allowing each robot rm to first process the percepts in Y m (t), and summarise the contained information in local anchors. It is these local anchors which are then exchanged. More will be said about the complexity of the data association problem in section 6.3.5. Another advantage of decomposing the problem into local and global parts is an increased robustness with respect to communication channel latencies and unreliability. Potential communication channel problems make it unfeasible for robots to wait for information from other robots before processing information from their own local information sources, which often produce information at relatively high frequencies. Treating the local anchoring problem independently allows local computations to be performed frequently, regardless of communication channel characteristics. Even if the communication network is extremely unreliable, the proposed framework allows whatever information is successfully exchanged to be exploited. Information loss due to communication channel errors does not affect the processing of local information, nor does it affect the processing of successfully exchanged information. Finally, the decomposition of the problem allows the amount of information exchanged between robots to be adjusted based on available bandwidth. For instance, this can be accomplished by reducing the frequency at which local anchors are shared, or by compressing the transmitted local anchors (in a lossy or lossless manner).

4.1.2

A Decentralised Approach

The proposed framework is inherently decentralised; each robot performs local anchoring independently, using percepts from its own sources. Local anchors received from other robots are considered during global anchoring; if no local anchors were received from other robots, global anchoring transparently produces one global anchor for each own local anchor. There is no central node, and there is no dependency on available communication channels.

4.1. FRAMEWORK OVERVIEW

39

Alternatively, a centralised approach might have been used. In such an approach, all robots would send their local anchors to one central robot or node, which would then perform global anchor management and send the resulting global anchors back to all robots. The main advantages of the decentralised implementation relate to robustness. In a distributed approach there is no critical central node; failures within such a central node, or failures in communication with this node, can make cooperative anchoring impossible. Also, a central node can become a processing and communication bottleneck. Although centralised approaches typically require less overall bandwidth, most of the communication in a centralised setup is directed to or from the central node. Bottlenecks can result in delays in the overall algorithm, since robots need to wait until they receive results from the central node before they can take advantage of information produced by other robots. In the decentralised case, a robot can, at any time, simply use whatever information it has received so far to create new global anchors. The main advantages of a centralised implementation relate to the facilitation of coordination between robots. If a central node is performing global anchor management, then this node can guarantee that all robots have identical global anchors, which can be referred to using globally consistent names or indices. This can facilitate coordination between robots [72]. Note however that this advantage is weakened if the central node is not explicitly aware of all robots. This is because in order to guarantee identical global anchors, the central node needs to know which robots are sending and receiving information. Both decentralised and centralised alternatives can be useful, depending on the application. Hybrid approaches might also be used; for instance, sub-groups of robots might communicate with each other in a decentralised manner, while a central node could be used to connect sub-groups. Avoiding dependencies on the communication network was the main motivation for choosing a decentralised approach here. However, a centralised version of the proposed framework would be relatively simple to develop.

4.1.3

Illustration

An overview of the framework, which shows the exchange of local and global anchors, is shown in figure 4.1. In the figure, one robot and one camera are shown exchanging information about three objects. The camera could only see two objects on its own, but after exchanging local anchors, both the robot and the camera had global anchors for all three objects. Note that local and global anchors are maintained separately, in order to avoid the circular dependencies which could arise if information received from a particular robot was sent back to that same robot as new information.

40


Figure 4.1: Overview of the proposed anchoring framework, showing the exchange of local anchors.

4.2. CONCEPTUAL SPACES

41

4.2 Conceptual Spaces As was mentioned in section 1.4, anchoring can involve a number of different types of information. In chapter 3, it was noted that in order to perform anchoring, percepts and anchors need to be compared and combined. For this to be possible, a common reference frame needs to exist between these different types of information. Such a reference frame can be hard to obtain; in particular, a relationship must be established between symbolic representations, for instance those used by high-level cognitive processes, and numeric representations, which are used in many perception and control sub-systems. Conceptual spaces were proposed by Gärdenfors [66] as a general theory of representation. The theory explores how different types of information can be meaningfully represented, both for developing artificial systems, and for explaining concept formation from a psychological perspective. Conceptual spaces use geometric representations spanning multiple domains to provide an intuitive interpretation of the relationship between symbolic and sub-symbolic information. This is extremely relevant for anchoring, which aims to connect heterogeneous descriptions and perceptual representations. This section provides a brief description of conceptual spaces, and it explains how they are used to represent information in the proposed anchoring framework. The main characteristics of conceptual spaces are as follows. • A conceptual space consists of a number of dimensions, which represent various object “qualities”. For instance, a conceptual space might include dimensions such as width, hue, weight, and saltiness. Dimensions may be constructed in different ways; for instance some dimensions may be circular, strictly positive, or discrete. • Domains are groups of interdependent dimensions. These dimensions are said to be integral or interacting, as opposed to separable. Common domains include position, colour, taste, shape, and weight. Some domains, like weight, can be described using a single dimension. The dimensions and units used to describe a domain can vary. For instance, positions can be described using Cartesian, cylindrical, or spherical coordinates, among others. Colours can be described using various colour spaces (e.g. RGB, YUV, HSV, and HSL). Taste is often described as consisting of five basic dimensions (usually sweet, sour, bitter, salty, and umami), although other proposals exist. The shape domain is a particularly complex domain to represent. A simple approach might be to simply use a discrete set of shape names. Another alternative could be to use an array of values which describe shape signatures according to some metric. More sophisticated approaches have also been proposed. For instance, some works [36] describe the shape domain using dimensions corresponding to parameters of superquadratic equations [13]. Gärdenfors discusses a number of proposals for representing shapes in conceptual spaces [66].


42

• Properties are regions in a single domain. For instance, the property “red” corresponds to a region in the colour domain. Properties described by convex regions are said to be natural properties; these properties are often particularly useful and intuitive to grasp. Properties can also be representing using non-convex or even disconnected regions. In language, properties often correspond to adjectives. • Concepts are regions in a conceptual space; such regions can span multiple domains. For example, the concept of an apple might be represented as regions in the colour, shape, taste, and weight domains. In language, concepts are often associated with nouns. For certain concepts some domains may be more salient than others [144]. For example, the shape and colour domains are much more useful than the weight domain when it comes to distinguishing apples from oranges. For some concepts, domains may also be correlated; for example, the taste and colour of an apple will often co-vary. • Objects can be described as points in a conceptual space. This approach to describing objects is the inspiration behind the proposed anchoring framework. Real world objects could potentially be described as points in an extremely large (even infinite) number of domains. In practice, only the most salient or useful domains are considered. A sample conceptual space is shown in figure 4.2. In this conceptual space, there are three domains. The colour domain consists of hue, saturation, and value dimensions (only hue and saturation are shown in the figure). The shape domain consists of a set of possible shapes. The weight domain is represented using its only dimension, which is strictly positive. Note that the units used in each dimension could vary. For instance, the HSV dimensions are often represented using either real numbers in the range [0, 1], or integers in the range [0, 255]. Also, note that the hue dimension is circular. In the figure, three objects are represented using corresponding points in each of the three domains.

4.2.1

Interpretations

Gärdenfors suggests two possible interpretations of conceptual spaces. On one hand, there is a phenomenal (or psychological) interpretation, in which the dimensions and domains of a conceptual space are inferred based on behavioural or psychological considerations. In this work the scientific (or theoretical) interpretation is used, in which dimensions and domains are chosen in order to facilitate the construction of cognitive systems. Simply put, this means that this work is not concerned with the question of which colour spaces are suitable for describing human perception. In most cases, available sensors and the task at hand will be the most important factors in determining which domains and dimensions should be used.

4.2. CONCEPTUAL SPACES

43

Figure 4.2: A sample conceptual space, with three domains. The three objects are represented as points in each domain of the conceptual space.


44

4.2.2

Similarity

Similarity is an important notion in the conceptual spaces framework. In metric domains Euclidian distance can often be used as a simple and intuitive similarity measure. For instance, red is closer to orange than it is to blue in the HSV colour space. Topological domains might use other distance measures, such as the number of edges between nodes in a graph or tree. For example, in a tree of kinship relations, a sibling node is closer than a cousin node. Some domains may not have an intuitive similarity measure. For instance, the shape domain in figure 4.2 is simply a set of shape names. This could be seen as a topological domain represented by a fully connected graph – in other words, all nodes are equidistant (and therefore equally similar). In this work, measures of similarity and consistency are needed in order to perform compare various types of information.

4.2.3

Anchor Spaces

In the proposed framework, percepts received from information sources, local anchors, global anchors, and descriptions of objects of interest are all represented using anchor spaces. These spaces are a specialisation of conceptual spaces, in which regions in domains correspond to uncertainty about the properties of a particular object. So a large region indicates that there is considerable uncertainty regarding the exact point in the conceptual space which defines the object in question. This means that belief about an object’s properties can be seen as an evolving region in an anchor space. How uncertainty is represented and interpreted will depend on the implementation; one possible interpretation is discussed in section 5.2.

4.3 Local Anchor Management This section describes how local anchors are managed in the proposed framework. Recall that local anchors contain estimates of object properties based on a robot’s own information sources. The creation and maintenance of local anchors is an important part of the proposed framework. Local anchors are associated with object descriptions to perform single-robot anchoring, as described in definition 3.2.1. In addition to the ingredients from section 3.1, the following additional components are introduced. • For each robot rm there is a local anchor space, denoted Cm . The local anchor space describes the domains, dimensions, coordinate systems, and units used by robot rm to represent information about objects. The space includes specifications for all domains in the set Xm – that is, all domains in which robot rm is interested.

4.3. LOCAL ANCHOR MANAGEMENT

45

• For each of robot rm ’s information sources sm k there is a conceptual sensor model hm , which maps information produced by the correspondk ing information source sm to values over the local anchor space Cm . A k m given conceptual sensor model hk includes mappings for all domains in which information source sm k can produce information. These mappings can involve complex, non-linear transformations, and the resulting values can range from single points to multi-modal regions. Conceptual sensor models can be seen as inverse sensor models, which map heteregeneous percepts to estimates of object properties. The term inverse is used since sensor models typically map state estimates to measurements, and not the other way around. Since percepts can include symbolic information, conceptual sensor models can also act as predicate grounding relations [38], which map symbolic predicates to corresponding object property estimates. In general, conceptual sensor models can be used to consider all types of information shown in table 1.1. No assumptions are made about how they are obtained; they can be manually created, derived using learning, or acquired in some other way. In most applications, sensor models and predicate grounding relations are given, or easily obtainable.

4.3.1

Self-anchors

Some local anchors may be created explicitly at start-up. In particular, in cooperative scenarios it is often useful for a robot to consider itself as an object; in this case, a robot’s estimates of its own properties can be stored as a local anchor in the set Ψm . This type of anchor is called a self anchor; self-anchors are normally named, where the anchor name is the name of robot rm . In practice, self-anchors should normally be the only named local anchors – more will be said about this in section 4.6. Although self-anchors can be treated as normal anchors, special considerations may be used when maintaining them, since robots often have more information about their own properties than they do about the properties of other objects. By representing its own properties as an anchor, a robot can easily share information about itself with other robots via the anchoring framework. This can be useful for many tasks. For instance, cooperative planning may benefit from having robots exchange self-localisation information. If robots can observe each other, cooperative self-localisation might also be addressed via the anchoring framework. The use of self-anchors also allows the framework to extend naturally to applications where “smart” objects can have information about themselves. In the proposed framework, such objects are essentially treated as robots.

46

4.3.2


Local Data Association

m When a percept zm k [j] arrives from information source sk , it is mapped into m the local anchor space C using the conceptual sensor model hm k ; the percept is then buffered. The local data association step is triggered for each frame. Information arriving during each frame must be considered separately, in order to ensure that percepts arriving from the same information source during the same frame are not associated with each other. Local data association involves matching percepts in set Y m (t) with each other and existing local anchors Ψm (t − 1), in order to determine which entities refer to the same objects. Note that local anchors need not be compared with each other, and percepts produced by the same information source during a given frame need not be compared with each other; these are assumed to correspond to different objects. Any named entities, either percepts or local anchors, can only match entities with the same name, or no name. If two entities do have the same name, they are considered to match perfectly, regardless of their contents; such matches take precedence over other matches. Otherwise, matching involves computing similarity or consistency measures between entities, based on their contents. The resulting “degree of matching” is used to decide which entities refer to the same objects. The matching process is simplified by the fact that all considered entities are represented in the same local anchor space. The result of the local data association step is a partitioning of the set αm (t) such that each partition αm l (t) contains all and only percepts which are believed to refer to object ol . Such a partition can contain at most one percept from each information source, and at most one local anchor ψm l (t − 1). If a partition has no associated local anchor, a new empty local anchor is created and added to the partition. Thus at the end of the local data association step, each partim tion αm l (t) contains a local anchor ψl (t − 1). The subscript l is a local index, m used by robot r to denote a given anchor ψm l . The details of the matching operation and the data association algorithm will depend on the implementation. One possible implementation is described in section 6.3.

4.3.3

Local Information Fusion

The local information fusion step is triggered immediately following the local data association step. Local information fusion combines the entities in each partition αm l (t). The result of the fusion operation overwrites the contents of m the associated local anchor ψm l (t − 1), which becomes ψl (t). Note that local anchor names are not affected by the fusion process. In particular, if a local anchor ψm l was unnamed, it remains unnamed, even if one or more percepts in the partition αm l (t) were named. More will be said about this in section 4.6.

4.3. LOCAL ANCHOR MANAGEMENT

47

Fusing information represented in the same local anchor space involves combining the estimates in each domain. The details of the fusion operation will depend on the implementation. One possible implementation is described in section 6.4.

4.3.4

Local Prediction

Periodically, local anchors which have not been observed for some time are replaced with predicted estimates, which take into account changes which may have occurred since the last observation. How prediction is performed depends on assumptions about how object properties can change in each domain. Recent rate of change information for each domain might be considered, such as velocity in the position domain. Predictions of self-anchors might be performed differently, since a robot might have more information about how its own properties change in time; for instance, self-motion estimates are normally much more accurate that estimates about motion of foreign objects. The details of the prediction step will depend on the implementation; one possible implementation is presented in section 6.5.

4.3.5

Local Anchor Deletion

Anchor deletion is a complex issue. Deciding when an anchor is no longer needed or useful is difficult; such decisions will often depend on the application. A number of factors should typically be considered, such as the dynamics of the object, the nature of the task at hand, as well as the domains included in the anchor [104]. Anchors might also be archived, to be retrieved at a later time. The proposed framework does not constrain how local anchor deletion is performed or triggered; the only requirement is that it should be possible for the system to explicitly trigger local anchor deletion. For instance, a planner might need to delete certain local anchors when beginning a new task, or when descriptions of what objects are of interest have been modified. The prediction step could also be used to trigger local anchor deletion, if an object’s predicted properties become too uncertain to be useful. Anchor deletion might also be triggered if an object is unobserved for some time, or when an observation was expected; this could mean that the anchor was created based on a perceptual glitch, for instance. Finally, deletion of local anchors could also be triggered when the number of maintained local anchors reaches a certain threshold. As will be discussed in section 6.3.5, the number of maintained local anchors can significantly affect the overall complexity of the data association problem, and a maximum number of local anchors could be used to mitigate this. When the maximum number of local anchors is reached, the oldest local anchors could be deleted to make room for new ones.


48

Figure 4.3: Illustration of the local anchor management steps performed by robot 1.

4.3.6

Illustration

Figure 4.3 shows the information used by robot 1 during one sample iteration of the local anchor management steps. Pre-existing local anchors in Ψ1 (t − 1) are first predicted if needed, so that they can be compared with the percepts in Y 1 (t). Produced percepts are converted from a source-specific format to local anchor space representations using appropriate conceptual sensor models. Local data association is then performed, by comparing local anchors and percepts. The result is a partitioning of the set α1 (t); each partition includes one local anchor, which may be created during the data association step. These partitions are shown using dashed boxes in figure 4.3. Note that the percepts in partition α13 (t) did not match any existing local anchors, so a new local anchor ψ13 was created and added to that partition. The local information fusion step involves combining the information in each partition of α1 (t); the result is stored in the contained local anchor. Note that local anchors can be used at any time by both high-level and low-level modules.

4.4. GLOBAL ANCHOR MANAGEMENT

49

4.4 Global Anchor Management This section describes how global anchors are managed in the proposed framework. Recall that global anchors contain estimates of object properties based on exchanged local anchors. The creation and maintenance of global anchors is an important part of the proposed framework. Global anchors are associated with object descriptions to perform cooperative anchoring, as described in definition 3.2.2. In addition to the ingredients listed in sections 3.1 and 4.3, the proposed approach introduces the following new components. • There is a global anchor space, denoted C. This anchor space describes the domains, dimensions, coordinate systems, and units used to represent shared information about objects. The space includes specifications for all domains which might be used in exchanged local anchors. This is the set X – the set of all domains in which any robot is interested. The global anchor space essentially describes an agreement between robots regarding the “format” used when exchanging information about objects. • For each robot rm , there is a space transformation function, denoted fm , which allows a robot rm to map information from its own local anchor space Cm to the global anchor space C before it is sent to other robots. Space transformation functions must be invertible, so that local anchors received from other robots can be converted from the global anchor space back to the robot’s local anchor space. The functions consider each domain in the local anchor space Cm . The local and global anchor spaces may be identical, in which case the corresponding space transformation function is the identity function. Space transformation functions typically implement coordinate transformations. For instance, if the global anchor space uses the HSV colour space for colour information, and robot rm ’s local anchor space represents colour in the RGB colour space, then the space transformation function fm should include an invertible transformation from the RGB colour space to the HSV colour spaces. Similarly, if position information in the global anchor space is represented using global coordinates, then any robot which uses local coordinates in its local anchor space should use a space transformation function to perform the coordinate transformation from local to global coordinates. Note that such a coordinate transformation would depend on the robot’s selflocalisation estimate; this will be discussed in detail in section 6.2. And again, this transformation should be invertible, so that information received in global coordinates can be mapped to the local coordinates used in the local anchor space.

50

4.4.1


Global Data Association

Transmission of local anchors can occur synchronously or asynchronously. Before being sent, local anchors are mapped from the local anchor space Cm to the global anchor space C using the space transformation function fm . Transmitted local anchors are marked with the index m of the sending robot, and the local anchor index l. Received local anchors are first mapped from the global anchor space to the local anchor space, after which they are buffered. Newer anchors from the same robot overwrite older anchors with the same robot index and local index. Buffered anchors are fed to the global data association algorithm either synchronously or asynchronously. Global data association involves matching the local anchors in the set Ψ(t) (recall that the set Ψ(t) is the set of all local anchors available at time t) in order to determine which ones refer to the same objects. As was mentioned previously, local anchors from the same robot are assumed to correspond to different objects, and therefore they need not be compared with each other. As in local data association, named entities can only match entities with the same name, or no name. Two entities with the same name are assumed to match perfectly, regardless of their contents; such matches take precedence over any other matches. In practice, local anchors usually have unique names, since self-anchors are normally the only named anchors, and these should all originate from different robots. Names aside, matching involves determining a “degree of matching”, which is often based on similarity or consistency. This measure is used to determine which local anchors refer to the same objects. Assuming all local anchors are successfully exchanged, the result of the global data association step is a partitioning of the set Ψ such that each partition Ψg contains all and only anchors which are believed to refer to the same object og . Each partition can contain at most one anchor from each robot. For each partition Ψg , a new global anchor ωm g is created, and all previous global anchors are discarded. If all local anchors have not been successfully exchanged, then the global data association step simply creates a partitioning of the set of available local anchors. Note that if no local anchors have been received from other robots, the result of the global data association step is a set which contains exactly one global anchor for each own local anchor. In this case the result of the global anchor management steps will be the same as the result of the local anchor management steps; this transparency is particularly useful, and it allows global anchors to be used instead of local anchors in most situations. Section 6.3 describes a possible implementation of both the matching operation, and the data association algorithm; the same implementation is used for both local and global data association.


51

Global Index Negotiation The subscript g is used by robot rm to denote a global anchor ωm g . Generally speaking, this index can be different across robots, and over time, for the same objects. However, a simple approach can be used to stabilise this index in most cases, as follows. The global index g can be computed as g = (mmin Lmax ) + l,

(4.1)

where: Lmax is a constant greater than the maximum number of local anchors which a robot can maintain; the robot index mmin is the lowest index of all robots with a local anchor in the partition Ψg ; and l is the local index of that anchor, provided by the sending robot. In the examples and illustrations in this work, Lmax is assumed to be 10, for simplicity. This allows global indices to contain two parts. All digits except the last one correspond to the robot index mmin ; the last digit corresponds to the local index l. The main benefit of using this approach is that any two robots who have successfully exchanged all local anchors will always have the same index for the same anchor, assuming they are using the same data association algorithm. The indices may differ when the same local anchors are not available to both robots; this is inevitable, since in this case even the number of global anchors may be different. Another benefit of the proposed numbering scheme is that the index will remain stable over time, except when a new robot, with a lower index, adds a local anchor to the given partition Ψg . This stability across robots and over time can be useful when coordinating cooperative activities. Specifically, objects can be referred to using indices which are shared by various robots. More sophisticated negotiation approaches could also be used; see chapter 11 in Coulouris et al [43] for a discussion of consensus in distributed systems. Aragues et al [3] propose a decentralised algorithm specifically designed to resolve inconsistencies between local and global instances of data association. However, such approaches require at least one extra round of information exchange. If the used communication channel is reliable enough to support such an approach, it is most likely reliable enough to allow all local anchors to be exchanged; in such cases, the simple approach proposed above will often be sufficient.

4.4.2

Global Information Fusion

Global information fusion is performed immediately after global data association. The fusion step combines the local anchors in each partition Ψg of the set Ψ, and stores the result in the global anchor ωm g . Note that unlike in local information fusion, where names were not passed on from percepts to local anchors, local anchor names are transferred to global anchors. Again, names will be discussed in more detail in section 4.6.


52

It is important to avoid storing information from the global information fusion step back into local anchors, since these will be sent to other robots again. Failure to keep this information separate could lead to circular dependencies, such that the same information might be treated as new multiple times. In the ideal case, all local anchors are successfully exchanged in a lossless manner, and global anchors are identical across all robots. In general, however, it may happen that due to communication errors, lossy compression of local anchors, or lack of synchronisation, the local anchors available to each robot are not the same. In this case, the set of global anchors created by one robot will be different from the set created by another robot. The fusion step simply uses whichever local anchors are available to compute global anchors. If no local anchors have been exchanged, the result of the global information fusion step is a new set of global anchors, each corresponding to one own local anchor. Global information fusion involves fusing information from each domain of the global anchors in question, in order to obtain improved estimates of object properties. Again, the details of the fusion operation will depend on the implementation. An implementation of the global information fusion step is described in section 6.4.

4.4.3

Global Prediction

Local anchors received from other robots can be subjected to the same prediction steps as own local anchors, as described in section 4.3.4. This implies that robots must agree on how object properties can evolve over time. If large network latencies exist, received local anchors might also be predicted upon receipt, by an amount corresponding to the estimated network latency. Global anchors do not need to be predicted, since they can be re-created based on predicted local anchors instead.

4.4.4

Global Anchor Deletion

Deletion of buffered local anchors received from other robots can be managed in the same way as deletion of own local anchors, as described in section 4.3.5. In addition to this, a robot can also explicitly signal to other robots that a local anchor was deleted, by sending a “deleted” message with the same local index as the deleted anchor. Global anchors do not need to be deleted, since they are discarded whenever new global anchors are created during the global information fusion step. If needed, the creation of new global anchors can be triggered after local anchors have been deleted.


53

Figure 4.4: Illustration of the global anchor management steps performed by robot 1.

4.4.5

Illustration

Figure 4.4 shows the information used by robot 1 during one sample iteration of the global anchor management steps. Local anchors exchanged between robots are represented in the global anchor space. Received local anchors must therefore be converted to the local anchor space C1 , via space transformation functions (or rather, their inverse). These local anchors are buffered and predicted, so that they can be compared with the latest own local anchors in set Ψ1 (t). Own local anchors are also predicted if needed – this prediction was also used during local anchor management. Note that a robot’s own predicted local anchors are also sent to other robots, after having been converted to the global anchor space via space transformation functions; the transmission of robot r1 ’s local anchors to other robots is not shown in the figure. Global data association involves comparing all available local anchors, in order to determine which ones refer to the same objects. The resulting partitions Ψg (t) are shown using dashed boxes. Global information fusion involves combining the information in each partition Ψg (t). The results are stored in new global anchors, and all previous global anchors are discarded. Note that in the example, global anchor indices are computed using equation 4.1.


54

4.5 Descriptions Recall from section 3.1 that each robot rm maintains a set of descriptions of objects of interest. Positive descriptions are used to indicate objects which are of interest, and negative descriptions are used to describe objects in which the system is not interested. Specific objects are described using definite descriptions, and indefinite descriptions describe general object properties. Descriptions can also be named or unnamed. It can be convenient to store commonly used descriptions; these can be grouped, and activated and deactivated, depending on the task at hand. In order to support the creation of descriptions, the framework includes on final component, in addition to the previously listed ingredients. • Each robot rm uses a set of Um grounding functions to create descripm tions; the set is denoted Gm = {gm 1 , . . . , gUm }. Grounding functions convert heterogeneous descriptions, which can span multiple domains, into the set of descriptions Dm , which is represented in the local anchor space. If needed, global anchor space representations of these can also be obtained via space transformation functions. Grounding functions are analogous to conceptual sensor models, in that both map various types of information to the local anchor space. While conceptual sensor models consider perceptual information produced by information sources, grounding functions consider generic object descriptions. In practice, the same mathematical operation can be used for both conceptual sensor models and grounding functions. For instance, both percepts and descriptions might contain information that an object is {red}, or at position (11, 7). The framework does not directly address the exchange of descriptions between robots. If desired, descriptions can be exchanged outside the framework. However, the representations and operations included in the framework could be used to facilitate this, since robots can exchange and compare descriptions in much the same way as they exchange and compare local anchors.

4.5.1

Descriptions and Anchoring

Positive descriptions are the main input to the anchoring process; they indicate which objects are of interest, and they are associated with matching local and global anchors in order to address the single-robot and cooperative anchoring problems, respectively. Each association has an associated match value, describing how well the description and anchor match; this value might be used by higher layers in a number of ways, for instance to determine if more perceptual information is needed, or if descriptions need to be updated. Associations between descriptions and anchors can be recomputed whenever either of these are updated. The matching operation will typically be similar to that used for data association; in particular, names can be used to constrain the matching operation. In order to match, descriptions and anchors

4.5. DESCRIPTIONS

55

should match in all domains which they have in common; other domains do not necessarily affect the match, although the reliability of a match could be decreased if few domains are common. As in data association, the matching process is simplified by the fact that all involved entities are represented using the same anchor space. If a positive description is named, matching anchors are also associated with that name. More will be said about this in section 4.6. Note that multiple anchors might match a single description, and a single anchor might match multiple descriptions. If a definite description is associated with more than one anchor, this indicates that the available information is not sufficient to allow the particular object described by the definite description to be uniquely identified. In such situations, higher layers might choose to perform perceptual actions to disambiguate the situation [30, 88], or a more detailed description of the object in question might be provided.

4.5.2

Descriptions and Interest Filtering

Another important function of descriptions is that they can be used to filter out uninteresting information. Reducing the number of entities considered by the anchoring framework is important given the computation complexity of the data association problem, which will be discussed in section 6.3.5. Produced percepts and received local anchors which match at least one active positive description, and which do not match any active negative descriptions, are processed as described earlier in this chapter; all other arriving entities are discarded. Again, the matching operation will typically be similar to that used for data association. Note that if robots have different descriptions then they may not compute the same global anchors, since some transmitted local anchors might be discarded. As mentioned in section 3.1.1, information sources can also be configured to discard percepts which are not relevant to the current task. Having information sources filter out uninteresting information can be a slightly more computationally efficient solution, since arriving perceptual information can often be rejected with very little processing; in particular, it may be possible to filter out percepts before they are fed to the information source’s conceptual sensor model. However, this approach can be rather inflexible, since information source tuning might be difficult to perform whenever interests change. In particular, information sources will typically have different calibration mechanisms, and one might need to calibrate each source separately. Also, local anchors received from other robots cannot be filtered out via information sources. Using descriptions to filter out information provides a more general and flexible approach, which allows interest to be easily managed and updated. As mentioned previously, descriptions can be grouped, and activated and deactivated, for different tasks. The computational cost of using descriptions to filter out arriving information is linear in the number of descriptions used and in the number of entities being checked.


56

4.6 Names In philosophy, names have many possible meanings and interpretations [116, 134, 90]. In this work, names are simply symbols used to denote objects and robots; they are assumed to be globally unique, even across robots. This is an admittedly simplistic interpretation of names, but it is adequate for the purposes of this work. Names are useful in many applications; in particular, humans and robots use names to denote objects, both internally and when exchanging information. The proposed framework considers names in two ways.

4.6.1

Assigning Names

First, names can be assigned to percepts by the information sources which produce them, to local anchors at start-up time, and to global anchors during global information fusion. Corresponding entities are considered to be named, and the framework assumes that these names are correctly assigned. This allows names to be used to constrain data association and interest filtering, as mentioned in sections 4.3.2, 4.4.1, and 4.5. Percepts are named if they are known to originate from a specific named object. For instance, an RFID tag might contain an object’s name, as well as a number of object properties; these properties are known to originate from an object with a particular name. Alternatively, a virtual sensor might be able to uniquely identify an object with a known name, perhaps aided by application domain constraints; for example, it may be known that there is only one orange object in the environment, called “ball”. Certain local anchors might be created and named at start-up time. This can be particularly relevant for self-anchors, which are normally the only anchors which can be known with certainty to refer to specific “objects”; more will be said about this shortly. Note that the framework does not allow existing local anchors to be named after start-up; names can be dynamically associated with anchors, as will be discussed in section 4.6.2 – but they can never be assigned to local anchors. Global anchors, on the other hand, can be assigned names during global information fusion. Arriving named entities can optionally be allowed to refine named descriptions, as follows. Received named entities and identically-named descriptions can be fused, and the named description can be replaced by the result of this fusion. This can be used to allow arriving information about a particular named object to improve the description of what is known about that object. The arriving named entity should then be processed as normal.

4.6. NAMES

57

Names in Local Anchor Management Named percepts and named local anchors constrain the local data association step as discussed in section 4.3.2. Specifically, a named entity will never match another named entity which has a different name. Local anchor names are not affected by the local fusion step. In particular, unnamed local anchors do not become named even if they matched and were associated with named percepts. The reason for this is that it is generally impossible to ensure that an anchor refers to a specific named object. Once a local anchor has been fused with any unnamed percept, the anchor inherently refers to “the object it represents”. For this reason, named local anchors should be updated with particular care. To illustrate the problem, imagine that an RFID reader produces a named percept for “cup-22”, with the properties {green, in-kitchen}. This percept results in the creation of a local anchor – here, assume that the anchor does inherit the name “cup-22”. Next, a green object is observed at the North end of the kitchen. The corresponding unnamed percept matches the local anchor for “cup-22”, and is associated and fused with this anchor. The anchor now indicates that “cup-22” is green, and located at the North end of the kitchen. Finally, another green object – the real “cup-22” – is observed at the South end of the kitchen. In this case, the real “cup-22” will not match the anchor which is named “cup-22”. Names in Global Anchor Management In global anchor management, names can constrain data association, as in local anchor management. Unlike the local case, names are passed on during global information fusion. This is because global anchors are discarded at every global anchor management iteration anyway. This means that errors in global anchor naming will not have cumulative or persistent effects on the anchoring process.

4.6.2

Associating Names and Anchors

The second way in which names are considered is that names can be associated with local and global anchors which match named positive descriptions. It is in this sense that anchoring can be seen as a subset of the symbol grounding problem [78]. These associations are dynamic, and they may change as descriptions and anchors are updated. They can be useful, since they provide a list of candidate anchors, which are consistent with a particular named description at a given time. In order to reduce fluctuations in these associations, hysteresis might be used to force an anchor which matches a particular description to remain associated with that description for some time.

58


4.7 Framework Summary The various components and steps which form the proposed framework are shown in figure 4.5. The following events and actions summarise the operation of the framework. EVENT: Descriptions updated. ACTION: Descriptions can be added, removed, or modified by a user of the framework at any time. Grounding functions are used to create new descriptions, and various types of updates might be performed. Recall that named descriptions may also be updated upon receipt of named entities. Since descriptions have two separate functions, they are shown in two places in figure 4.5. On the left hand side of the figure, they are shown filtering out unwanted percepts and incoming local anchors. On the right hand side of the figure, they are associated with matching local and global anchors. Description updates cause currently buffered percepts and local anchors to be re-examined. If existing entities are discovered to be uninteresting, they are deleted. EVENT: Receive percepts. ACTION: Percepts can arrive from local information sources at any time. In figure 4.5, information sources s11 and s12 are shown as circles in the upper left corner, and the percepts they produce are shown using rectangles pointed to by arrows. Percepts are first fed to the appropriate conceptual sensor model, using which they are represented in the local anchor space. They are then passed through the active descriptions, and uninteresting information is discarded. Surviving percepts are buffered, together with any other percepts received during the current frame. EVENT: Receive local anchors. ACTION: Local anchors can be received from other robots at any time. Robots r2 and r3 are shown on the left of figure 4.5, and the local anchors sent to robot r1 are shown as rectangles pointed to by arrows. Note that transmission of own local anchors can also occur at any time – in figure 4.5, transmission of robot r1 ’s local anchors to other robots is not shown. Local anchors are converted to the global anchor space before being sent, using appropriate space transformation functions. Arriving local anchors are converted from the global anchor space to the local anchor space via inverse space transformation functions. Received local anchors which do not match the active descriptions are discarded. Remaining anchors are buffered. Local anchors from the same robot with the same local index overwrite previously received anchors with the same identity; a deleted message can be sent for a particular local index to indicate that the corresponding local anchor has been deleted.

4.7. FRAMEWORK SUMMARY

59

EVENT: Prediction cycle end. ACTION: Periodically, local anchors which are older than a certain threshold are replaced with predictions; these predictions are computed using knowledge about the dynamics of each relevant domain. Predicted local anchors may be deleted if the prediction causes deletion criteria to be met. Prediction is applied to both own local anchors and buffered local anchors received from other robots. In figure 4.5, prediction is shown using circular arrows for both own and received local anchors. Another approach could be to perform prediction asynchronously instead; for instance, prediction might be triggered whenever local anchors are to be transmitted or used to create global anchors. EVENT: Frame end. ACTION: For each frame, local data association and information fusion algorithms are invoked. Buffered percepts received during a particular frame are compared with each other and with own local anchors, during the local data association step. Matching entities are grouped into partitions, each containing at most one local anchor and one percept from each local information source. New local anchors are created for partitions which do not contain a local anchor. Entities in each partition are then fused, and the result overwrites the contents of the corresponding local anchor. Buffered percepts are then discarded, and a new frame begins. In figure 4.5 local data association and information fusion steps are shown near the top of the figure. EVENT: Anchors requested. ACTION: At any time, a user of the anchoring framework can request all anchors which match the active descriptions; this can happen synchronously or asynchronously. For instance, matching anchors might be requested whenever a certain number of frames have passed. Alternatively, matching anchors might be requested when a new planning action is initiated. In order to satisfy such requests, global data association and global information fusion algorithms are invoked. Own local anchors and buffered local anchors received from other robots are compared, and matching local anchors are fused and stored in newly created global anchors. Previous global anchors are discarded. All local anchors are kept. In figure 4.5 global data association and information fusion steps are shown near the top of the figure. Associations between descriptions and anchors (both local and global) are updated, based on comparisons between them. These associations are shown in the bottom right hand corner of the figure.

60 CHAPTER 4. ANCHORING FRAMEWORK

Figure 4.5: Framework summary which shows the main components and processes of the proposed anchoring framework.

4.8. DISCUSSION

61

4.8 Discussion This chapter has described the anchoring framework proposed in this thesis. The proposed approach is decentralised, and it allows both single-robot anchoring and cooperative anchoring to be performed in a transparent manner. Local and global anchors are stored and maintained separately; local anchors contain only locally produced information, while global anchors contain information from own information sources and other robots. This separation means that local anchors can safely be exchanged without circular dependencies. Anchor spaces, inspired by conceptual spaces, allow the framework to consider various types of information and domains when managing local and global anchors; anchor spaces are also used to represent various types of descriptions of objects of interest. The use of a common representation facilitates the creation and maintenance of anchors, and it also simplifies the process of associating anchors with descriptions. Local anchors reflect the latest information available from local information sources, and they can be particularly useful for task execution. Global anchors improve the completeness and robustness of available object property estimates, since they include information received from all available sources of information; this makes them particularly useful for building shared representations of the environment. In the ideal case, when all information is successfully exchanged, global anchors are identical across all robots – a situation which can facilitate multi-robot coordination [24, 72]. Otherwise, the framework allows whatever information is successfully exchanged to be used.

Chapter 5

Framework Realisation Part 1: Representations In the previous chapter the proposed anchoring framework was described in general terms. The framework is quite complex, and it involves a number of components. Many of these could be implemented using a variety of methods. In this chapter a brief overview of the implementation used in this work is presented. The rest of this chapter then describes how information is represented in the implementation. Details about how anchoring processes have been implemented are given in chapter 6. The reader should keep in mind that the choices described here do not detract from the generality of the framework; the presented implementation is merely one way in which the proposed anchoring framework can be realised.

5.1 Implementation Overview This section summarises the presented implementation of the proposed anchoring framework. A number of factors have affected the implementation choices described here; in particular, many choices were made for two reasons: to simplify the implementation, and to facilitate a broad exploration of the problem.

5.1.1

Representations

One of the most significant implementation choices relates to how information is represented in the framework. Anchor spaces are required to handle both high and low level information, as well as various types of uncertainty. Given these requirements, fuzzy sets were chosen as a primary representational tool. Fuzzy sets are well suited for representing various types of information and uncertainty, and they support a wide range of matching and fusion operators. More will be said about the use of fuzzy sets in this work in section 5.2.

63

64

CHAPTER 5. FRAMEWORK REALISATION PART 1: REPRESENTATIONS

As was mentioned earlier, the domains which need to be considered vary depending on the application. In order to simplify the implementation, only two domains are considered: position and colour. Both are important in many robotic applications. The presented implementation is also simplified by the fact that all local anchor spaces and the global anchor space use the same dimensions, units, and coordinate systems. This means that space transformation functions are the identity function, and can be ignored. These choices are discussed in more detail in section 5.3. Conceptual sensor models and grounding functions need to map information to the local anchor space. A number of functions have been implemented which allow various types of information to be represented in this space. The same functions are used for both conceptual sensor models and grounding functions; they are discussed in sections 5.5 and 5.6.

5.1.2

Processes

Self-localisation information is important for most robotic applications; in this work, self-localisation information is mainly needed in order to allow object positions to be determined with respect to a coordinate system which is shared across multiple robots. The approach to self-localisation used in this work is described in section 6.1. The object localisation approach, which uses selflocalisation information, is described in section 6.2. Data association has been implemented using a method inspired by global nearest neighbour (GNN) data association approaches. The proposed algorithm considers various types of information spanning multiple domains, and uses names in addition to fuzzy operators to perform matching. The same algorithm is used for both local and global data association. The full method performs a brute-force search, which guarantees that the best match according to the selected matching criteria will be found. This guarantee was important during testing of the overall framework. An approximation of the proposed method is also described, which sacrifices optimality in exchange for a reduction in complexity; this can be necessary for applications in which large numbers of objects need to be considered. Details about the data association implementation are given in section 6.3. The information fusion approach uses fuzzy operators to combine information. Various operators have been tested; the fusion approach is described in section 6.4. The prediction model used in this work is relatively simple. It is assumed that information in the colour domain does not change at all, so this domain is not predicted. For the position domain, it is assumed that all objects can move in any direction at a fixed maximum velocity; current velocities are not used to predict motion. This prediction implementation was used to verify that prediction can be used within the framework. Also, only manual anchor deletion is supported. The implemented approach to prediction is discussed in section 6.5.

5.2. INFORMATION REPRESENTATION

5.1.3

65

Experimental Tool

The proposed framework is functionally decentralised. The implementation, however, is centralised in a monitoring tool, which logs and processes percepts received from multiple robots. This monitoring tool allows various parameters and algorithms to be tested on real and artificial data, both online and offline. Within the tool, information from each robot is maintained separately, and the decentralised nature of the algorithm is preserved. The centralisation of information within the monitoring tool was useful for development and testing. The monitoring tool and the experiments it made possible will be discussed in chapters 7 and 8.

5.2 Information Representation In the presented implementation of the cooperative anchoring framework, fuzzy sets are used to represent information. Fuzzy sets provide a powerful way to represent information, allowing various types of uncertainty to be taken into account in a convenient and intuitive manner [136]. Also, fuzzy logic provides formal mechanisms for matching and fusing fuzzy sets, and it includes a rich set of operators for these and other operations [19]. Fuzzy logic also lends itself to relatively straightforward and computationally efficient implementations. This section gives a brief overview of fuzzy sets, and it describes how they are used to represent information in the proposed anchoring framework. The reader is referred to one of the many books on fuzzy sets and fuzzy logic for more details (e.g. Klir and Folger [89]).

5.2.1

Fuzzy Sets

Fuzzy sets were proposed by Zadeh [161] as a way to represent non-crisp concepts (e.g. tall, or old) by allowing set elements to have degrees of membership. These degrees of membership are represented by real numbers in the [0, 1] interval. Given an element x belonging to the universal set X, one can denote the degree of membership of the element x to the set described by the property P as µP (x). Mathematically, membership functions are defined as µP : X → [0, 1].

(5.1)

A similar definition exists for fuzzy relations, which are fuzzy sets defined over a Cartesian product X1 × X2 × · · · × Xn . For a fuzzy relation R, the level to which elements x1 , x2 , · · · , xn are in relation R is given by µR (x1 , x2 , · · · , xn ), which is defined as µR : X1 × X2 × · · · × Xn → [0, 1].

(5.2)

The original interpretation of degrees of membership was that membership was non-crisp because the concepts themselves were vague. Other useful in-

66

CHAPTER 5. FRAMEWORK REALISATION PART 1: REPRESENTATIONS µ(x)

µ(x)

µ(x) 1

1

1

0

0 0

50

100 150

X

µ(x)

0 0

50

100 150

X

1

0

50

100 150

X

100 150

X

50

100 150

X

1

0

0

50

µ(x)

µ(x)

1

0

0

50

100 150

X

0 0

Figure 5.1: Various types of uncertainty represented using fuzzy sets. Figure from [136], used with permission.

terpretations have been proposed over the years, and the choice of which interpretation to use will typically depend on the application. In this work, a possibilistic interpretation [162, 52] is used, where µP (x) indicates the degree of possibility that x possesses the property P. For example, if P represents the position of an object, µP (x) is read as “the degree of possibility that the object is at position x”. Note that under this interpretation, low possibility values actually provide more information than high values, since they rule out potential elements. In particular, complete ignorance is represented by a fuzzy set in which all elements have membership values of 1.0 – in other words, all values are equally and fully possible. Conversely, the most informative fuzzy sets are those in which all elements have membership values of 0.0 except for one, which has a membership value of 1.0. Often, fuzzy sets have a maximum membership value, or height, of 1, and a minimum value, or bias, of 0. Given the used possibilistic interpretation, if a fuzzy set has a maximum value below 1 this indicates that no value is fully possible given the available information; if a fuzzy set has a minimum value, or bias, greater than 0, this means that the information is not fully reliable – so all values are possible at least to some degree. The height and bias of a given fuzzy set µ will be denoted height(µ) and bias(µ), respectively. Fuzzy sets can be used to represent several different types of uncertainty. Figure 5.1 illustrates some of these, using a 1D position domain as an example. In (a) the position of the object is known with certainty to be 80; in (b) the position is approximately 80, and it is therefore vague; in (c) the position is between 80 and 160, and it is therefore imprecise; in (d) the position is either 80 or 160, and it is therefore ambiguous; in (e) the position is, with the highest possibility, at 80, but it is also possible that it is elsewhere (e.g. perhaps it was seen at 80 recently, but it might have been moved since then) – this unreliability


67

is represented by setting the bias of the fuzzy set to a non-zero value; finally, in (f), there are a number of different types of uncertainty combined. Note that fuzzy sets can represent as special cases crisp values, as in (a), intervals and sets, as in (c) and (d), and parametric shapes, like trapezoids. Also, note that elements can be assigned values independently of each other. This ability to represent information at precisely the level of detail at which it is available is often claimed to be an important property of fuzzy sets. Sometimes, it is useful to extract a point estimate µˆ P ∈ X from the information contained in a fuzzy set µP . This process is called defuzzification, and it can be performed in a number of ways. One of the most common defuzzification techniques is to use the centre of gravity (CoG) of the fuzzy set µP , which is computed according to: R xµP (x)dx µˆ P = Rx∈X . (5.3) x∈X µP (x)dx Some variations include computing the CoG using only elements which have membership values above a certain threshold; all other values are treated as zero. Another variation involves ensuring that the resulting point estimate corresponds to an element which has a membership value above a certain threshold; in this case, if the membership value at the CoG is below a certain threshold, the nearest point to the CoG which has a membership value above that same threshold is chosen instead. In this work, the normal CoG is used.

5.2.2

Implementing Fuzzy Sets

Two common ways to implement fuzzy sets are using the bin model and the parametric model. In the bin model, the universe of discourse is discretised as an array of bins. If a distance metric is available, the bins are usually (but not necessarily) the same size. The value in each bin corresponds to the membership value for that bin. The choice of resolution for such discretised models is crucial in obtaining a balance between accurate representations and low storage and computational requirements. In the parametric model, a fuzzy set is represented by fixing the parameters of a parametric function. Parametric representations tend to allow more efficient storage and computation; however they have limited representational power. Trapezoids and Gaussian distributions are examples of parametric representations. In the presented implementation both bin and parametric models are used, as well as a hybrid model which has both bin and parametric components. Note that if a fuzzy set is used to represent information over a circular space, for example the space of possible orientations, the implementation must take this into consideration.


68

y

µ(x) 1

0

x

x

(a)

(b)

Figure 5.2: Bin models used to implement fuzzy sets; (a) is a 1D bin model, (b) is a 2D bin model. Both (a) and (b) use a fixed step size. In (b), more possible bins are shown darker, less possible bins are lighter.

Bin Models Examples of 1D and 2D bin models are shown in figures 5.2(a) and 5.2(b), respectively. Both examples use a fixed step size. A 2D bin model with a fixed step size is sometimes referred to as a grid, and the bins can be referred to as cells. Bins might also be used to implement sample-based models, although this idea is not explored in this work. Bin models are frequently used throughout this work, as they can represent arbitrarily shaped fuzzy sets, albeit at a limited resolution. Fuzzy sets over more than two dimensions can also be represented using bin models, however such representations quickly become difficult to maintain due to their high requirements in terms of storage and computation. Parametric Models Various parametric membership functions are used to implement fuzzy sets. Some of the most commonly used functions are ramp functions, which can be described by the following parameters, shown graphically in figure 5.3: • Base: The value of the membership function before the ramp; in other words, the value before the ramp is in effect. The base of a ramp fuzzy set µ is denoted base(µ). • Level: The value of the membership function after the ramp; in other words, the value being set by the ramp. If this value is above the base, then the function is a ramp-up function; if it is below the base, the function is a ramp-down function; if it is equal to the base, the function is a uniform function, which always returns the same value. The level of a ramp fuzzy set µ is denoted level(µ). • Start: The start of the ramp. The start of a ramp fuzzy set µ is denoted start(µ).

5.2. INFORMATION REPRESENTATION µ(x)

69 µ(x)

1

1

base

level

base

level 0

x

start

end

0

x

start

end (b)

(a)

Figure 5.3: Ramp-up (a) and ramp-down (b) parametric membership functions. y

y

x

y

x

x

Figure 5.4: Examples of 2D fuzzy sets which can be described using ramp functions. More possible values are shown darker, less possible values are lighter. The most possible values in the figures are 0.8, the least possible values are 0.2.

• End: The end of the ramp, which is > the start of the ramp. If this is equal to the start, then the function is a step function. The end of a ramp fuzzy set µ is denoted end(µ). Ramp functions are commonly used in fuzzy logic, although the level and base are usually set to 1 and 0, respectively. Ramp functions can easily be extended to describe multi-dimensional fuzzy sets, by making the start and end parameters include values for each dimension. For instance, instead of just containing a value for x ∈ X, the start and end parameters can also include a value for y ∈ Y. Some sample 2D fuzzy sets which can be described in this way are shown in figure 5.4. Another parametric function often used to describe fuzzy sets is the trapezoid, shown in figure 5.5(a); trapezoids can also be inverted, as in figure 5.5(b). Trapezoids and inverted trapezoids can both be described by the following parameters, shown graphically in both figures: • Base: The value of the membership function at all points outside the trapezoid. The base of a trapezoidal fuzzy set µ is denoted base(µ). • Level: The value of the membership function which is farthest from the base; in other words, the level being set by the trapezoid. If this value is

70

CHAPTER 5. FRAMEWORK REALISATION PART 1: REPRESENTATIONS µ(x) 1

µ(x)

support

1

core

level

support core

base

base

level 0

x

centre

support−centre (a)

0

x

centre

support−centre (b)

Figure 5.5: Parametric trapezoidal membership function (a) used to implement fuzzy sets. The trapezoid in (b) is inverted.

below the base, then the trapezoid is inverted. The level of a trapezoidal fuzzy set µ is denoted level(µ). • Core: The width of the set of values at the level of the trapezoid. A wider core means a less precise fuzzy set. The core of a trapezoidal fuzzy set µ is denoted core(µ). • Support: The width of the set of values which are inside the trapezoid. Note that the support must contain the core. The larger the difference between the core and the support of a trapezoid, the more the fuzzy set is vague. The support of a trapezoidal fuzzy set µ is denoted support(µ). • Centre: The middle of the set of values at the level of the trapezoid (i.e. the middle of the core). The centre of a trapezoidal fuzzy set µ is denoted centre(µ). • Support-centre: The middle of the set of values which are inside the trapezoid (i.e. the middle of the support). Note that the support-centre and support of a trapezoid must be set so that the support contains the core. Often, symmetric trapezoids are used, in which case the support-centre is equal to the centre. The support-centre of a trapezoidal fuzzy set µ is denoted support-centre(µ). Trapezoidal fuzzy sets are often used in fuzzy logic, although inverted trapezoids are not commonly used; also, the level is usually 1 and the base is usually 0. Trapezoids are also usually symmetric, meaning that the support and support-centre are equal; in this work, symmetric trapezoids are used, so the support-centre parameter is omitted. Notice that a single trapezoidal membership function can describe all of the uni-modal uncertainty types illustrated in figure 5.1, as well as many others. Trapezoidal fuzzy sets can also be extended to multi-dimensional spaces, by extending some of the parameters to include values for each dimension. For

5.2. INFORMATION REPRESENTATION y

71

y

x

x

(a)

(b)

y

y

x

(c)

x

(d)

Figure 5.6: Extending trapezoidal membership functions to 2D. More possible values are shown darker, less possible values are lighter. The most possible values in the figures are 0.8, the least possible values are 0.2.

example, a 2D trapezoid, or pyramidal frustum, can be implemented simply by extending the normal trapezoid definition so that the centre, core, support, and support-centre are 2D values. Such a function, and its inverted counterpart, are shown in figures 5.6(a) and 5.6(b), respectively. Another approach is to implement a 2D cone, or conical frustum, in which case only the centre and support-centre need to be 2D values; in this case the core and support are the diameters of the circles which make up the cone. Examples are shown in figures 5.6(c) and 5.6(d). Multi-modal parametric fuzzy sets can also be described using the above functions. To do this, a common base value is set, which indicates the value of the fuzzy set in all areas not explicitly set by one of the modes. Each mode is then described by its type and all remaining parameters. Areas affected by more than one mode can be assigned values which reflect the combination of the values in the relevant modes; combination operators will be discussed in section 5.2.3. Some examples of such multi-modal fuzzy sets are shown in figure 5.7.


72

y

µ(x) 1

0

x

x

(a)

(b)

Figure 5.7: Examples of 1D (a) and 2D (b) multi-modal parametric functions.

Hybrid Models In some situations it can be useful to implement hybrid models, which combine bin and parametric models. In this work one such hybrid model is used; this model is called a 2.5D grid. In this hybrid model, a 2D bin model in maintained, as described above. However, instead of containing a single number corresponding to a membership value, each bin contains the parameters of a single symmetric trapezoidal membership function, which describes a uni-modal estimate of a third dimension. The motivation for such a representation is that it allows information over a 3D space to be represented using far less storage and computation than would be needed for a 3D grid. The price is that information in the third dimension can only be represented using uni-modal parametric fuzzy sets, one for each cell. This approach was originally proposed as a means to represent pose information by Buschka et al [31]. In this work 2.5D grids are used to represent both pose and colour information, as will be discussed in sections 5.3 and 6.1. A sample hybrid model, used to represent a robot’s pose, is shown in figure 5.8. In this example, the fuzzy set, denoted µ, is over the 3D space of possible poses. A given pose is denoted (x, y, φ), where (x, y) is the robot’s position and φ is its orientation. In order to compute the membership value µ(x, y, φ) of a given pose (x, y, φ), one must first determine which cell, denoted c, corresponds to the position (x, y). The value of the trapezoidal membership function contained therein is then read, at φ. This trapezoid is denoted µ(x, y) or µ(c). The level of the trapezoid reflects the possibility of the value (x, y) being the true position, disregarding orientation φ. The other parameters describe the possible orientations of the robot, assuming that it is located in cell c. Note that the orientation dimension is circular; the implementation of the trapezoidal membership functions must take this into account.


73

y

x

Figure 5.8: Example of a hybrid 2.5D grid, where each cell in a 2D grid contains a trapezoidal membership function which gives a uni-modal estimate of a third dimension.

5.2.3

Operations On Fuzzy Sets

There are a number of operations commonly performed on fuzzy sets; some of the most common are intersection, union, and complementation. These are normally defined as µP∩Q (x) µP∪Q (x) µPc (x)

= µP (x) ⊗ µQ (x) = µP (x) ⊕ µQ (x)

= 1 − µP (x)

(5.4) (5.5) (5.6)

where ⊗ denotes a triangular norm, or t-norm, and ⊕ denotes the corresponding t-conorm [156]. T-norms are binary operators which are commutative, associative, non-decreasing (if x 6 y then x ⊗ z 6 y ⊗ z) and have 1 as neutral element (x⊗1 = x). The most commonly used t-norms are the minimum, the product, and the Łukasiewicz operator, computed as max(x + y − 1, 0). T-conorms are binary operators which are commutative, associative, non-decreasing and have 0 as neutral element. Common t-conorms include the maximum and the bounded sum, computed as min(x + y, 1). Various averaging operators can also be applied to fuzzy sets. Averaging operators act as aggregation operators, which produce a result which is in between the minimum and maximum. These are not t-norms or t-conorms, and they do not compute intersections or unions of fuzzy sets; rather, they aggregate the information contained in the inputs. In general averaging results in a sort of compromise or trade-off between what is believed by all sources, while an intersection reflects a consensus or agreement [19]. Implementations of these operations are normally straightforward and computationally efficient. A book by Klir and Folger [89] examines a wide variety of operations which can be performed on fuzzy sets.

74


Matching Fuzzy Sets In this work it is often useful to determine how well two fuzzy sets match. A number of matching operators could be used; two of the most common ones have been implemented in this work [89, 37]. The first can be described as the maximum value of the intersection. Formally, if µP and µQ are two fuzzy sets over the domain X, and x is a value in X, a match value can be computed according to the following: match1 (µP , µQ ) = sup (µP (x) ⊗ µQ (x)) ,

(5.7)

x∈X

where sup denotes the supremum (least upper bound) operator, and ⊗ is a t-norm. Note that this equation depends only on the existence of common elements in the input fuzzy sets, and does not try to characterise how similar the fuzzy sets are. The second matching operator is a measure of how much the fuzzy sets overlap, and it can be defined as: R (µP (x) ⊗ µQ (x)) dx R Rx∈X , (5.8) match2 (µP , µQ ) = min( x∈X µP (x)dx, x∈X µQ (x)dx) where ⊗ is again a t-norm. This equation measures what percentage of the smallest of the input fuzzy sets is contained in the intersection of the fuzzy sets. The minimum t-norm is normally used for both matching operators, since its result is always independent of the number of items being matched (this is not true for the product, for instance). Both matching operators will give a number in [0, 1] which describes how well the fuzzy sets match. Extensions of these matching operators to consider an arbitrary number of input fuzzy sets is straightforward. The choice of operator will depend on the application. In general, equation 5.8 is more discriminating, since it describes how much the inputs must agree; equation 5.7 is slightly faster to compute – it describes how much the inputs might agree. Figure 5.9 shows the results of matching a number of sample 1D fuzzy sets using both matching operators. Fusing Fuzzy Sets Intersection can be used to combine or fuse the information contained in two fuzzy sets. For example, if µP and µQ represent two fuzzy sets which contain information about the position of an object according to two different sources, the fusion of those percepts can be computed using: fuse(µP , µQ ) = µP∩Q = µP ⊗ µQ .

(5.9)

The extension of this definition which considers an arbitrary number of input fuzzy sets is straightforward.


1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

75

0.1

0

0 0

20

40

60

80

0

100

match1 = 0.000 match2 = 0.000

20

40

60

80

100

80

100

match1 = 0.500 match2 = 0.175

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0 0

20

40

60

match1 = 1.000 match2 = 0.700

80

100

0

20

40

60

match1 = 1.000 match2 = 1.000

Figure 5.9: Matching fuzzy sets using equations 5.7 (match1 ) and 5.8 (match2 ). The filled in area represents the intersection of the fuzzy sets according to the minimum t-norm, which is used in both matching operators.

Different t-norms will cause the fusion operator to behave differently. The choice of operator is typically domain dependent. For instance, the minimum t-norm is idempotent (so min(x, x) = x), which means that fusing the same information multiple times as the same effect as fusing it once. Therefore, this operator is often used when the independence of sources cannot be assumed. The product t-norm, on the other hand, should be used when independence is granted, since it acts as a reinforcing operator; specifically, it reinforces belief in values which are deemed possible by all sources [16]. As an example, imagine two sources which both report that µP (x) = 0.5 and µP (y) = 1.0. If the sources are dependent, then they are potentially merely repeating the same information. For instance, the information might reflect the beliefs of two people who have read about an event in the same newspaper. In this case, the information should be combined using the idempotent minimum t-norm, which will result in the combined belief being the same as the inputs, µP (x) = 0.5 and µP (y) = 1.0. Since the sources are not independent, there is no reinforcement effect, even though the sources agree. On the other hand, if the sources are independent, then they are not merely repeating the same information. For instance, the information might reflect the beliefs of two people who have both witnessed an event. In this case, a rein-


76

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0 0

20

40

60

80

100

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0

20

40

60

80

100

0

20

40

60

80

100

0.1

0

0 0

20

40

60

80

100

Figure 5.10: Fusing fuzzy sets using intersection (the minimum t-norm) yields a consensus between sources. The peak of the intersection, shown as the filled in region, always corresponds to the point of maximum agreement between the input fuzzy sets.

forcing t-norm like the product can be applied; this will result in a combined belief of µP (x) = 0.25 and µP (y) = 1.0. This belief is stronger than any of the individual ones, because it narrows the set of possibilities more sharply. In this sense, the two opinions have been reinforced. There are two things to note about fusing fuzzy sets using intersection. First, only values which are regarded as possible by all sources are retained in the result, which is why an intersection reflects a consensus between sources. This can be seen in figure 5.10, where two fuzzy sets are fused using the minimum t-norm. Notice that the peak of the resulting fuzzy set always reflects the point of maximum agreement between sources. The second fact to note is that the intersection automatically discounts unreliable information. Consider figure 5.11. The information represented by µQ , drawn as a dashed line, has a high bias (0.8), indicating that this information is rather unreliable; the information represented by µP , drawn as a solid line, only has a small bias (0.1), and is therefore much more reliable. Correspondingly, the result of the fusion – again computed using the minimum t-norm – is similar to the information in µP ; it is only marginally influenced by µQ . In practice, this means that this approach to fusion automatically reduces the impact of unreliable information, as long as this unreliability is correctly represented. It should be noted that operations on trapezoidal fuzzy sets do not normally result in another trapezoid. In this work the result of such operations is represented using a trapezoidal envelope of the result. This approach introduces some extra uncertainty, but it allows the result of the intersection to be stored back into a single trapezoid. An example of this is shown in figure 5.12.


1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

77

0.1

0

0 0

20

40

60

80

100

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0

20

40

60

80

100

0

20

40

60

80

100

0.1

0

0 0

20

40

60

80

100

Figure 5.11: The intersection of fuzzy sets discounts unreliable information. The information in µQ (dashed line) is unreliable, as indicated by its high bias; the information in µP (solid line) is much more reliable, as shown by its low bias. Correspondingly, µQ only has a small influence on the result of the fusion.

Normalisation In some cases, the result of fusing fuzzy sets is a fuzzy set in which no elements are fully possible. This means that the fused information was to some degree inconsistent. Often, however, it is known that some value must be true; this is the case, for instance, for a fuzzy set describing the position of an object which is known to be in the region described by the fuzzy set. In such cases the resulting fuzzy set can be normalised, such that the most possible values become fully possible. There are two common ways to perform this normalisation. A multiplicative normalisation involves dividing all membership values by the highest membership value. This effectively stretches the fuzzy set upwards until the most possible values are fully possible (have membership values equal to 1). An additive normalisation can also be performed, in which all membership values are shifted upwards until the most possible values are fully possible. Both normalisation methods have been implemented, but the additive approach is normally used in this work. When normalising a fuzzy set implemented as a 2.5D grid (described in section 5.2.2) both the level and base of the trapezoids in each cell are stretched or shifted, until at least one cell has a level of 1. Both normalisation methods raise the bias of the fuzzy set, which corresponds to decreasing the reliability of the information. This is consistent with the fact that the inputs used to create the result were somehow inconsistent.

78


1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0

20

40

60

80

100

Figure 5.12: An example of taking the outer envelope (the thick line) of the result of fusing two trapezoids (the filled area), in order for the result to be another trapezoid.

5.3 Domain Choices The choice of which domains should be used in the local and global anchor spaces will normally depend on the application. Although many possible domains could be used, in the current implementation two domains which are commonly used in robotics applications are considered: position and colour. An object’s position is often its most important property. Assuming objects are solid and convex, 3D positions are unique. And even if positions are not unique, information in the position domain can still be very useful in distinguishing objects from one another. Position information is also crucial for nearly all actions which can be performed with respect to an object. Moreover, position information can be obtained using a wide range of sensors, and it is often used in both low-level and high-level processes. Colour is also an important and salient property in many applications; many objects can be distinguished from one another using colour. Colour is also commonly used by humans to describe objects. Vision systems, which are popular in robotic systems, can usually extract, among other things, both position and colour information about observed objects. The two implemented domains are enough to demonstrate how both perceptual and non-perceptual information, arriving from high-level and low-level information sources across multiple robots, can be compared and combined to perform anchoring. Other domains such as shape and texture are used in examples and illustrations in this work; these are not part of the implementation.

5.3.1

Common Local and Global Anchor Spaces

Recall from section 4.4 that a space transformation function fm is used to convert information from robot rm ’s local anchor space Cm to the global anchor space C, which is the space used to represent shared information. In the implementation described here, the domains, dimensions and units used in the global

5.3. DOMAIN CHOICES

79

space C are exactly the same as those used in all local spaces Cm . This implies that the space transformation functions fm are all the identity function, and they can be ignored. This choice simplifies the implementation, by having information sources map percepts directly into the format used both locally, and for exchanging information.

5.3.2

Dimensions and Coordinate Systems

Position Domain Position information about objects is represented using two dimensions, corresponding to globally defined 2D Cartesian coordinates, in metres. The representation is realised using a 2D fuzzy set implemented using a bin model with a fixed step size. In other words, position information is stored in a 2D grid, where each cell corresponds to a region in the real world. The value in each cell is a floating point number in the range [0, 1], corresponding to the possibility that the object in question is located in that cell. For a given position (x, y), the possibility that the object is at that position can be computed by looking up the cell in which the position lies, and reading the membership value in that cell. Colour Domain Colour information about objects is represented using the HSV colour space, which uses three dimensions: hue, saturation, and value. The values are normalised to be in [0, 1]. A 3D fuzzy set is used to represent colour information. This fuzzy set is implemented using a 2.5D grid, as described in section 5.2.2. Saturation and value components of colour information are represented using a 2D bin model, like the one used for position information. However, instead of containing a single floating point number, each cell contains the parameters of a symmetric trapezoidal membership function, which indicates the possible hue values for that cell. Note that the hue dimension is circular; the implementation takes this into account. Hue was chosen as the dimension to be stored using the trapezoids in the 2.5D grid because hue, like orientation, can be represented using the circular angle dimension. This meant that the same 2.5D grid implementation used for self-localisation (see section 6.1) could also be used for hue. This simplified the implementation significantly. Otherwise, the value dimension would have been represented using trapezoids, since this dimension is typically less informative than the hue domain. For a given colour (h, s, v), represented by a 2.5D fuzzy set µ, the value of µ(h, s, v) reflects the possibility that the object in question is that colour. This value is computed by first finding the cell which corresponds to the (s, v) pair, and then reading the value of the trapezoidal membership function contained in that cell at h.

80


Other Domains Although they are not implemented, shape and texture domains are used in some examples and experiments. These are represented using 1D fuzzy sets, implemented as bin models, where each bin corresponds to a specific shape or texture. Note that there is no distance metric, in these domains.

5.4 Descriptions Descriptions are represented in the same anchor space as percepts and anchors, using two grids: a 2D position grid and a 2.5D colour grid. Descriptions can be added, removed, or modified at any time. Arriving percepts and local anchors are discarded if they do not match any positive description, or if they match a negative description. If no positive descriptions exist, a default positive description which accepts everything is assumed. Associations between global anchors and descriptions are updated whenever descriptions are updated, and whenever global information fusion is performed. These associations are computed by performing matching between each description and each global anchor, using the match1 operator, described in section 5.2.3. If the match value is greater than a certain threshold, an association is created. Associations between descriptions and local anchors are not computed in the presented implementation. Table 5.1 lists how the different types of descriptions are used, for both anchoring and interest filtering.

Table 5.1: Positive and negative descriptions and their uses.

Named Definite Unnamed

Positive anchoring and interest filtering, e.g. “cup-22”

Negative interest filtering (match name only), e.g. “cup-22”

anchoring and interest filtering, e.g. the green cup

Named Indefinite Unnamed

anchoring and interest filtering, e.g. green cups

interest filtering, e.g. green cups

5.5. GROUNDING FUNCTIONS

81

5.5 Grounding Functions Recall from section 4.5 that a grounding function gm u maps information from a custom format into the local anchor space Cm . The information to be mapped can span both position and colour domains. In this work, the same underlying operations are used by both grounding functions and conceptual sensor models. The various information types which can be mapped to the local anchor space will be described in detail in section 5.6. The information types mentioned there can all be fed to grounding functions in order to create descriptions, represented by two grids: a 2D position grid and a 2.5D colour grid. Note that a description can contain information in one or both domains.

5.6 Conceptual Sensor Models Recall from section 4.3 that a conceptual sensor model hm k is used to map percepts produced by robot rm ’s information source sm from a source specific k format into the local anchor space Cm , taking uncertainty into account. Conceptual sensor models include conversions for each domain in which the corresponding information source can produce information. Information sources can produce percepts containing information in one or both domains. The output of each conceptual sensor model, in response to an input percept zm k [j], is that same percept, converted into two grids: a 2D grid for the position domain, denoted zm k,POS [j], and a 2.5D grid for the colour domain, denoted zm [j]. One of the two output grids may be empty, since percepts k,COL need not contain both position and colour information. Some of the implemented conceptual sensor models take self-localisation information into account in order to compute position grids. A robot’s selflocalisation estimate is denoted rm SELF , and it can be represented in a number of ways. Self-localisation will be discussed in section 6.1. The only requirement from conceptual sensor models is that the estimate rm SELF be rich enough to allow a robot to estimate, at any time, how much it believes a given pose (x, y, φ) to be its true pose. Recall that φ denotes robot orientation in the global coordinate system. Conceptual sensor models are typically source specific; those presented here can be divided into two main categories: symbolic and numeric. In the rest of this section the various types of information produced by information sources will be examined; for each type of information, the creation of the position m and colour grids zm k,POS [j] and zk,COL [j] will be described. Position and colour domains are discussed separately; the reader should keep in mind that a single percept can contain both position and colour information.

82


5.6.1

Symbolic Conceptual Sensor Models

Symbolic conceptual sensor models are functions which map symbolic information to one or more regions in the local anchor space. Objects can be described using symbols such as {red}, {in-kitchen}, or {near-wall}. For instance, a virtual sensor might be able to detect that an object is {in-room-12}, or a sensor might simply be able to sense that an object is {close}. Functions called predicate grounding relations [38] have been used in previous anchoring frameworks to map symbols to numeric values. The symbolic conceptual sensor models implemented in this work are described here. Many other types of symbolic information could be considered using similar techniques. Symbolic Position Information Information type: {in-region} Two types of symbolic position information are considered. The first type is {in-region} information, where an information source produces a set of region names and a flag indicating whether the object in question is inside or outside all the given regions, e.g., {in-region, {kitchen, bedroom}, inside}. Regions can be defined a priori or during execution. Regions typically describe rooms, tables, or otherwise distinct areas in the world. Crisply defined regions are used; they can be either rectangular or circular. So for a given item of {in-region} information, each cell in the output position grid zm k,POS [j] is assigned a value of either 1 or 0, depending on two things: whether the cell is inside or outside any of the given regions, and whether the flag indicates that the object is inside or outside these regions. In practice the regions are first combined using a union operation, in this case via the maximum operator. If the flag indicates that the object is not inside the regions, the result of the union is complemented. The result is then copied to the output position grid. This type of information might be used, for instance, by information sources able to detect that an object is in a particular room, or on a particular table. Example grids are shown in figure 5.13. In (a) the information says that the object is inside the given regions; in (b) the object is outside these same regions. Information type: {near-self} The second type of symbolic position information is {near-self} information, where an information source simply produces a flag which indicates whether the object in question is “near” or “not near” the observing robot, e.g., {near-self, yes}. A fuzzy set describing the property {near} is defined, either a priori or during execution, using a ramp down function, as described in section 5.2.2. The output position grid zm k,POS [j] is then produced by creating a blurred copy of the self-localisation grid rm SELF , where the amount of blurring is based on the definition of the fuzzy set for {near}. The orientation information contained in rm SELF is ignored. Blurring is implemented using a fuzzy

5.6. CONCEPTUAL SENSOR MODELS

(a)

83

(b)

Figure 5.13: Position grids showing information about an object which is inside (a) and outside (b) two given regions. Darker areas in the figures indicate possible positions.

(a)

(b)

(c)

Figure 5.14: An example self-localisation estimate is shown in (a). Corresponding near self (b) and not near self (c) information are also shown. Dark regions correspond to possible positions.

morphological dilation operation [21], similar to dilation in image processing. If the flag from the information source indicates that the object is not near the robot, the blurred grid is complemented. This information might be used, for instance, if a virtual sensor or RFID reader is able to detect the presence, but not the position, of an object. An example is shown in figure 5.14. In the figure, the self-localisation estimate is shown in (a); an object which is “near self” (within one metre, in this case) is shown in (b); an object which is “not near self” (again within one metre) is shown in (c). Symbolic Colour Information Information type: {is-colour} Only one type of symbolic colour information is considered: {is-colour} information. In this case an information source produces a set of colour names and a flag indicating whether the object in question is one of the given colours, or not. For example, an information source might report that an object is {red}. Alternatively, a source might know that an object is not {blue} and not {green}. As an example, this type of information could be produced by an RFID reader, based on colour information read from an RFID tag.

84


(a)

(b)

Figure 5.15: Colour grids, drawn as hue-saturation circles. Darker areas indicate possible colours. In (a), the object in question is red; in (b), the object is not red and not green.

Colour names can be defined a priori or during execution. Each colour is defined by three symmetric trapezoidal fuzzy sets, which describe the colour in the three dimensions of the HSV colour space: hue, saturation, and value. The set of colours contained in a given item of {is-colour} information will be denoted Λ. For a given colour λ in the set Λ, the trapezoids for hue, saturation, and value will be denoted µH [λ], µS [λ], and µV [λ], respectively. The set of colours Λ is transformed into the 2.5D colour grid zm k,COL [j] as follows. Recall that each cell in the 2.5D grid corresponds to a saturation-value pair (s, v). Each (s, v) cell contains a trapezoid, denoted zm k,COL [j](s, v), which represents the possible hue values for that cell. The level of the trapezoid in each cell reflects the possibility of that cell containing the colour of the object in question, disregarding hue. This means the level in a given cell will be high as long as its saturation and value components are consistent with at least one of the colours in zm k,COL [j]. The level is computed by: level(zm k,COL [j](s, v)) = sup {µS [λ](s) ⊗ µV [λ](v)},

(5.10)

λ∈Λ

where ⊗ is the minimum t-norm. The rest of the parameters of the trapezoid zm k,COL [j](s, v) are computed by finding the smallest trapezoidal envelope which contains all hue trapezoids µH [λ] for colours λ in Λ. This envelope is computed by finding the outer envelope of the union (computed using the maximum operator) of the hue trapezoids for each colour. The grid is then inverted if the flag indicates that the object does not have any of the given colours. Inverting the grid in this case means simply inverting the trapezoids in each cell. In figure 5.15, two example colour grids are shown; the dark areas indicate the possible colours in a hue-saturation circle. In (a), the grid is for a red object; in (b), the grid is for an object which is not red, and not green.


5.6.2

85

Numeric Conceptual Sensor Models

Numeric conceptual sensor models are functions which map numeric information to one or more regions in the local anchor space. Numeric information might consist of range-bearing observations in the position domain, or observed HSV values in the colour domain. The numeric conceptual sensor models implemented in this work are described here. Again, many other types of information could be considered using similar techniques. Numeric Position Information Information type: {near-position} Two types of numeric position information are considered. The first is called {near-position} information; in this case, an information source produces a set of positions corresponding to possible object locations. A base value is provided which is used for cells not affected by any of the specified positions. Each position is described using several parameters. First, (x, y) coordinates for the centre of each position are given. The level of each position is also given; it can be higher or lower than the base value. The shape of each mode is either a 2D trapezoid or a 2D cone, as described in section 5.2.2. Rather than using a fixed definition of “nearness”, the full shape of each mode is provided by the information source. The produced set of positions is used to compute possibility values for each cell in the position grid. If the same cell is affected by more than one mode, the maximum value is taken. This type of information is particularly flexible, and it can be used in a number of contexts. For instance, such information might correspond to sensor observations made directly in the global coordinate system (e.g. an object might be detected “within one metre of position (3, 5)”). Note that this type of information could also be used to represent fuzzy sets which have the same shape as those resulting from symbolic {in-region} information. Figure 5.16 shows some examples of grids produced using this type of information. Note that both uni-modal and multi-modal grids can be created; also, modes can be higher or lower than the base value, which is used in cells not affected by any of the specified modes. Information type: {observed-range-bearing} The second type of numeric position information implemented in this work is called {observed-range-bearing} information; in this case, an information source indicates that an object has been observed at a given range and bearing, in local polar coordinates. This type of information is produced by many different types of sensors in robotic systems. The mapping of this information into the local anchor space is not as straightforward as for other types of information. In order to be represented in the local anchor space, the arriving information needs to be transformed

86


(a)

(b)

(c)

Figure 5.16: Position grids created using near-position information.

from local polar coordinates to global Cartesian coordinates. The transformation is complicated by two separate sources of uncertainty: uncertainty in the observation itself, and uncertainty in self-localisation. The uncertainty in the observation is captured by a sensor model in the more traditional sense; uni-modal symmetric trapezoidal fuzzy sets are used to represent uncertainty in both the range and bearing components of the observation. The range sensor model trapezoid is denoted µρ , and the bearing sensor model trapezoid is denoted µθ . The uncertainty in the self-localisation estimate of the robot is contained in the self-localisation grid rm SELF . The sensor model trapezoids are set as follows. The centre of µρ is normally set to ρ; however, the centre can be offset to account for known systematic errors. This offset can be used for sensor calibration. If the range accuracy of the sensor inversely depends on distance, the core and support are set proportionally to the observed range; otherwise, they are given fixed values. The centre of µθ is normally set to θ; again, the value can be offset if there are known systematic errors. The core and support are fixed, since a bearing measurement’s error typically does not depend on its absolute value. For both µρ and µθ , the level is set to 1, which means that the most likely values are deemed fully possible. Also, both trapezoids have a non-zero base value, which acts as a bias; this value is usually very low, and it reflects the possibility that errors greater than the trapezoidal support may occur. Typically, fine-tuning these sensor models does not significantly affect performance; in general, intuitively chosen initial values perform well and need not be modified. The sensor model parameters can also be modified at run-time, if needed. This conceptual sensor model acts as a sort of virtual sensor model; it encapsulates uncertainty in both the observation itself and self-localisation, while performing the coordinate transformation from local polar coordinates to global Cartesian coordinates. More details regarding the described coordinate transformation will be given in section 6.2, along with figures showing the resulting grids. This type of transformation deserves special attention, since it is useful in a wide number of situations and applications.


(a)

(b)

87

(c)

Figure 5.17: Colour grids created using numeric colour information, drawn as huesaturation circles. Darker areas indicate possible colours.

Numeric Colour Information Information type: {observed-colour} The only type of numeric colour information considered in this work is {observed-colour} information. In this case, an information source indicates that an object with given (h, s, v) colour values has been observed. The resulting colour grid is created using an approach which is similar to that used for {is-colour} information, described in section 5.6.1. There are two main differences. First, only one colour can be observed, rather than a set of colours. Second, instead of using predefined trapezoids which describe symbolic colours, a sensor model consisting of three uni-modal symmetric trapezoidal fuzzy sets is used. This sensor-model can be modified online. The sensor model values are set in a similar manner to the position sensor model values set when using {observed-range-bearing} information, described previously. The centre is set according to the observed (h, s, v) values; an offset can be used for calibration. The core and support are given fixed values. The level is set to 1, and the base is assigned a small non-zero value to account for the possibility of large sensor errors. These sensor models have proven to be quite robust. The output grid has a single mode in the 2.5D grid zm k,COL [j]. This type of information can be used, for instance, to represent object colours as observed by a vision system. Note that although the arriving information could simply be represented using three trapezoids, the transformation to the 2.5D grid is still needed in order to allow the arriving information to be stored in the local anchor space. The local anchor space will often contain information which cannot be stored as simple trapezoids, since an anchor typically contains the result of fusing multiple percepts. Some examples of colour grids which can be created using this type of information are shown in figure 5.17.

88


5.6.3

Negative Information

Note that many of the conceptual sensor models use a flag which allows “negative” information to be considered. For instance, an object can be observed to be outside a certain region. This might arise in a system which uses, for instance, an RFID reader to detect the presence of tagged objects in a certain area. This negative information must be asserted, however; in other words, not getting information is not always treated as a negative observation. A more sophisticated treatment of negative information might also be used. For instance, the reliability of an object’s position estimate might be reduced when observations are missed, assuming observations could be expected given the object’s estimated position and the field of view of corresponding sensors.

5.7 Summary In this chapter, an implementation of the framework proposed in chapter 4 was introduced. Specifically, the approach to information representation within anchor spaces was described. Fuzzy sets were chosen as the main representational tool, since they provide a powerful and flexible way to represent heterogeneous and uncertain information. The implemented domains were also defined, and a number of symbolic and numeric information types were discussed; these information types can be used to create descriptions (via grounding functions), and they can be produced by information sources. Given the representations presented in this chapter, an anchor in the implemented framework can be seen as an evolving pair of fuzzy sets, implemented using two grids: a 2D grid for position information, and a 2.5D grid for colour information.

Chapter 6

Framework Realisation Part 2: Processes In chapter 4, the anchoring framework proposed in this thesis was described. In section 5.1, the implementation of the proposed framework was briefly outlined, and the rest of chapter 5 explained how the various types of information used in the framework are represented. In this chapter, the main processes used in the proposed anchoring framework are described. First, the implemented approaches to self-localisation and object localisation are discussed; these are used by the framework to create position grids which represent object position estimates in the local anchor space. Next, implemented approaches to data association, information fusion, and prediction are described. Finally, an illustration of these processes is presented. Again, it should be noted that the implementation described here merely reflects one possible realisation of the proposed anchoring framework.

6.1 Self-Localisation As was mentioned earlier, position information is extremely important for many applications. Robots often need to keep track of their own pose (x, y, φ) in the world, as well as the (x, y) positions of relevant objects. Self-localisation is discussed here because it is an important input to some of the conceptual sensor models described in the previous chapter. Specifically, it is needed when considering {near-self} and {observed-range-bearing} information, as described in sections 5.6.1 and 5.6.2, respectively. In this section two approaches to self-localisation will be described; both have been used with together with the anchoring framework. The first method is a landmark-based approach [31, 82, 81], used in experimental environments where recognisable landmarks are available. The second method is an Adaptive Monte-Carlo Localisation approach (AMCL) [151, 150], which performs scanmatching using a laser scanner and a map of the environment. 89

90

CHAPTER 6. FRAMEWORK REALISATION PART 2: PROCESSES

Two things should be emphasised. First, the anchoring framework does not require that self-localisation be addressed; in the described implementation of the framework, it is needed for only two of the implemented conceptual sensor models. Second, the two described approaches to self-localisation are only provided as examples of how self-localisation might be performed. The only real requirement is that the self-localisation method be able to provide, at any time, a value corresponding to how much a robot believes a given pose (x, y, φ) to be its true pose.

6.1.1

Representation

Self-localisation information is maintained as an estimate over the 3D space of possible (x, y, φ) poses; this estimate is denoted rm SELF . The (x, y) dimensions correspond to the global coordinates used in the position domain of the local and global anchor spaces; the φ dimension denotes the robot’s orientation. The self-localisation estimate rm SELF is represented using a possibilistic fuzzy set, implemented using a 2.5D grid, as described in section 5.2.2. This 2.5D grid is similar to the one used to represent colour information, described in section 5.3.2. A 2D bin model is used to represent (x, y) positions, and each cell contains the parameters of a symmetric trapezoidal membership function indicating which orientations are possible for that cell. Like the hue dimension, the orientation dimension is circular; again, the implementation of the trapezoidal membership function takes this into account. For a given pose (x, y, φ), the possibility of that pose being the robot’s true pose, denoted rm SELF (x, y, φ), is computed by finding the cell c which corresponds to that (x, y) position, and reading the value of the corresponding trapezoidal membership function at φ. Recall that the trapezoid in cell c is dem noted rm SELF (x, y) or rSELF (c). This method of representing pose information was proposed by Buschka et al [31].

6.1.2

Landmark-Based Self-Localisation

The first considered self-localisation method is a landmark-based approach, based on an approach proposed by Buschka et al [31] and extended by Herrero et al [82, 81]. The approach proposed by Buschka et al [31] assumes that a number of recognisable and unique features (landmarks) are at known positions in the environment. Herrero et al [82, 81] extend the approach to include situations where the features are non-unique. This fuzzy self-localisation method has been shown to produce robust results in a highly dynamic domain characterised by significant sensor noise and unpredictable model errors. The method has also proven to be quite insensitive to sensor model tuning [31]. In this work, vision is used to observe landmarks. The localisation process updates the 2.5D self-localisation grid using an iterative predict-update cycle.

6.1. SELF-LOCALISATION

91

Initialisation The grid is first initialised such that all positions and orientations are fully possible. Specifically, each cell contains a trapezoid with a level of 1.0, a base of 0.0, and a centre of 0.0. Both the core and the support are set to 2π. Prediction Prediction is performed periodically, and it takes robot motion into account; motion is estimated using odometry. The prediction step consists of a translation, a rotation, and a dilation of the rm SELF grid. The translation and rotation are applied together, and the dilation is applied afterwards to model uncertainty in the odometric information. The dilation amount is proportional to the magnitude of the motion update. The transformations are implemented as fuzzy morphological operations [21] in the (x, y) dimensions. Orientation trapezoids are shifted and widened according to the amount of rotation contained in the motion estimate. In addition to prediction based on motion, it is also possible to apply predictions based on the possibility that the robot is getting pushed or kidnapped. To account for the possibility of the robot being pushed, another dilation is applied, irrespective of motion. To account for the possibility of the robot being kidnapped, the overall bias of the grid is increased. In this case, the overall bias is the level of the trapezoids in the least possible cells. These additional predictions may not be needed, depending on the application. Update Updates are applied in response to landmark observations. Landmark observations are represented as range-bearing object observations, described in section 5.6.2. Two trapezoidal membership functions are used to represent observed landmarks: µρ for range and µθ for bearing. Each landmark observation is used to compute a temporary 2.5D grid Γ which describes the poses that are possible given that landmark observation. The grid Γ is computed as follows. Let l denote the (x, y) position of the observed landmark. For each cell c ∈ Γ values are assigned to the trapezoid Γ (c) as follows: level(Γ (c)) base(Γ (c)) centre(Γ (c)) core(Γ (c)) support(Γ (c))

= µρ (kclk)

(6.1)

= base(µθ ) = ∠(cl) − centre(µθ ) = core(µθ )

(6.2) (6.3) (6.4)

= support(µθ ),

(6.5)

where kclk and ∠(cl) denote the length and orientation of the segment linking cell c to landmark l, respectively. The level of the trapezoid corresponds to the

92


possibility of landmark l being observed from cell c based only on the distance between the two. The centre of the trapezoid is the orientation a robot would need in order to observe landmark l at the observed bearing centre(µθ ) from cell c. The other parameters of Γ (c) are set according to the bearing sensor model. In order to speed up computation, a look-up table is created at start-up containing the values of kclk and ∠(cl) for each cell and each known landmark. Non-unique landmarks are treated using the extensions proposed by Herrero et al [82, 81]. Specifically, if a non-unique landmark is observed, the above trapezoid is computed for each possible landmark position, and the trapezoidal outer envelope of the union (computed using the maximum operator) of those trapezoids is put into the trapezoids in each cell of the temporary grid. The grid Γ is used to update the self-localisation grid rm SELF by computing the intersection between the two: m rm SELF = rSELF ⊗ Γ .

(6.6)

To make sure that consistent observations have a reinforcement effect, the nonidempotent product t-norm is used. Normalisation If no cells are fully possible after an update, the fuzzy set is normalised by shifting the level and base values in each cell up until at least one cell is fully possible (as discussed in section 5.2.3). This is because it is assumed that the robot must be somewhere in the world. If this normalisation causes the bias of the grid to exceed a (fairly high) threshold called the revision threshold, denoted ζ, it is assumed that the robot has been kidnapped, since this means the observed information is fully inconsistent with the previous self-localisation estimate. In this case a belief revision is triggered, meaning the previous estimate in rm SELF is simply replaced with Γ , which contains the estimate of possible poses given the latest landmark observation. Example Figure 6.1 shows the evolution of a self-localisation grid in response to three landmark observations, and one odometric update. Note that robot orientation is not shown in the figure. In (a) and (b), all poses are possible. In (c), a non-unique landmark observation is shown; the corresponding self-localisation estimate is shown in (d). In (e) and (f), a unique landmark was observed. In (g) another landmark, identical to the one observed in (c), was observed. Finally, in (i) and (j), the landmark position estimates, in local coordinates, as well as the self-localisation grid, are predicted based on an odometric update which included both linear and rotational components. Note that some uncertainty was also added to the self-localisation estimate using dilation, after the translation and rotation were applied.

6.1. SELF-LOCALISATION

93

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

Figure 6.1: On the left, landmark observations are shown, in polar coordinates. The latest observation is shown using a light outline around the corresponding landmark. On the right, the self-localisation grids which resulted from the observations are shown. Darker areas indicate possible positions. In (i) and (j) an odometric update is applied.

94


(a)

(b)

Figure 6.2: Two sample self-localisation grids created using the AMCL self-localisation algorithm. In (a), there are three possible poses; in (b), there is only one. Darker positions indicate more probable positions. Orientation is not shown.

6.1.3

Adaptive Monte-Carlo Localisation

The second self-localisation method is a particle-based probabilistic method called Adaptive Monte-Carlo Localisation [151]; the implementation used in this work is from Player [69, 124], an open-source robot device interface. The Adaptive Monte-Carlo Localisation (AMCL) algorithm [151] maintains a sample-based probability distribution over the 3D space of (x, y, φ) poses. The number of maintained particles varies depending on how certain the robot is about its pose; this is why the algorithm is said to be “adaptive”. This allows a trade-off between computational requirements and accuracy to be exploited. The distribution is updated using the Bayes filter algorithm [150]. Similarly to the landmark-based localisation algorithm, the Bayes filter algorithm performs a prediction step based on motion information, and an update step based on sensor information. Motion information is received from odometry, and the update step is based on laser scans of the environment, which are compared to a map of the environment; this map is created offline. The output of the algorithm is a set of poses, each with an associated probability. This set of poses is mapped directly onto the 2.5D self-localisation grid using a function which is nearly identical to the conceptual sensor model used to treat numeric {near-position} information, described in section 5.6.2. To map the pose estimates to rm SELF , fixed-radius cones are used for each probable pose, where the radii of the core and support are set according to an estimate of the overall accuracy of the AMCL algorithm. The level of each cone is set based on the estimated probability of that given mode being the true pose of the robot. Note that the conceptual sensor model for {near-position} information does not consider orientation information. Orientation is added to the information in the 2.5D grid by setting the parameters of the trapezoids in each cell according to (a) the orientation values contained in the poses received from the AMCL algorithm, and (b) a fixed estimate of orientation uncertainty. Figure 6.2 shows two sample self-localisation grids created using the AMCL algorithm.

6.2. OBJECT LOCALISATION

95

6.2 Object Localisation The conceptual sensor model used to convert range-bearing observations from local polar coordinates to global Cartesian coordinates was only briefly described in section 5.6.2. In this section the coordinate transformation is described in detail; this transformation is the core of the proposed approach to multi-robot object localisation. The coordinate transformation from local polar coordinates to global Cartesian coordinates is extremely important; nearly all single-robot and multi-robot object localisation tasks require such a transformation. This is because observations are often received in local coordinates, and robotic systems often reason using representations in global coordinates. The proposed approach considers uncertainty in both observations and self-localisation. Since the self-localisation estimate typically contains orientation uncertainty, the transformation can be highly non-linear. Most other approaches ignore self-localisation uncertainty, or approximate it very coarsely. Moreover, most other approaches do not account for ambiguity (i.e. multiple modes) in self-localisation. This was discussed in section 2.2.6.

6.2.1

Relevant Information

There are three types of information involved in the transformation. First, there is the percept zm k [j]; the percept includes the observed range ρ and the observed bearing θ. The observation sensor model is set as described in section 5.6.2. Specifically, the observation and associated uncertainty are fully represented using two trapezoidal fuzzy sets, denoted µρ and µθ . Together, they form a 2D fuzzy set in polar coordinates, denoted µρ,θ . Single point, range-only, and bearing-only estimates can all be represented using this fuzzy ˆ is represented by setting µρ,θ (ρ, ˆ = 1 ˆ θ) ˆ θ) set. A single point estimate (ρ, and µρ,θ (ρ, θ) = 0 elsewhere. A bearing-only measurement of θˆ is obtained ˆ and 0 otherwise. A range-only measurement by setting µρ,θ (ρ, θ) = 1 if θ = θ, ˆ and 0 otherwise. of ρˆ is obtained by setting µρ,θ (ρ, θ) = 1 if ρ = ρ, The second type of information which is relevant to the coordinate transform mation is the self-localisation estimate rm SELF . Recall that rSELF is represented using a 2.5D position grid, in which each cell c contains a symmetric trapezoidal fuzzy set representing possible orientations φ for that cell. The trapezoid m m in each cell c is denoted rm SELF (x, y) or rSELF (c). The maintenance of rSELF is described in section 6.1. The third type of information is the output of the coordinate transformation, which is the output position grid, denoted zm k,POS [j]. This is a 2D grid which has the same number of cells and the same global Cartesian coordinate system as the 2.5D grid rm SELF ; however, instead of a trapezoid, each cell contains only a membership value, indicating the possibility of the observed object being located in that cell.

96

6.2.2


Coordinate Transformation Process

The transformation is not straightforward, since neither the self-localisation estimate nor the object observations are represented as points. Instead, the selflocalisation estimate, which includes uncertainty in both position and orientation, is represented by rm SELF , and the observation is represented by µρ,θ , which includes uncertainty in both range and bearing. It should be noted that one can address situations in which there is no uncertainty in rm SELF or µρ,θ , by using point estimates for these. This amounts to assuming perfect self-localisation, or a perfect sensor, respectively. The proposed method handles both cases transparently, so no special considerations are needed. The position grid zm k,POS [j] is computed as follows. Assume that robot m sees object n at range ρ and bearing θ. The observed range and bearing and associated uncertainty are encoded in the local observation µρ,θ , per the discussion in section 5.6.2. Let p = (x, y, φ) denote an arbitrary 3D pose for robot m; recall that φ represents the robot’s orientation. Let q = (x 0 , y 0 ) denote an arbitrary 2D position. Then the possibility of object n being at position q according to robot m is given by: m zm k,POS [j](q) = sup{ rSELF (p) ⊗ µρ,θ (kpqk, ∠(pq) − φ) }.

(6.7)

p

Equation 6.7 can be explained as follows. The length and orientation of the segment linking p to q are denoted kpqk and ∠(pq), respectively. The value of rm SELF (p) is a measure of the possibility that robot m is at pose p. The value of ∠(pq) − φ is the observed bearing to the target with respect to φ, which is robot m’s orientation in the global coordinate system. The value of µρ,θ (kpqk, ∠(pq) − φ) is the possibility that robot m could observe object n at position q from pose p according to the observation and associated uncertainty, represented in µρ,θ . The values rm SELF (p) and µρ,θ (kpqk, ∠(pq) − φ) are combined using a tnorm, as indicated by the symbol ⊗; in this work, the reinforcing product operator is used. The supremum of this combination is taken for all possible poses p, as indicated by the supp operator. This means that the overall possibility of object n being at position q is based on the pose p which yields the highest possibility of this being true. The output of the formula is essentially a measure of how possible it is that object n is at the 2D input position q, given: a) the 3D pose of robot m, represented by rm SELF , in global Cartesian coordinates; and b) the 2D range and bearing observation of object n made by robot m, represented by µρ,θ , in local m polar coordinates. The grid zm k,POS [j] is computed by calculating zk,POS [j](c) m for each cell c in zk,POS [j]. The coordinate transformation can yield a fuzzy set in which no positions are fully possible. This can arise because of inconsistent information in rm SELF and µρ,θ . For example, an object could be observed in a position which is outside the robot’s world model. In such cases, zm k,POS [j] is normalised, by shifting


97

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Figure 6.3: The figure shows an example of applying the coordinate transformation algorithm. In the first column, the observed object is on the robot’s left; it is shown as a circle, with a light ring around it. Observation uncertainty is not shown. The second column shows the self-localisation grid. The third column shows the resulting 2D position grid for the object. Darker regions indicate more possible positions.

the values of all positions up until the most possible positions are fully possible (as discussed in section 5.2.3). This is intuitive, since there should always be some fully possible position. The fact that the normalised estimate incorporates inconsistent information is indicated by the fact that the minimum value of the fuzzy set, or the bias, is increased – meaning that to some degree, any position is possible. In figure 6.3 an example of applying the coordinate transformation for a number of different self-localisation estimates is shown. Note that the object position estimates are not merely translations of the self-localisation estimates. This is due to the non-linearities introduced by considering orientation and bearing uncertainty.


98

Algorithm Algorithm 1 encodes the full computation. Step 1 simply resets zm k,POS [j] to zero. In step 2, ε is a fixed threshold below which cells in rm SELF are ignored; this threshold is typically quite low. This check speeds up the algorithm, since it allows iterations of the main loop which can only produce low possibility values to be skipped. In step 3 uncertainty in orientation and bearing are taken into account using a temporary trapezoid which is set using a combination of the parameters of rm SELF (x, y) and µθ . In step 4 updated cells are limited to those which are at a distance which is consistent with the range reading. These cells are on an annulus with radius r = kcqk and width w =support(µρ ). This check can be implemented very efficiently using Bresenham’s circle drawing algorithm [29]. w The algorithm is iteratively run for radii between r − w 2 and r + 2 . Step 5 combines the possibilities from range and bearing using the t-norm ⊗; again, the product is used, in order to reinforce positions which are possible according to both range and bearing components of the observation. Finally, step 8 normalises zm k,POS [j] if there are no fully possible values. Algorithm 1 Fuzzy coordinate transformation. m Require: rm SELF = one trapezoid rSELF (c) for each cell c. Require: µρ,θ = two trapezoids, µρ and µθ . Ensure: zm k,POS [j] = 2D grid of values in [0, 1]. 1: 2: 3:

4: 5: 6: 7: 8:

zm k,POS [j] ← 0 for all cell c such that level(rm SELF (c)) > ε do centre(µT EMP ) ← centre(rm SELF (c)) − centre(µθ ) core(µT EMP ) ← core(rm SELF (c)) + core(µθ ) support(µT EMP ) ← support(rm SELF (c)) + support(µθ ) level(µT EMP ) ← min(level(rm SELF (c)), level(µθ )) base(µT EMP ) ← min(base(rm SELF (c)), base(µθ )) for all cell q such that µ (kcqk) > bias(µρ ) do ρ

m zm k,POS [j](q) ← max zk,POS [j](q), µρ (kcqk) ⊗ µT EMP (∠(cq)) end for end for normalise zm k,POS [j]

6.2.3

Approximate Coordinate Transformation

Here an approximation of the coordinate transformation is described, which allows output grids to be derived using less computation than the full method. This can be useful for implementations on platforms with limited computational resources.


99

Figure 6.4: The grid on the left is computed using the full coordinate transformation, the one on the right is computed using the approximate coordinate transformation. The light dots show the centre of gravity for each grid.

The cost of algorithm 1 critically depends on the width w = support(µρ ) of the trapezoid representing range uncertainty. One way to reduce the complexity of the algorithm is to ignore range uncertainty in the algorithm itself, and introduce an approximation of it a posteriori, by performing a blurring operation on the resulting zm k,POS [j] grid; recall that blurring of grids can be efficiently implemented using a fuzzy morphological dilation operation [21]. The approximation still considers the full uncertainty in the self-localisation grid, as well as uncertainty in the bearing of the observation; it approximates only the range uncertainty in the observation. As observation range uncertainty increases, the accuracy of the approximation decreases. In most cases, however, the approximation’s effect on the results of the method is negligible. An example is shown in figure 6.4, where two zm k,POS [j] grids produced using the same data are shown; the grid on the left was created using the full algorithm, the one on the right was created using the approximate algorithm. Note that the uncertainty in the approximated grid is slightly less prominent in the horizontal direction; this reflects the fact that the approximation does not consider the full range uncertainty of the observation.


100

Algorithm The approximated coordinate transformation can be implemented using algorithm 2. The approximate algorithm differs from algorithm 1 in two ways. First, in step 4, only cells where kcqk = ρ are considered; these cells lie on a circle of radius ρ. These cells can quickly be found using a single iteration of Bresenham’s circle drawing algorithm [29], as opposed to the multiple iterations used in the full algorithm. Second, in step 8, a fuzzy morphological dilation operation is used to blur the zm k,POS [j] grid, by an amount proportional to core(µρ ). This operation is meant to approximate the range uncertainty in the observation, which was ignored in step 4. Algorithm 2 Approximate fuzzy coordinate transformation. m Require: rm SELF = one trapezoid rSELF (c) for each cell c. Require: µρ,θ = two trapezoids, µρ and µθ . Ensure: zm k,POS [j] = 2D grid of values in [0, 1]. 1: 2: 3:

4: 5: 6: 7: 8: 9:

zm k,POS [j] ← 0 for all cell c such that level(rm SELF (c)) > ε do centre(µT EMP ) ← centre(rm SELF (c)) − centre(µθ ) core(µT EMP ) ← core(rm SELF (c)) + core(µθ ) support(µT EMP ) ← support(rm SELF (c)) + support(µθ ) level(µT EMP ) ← min(level(rm SELF (c)), level(µθ )) base(µT EMP ) ← min(base(rm SELF (c)), base(µθ )) for all cell q such that kcqk = ρ do

m zm k,POS [j](q) ← max zk,POS [j](q), µρ (kcqk) ⊗ µT EMP (∠(cq)) end for end for dilate zm k,POS [j] by an amount proportional to core(µρ ). normalise zm k,POS [j]

6.2.4

Coordinate Transformation Complexity

The computational complexity of the full coordinate transformation for one object is O(ND), where N is the number of cells in rm SELF , and D is the number of cells in zm [j] which are possible according to the range sensor model k,POS m (step 4). Since grids of the same size are used for rm SELF and for each zk,POS [j], the worst case computational complexity is O(N2 ). The computational complexity of the approximate coordinate transformation is O(NC + NK), where N is the number of cells in rm SELF , C is the number of cells on the circle around the robot which has a radius equal to the observed bearing ρ, and K is the size of the structuring element used in the dilation. Since

6.3. DATA ASSOCIATION

101

√ the number of cells C can grow at most as N, and since√K does not depend on N, the asymptotic complexity of algorithm 2 is O(N N). This is consistent with empirical observations; the approximate algorithm typically requires significantly less time to execute than the full algorithm, especially for observations with much range uncertainty.

6.3 Data Association This section describes the data association method proposed in this work. The discussion begins by describing the algorithm’s inputs; specifically, the arrangement of the entities to be associated is described, for both local and global data association. Next, the proposed data association algorithm is presented; the algorithm is a single-scan best-first global nearest neighbour (GNN) approach to data association, as discussed in section 2.2.5. The implemented framework uses the same algorithm for both local and global data association. An approximated version of the algorithm is also proposed, and the complexity of the data association problem is discussed.

6.3.1

Local Data Association

In local data association, discussed in section 4.3.2, a single robot m is considm m m ered. Percepts Zm k = {zk [1], . . . , zk [Jk ]} arriving from robot m’s local inform m m mation sources S = {s1 , . . . , sKm } are associated with each other, and with m robot m’s local anchors Ψm = {ψm 1 , . . . , ψLm }. In section 5.6, an explanation m of how each percept zk [j] received from robot m’s local information source sm k is transformed into two grids was given. One grid is created for the position m domain (zm k,POS [j]) and one is created for the colour domain (zk,COL [j]). Both local and global anchors are represented in exactly the same way – using two grids, one per domain. Local data association is performed synchronously, at every frame. Recall from section 3.1.1 that a frame reflects the time during which any two percepts arriving from the same information source are considered to refer to different objects. If local data association were not performed at every frame, the algorithm would still need to keep track of when percepts arrived, in order to ensure satisfaction of this constraint. Initially, the set of local anchors is normally empty, with the possible exception of a single (often named) self-anchor. Arriving percepts are converted m into zm k,POS [j] and zk,COL [j] grids as described in section 5.6, and they are buffered until a frame ends, at which point the data association algorithm described in section 6.3.3 is invoked. If no new percepts arrive during a frame, local data association and fusion steps are skipped for that frame.

102


Table 6.1: Entities used for local data association.

Local Sources

Entities m

row 0: Ψ row 1: sm 1 row 2: sm 2 row 3: sm 3

⊥[0] ⊥[0] ⊥[0] ⊥[0]

ψm 1 zm 1 [1] zm 2 [1] zm 3 [1]

ψm 2 zm 1 [2] zm 2 [2]

zm 2 [3]

zm 3 [2]

Local Associations Table 6.1 shows an example of the percepts available for local data association at the end of a sample local frame. The first row corresponds to the robot’s local anchors, and the other rows correspond to the robot’s sources of information. Note that no two entities in the same row can refer to the same object. This is because a given robot’s local anchors, as well as percepts received during the same frame, are always assumed to refer to separate objects. The bottom element ⊥[0] is a placeholder, which is used when no entity from the corresponding row is considered. A local association is defined as a set of entities consisting of 0 or 1 local anchors, and 0 or 1 percepts from each information source. So an association must contain exactly one entry, possibly the bottom element, from each row in the table of entities – for instance, table 6.1. An association is intended to gather all entities which refer to a particular object. The output of the local data association algorithm is a set of local associations, each assumed to refer to a unique object. Recall from section 4.3 that for local associations which do not contain a local anchor (i.e. associations which contain the bottom element ⊥[0] in row 0), a new empty local anchor is created and added to the association.

6.3.2

Global Data Association

Global data association, discussed in section 4.4.1, involves associating local m anchors Ψm = {ψm 1 , . . . , ψLm } from multiple robots. Recall that both local and global anchors are represented using two grids: one for position and one for colour. Global data association is performed synchronously, at a pre-defined rate which is typically significantly lower than the frame rate. Note that global data association could technically be performed at any time – for instance at frame rate (assuming execution is fast enough), or asynchronously. Unlike local anchor management, global anchor management does not depend on the results of previous iterations; at any time, new global anchors can be created with the


103

Table 6.2: Entities used for global data association.

Global Sources row 0: Ψ

Entities 1

row 1: Ψ2 row 2: Ψ3 row 3: Ψ4

⊥[0]

ψ11

ψ12

⊥[0]

ψ21

ψ22

⊥[0]

ψ31

ψ32

⊥[0]

ψ41

ψ33

latest available local anchors. These newly created global anchors replace any pre-existing global anchors. At each global data association step, a robot m associates its own local anchors with those received from other robots, using the data association algorithm described in section 6.3.3. If none of the available local anchors were updated since the last global data association invocation, data association and fusion steps are skipped. Global Associations Table 6.2 shows the local anchors used for an example instance of global data association. Note that some local anchors may have been sent but not received due to communication channel errors. The proposed approach simply takes as much information as possible (i.e. everything which has been received so far) into consideration. Again, no two entities in the same row can refer to the same object, since local anchors from the same robot are assumed to refer to different objects. As before, the bottom element ⊥[0] is a placeholder, which is used when no entity from the corresponding row is considered. A global association is defined as a set of entities consisting of 0 or 1 local anchors from each robot. Again, this means that a valid association must contain exactly one entry, possibly the bottom element, from each row of the table of entities – for instance, table 6.2. The output of the global data association algorithm is a set of global associations, where each association refers to a unique object. The example table 6.2 has the same number of rows and entities as table 6.1; this is just to make it easier to see how the algorithm described in section 6.3.3 can be applied to both local and global data association. In practice, the tables are unrelated.


104

Table 6.3: Associations of entities.

association ID 0 1 2 3 4 5 6 7 8 ... 65 66 67 68 69 70 71

6.3.3

row 0 0 0 0 0 0 0 0 0 0 ... 2 2 2 2 2 2 2

row 1 0 0 0 0 0 0 0 0 1 ... 2 2 2 2 2 2 2

row 2 0 0 1 1 2 2 3 3 0 ... 0 1 1 2 2 3 3

row 3 0 1 0 1 0 1 0 1 0 ... 1 0 1 0 1 0 1

match value (example) 0.0000 0.6000 0.6000 0.8450 0.6000 0.9050 0.6000 0.7340 0.6000 ... 0.5500 0.6780 0.0000 0.0000 0.2180 0.7600 0.0000

Data Association Algorithm

Associations For both local and global data association, a table of entities is used to represent the information to be associated. In both cases, entities in the same row are assumed to correspond to different objects, which means that these entities need not be compared with each other. All entities are represented using two grids: a 2D grid for position, and a 2.5D grid for colour. These similarities make it possible for the same data association algorithm to be used for both local and global data association. Both local and global associations can be seen as arrays of indices: one index for each row in the table of entities. Index 0 is assigned to the bottom element in each row. The set of all possible associations is shown in table 6.3; this table could be for either of the two entity tables. The indices for each row in the entity table are shown in the middle of the table. The number of possible associations is the product of the number of entries in each row of the corresponding table of entities, including the bottom element. Each association has a unique ID, computed using the indices of the entries from each row of the table of entities. A sample match value is also shown for each association in the table; this value reflects how well the entities in a given association match.


105

In the presented implementation, the following steps are used to compute the match value of each association. Note that these steps are only one of many ways in which such a match value might be computed. 1. Association 0, which contains the bottom element for every row in the table of entities, is assigned a match value of 0. This association is said to be a_empty_invalid. 2. Associations with only one entity (i.e. entities which contain the bottom element for all rows except one) are called singleton associations, and they are assigned a fixed match value. This value is equal to the threshold below which entities are considered not to match, denoted δ. This value is typically around 0.6. The threshold value is assigned to these associations because associations which involve more than one matching entity are preferred. Setting a low match value here allows more informative matches to be preferred over singleton associations. 3. For associations with multiple entities, a comparison of the entities is performed. The first aspect of this comparison deals with entity names. All entities with the same name must be in the same association. An unnamed entity can be associated with any other entity, as long as the previous constraint is not violated. In practice, naming constraints translate into the following two rules. First, any association which contains only identically-named entities (and possibly bottom elements) is automatically assigned a match value of 1. Second, for any two rows which contain an entity with the same name, any association which includes one of the entities and not the other is considered a_name_invalid and given a match value of 0. An association which includes at least one named entity, and which is not a_name_invalid, is given the same name as the named entities it contains; other associations are unnamed. 4. Associations with multiple entities which have not been assigned a match value based on one of the previous rules are compared based on their contents. The entities in these associations are compared using the match1 operator, in each domain. Recall that the match1 operator, described in section 5.2.3, simply takes the highest point of the intersection of the input fuzzy sets. This intersection is computed cell by cell; in the colour domain, the hue trapezoids are intersected and the trapezoidal outer envelope of this intersection is used. Because only the maximum value of the intersection is interesting, matching in a given domain can be aborted as soon as any cell fully matches for all entities. Intersection is performed using the product t-norm, which means that entities which fully match are reinforced. The match values from each domain are combined using the minimum t-norm, producing an overall match value. The minimum is taken since the information from various domains is not necessarily


106

Table 6.4: Hypotheses.

hypothesis ID 0 1 2 3 4 5 6 ... 272

71 F F F F F F F ... T

included associations 70 ... 2 1 F ... F F F ... F F F ... F T F ... F T F ... T F F ... T F F ... T T ... ... ... ... T ... T T

0 F T F T F T F ... T

sample quality value 0.0000 0.0000 0.6000 0.0000 0.6000 0.0000 0.8450 ... 0.0000

independent. If the overall match value is below δ, the association is a_value_invalid, and it is given a match value of 0. 5. There is one extra consideration to make, with respect to names. If any association contains named entities which have the same name but do not match (according to the above criteria), the contents of these named entities are cleared. The cleared grids are used even during the fusion step which is performed after the data association step. If, in addition to such similarly named but non-matching entities, an association also contains unnamed entities, the contents of the named entities are cleared before the entities in the association are matched. This is done since if two entities have the same name they must match. If their contents do not match, they are unreliable, and they should not be allowed to make the association a_value_invalid, which is what would happen if the entities in the association were matched normally. Hypotheses A hypothesis is defined as a set of associations, intended to reflect a complete hypothesis about all associations which are true. Table 6.4 shows a very small portion of the table of all possible hypotheses for the associations in table 6.3. Even for this small example, the number of possible hypotheses is prohibitively large: it is the power set of all associations. So for N associations, there are 2N possible hypotheses. Fortunately, as will be discussed shortly, a search on the space of all hypotheses can be pruned quite effectively. Note that the hypotheses in table 6.4 have an associated quality value, which is computed as follows. 1. Any hypothesis containing an invalid association is itself invalid, and has a quality value of 0. Such a hypothesis can be h_empty_invalid,


107

h_name_invalid, or h_value_invalid, depending on whether the invalid association is a_empty_invalid, a_name_invalid, or a_value_invalid, respectively. 2. Since associations can be named, it can happen that within a given hypothesis, associations cannot be assigned unique names. For instance, two different associations might have the same name. In this case the hypothesis is given a quality value of 0, and called h_unnameable_invalid. 3. Since a hypothesis is intended to reflect a hypothesis about all associations which are true, the set of associations should contain each entity in the table of entities exactly once (only bottom elements may be contained multiple times). This is because an association is meant to contain all entities which refer to a particular object, and an entity cannot refer to more than one object. If any entity is contained in more than one association, the hypothesis is called h_multiple_invalid, and it is given a quality value of 0. A hypothesis in which each entity is considered exactly once is called complete. A hypothesis in which some entities are not considered at all is called incomplete. Incomplete hypotheses may be made complete by the inclusion of additional associations. 4. The quality value of a valid hypothesis is computed by taking the average of the match values of the associations it contains. Averaging is used here to compute an overall quality measure of the hypothesis. Using the minimum operator would cause a hypothesis containing associations with match values {0.9, 0.9, 0.6} to have the same quality as one with match values {0.6, 0.6, 0.6}, which is clearly undesirable. If a reinforcing operator like the product were used, match values {0.8, 0.8, 0.8} would be better than {0.8, 0.8, 0.8, 0.8}; this is also undesirable, since the quality value should not depend on the number of entities contained in the hypothesis. In this situation, the “compromise” produced by the averaging operation is exactly what is needed. Algorithm Overview The presented algorithm is inspired by global nearest neighbour (GNN) approaches to data association, as discussed in section 2.2.5. The approach adds the ability to consider generic match values from multiple domains, rather than considering only proximity-based matching in the position domain. The approach also allows names to constrain associations, when possible. A tree-based search is performed, in which each node in the tree is a hypothesis H. Branching involves creating a child hypothesis, denoted H’, by adding one extra association A to H. The added association must have a greater ID than all the associations already contained in H. If H already contains the asso-

108


ciation with the highest ID, H is a leaf in the search tree. Otherwise, a hypothesis is considered explorable. Fortunately, as was mentioned earlier, there are several ways in which hypotheses can be deemed invalid, which means the search space can be pruned quite aggressively. The goal of the search is to find all hypotheses which meet the following criteria: 1. a hypothesis should contain only valid associations (i.e. the hypothesis should not be h_empty_invalid, h_name_invalid, or h_value_invalid); 2. a hypothesis should be nameable, which means it should be possible to assign unique names to each included association (i.e. the hypothesis should not be h_unnameable_invalid); 3. a hypothesis should be complete, which means each entity should be considered exactly once (i.e. the hypothesis should be neither incomplete nor h_multiple_invalid). Note that a hypothesis which meets the above criteria might still be explorable; however, there is no point in further exploring a node which is complete, since adding an association to such a hypothesis will make it h_multiple_invalid. Search Algorithm Algorithm 3 is used to find the hypotheses which meet the above criteria. The algorithm proceeds as follows. Step 1 creates an empty list of explorable nodes. Step 2 creates an empty list of solution nodes. Step 3 creates an empty hypothesis H, i.e. a hypothesis which contains no associations. This hypothesis is the root of the tree, and it is pushed into the list of explorable nodes at step 4. The while loop at step 5 continues until there are no more interesting nodes to explore. The next node to explore is extracted at step 6. Step 7 loops through all the children of this node. Recall that children of H are hypotheses H’, which contain one extra association compared to H. This extra association must have a higher ID than all the associations already contained in H. Observe that if the node extracted at step 6 is a leaf, it is ignored. Such leaves can be ignored since they are known to have no interesting branches, and they cannot be solutions, since complete nodes are never pushed into the list of explorable nodes (as will be seen shortly). A child hypothesis H’ is created in step 8. Steps 9 to 14 check if the newly added association A is a_empty_invalid, a_name_invalid, or a_value_invalid; if A is invalid, hypothesis H’ is pruned accordingly. Note that the match value of association A is computed at step 13. Match values are cached during the search, since hypotheses may have many associations in common. Step 15 checks if H’ contains any entities more than once; if so, the hypothesis is pruned accordingly. In practice, the same function is used to verify if H’ is


Algorithm 3 Data association algorithm. Require: entities and associations Ensure: all solution hypotheses

3:

explorable_hypotheses = {} solution_hypotheses = {} H = {}

4:

push(H, explorable_hypotheses)

5:

while !empty(explorable_hypotheses) do

1: 2:

6:

H = pop(explorable_hypotheses)

7:

for all associations A which can be added to H do S H’ = associations(H) A

8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26:

if a_empty_invalid(A) then prune(H’, h_empty_invalid) else if a_name_invalid(A) then prune(H’, h_name_invalid) else if compute_match_value(A) < δ then prune(H’, h_value_invalid) else if h_multiple_invalid(H’) then prune(H’, h_multiple_invalid) else if complete(H’) then if h_unnameable_invalid(H’) then prune(H’, h_unnameable_invalid) else compute_quality_value(H’) sorted_push(H’, solution_hypotheses) end if else push(H’, explorable_hypotheses) end if

27:

end for

28:

end while

29:

return solution_hypotheses

109

110


complete (step 17), since both steps involve counting the number of times each entity is referenced. If H’ is incomplete, it is pushed into the list of explorable nodes (step 25). If H’ is complete, a check to see if it is nameable is performed (step 18). If it is not nameable, it is pruned (step 19). If it is nameable, it is a solution node. In this case, its quality value is computed in step 21, and it is pushed into the list of solutions in step 22. This list is sorted, so that the solution with the highest quality value can be extracted first. The algorithm performs a brute-force search, finding all possible solutions; it is therefore guaranteed to find the solution with the best quality value. There is always at least one solution – in which nothing matches. This is the worst case solution, which contains one association per entity; this solution contains only singleton associations. If more than one solution is found, the confidence in the best solution is represented as the difference between the quality value of the best solution and the quality value of the next best solution. This information might be useful for a high-level module; for instance, actions might be taken in order to disambiguate situations in which the confidence in the best solution is low [30, 88].

6.3.4

Bounded Data Association Algorithm

The complexity of the data association problem is inherently problematic. Although the implementation of algorithm 3 performs well when a small numbers of objects are considered, when the number of considered associations reaches roughly 125 (e.g. global data association between three robots, each with a single information source which has observed four objects), the time required by the full data association algorithm becomes prohibitively long. In applications which involve large numbers of objects, the exhaustiveness of the search can be sacrificed in favour of processing time. A bounded version of the search which can accomplish this is presented in algorithm 4. The bounded search returns after a fixed number of solutions have been found (or when all hypotheses have been explored). Although the bounded algorithm is not guaranteed to find the best solution, hypothesis quality values are used to direct the search, such that the best hypothesis so far is always explored first. This heuristic often results in the best solution being found early in the search. The bounded search described in algorithm 4 differs from the full search presented in algorithm 3 in the following two ways. First, the quality value of an explorable hypothesis is computed before it is pushed into the list of explorable hypotheses at step 28, which allows the list of explorable hypotheses to be sorted during the push at step 29. Second, steps 23 and 24 cause the search to be halted once the given maximum number of solutions has been reached. Two characteristics of the bounded algorithm should be noted. First, nonmatching entities will never be erroneously matched. Potential sub-optimality arises solely from the fact that the search can fail to associate entities which do


Algorithm 4 Bounded data association algorithm. Require: entities and associations Ensure: up to MAX_SOLUTIONS solution hypotheses 2: 3:

explorable_hypotheses = {} solution_hypotheses = {} H = {}

4:

push(H, explorable_hypotheses)

5:

while !empty(explorable_hypotheses) do

1:

6:

H = pop(explorable_hypotheses)

7:

for all associations A which can be added to H do S H’ = associations(H) A

8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30:

if a_empty_invalid(A) then prune(H’, h_empty_invalid) else if a_name_invalid(A) then prune(H’, h_name_invalid) else if compute_match_value(A) < δ then prune(H’, h_value_invalid) else if h_multiple_invalid(H’) then prune(H’, h_multiple_invalid) else if complete(H’) then if h_unnameable_invalid(H’) then prune(H’, h_unnameable_invalid) else compute_quality_value(H’) sorted_push(H’, solution_hypotheses) if (count(solutions_hypotheses) == MAX_SOLUTIONS) then return solution_hypotheses end if end if else compute_quality_value(H’) sorted_push(H’, explorable_hypotheses) end if

31:

end for

32:

end while

33:

return solution_hypotheses

111


112

match. This results in either the creation of anchors for objects which do not exist, or the selection of a sub-optimal set of associations. The second thing to note about the bounded algorithm is that if there is a hypothesis with the maximum possible quality value, the bounded version of the algorithm will always find it (or another with the same value, if there are more than one). This is guaranteed since the best hypothesis so far, which is always explored first, will always have the maximum quality value. Applicability In general, it makes sense to use the full data association algorithm for local data association, and the bounded algorithm for global data association. There are two main reasons for this. First, global data association will often involve more associations than local data association. The number of associations considered during local data association is limited by the number of information sources a robot has. The number of associations considered during global data association grows quickly as the number of robots increases. In practice, the full data association algorithm is normally able to address the local data association problem fast enough to make the use of the bounded algorithm unnecessary. The global data association step, on the other hand, can take a long time to complete if the full data association algorithm is used, especially once several robots have created local anchors for a number of objects. Second, errors in local data association can have long-term and cumulative negative effects. In particular, erroneously created local anchors can negatively affect subsequent local data association iterations. This is because local anchors are persistent, and existing anchors are used during local anchor management. Errors in global data association, on the other hand, can be “undone” during subsequent iterations of the global management steps. This is because previous global anchors are simply discarded when new ones are created.

6.3.5

Data Association Complexity

In order to avoid confusion between exponents and superscripts, the notation used in this section will be simplified. Robot superscripts will be omitted, and it will be assumed that all robots have the same number of information sources, and the same number of local anchors. It will also be assumed that each information source produces the same number of percepts at every frame. These assumptions do not affect the complexity of the problem. This simplified notation can be summarised as follows. The number of robots will be denoted M, as usual. The number of information sources per robot, normally denoted Km , will be denoted K. The number of local anchors per robot, normally denoted Lm , will be denoted L. Finally, the number of


113

percepts produced by a given information source on a given robot, normally denoted Jm k , will be denoted J. As was mentioned in section 4.1, the problem of associating all percepts arriving from all information sources across all robots has been decomposed into local and global steps. One reason for this is that it allows local computations to be performed at high frequencies, even when communication channel errors and latencies exist. Another reason for this decomposition is that it reduces the overall complexity of the addressed data association problem. The number of possible associations which each robot would need to consider if the problem were not decomposed is M (J + 1)K .

(6.8)

In the worst case, when nothing matches, it can happen that all possible hypotheses need to be examined. Since the size of the search space is the powerset of the number of associations, the worst case complexity of the search would therefore be M JK ) ( . O 2 With the decomposition, the local data association problem addressed by each robot would require considering at most (L + 1)(J + 1)K

(6.9)

associations. Again, the worst case is when nothing matches. In this case the original local anchors would all still exit, and each percept would result in the creation of a new local anchor. So the number of local anchors used in global data association would be L + J. The number of associations which would need to be considered by each robot in the global data association step, in the worst case, would therefore be (L + J + 1)M . (6.10) So the worst case complexity of the decomposed search, including both local and global data association, would be: K M O 2(LJ ) + 2((L+J) ) . Note that the worst case complexity is the same for both the full and bounded versions of the search. This is because in the worst case, there is only one solution, which is the case in which nothing matches. This solution will always have the lowest possible match value, and in the bounded search, it will be explored last. The best case complexity will be better for the bounded search. The full search will always have to find all solutions, including the worst case solution

114


in which nothing matches. The bounded search can stop at the first match, if the maximum number of solutions is set to 1. In practice, the search space is pruned very aggressively, by eliminating both invalid hypotheses and associations. Invalid associations can be used to quickly eliminate large numbers of hypotheses. The space which is actually searched is therefore considerably smaller than the worst case numbers given above. The bounded version of the search reduces the search space even more, and it usually results in a significant reduction in execution time. Numeric Example To illustrate both the significance of the problem’s complexity, as well as the computational reduction achieved by decomposing the problem, imagine a scenario involving M = 4 robots, with K = 3 information sources each, which, for a given frame, each produce J = 2 percepts. In total, the scenario would involve MKJ = 24 percepts for a given frame. Without the decomposition, each robot would have to send 6 percepts to the other robots, and receive 18. The input table of entities for the search algorithm would have 12 rows (one for each information source). Each row would contain 3 entities, including bottom elements. The search would therefore need to consider 531441 associations, per equation 6.8. With the decomposition, and assuming that there were no pre-existing local anchors initially, the local data association step would require that at most 27 associations be considered, per equation 6.9. Each robot would then need to send 6 local anchors to the other robots, and receive 18. Again, this assumes the worst case, in which no entities matched. The global data association step would then require considering at most 81 associations, per equation 6.10. The decomposition reduces the problem from one large search in a space of 2531441 associations to two searches in spaces of 227 and 281 associations. Although the search space is still intimidatingly large, the decomposition provides a drastic complexity reduction. This is true even if several local anchors are maintained before the local data association step begins. For the example scenario, the decomposed search becomes as large as the non-decomposed search only if 24 or more local anchors are maintained by each robot – this also assumes that none of the local anchors match any of the arriving 24 percepts. The cost of this decrease in complexity is that it is possible that a percept matched locally in one robot would have matched better with a percept from another robot. The decomposition can therefore result in sub-optimal matching, from a global perspective. However, the matching operator always ensures that entities associated with each other match at least as well as the match threshold δ. So sub-optimal data association should only arise in situations where objects are nearly indistinguishable with respect to the given matching criteria. These aspects should be kept in mind when choosing both the matching operator and the matching threshold δ.

6.4. INFORMATION FUSION

115

6.4 Information Fusion 6.4.1

Local Information Fusion

The output of the local data association step is a set of associations, each containing 0 or more percepts and exactly one local anchor, which may be a newly created (and hence empty) anchor. The local fusion step involves intersecting the percepts and the local anchor contained in each association A. The result is stored back into the local anchor, as follows: m m m m ψm l = ψl ⊗ z1 [j] ⊗ z2 [j] . . . ⊗ zKm [j].

(6.11)

m In the above equation, zm k [j] is the percept from robot m’s information source sk which is in association A; note that this may be the (empty) bottom element ⊥[0], as discussed in section 6.3. Bottom elements contain no information, and can be ignored during the fusion step. The operator ⊗ denotes the used t-norm; the reinforcing product operator is used here. The implementation of this fusion operation can be performed using a single scan of the grids in each domain. For the position domain, the cells from each input entity are combined using the product. For the colour domain, the hue trapezoids in each cell of the 2D grid are intersected using the product, and the outer envelope of this intersection is used as the result. Recall from section 4.3 that an anchor’s name is never affected by the fusion process. If an anchor was named, it keeps the same name; if it was unnamed, it remains unnamed. Despite the fact that the information being fused has been determined to match to a degree of at least the match threshold δ, it can happen that in a particular domain, the result of the fusion is a fuzzy set in which no values are fully possible; the maximum value may in fact be as low as δ. In such cases the fuzzy set in that domain is normalised, by shifting all values up until some cells have possibility values of 1, as discussed in section 5.2.3. Again, this normalisation increases the bias value of the overall estimate, indicating that to some degree any position is possible. As in self-localisation, if the normalisation causes the bias of the fuzzy set to exceed the revision threshold ζ, it is assumed that the object has been moved, in a way which makes it inconsistent with previous estimates. In this case, a belief revision is triggered, meaning the information in the local anchor ψm l is replaced with the information from the fused percepts only. Note that previous information contained in the local anchor is considered in the local fusion step. In order for this information to be relevant at various points in time, prediction must be performed, as discussed in section 6.5.

116

6.4.2


Global Information Fusion

The global data association step also returns a set of associations, each containing a number of local anchors. The global fusion step involves intersecting the local anchors in each association A. The result is stored in a new global anchor, as follows: 1 2 M ωm (6.12) g = ψl ⊗ ψl . . . ⊗ ψl ,

where ψm l is the local anchor from robot m which is in association A. The chosen anchor may be the (empty) bottom element ⊥[0]. Again, bottom elements contain no information, and can be ignored during the fusion step. For ⊗ the reinforcing product operator is used. And as in local information fusion, the operation can be performed using a single scan of the grids used to represent information in each domain. Recall that unlike in local information fusion, where names were not passed on from percepts to local anchors, local anchor names are transferred to global anchors, since these are not re-used by the anchoring framework. Since a reinforcing operator is used, it is important to avoid circular dependencies which could cause the same information to be considered more than once. This is why the result of the global fusion is stored separately, in a global anchor, and not back in local anchors. Otherwise, a local anchor containing information from robot m might be sent back to robot m and treated as new information. As in local information fusion, normalisation is applied if the result of the fusion in a particular domain contains no fully possible values. Note that no belief revision is possible, however, since previous information is not considered in the global fusion step: global anchors are always created using only currently available local anchors.

6.4.3

Approximating Local Anchors

In this section, three approximations of the 2D position grids used in local anchors are presented. These approximations reduce the amount of bandwidth needed to exchange position domain information between robots. Such approximations can be useful when bandwidth is limited. There is a trade-off between accuracy and reduced bandwidth, as will be shown in chapter 7. Note that similar approximations could be used for the colour domain; only position domain approximations are discussed here. Resolution Reduction Normally, robots exchange full 2D position grids. If these grids have N cells, then the (uncompressed) size of each grid is BN bits, where B is the number of bits used to represent the possibility value in each cell. One simple way to approximate the grids is to use a coarser resolution to represent possibility

6.4. INFORMATION FUSION

117

values. Typically, 1 byte per cell is used. Reducing the number of bits per cell results in a linear reduction in the bandwidth needed to send each grid. A related approximation, which has not been implemented, could be to reduce the spatial resolution of the grid – i.e., reduce the number of transmitted cells. Inspiration for such an approximation could be drawn from scaling operations performed in image processing. Single Bounding Box Another implemented approximation converts 2D grids into a bounding box, which is a very simple uni-modal parametric representation. The bounding box is computed so that it contains all cells which have possibility values greater than some threshold, for instance 0.8. In addition to the four corners of this bounding box, the bias value of the grid can also be sent, which provides a measure of reliability. All cells inside the bounding box are interpreted as having possibility values of 1; all cells outside the bounding box have the bias value. As an example, if 4 bytes are used to represent each corner of the bounding box and 1 byte is used to represent the bias value of the grid, the approximated 2D grid can be represented using only 17 bytes. Most importantly, the size of the bounding box representation does not depend on the size or resolution of the grid. Double Bounding Box A second parametric approximation consists of two bounding boxes, one at a higher threshold (e.g. 0.8) and one at a lower threshold (e.g. 0.2). Again, the bias value of the grid can also be sent as a measure of reliability. Together the bounding boxes describe the core and support of a 2D trapezoid, or pyramidal frustum, as described in section 5.2.2. If 4 bytes are used to represent each box corner, and 1 byte is used to represent the grid bias, then this approximation allows a 2D grid to be represented using only 33 bytes – again, regardless of the size of the grid.

6.4.4

Information Fusion Complexity

The computational complexity of both the local and global fusion steps, for one object and one domain, is O(NM), where N is the number of cells in the grid used for that domain and M is the number of entities being fused. Note that both local and global information fusion computations are carried out by each robot. Since the cost is linear in both the number of robots and in the number of entities to be fused, both the local and global fusion steps scale quite well.

118


6.5 Prediction 6.5.1

Local Prediction

After local anchors are created their properties may need to be predicted in order to allow their contents to be relevant at various points in time. Properties are updated using fusion whenever matching percepts are received. However, since these percepts may arrive infrequently, and since previous values are used during local anchor management, prediction is needed in order to keep local anchors as reliable as possible in between these updates. Recall that in this work it is assumed that only the position domain is dynamic; colour information is not predicted. Local prediction is applied to local anchors which have not been updated in some time. Prediction is applied to both locally created anchors, and local anchors received from other robots. Each local anchor has two timestamps: a “received” timestamp, which is set to the time when the last percept used to update the anchor was received; and an “updated” timestamp, which is set to the time when the anchor was last updated, either by prediction, or by fusion with an arriving percept (in which case the updated and received timestamps are the same). The updated timestamp is used to check if the anchor needs to be predicted; if this timestamp is old enough, prediction is applied. The difference between the received and updated timestamps is also a useful measure, which indicates how much the anchor’s contents are based on received information, as opposed to prediction. If the received timestamp has not been updated in a while, the information contained in the anchor is likely to be less reliable. In some situations, higher layers might choose to improve the reliability of object property estimates by initiating perceptual actions. A gaze control strategy based on this principle was described by Saffiotti and LeBlanc [140]. Two different types of prediction are used; both are applied at a fixed period. The first is a blurring operation, implemented using a fuzzy morphological dilation operation [21]; this is similar to the blurring performed on the self-localisation grid, as discussed in section 6.1, except that orientation can be ignored. This blurring implements an extremely simple prediction model; it simply predicts that objects can move in any direction at a given maximum velocity. The blurring amount is computed using an object’s estimated maximum velocity and the time since the anchor was updated. The second type of prediction is called evaporation, and it involves increasing the bias value of position grids by a small amount. The minimum value in the grid is computed, and each cell is assigned a value which is greater than or equal to this minimum value plus the evaporation amount. This prediction aims to account for the possibility that an object has suddenly been moved, at a speed greater than the estimated maximum velocity.

6.6. ILLUSTRATION

6.5.2

119

Global Prediction

Global anchors are not predicted; instead, they are simply recomputed when needed, via the data association and fusion steps discussed previously. If a local anchor used to create a global anchor has been subjected to prediction, this will be reflected in the corresponding global anchor.

6.5.3

Anchor Deletion

Anchor deletion is a very complex issue, as discussed in section 4.3.5. No automatic anchor deletion mechanisms have been implemented; only manual anchor deletion is possible, for both own and received local anchors. A user of the anchoring framework is required to explicitly trigger deletion of local anchors. Global anchor deletion is not implemented, since global anchors are always deleted before each global anchor management iteration. If desired, the creation of new global anchors can be triggered after local anchors have been deleted.

6.6 Illustration In this section a simple example is used to illustrate the data association and information fusion steps used in both local and global anchor management. Prediction and names are not considered in the example. The scenario involves two robots dealing with information arriving during a single frame.

6.6.1

Robot 1: Local Anchor Management

Local Entities Figure 6.5 shows the percepts received by robot 1 during the given frame. The figure presents the percepts in a table of entities, as in section 6.3.1. Initially, robot 1 had no local anchors, so the only entity in the first row is the bottom element. Robot 1’s first information source s11 was an RFID reader, which produced one percept z11 [1] during the considered frame. This percept contained both symbolic {in-region} position domain information (e.g. the object is on the table), and symbolic {is-colour} information in the colour domain (the object is cyan). Robot 1’s second information source s12 was a vision system, which also produced one percept during the frame, denoted z12 [1]. This percept was an object observation, containing {observed-range-bearing} position domain information, and {observed-colour} colour domain information; information in both domains was numeric.

120


Local Associations There are four possible local associations, shown in table 6.5. Recall that each local association consists of one entity from each row in the table of entities shown in figure 6.5. The match value for association 0 is 0.0, since it is a_empty_invalid. Associations 1 and 2 are singleton associations, containing only one entity each; they have match values equal to the matching threshold δ, which is 0.6. The match value for association 3 is 0.0, since the two percepts contained in that association do not match; the matching operation fails in both the position and colour domains. This makes the association a_value_invalid. Local Data Association The search performed by robot 1 using algorithm 3 is summarised in figure 6.6. Each node in the tree is a hypothesis, labelled with the associations it contains. The number of associations contained in a given hypothesis is equal to the depth of the node in the tree. The yellow diamond shows the only h_empty_invalid hypothesis, which contains association 0 (the only a_empty_invalid association). Orange hexagons are used to denote h_value_invalid hypotheses, which contain a_value_invalid associations. The bright green box shows the best solution hypothesis; its value is shown in the corresponding dotted node. Leaves in the tree are highlighted using additional contours. Recall that a leaf is any hypothesis that contains the association with the highest ID – in this case, association 3. Note that hypotheses are pruned according to the first pruning criteria they meet, per algorithm 3. The earlier in the algorithm a hypothesis is pruned, the sooner it and its children are excluded from the search space. For example, hypotheses {1, 3} and {2, 3} are both pruned because they are h_value_invalid. Both hypotheses are also h_multiple_invalid; however, no check for this was needed. Of a possible 16 hypotheses, 8 were explored. The only valid solution hypothesis is the so-called worst case hypothesis, in which nothing matches. It includes associations 1 and 2, which are both singleton associations. The quality value of a hypothesis is the average of the match values of the associations it contains, so the solution hypothesis has a quality value of 0.6. The entities included in each of the two associations contained in the solution hypothesis are shown using dashed red and green boxes in figure 6.5. The finely dashed red boxes indicate the entities contained in association 1; the dashed green boxes indicate the entities contained in association 2. Note that the bottom element in row 0 is considered in both associations. Since neither association contains a local anchor, two new empty local anchors are created, one for each association.

6.6. ILLUSTRATION

121

Figure 6.5: The table of entities used by robot 1 for local data association.

Table 6.5: Local associations for robot 1.

association ID 0 1 2 3

row 0 0 0 0 0

row 1 0 0 1 1

row 2 0 1 0 1

match value 0.0000 0.6000 0.6000 0.0000

empty

0

12

1

13

2

3

23

0.600000

Figure 6.6: The local data association search performed by robot 1.

122


Figure 6.7: The table of entities used by robot 2 for local data association.

Table 6.6: Local associations for robot 2.

association ID 0 1 2 3

row 0 0 0 1 1

row 1 0 1 0 1

match value 0.0000 0.6000 0.6000 1.0000

empty

0

1

2

12

13

23

3

1.000000

0.600000

Figure 6.8: The local data association search performed by robot 2.

6.6. ILLUSTRATION

123

Local Information Fusion The local information fusion step involves fusing the entities from each association contained in the solution hypothesis. For singleton associations, this step is trivial, since singleton associations contain only one non-empty entity each. The result of the fusion is therefore just a copy of this entity. Both of the associations in the solution hypothesis include a newly created (empty) local anchor, and one percept. After the fusion step, the local anchor is a copy of the percept. The local anchors created by robot 1 are shown in the first row of figure 6.9.

6.6.2

Robot 2: Local Anchor Management

Local Entities Figure 6.7 shows the local entities available to robot 2 for local anchor management. Initially, robot 2 had only one local anchor, shown in the first row of the figure. This anchor contained both position and colour information. Robot 2 also had one information source s21 , which was a vision system. One percept z21 [1] was produced during the frame; the percept contained numeric {near-position} information in the position domain (e.g. the object is near the door), as well as symbolic {is-colour} information in the colour domain (e.g. the object is red). Local Associations There are again four possible local associations, shown in table 6.6. Each local association consists of one entity from each row in table 6.7. The match values for associations 0 to 2 are the same as for robot 1. Again, association 0 is a_empty_invalid, and associations 1 and 2 are singleton associations. The match value for association 3 is 1.0, since the two entities in that association match perfectly. Local Data Association The search performed by robot 2 using algorithm 3 is summarised in figure 6.8. Again, the yellow diamond indicates the h_empty_invalid hypothesis. Pink octagons indicate h_multiple_invalid hypotheses. The bright green box again shows the chosen solution hypothesis. The pale green parallelogram shows another possible solution, which had a lower quality value. Quality values for both solutions are shown in corresponding dotted nodes. Additional contours are used to highlight leaf nodes. Note that hypotheses {1, 3} and {2, 3}, which were pruned because they were h_value_invalid in figure 6.6, are pruned here because they are h_multiple_invalid.

124


Since all contained associations were valid in this case, the search algorithm had to consider these hypotheses more carefully. Of a possible 16 hypotheses, 8 were explored. The chosen solution includes only one association, which contains both robot 2’s local anchor, and the received percept z21 [1]. The match value for the association is 1.0; this is also the quality value of the hypothesis. The entities included in association 3 are indicated using the finely dashed red boxes in figure 6.7. Note that no bottom elements are included in this association. No new anchors need to be created, since the association already contains a local anchor. Local Information Fusion The local information fusion step involves fusing the two entities included in association 3, which was the only association contained in the solution hypothesis. Entities are fused by taking the intersection of the fuzzy sets in each domain. The resulting local anchor is shown in the second row of figure 6.9.

6.6.3

Global Anchor Management

Global Entities Figure 6.9 shows the local anchors used for global data association. It is assumed that the robots successfully exchanged their local anchors, so the global anchor management steps performed by each robot are identical. After local anchor management, robot 1 had two local anchors and robot 2 had one. All anchors contained both position and colour information. Global Associations There are six global associations, shown in table 6.7. Each global association consists of one entity from each row in the table of entities shown in figure 6.9. The match values for associations 0 to 2 are the same as in the local data association steps for both robots. Once again, association 0 is a_empty_invalid, and associations 1 and 2 are singleton associations. The match value for association 3 is 0.0, since the local anchors contained in the association do not match. Association 4 is another singleton association, with a match value of 0.6. The local anchors contained in association 5 match, but not fully; the match value for this association is 0.8333. Global Data Association The search performed for global data association using algorithm 3 is summarised in figure 6.10. Once again the yellow diamond is used to indicate the h_empty_invalid hypothesis; orange hexagons are used to indicate hypotheses which are h_value_invalid; and the pink octagons indicate h_multiple_invalid

6.6. ILLUSTRATION

125

Figure 6.9: Global matching example.

Table 6.7: Associations for global matching.

association ID 0 1 2 3 4 5

row 0 0 0 1 1 2 2

row 1 0 1 0 1 0 1

match value 0.0000 0.6000 0.6000 0.0000 0.6000 0.8333

126


empty

0

123

12

13

124

125

1

14

145

2

15

23

3

24

245

4

25

0.716667

0.600000

Figure 6.10: Global matching example search.

Figure 6.11: Global anchors resulting from global information fusion.

5

45

6.7. SUMMARY

127

hypotheses. The blue triangle is an incomplete leaf hypothesis. Again, the bright green box shows the chosen solution hypothesis, and the pale green parallelogram shows a solution which had a lower quality value. Quality values for both solutions are shown in the dotted nodes. Of a possible 64 hypotheses, 20 were explored. The chosen solution includes two associations; one singleton association, and one association which contains two local anchors. The average match value for this hypothesis is roughly 0.7. The entities included in each association are shown using dashed red and green boxes in figure 6.9. The finely dashed red boxes indicate the entities contained in association 5; the dashed green boxes indicate the entities in association 2. Global Information Fusion The global information fusion step involves fusing the local anchors from each association in the solution hypothesis. A new global anchor is created for each association, and all previous global anchors are discarded. The resulting global anchors are shown in figure 6.11. In this case, the global anchors are the same for both robots, since it is assumed that the considered local anchors were identical. Note that since the entities in association 5 did not fully match in the position domain, normalisation was applied as discussed in section 6.4.2. This can be seen by the grey background in the position grid of the corresponding global anchor.

6.7 Summary In this chapter, implementations of the processes used in the proposed anchoring framework were described. First, two self-localisation approaches were presented; self-localisation is an important input to certain conceptual sensor models used by the implemented framework. A detailed description of one particularly important conceptual sensor model, that used to process numeric {observed-range-bearing} information, was also given. This conceptual sensor model performs a coordinate transformation between local polar coordinates and global Cartesian coordinates in the position domain, taking uncertainty from both self-localisation and observations into account. This conceptual sensor model is the core of the proposed object localisation approach. Details regarding the processes used in local and global anchor management were also given. In particular, the implemented data association algorithm was described in detail, and a bounded version of the algorithm was proposed. It is important to observe that the inherent complexity of the data association problem is a significant challenge. Any data association algorithm which considers large numbers of objects must either approximate the search space, sacrific-

128


ing global optimality, or be able to exploit application-dependent assumptions which reduce the complexity of the search. Information fusion and prediction processes were also described. These processes scale well, and are easy to implement given the information representation choices described in chapter 5. The implemented approach to prediction is simple, but it is sufficient to illustrate how prediction fits into the proposed framework. An example of the data association and fusion steps was also provided. The various processes described in this chapter demonstrate how local and global anchors, represented as 2D and 2.5D grids, are created and maintained over time.

Chapter 7

Cooperative Object Localisation Experiments The implementation of the proposed anchoring framework uses the intersection of fuzzy sets to fuse information obtained from different sources. In particular, the framework allows robots to exchange and fuse uncertain position information about observed objects; this is essentially cooperative object localisation. One of the key features of the proposed approach is that the information to be fused is spatially aligned using a coordinate transformation which takes both observation and self-localisation uncertainty into account; this coordinate transformation was described in section 6.2. This chapter reports the results of experiments which evaluate the proposed approach to cooperative object localisation. The experiments provide an empirical analysis of the implementation of one of the key components of the proposed anchoring framework; namely, the information fusion component. The analysis is particularly relevant since it examines the performance of the fusion component in the face of a complex yet well-studied problem. The experiments are carried out using three robots, which exchange information about the location of a uniquely identifiable and static ball. No filtering is applied to the input data; the fact that the ball is uniquely identifiable trivialises data association; the fact that it is static allows prediction to be ignored. These aspects allow the fusion component to be examined in isolation. The presented experiments were performed with three goals in mind. The first goal was to empirically validate the approach to cooperative object localisation. The second goal was to assess the performance of the approach given various types of input errors; such an assessment allows an informed decision to be taken regarding whether the method is suitable for a given application. The third goal was to evaluate the degradation in performance introduced by various approximations of the exchanged position information; such an evaluation is useful for applications with strict bandwidth limitations.

129

130

CHAPTER 7. COOPERATIVE OBJECT LOCALISATION EXPERIMENTS

7.1 Methodology Initial evaluations of the proposed method showed widely varying results from one experimental setup to another; the experimental setup in this context includes the platforms, sensors and experimental conditions (e.g. lighting). This prompted a more systematic analysis of the method, the results of which are presented here. The analysis is intended to describe an input-error landscape, which shows how the performance of the method varies as different types and amounts of errors are introduced on its inputs. The considered input variables include the self-localisation estimate, as well as the range and bearing values of object observations. The systematic analysis is achieved empirically, by introducing increasingly large amounts of artificial errors on each input variable independently. Errors in object observations are introduced by corrupting measured range and bearing values. Errors in self-localisation are introduced by corrupting landmark observations. Three types of input errors are examined: systematic errors, random noise, and false positives. Artificially corrupted observations are based on real data, recorded in real time using a number of experimental layouts. The observations are first idealised offline; in other words, range and bearing measurements to both landmarks and target objects are set to reflect the ground truth. Various types and amounts of artificial input errors are then introduced. For this systematic analysis, the robots and the target object are static; recall that this allows the performance of the fusion process to be examined in isolation. In addition to the systematic analysis, one other experiment is presented, which uses unaltered real data, in a scenario where one robot is moving. The results of this experiment are consistent with the ones obtained using the systematic analysis; they reflect the performance of the proposed method at one specific point on the input-error landscape.

7.2 Experimental setup 7.2.1

Robots

The used robots are Sony AIBO ERS-210A [159]; see figure 7.1. These robots were used in earlier editions of the RoboCup competition [148]. Each robot has 32MB of SDRAM, and a 64-bit RISC processor with a clock speed of 384MHz. The robot’s main sensor is a 100000 pixel CMOS camera, mounted on the robot’s head. The robots communicate via wireless ethernet. Since the robots use legs instead of wheels, odometry is particularly unreliable, mainly due to unpredictable slippage. The most serious errors in perception, for both landmark and target observations, occur because:

7.2. EXPERIMENTAL SETUP

131

Figure 7.1: AIBO robot.

• range estimates are based on the size of objects in the camera image, hence accuracy crucially depends on lighting conditions; also, the precision of range estimates quickly decreases with distance; • bearing precision is limited due to uncertainty in the position of the camera; in particular, the pan joint position estimate often contains errors; • false positives and false negatives are relatively frequent, due to errors in colour segmentation caused by the camera’s low resolution and high sensitivity to lighting. These errors do not affect the systematic analysis, since in this analysis the sources of error are artificial; in these experiments, input error types are isolated, rather than their sources. However, these sources of error should be kept in mind for the final experiment which uses unaltered data.

7.2.2

Environment

The environment is an area of approximately 3 × 5 metres, with eight unique landmarks. The setup is based on one of the playing fields used in earlier editions of the RoboCup competition. All objects of interest are colour coded; this allows the data association problem to be simplified. A photo of the setup is shown in figure 7.2. For all grids, a spatial resolution of 100mm is used; the precision of the method is limited by this choice. It has been verified that using finer resolutions does not significantly affect performance. Given the size of the environment, this means that there are approximately 30 × 50 cells in the position grids used to describe object position estimates. Using a maximum resolution of 8 bits per cell, a full position grid has a size of 1.5kB. In all the experiments, there are three robots and one static ball, which is the only target object. In all runs, the robots use the gaze control strategy described by Saffiotti and LeBlanc [140] to keep both the target and landmarks under observation.

132


Figure 7.2: Experimental environment.

Three experimental layouts are used, as shown in figure 7.3. Other layouts were tested, but the results did not vary significantly from one layout to another. The first two layouts, in which the robots are static, are used for the systematic analysis. The third layout is used for the experiment with unaltered data; in this case there is one moving robot. The approximate path taken by the robot is shown by the dotted line.

7.2.3

Ground truth

The true positions of static robots and objects are measured for each experimental layout. In layout 3, the moving robot’s pose is determined using a Polhemus Fastrak 6D magnetic position tracker [125] mounted on the robot’s back. The reference point for this tracker can be seen in figure 7.2. The accuracy of the tracker degrades quite quickly as the distance from the reference point increases. Error estimates are shown in figure 7.4. However, in the relevant experiment, the moving robot is never farther than 500mm away from the reference point. At this distance, the tracker error is always less than 15mm.

7.2.4

Performance Metrics

The performance metric used for all experiments is the distance (in mm) between the fused object position estimate, which is based on information from all three robots, and the ground truth position of the target object, which is known in advance for each layout.

7.2. EXPERIMENTAL SETUP

Layout 1

133

Layout 2

Layout 3

Figure 7.3: Experimental layouts used in the experiments.

7.2.5

Software Setup

The architecture uses on the robots is a modular layered architecture, loosely based on the Thinking Cap software architecture [137]. Originally, everything was implemented within this architecture and run on-board the AIBO robots. For the experiments presented here, the software is separated into two parts: an on-board part and an off-board part. The on-board and off-board modules communicate via wireless ethernet. The on-board software includes modules for perception and motion control. The perception module performs object recognition based on colour segmentation [155], and for each observed object it returns its identity together with an observed range and bearing (ρ, θ). The perceptual module also takes care of gaze control, determining where and when to look for landmarks and the target object [140]. The motion control module sends motion commands to the low-level controller of the robot, and produces motion estimates based on odometry. Recall that odometric information is typically very poor for the legged AIBO robots. The off-board software consists of a customisable tool called the anchoring monitor, which implements the multi-robot object localisation method; the tool allows data received from all robots to be logged, processed and analysed in a number of ways. Motion updates, landmark observations, and target object observations are logged by the tool. These logs are used in all the presented experiments. For the experiment using unaltered data, the tool is used only to log, process, and analyse the data. For the systematic analysis using artificially introduced errors, the tool is used to log, idealise, artificially corrupt, process, and analyse the data. The idealisation step involves recomputing the range and

134

CHAPTER 7. COOPERATIVE OBJECT LOCALISATION EXPERIMENTS Polhemus 6−D tracker error vs distance to reference point 180 170 160 150 140 130 120

Error (mm)

110 100 90 80 70 60 50 40 30 20 10 0 100

200

300

400 500 600 700 800 Distance between probe and reference point (mm)

900

1000

1100

Figure 7.4: Plot of the 6D tracker error versus the distance from the reference point. The dots indicate raw readings, and the line is a trend line computed using regression.

bearing components of logged landmark and target observations, so that they correspond to ground truth information. The artificial corruption step applies the various types of input errors discussed previously to the data. Event ordering and timing are preserved. Self-localisation estimates are created and maintained using the landmarkbased approach described in section 6.1.2. In the presented experiments, it is assumed that global anchors are constantly being requested. This means global anchor management steps, and global information fusion in particular, are performed for every frame. In the off-board implementation used for the analysis, this is easily achieved, since the logged data are processed from within the tool, which maintains the estimates for all robots within the same process. In the on-board implementation, this would require that enough bandwidth and computation be available to transmit and process the grids as often as target observations are made. For the case with three robots observing one ball, this is feasible, given the bandwidth available from wireless ethernet and the available processing power on the robots. Note that the software tool used here is also used to examine the entire anchoring framework, as discussed in chapter 8.

7.3. EVALUATED METHODS

135

7.3 Evaluated Methods Results have been computed and compared for eight different methods: • 8BPC-exact: the proposed fuzzy fusion method, using the exact coordinate transformation algorithm 1, and eight bits per cell to represent possibility values; • 8BPC: same as previous, but using the approximate coordinate transformation algorithm described in section 6.2.3; • 4BPC: same as previous, but using four bits per cell; • 2BPC: same as previous, but using two bits per cell; • 2BB: same as previous, but using the two bounding box approximation for the exchanged position grids described in section 6.4.3; • 1BB: same as previous, but using the single bounding box approximation; • IWAVG: an ideally weighted average, used as a reference method; and • AVG: a non-weighted average, used as a reference method. As mentioned previously, the 8BPC-exact and 8BPC methods produced very similar results; the 4BPC and 2BPC methods also performed similarly. To keep the figures readable, and to avoid confusion between methods which performed similarly, results achieved using the following six methods are presented: 8BPC, 2BPC, 2BB, 1BB, IWAVG, and AVG. The IWAVG and AVG are reference methods with which the proposed approach is compared. When evaluating robotic systems, selecting reference methods is always a delicate issue. Rarely do reference methods reflect the latest and best alternative approaches, since the implementations of these are often difficult to achieve. Also, reference methods are typically not implemented with as much care and expertise as the methods under test. In the presented experiments, instead of a single “straw man” reference method, two relatively objective reference methods are used; the selected reference methods produce upper and lower bounds on the results achievable by all averaging-based approaches to cooperative object localisation. Averaging-based methods encompass an important subset of object localisation methods, as discussed in section 2.2.6. In particular, since a static target is used, methods based on Kalman filtering would essentially compute a weighted average of observations. Recall that neither the proposed method nor the reference methods use filtering or prediction in these experiments; previous state is ignored, and results are computed by fusing estimates produced by all three robots at each time step. If no observation is made during a particular time step, the last observation is used.

136


For both reference methods, target object position estimates are computed as follows. Range-bearing observations made by each robot are represented as 2D position grids, using the approximate coordinate transformation described in section 6.2.3; this is the same transformation used by all methods under test. Note that for the reference methods, the transformation takes only selflocalisation uncertainty into account, since target observations are crisp. The centre of gravity of the resulting grids is then computed per equation 5.3. The results from the three robots are then averaged in x and y. The lower bound reference method uses a simple non-weighted average (AVG), which is the least informed way to perform averaging. As an upper bound, an “ideally weighted average” (IWAVG) is used; in this case, weights are computed using ground truth information. Note that this method represents an upper bound on the performance achievable by averaging-based approaches; it is not an absolute upper bound on the performance of any alternative method. Also, it is important to keep in mind that the IWAVG method produces results which could not be reproduced in a real system, since robots obviously do not have access to the ground truth information used to compute the idealised weights. The weights in the IWAVG method are computed as follows. Consider M robots observing a target, where robot m observes the target at pm . The fused estimate p obtained by IWAVG is given by p=

PM

m=1 wm · pm . PM m=1 wm

(7.1)

Each weight wm is computed by kpm qk−1 , wm = PM −1 i=1 kpi qk

(7.2)

where q is the ground truth position of the target, and kpm qk denotes the distance between pm and q.

7.4 Exploring The Input-Error Landscape To explore the input-error landscape, idealised data were corrupted using different types of artificially introduced input errors; these span the different axes of the landscape. First, 30 to 60 seconds worth of data were logged for a number of runs of layouts 1 and 2 from figure 7.3. These logs contain slightly different event ordering and timing of landmark and target observations. Algorithm 5 was then run on the log files. The types of input errors considered at step 4 were: systematic errors and random noise on both range and bearing measurements to the ball; false ball detections; and errors in self-localisation. Self-localisation errors were created

7.4. EXPLORING THE INPUT-ERROR LANDSCAPE

137

Algorithm 5 Systematic analysis of the input-error landscape. Require: Set of all log files obtained using layouts 1 and 2 Ensure: Statistics about the input-error landscape 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:

for all logfile F do Ideal(F) ← create idealised data from F end for for all type T of artificial input errors do for i = 0 to 20 do for all logfile F do Corrupted(F) ← corrupt Ideal(F) data with errors of type T Result(F) ← process Corrupted(F) logfile end for Compute statistics for this run of all logfiles end for Compute overall statistics for errors of type T end for

by introducing errors of all types on landmark observations. Note that the data for each type of input error was corrupted and analysed 20 times in step 5. For runs with only systematic errors (no random noise or false positives), the results do not vary from one iteration to the next; these results only needed to be computed once. Data corruption (step 7) was performed as follows. In the used experimental setup, range estimates are mainly subject to multiplicative errors, since uncertainty in range estimates are assumed to increase as the range to the target increases. This assumption is justified by two observations. First, range errors often originate from errors in image segmentation; for instance, the width of the object in pixels could be incorrect because the image was blurred due to camera motion. Second, an error of one pixel in object segmentation induces a range error which has a magnitude proportional to the distance to the object. This is because range to an object is computed by comparing its size (in mm) in the real world and its size (in pixels) in the image, modulo the optical parameters of the camera. Given this, an artificially corrupted range estimate ρc can be computed from an ideal range estimate ρi as follows: ρc = ρi · (1 + ηρsys + ηρran ),

(7.3)

where ηρsys and ηρran are values in [0, 1], representing the percentages of systematic errors and random noise introduced, respectively. Random noise was uniformly distributed.

138


Bearing estimates are typically affected by additive errors, e.g., due to pan joint position uncertainty. As such, an artificially corrupted bearing estimate θc can be computed from an ideal bearing estimate θi as follows: θc = θi + θθsys + θθran ,

(7.4)

where θθsys and θθran are the amounts (in degrees) of systematic errors and random noise introduced, respectively. False positives were introduced by replacing randomly chosen observations with random values within the measurement domain. A value ηfp indicates the percentage of observations which were replaced.

7.5 Results 7.5.1

Artificial Errors on Target Observations

The results for runs in which input errors were introduced only on target observations are presented in figures 7.5 to 7.9. The plots show the average error in the fused object position estimates for each method, versus the amount of introduced input error of the given type (one type of input error per plot). For runs with random noise and false positives, the results reflect an average of 20 runs; runs with only systematic errors only needed to be run once. Note that in these plots, self-localisation is based on perfect landmark observations. This means that the self-localisation estimate was precise and accurate; this means that these results reflect the performance of the presented methods in response to object observation errors only. In figure 7.5, the results for systematic bearing errors on ball observations are shown. Here the proposed methods perform considerably better than even the upper bound IWAVG method. Even the most drastic approximations outperform both reference methods for small bearing errors. In particular, note that the 2BB method, which is a uni-modal approximation which requires very little bandwidth, outperforms the reference methods consistently. Figure 7.6 shows the results for random bearing errors on ball observations. Again, the proposed methods outperform the reference methods, although to a lesser extent. In particular, the 1BB and 2BB approximations no longer outperform the IWAVG reference method. Results for systematic and random range errors on ball observations are shown in figures 7.7 and 7.8. In these figures even the lower bound AVG reference method outperforms all of the proposed methods, although in the systematic case the results for all methods are quite similar. Interestingly, in figure 7.8, both the 2BPC and 2BB approximations perform nearly as well as the full method. The 1BB approximation collapses suddenly when the input errors reach a fairly small threshold. Figure 7.9 shows the results obtained when some observations were replaced with false positives. In this case the upper bound IWAVG method drasti-

7.5. RESULTS

139

avg fused err vs systematic theta error on ball 600

8BPC 2BPC 2BB 1BB IWAVG AVG

500

avg fused err (mm)

400

300

200

100

0 5 10 15 −50 −45 −40 −35 −30 −25 −20 −15 −10 −5 0 systematic theta error (deg)

20

25

30

35

40

45

50

Figure 7.5: Systematic bearing errors added to ball observations. The proposed methods perform considerably better than the best results achievable by weighted averaging, represented by the upper bound IWAVG method. avg fused err vs random theta error on ball 600


500

avg fused err (mm)

400

300

200

100

0

0

5

10

15

20 25 30 max random theta error (deg)

35

40

45

50

Figure 7.6: Random bearing errors added to ball observations. The proposed methods perform slightly better than the upper bound IWAVG method.

140


avg fused err vs systematic rho error on ball 600


500

avg fused err (mm)

400

300

200

100

0 −50

−40

−30

−20

−10 0 10 systematic rho error (%)

20

30

40

50

Figure 7.7: Systematic range errors added to ball observations. The reference methods perform slightly better than the proposed methods.

avg fused err vs random rho error on ball 600


500

avg fused err (mm)

400

300

200

100

0

0

10

20 30 max random rho error (%)

40

50

Figure 7.8: Random range errors added to ball observations. The reference methods perform better than the proposed methods.

7.5. RESULTS

141

avg fused err vs false positives for ball 600


500

avg fused err (mm)

400

300

200

100

0

0

10

20 30 false positives (%)

40

50

Figure 7.9: False positive ball observations added. The methods perform similarly, except for the upper bound IWAVG method, which drastically outperforms all other methods. This is because the IWAVG method has effective knowledge about which observations are false positives, via the weights used for averaging, which are computed using ground truth information.

142

CHAPTER 7. COOPERATIVE OBJECT LOCALISATION EXPERIMENTS avg fused err vs vector length of self xy errors

avg fused err (mm)

200


100

0

0

100

200 vector length of self xy errors (mm)

300

400

Figure 7.10: Fused error versus combination of translational self-localisation errors from all robots. Only points where orientation errors are small are considered. The fact that the slope is less than 1 indicates that there is at least some compensation for errors in self-localisation. avg fused err vs vector length of self th errors 600


500

avg fused err (mm)

400

300

200

100

0

0

10

20

30

40

50

60 70 80 90 100 110 120 130 140 150 160 170 180 vector length of self th errors (deg)

Figure 7.11: Fused error versus combination of self-orientation errors from all robots. Only points where translational errors are small are considered. The performance of the proposed methods is visibly better than even the upper bound IWAVG.

7.5. RESULTS

143

cally outperforms the other methods. This is understandable, since the ground truth information used to compute the averaging weights gives the method effective knowledge about which observations are false positives. The proposed methods do outperform the lower bound AVG reference method. Once again the 2BB method performs particularly well; it matches the performance of the 2BPC method, which requires much more bandwidth. Recall that for all methods, no filtering was used; in a real application, filtering would likely provide a substantial improvement in performance, particularly in the presence of false positives.

7.5.2

Artificial Errors on Landmark Observation

It is not informative to show the results of applying each type of input error to landmark observations separately, since these errors affect the self-localisation estimate in a non-linear fashion. Instead, to examine the impact of self-localisation errors, data is gathered for all runs in which only landmark observations were corrupted. These results are shown in figures 7.10 and 7.11. Note that target observations here correspond to the ground truth. In both figures, each point corresponds to the average error over one iteration of all log files, for one type and amount of introduced error on landmark observations. First order trend lines computed using regression are added to clarify the data. In figure 7.10, error in the fused object position estimate for each method is plotted against the combined translational (straight-line) errors, also called (x, y) errors, in the self-localisation estimates of all robots. The errors are combined using the square root of the sum of the squares (SRSS) of the errors from the three robots. In other words, the vector length of the self-localisation errors is used. Only runs where the combined orientation errors were below 5 degrees are shown; this to allow the impact of translational errors to be isolated. The data is fairly sparse, because of this constraint; many iterations had both translational and orientation errors. Nonetheless, the IWAVG method seems to perform slightly better than other methods. In figure 7.11, error in the fused object position estimate for each method is plotted against the combined orientation errors in the self-localisation estimates of all robots. The orientation errors are combined in the same way as the translational errors, using the SRSS method. Only cases where the combined translational errors are below 100mm are considered; again, this allows the impact of orientation errors to be isolated. Again the data is fairly sparse, but the trends are much more apparent here. The proposed methods all outperform both reference methods, and the degradation of the proposed methods as the amount of required bandwidth changes is smooth.

144

CHAPTER 7. COOPERATIVE OBJECT LOCALISATION EXPERIMENTS Layout 3: avg fused error vs method

avg fused err (mm)

400

300

200

100

0

AVG

WAVG

8BPC 2BPC fusion method

1BB

2BB

Figure 7.12: Results for each method, averaged over nineteen runs of layout 3.

7.5.3

Unaltered Data

Results of the experiment using unaltered data, logged using layout 3 from figure 7.3, are presented here. These results reflect one particular point on the input-error landscape. In each run, robot 3 moves forward approximately 300mm, turns right approximately 90 degrees, and moves forward again. All robots are observing landmarks and the ball throughout each run. Data was collected for 20 runs of layout 3, but one run was unusable; the average results over the remaining 19 runs are presented. The average and standard deviation of the fused estimate errors for each method, averaged over the 19 runs of layout 3, are shown in figure 7.12. In this case, the proposed methods perform slightly worse than the reference methods. This is consistent with data from the systematic analysis in the case when the main source of error is range error (recall figure 7.8). Figure 7.13 shows the self-localisation and target object estimates for each robot during one sample run of layout 3. Here, it appears that the main source of error is indeed range error; in particular, robot 1 consistently and significantly overestimates range to the ball. Orientation estimates are shown in figure 7.14; from this figure, it appears that orientation estimates were accurate throughout the run, even for the moving robot. These plots give a rough indication of where on the input-error landscape the input data was situated.

7.6. DISCUSSION

145

7.6 Discussion The experiments presented here have examined the input-error landscape of the proposed approach to cooperative object localisation. Two reference methods were used as a basis for comparison, which reflect upper and lower bounds on the results achievable using averaging approaches. Recall that the upper bound reference method could not be implemented in a real system, since it uses ground truth information to compute the weights used for averaging. It is apparent that in the presence of bearing and orientation errors, the proposed methods tend to outperform even the upper bound reference method. To a lesser extent, the proposed methods are not as effective when faced with range errors and translational self-localisation errors. In the experiment based on unaltered data, range errors were considerable, while bearing and orientation errors were less apparent; the results of this experiment are consistent with the results of the systematic analysis of the input-error landscape. Another important observation is that the approximations exhibited graceful degradation as the amount of required bandwidth was decreased. As expected, the approximations yielded less consistent and less accurate results as the amount of shared information was reduced. In a number of cases, the parametric 2BB method performed particularly well, especially considering the small amount of information which needs to be exchanged using this approximation. The smooth degradation allows the appropriate method to be chosen for a particular application, given the trade-off between low bandwidth requirements and accuracy. The fact that the proposed methods are consistently better at dealing with bearing and orientation errors indicates that the fuzzy fusion approach has a certain robustness with respect to these types of input errors, even beyond the tolerance encoded in the sensor models. This is likely due to the fact that the proposed approach is set-theoretic, which means it is better able to handle the non-linearities which result from bearing and orientation errors. Specifically, the method considers all possible positions for an object during the fusion step; this avoids the heavy loss of information which results from approximating estimates as crisp positions before they are fused. This loss of information is of particular significance when dealing with bearing errors, since even small bearing errors can result in large errors in the overall position estimate. A typical example of a situation in which errors in bearing cause averaging to perform worse than the proposed method is shown in figure 7.15. The reason the proposed approach is better at dealing with bearing and orientation errors is likely also the reason why it is slightly less effective at dealing with range errors. Tolerance to bearing errors implies that each robot must consider many positions as possible for a given object; these positions are typically laid out on an arc. If range errors occur, most of these positions will be wrong. In such situations, the intersection which reflects the agreement between robots is also likely to be wrong. Figure 7.16 shows a simple example

146


of a situation in which this might occur. In most cases, best results are achieved when sharing as much information as possible; however, in some cases, sharing less information can result in more accurate results. The performance of the proposed method with respect to range errors can be improved by increasing the width of the range sensor model, at the expense of reduced precision. Alternatively, the width of the bearing sensor model can be reduced, at the expense of a reduction in performance with respect to bearing errors. It has been verified that by changing these parameters, albeit by a fairly large amount, the proposed method can be adapted so that its performance is very close to that of the IWAVG reference method, with respect to both range and bearing errors. For the presented experiments, however, the original tuning is used. This tuning is particularly interesting, since it shows how the proposed method can produce a significant improvement in performance with respect to bearing errors, while suffering only a slight degradation in performance with respect to range errors. It is also interesting to note that without the systematic analysis using artificially modified data, these aspects would not have been apparent. Using real data is obviously important for testing any robotic system, due to the various sources of uncertainty inherent in such systems. However, as has been shown, it can be useful to use artificial data to more fully characterise the performance of a given method. It is common in robotics to evaluate a method only in typical conditions, with the method’s parameters tuned specifically to suit those conditions. However, as robotic systems aim to become more flexible and robust, it is important to know how various sub-systems perform outside their comfort zones. This can make it easier to decide which techniques should be used in which situations. The computational, memory and bandwidth requirements of the proposed method allow the method to be implemented on a wide range of platforms, especially if one considers the approximations which can be used if resources are particularly limited. Implementations of the approximate coordinate transformation and the 2BB global grid approximation have been successfully tested and used on teams of Sony AIBO robots in earlier RoboCup competitions [148]; in this application domain, multi-robot ball localisation was performed at 1Hz on-board the robots.

7.6. DISCUSSION

147

R1 SELF EST R1 SELF TRUTH R2 SELF EST R2 SELF TRUTH R3 SELF EST R3 SELF TRUTH R1 BALL EST R2 BALL EST R3 BALL EST BALL TRUTH

2000

1500 y (mm) 1000

500 −500

0

500

x (mm)

th

Figure 7.13: Plot showing (x, y) estimates of self and ball position, for each robot, during a sample run of layout 3.

180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10 0 −10 −20 −30 −40 −50 −60 −70 −80 −90 −100 −110 −120 −130 −140 −150 −160 −170 −180

R1 TRUTH R1 EST R2 TRUTH R2 EST R3 TRUTH R3 EST

time

Figure 7.14: Plot showing orientation estimates for each robot during the course of a sample run using layout 3.

148


Intersection by our method True position Average of observations

Robots

Observations Figure 7.15: An example of a situation in which bearing errors cause averaging to perform worse than the proposed method. The thick arches indicate positions which are at a fixed distance from the observing robot, at different bearings.

Intersection by our method Average of observations

True position

Robots

Observations Figure 7.16: An example of a situation in which range overestimation results in the proposed method performing worse than averaging. The thick arches indicate positions which are at a fixed distance from the observing robot, at different bearings.

Chapter 8

Anchoring Experiments 8.1 Objectives The purpose of the experiments presented in this chapter is to validate the proposed anchoring framework. The presented experiments provide a “proof of concept”, which shows that the proposed framework is indeed able to address the single-robot and cooperative anchoring problems. Note that the goal of the experiments presented in this chapter is not to provide a quantitative evaluation of the anchoring framework’s performance. Quantitative analysis could be used to evaluate the performance of specific implementations of individual components of the framework. For example, in chapter 7, a particular implementation of the fusion component, which is one of the core components of the framework, was quantitatively evaluated. However, empirical analysis is of little use for evaluating the anchoring framework itself, for a number of reasons. First, it is difficult to obtain meaningful performance metrics for the anchoring problem; this is due mainly to the breadth of the problem. Second, most interesting metrics would most likely fail to reflect the performance of the anchoring framework itself; rather, they would reflect the performance of the implementations of individual components and sub-systems, both inside and outside the framework. Finally, empirical analysis typically involves a comparison with one or more reference methods. Although other anchoring frameworks do exist, none of them are fully comparable with the proposed framework, since the problems being addressed are slightly different. In particular, the proposed framework considers both single-robot and cooperative anchoring transparently; it also includes a more general approach to descriptions of objects of interest; and finally, it provides a more thorough treatment of uncertainty and information heterogeneity.

149

150

CHAPTER 8. ANCHORING EXPERIMENTS

8.2 Methodology The proposed anchoring framework is validated using four experiments, which demonstrate how the anchoring problem is addressed. In each experiment, a task is defined which requires that certain aspects of the anchoring problem be addressed. The success criterion for each experiment is that this task is successfully completed. The first experiment is mainly illustrative, and it uses data collected using a simulated environment, as well as artificially created data, to demonstrate some of the functionalities of the anchoring framework. The implementation used is essentially the one described in the previous chapters, with a few additions. The presented scenario involves finding the position of a particular parcel, given information from various static robots which use two different types of sensors. The task illustrates how a number of important aspects of the anchoring problem can be addressed using the proposed framework. The second experiment uses data collected using real robots, to demonstrate that the framework is able to cope with the uncertainty present in such data. The fusion and data association components, in particular, must be able to cope with the uncertainty inherent in real sensor observations. The implementation described in the previous chapters is used. The scenario is similar to the first experiment, and it involves finding the position of a particular parcel; however, in this case the robots are moving, in a real environment. The third experiment also uses data collected using real robots, and again the uncertainty present in real world data must be considered. The implementation used is the same as in the second experiment. The scenario for the third experiment involves the detection of several parcels scattered throughout the experimental environment, using both static and moving robots. The result is the creation of a consistent world model, despite uncertainty in self-localisation and parcel observations. The performance of the bounded version of the data association algorithm is also examined, and compared with the full version. The fourth and final experiment was performed by Borissov and Janecek [25, 26], and it uses a simplified implementation of the anchoring framework. The differences between their implementation and the implementation described in chapters 5 and 6 will be discussed in section 8.7. The experiment is presented here since it demonstrates how the proposed anchoring framework was used within a complete robotic system in order to address a complex task. The task was inspired by the “Lost and Found” challenge of the RoboCup@Home competition [130], and it involves detecting, localising, and approaching various objects of interest in a real-world environment. The overall approach to the problem is inspired by the Peis-Ecology project [138, 139], and related work on network robot systems [141, 120].

8.3. COMMON EXPERIMENTAL SETUP

151

(a)

(b)

(c)

(d)

Figure 8.1: Some photos (a, b, c) and the layout (d) of the Peis-Home.

8.3 Common Experimental Setup The setup used for the four experiments was similar in many respects. This section describes the common aspects of the experimental environment. Details regarding the setup used for each experiment are given in later sections.

8.3.1

Environment

The experimental environment is a small apartment (about 25m2 ) in which various intelligent devices are embedded. The apartment, referred to as the PeisHome, is shown in figure 8.1. Devices in the Peis-Home communicate via a middleware developed within the Peis-Ecology project [138, 139]. This middleware allows various Peis, or “Physically Embedded Intelligent Systems”, to exchange various types of information using a distributed tuple-space. These Peis can range from extremely simple devices, such as single sensors or actuators, to fully autonomous mobile robots. This concept coincides with the general notion of robots which has been used throughout this work. A Peis-Ecology is a set of Peis which have the ability to share resources in order to achieve common goals. A Peis typically consists of a number of software components, called Peis-Components. These components may provide interfaces to sensors or actuators; they may also exchange data with other Peis-Components. A number of standard Peis-Components were used in the presented experiments.


152

(a)

(b)

(c)

(d)

Figure 8.2: The Gazebo simulation of the Peis-Home, used for the first experiment.

In the first experiment, a 3D Gazebo [69, 124] simulation of the Peis-Home apartment [91] was used. The simulated apartment is shown in figure 8.2. The Gazebo simulator provided a mid-fidelity simulation of the experimental environment. In particular, perception was simulated, which means that sensors did not have direct access to properties of the simulated world. The second, third, and fourth experiments were carried out in the real apartment.

8.3.2

Fixed Cameras

In the experiments two fixed cameras were available, which will be referred to as C1 and C2. These were standard web cameras (Logitech QuickCam Fusion); a photo of one of the cameras is shown in figure 8.3(a). The cameras were mounted at fixed and known positions on the apartment ceiling, and connected to a local workstation via USB. Each camera was controlled and accessed using an instance of a Peis-Component called peiscam. The peiscam components were hosted on a workstation on which colour-based segmentation was performed [155]. The segmentation algorithm was implemented using another Peis-Component, called peiscsvision; one instance of peiscsvision was used for each camera. The output of the colour segmentation was a set of detected objects, consisting of position and colour estimates, as well as a shape signature. Position estimates were computed relative to the observing camera, using the elevation angle to the relevant region in the segmented image. This computation depended


(a)

153

(b)

Figure 8.3: The fixed cameras were Logitech QuickCam Fusion cameras (a). The mobile robots (b) were Astrid (left) and PeopleBoy (right).

on the pose and field of view of the camera, and it assumed that objects were placed on the ground. Colour estimates were provided in the HSV colour space, and they reflected the average colour of the corresponding segmented region. Simulated Fixed Cameras In the simulated environment, the cameras were configured to be as similar to the real cameras as possible. The simulated cameras were connected to their respective peiscam Peis-Components via a camera driver from the Player robot device interface [69, 124]. Sample images from the real and simulated cameras are shown in figure 8.4.

8.3.3

Mobile Robots

Two mobile robots, called Astrid and PeopleBoy, were used in the experiments. In figures and plots, Astrid will be called R1, and PeopleBoy will be called R2. Both are “PeopleBot” robots, from ActivMedia Robotics [1]. The mobile robots can be seen in figure 8.3(b). Both are 0.51m long and 0.41m wide. PeopleBoy is 1.255m tall; Astrid is 1.115m tall. Both robots are equipped with SICK LMS laser range finders, optical wheel encoders, and pan-tilt-zoom cameras (Canon VC-C4). Astrid was also equipped with a forward-facing RFID reader (not shown in figure 8.3), capable of reading RFID tags within a range of roughly 0.2m. Both robots have small embedded computers on-board, which


154

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 8.4: Images from the real (a, c) and simulated (e, g) cameras in the Peis-Home, as well as examples of the segmented images output from the peiscsvision Peis-Component (b, d, f, h). The red boxes indicate objects which were found by the colour segmentation algorithm to match given shape and colour descriptions.


155

run Linux using 1.6GHz processors. Both robots also have grippers, sonar sensors, and bumper sensors; however, none of these were used in the experiments presented in this chapter. The Player robot device interface [69, 124] was used to interface to several of the robots’ sensors and actuators. The wheel encoders and motors were accessed via a Player driver for the PeopleBot’s controller. Player was also used to connect to the laser range finder. The position and laser data were fed to another Player driver, which implemented the AMCL self-localisation algorithm described in section 6.1.3. Note that laser data was only used for selflocalisation, not for object detection. A map of the environment and initial positions were given to the self-localisation driver at the beginning of each experiment. This map was created offline, using laser data collected by manually driving Astrid around the Peis-Home. The robots could be manually controlled using a Player joystick application, called playerjoy, which connected to the Player position driver. The robots could also be controlled using other methods, including Player drivers for path-planning and path-following. A Peis-Component called peisplayer allowed odometry and self-localisation information to be accessible to the Peis-Ecology middleware. A Peis-Component called peispantilt was used to set desired positions for the pan-tilt-zoom units. In the first three experiments, the pan-tilt-zoom units were kept fixed during execution. Astrid’s RFID reader was accessed using a Peis-Component called peisrfid. The cameras mounted on the mobile robots were controlled and accessed using on-board instances of the peiscam Peis-Component, which were connected to separate instances of the peiscsvision Peis-Component. These peiscsvision instances were run on the same workstation which hosted the fixed cameras’ peiscam and peiscsvision Peis-Components. Simulated Mobile Robots The Gazebo simulation included a PeopleBot model, which was used in the first experiment. The peispantilt Peis-Component was not needed in the simulated experiment; the pan-tilt-zoom units were simply initialised to their desired positions. Also, Astrid’s RFID reader was unavailable in the simulation. The simulated robot cameras, like the simulated fixed cameras, were connected to their respective peiscam components via Player camera drivers.

8.3.4

Software Configuration

The software configuration was essentially the same for the simulated and real environments; the Player robot device interface and the Gazebo simulator provided transparent access to simulated and real resources, and the PeisEcology middleware allowed the various Peis-Components to communicate in a deployment-independent fashion.

156


In the first three experiments, a Peis-Component called the peisrouter was used to gather information from various Peis-Components, and send it to the anchoring monitor software tool described in section 7.2.5. Recall that this customisable tool is able to log and process various types of data. In the experiments presented here, the tool was used to log odometry updates from the Player position driver, self-localisation updates from the Player AMCL driver, and object observations from vision and RFID sensors. The same tool was used to process these logs offline. This processing included implementing the anchoring framework described in chapters 5 and 6, as well as analysing the results of the experiments. Some screen-shots of the tool are shown in figure 8.5. In (a), the active tabs are those used for logging and live processing of incoming data. In (b), the tool is ready to process an offline log. In (c) an overview of the currently used environment is shown, with details such as the size of the environment, and known regions of interest. In (d) the list of connected and ignored robots is shown. By allowing certain robots to be ignored, the tool is able to simulate different experimental configurations using a single log. In (e) to (h), various per-robot views are shown. In (e), the first image is a robot-centric view of the positions of all locally detected objects; the next image shows a global view of the positions of these objects; the last image shows the robot’s self-localisation estimate. In (f), position grids are shown for a local anchor and a corresponding global anchor. In (g), colour grids for the same local and global anchors are shown using hue-saturation circles (the maximum value is shown). In (h), the same colour grids are shown as saturation-value squares (the maximum hue is shown). The deployment of processes and Peis-Components for the first experiment, performed in the simulated environment, is shown in figure 8.6. The deployment of processes and Peis-Components for the second and third experiments, performed in the real environment, is shown in figure 8.7. The deployment for the fourth experiment was similar; the differences will discussed in section 8.7 It should be noted that the framework and algorithms outlined in the previous chapters are decentralised. However, since offline processing was needed in order to allow the parameters of the framework and its implementation to be explored, the logging and processing tool is deployed in a centralised manner, in practice. The processing takes the decentralised nature of the framework into account, and the information for each robot is maintained separately. In the presented experiments it is assumed that information is successfully exchanged between robots, which is not a large assumption given the reliability of available wireless ethernet networks.


157

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 8.5: Screenshots of the anchoring monitor tool. The main tabs used for logging and processing are shown in (a) and (b).


158

Workstation gazebo

player

Astrid Hardware

motors

On−board PC

peisrouter

position driver

playerjoy

amcl driver

peisplayer

encoders laser camera

camera driver

peiscam

odometry and self−localisation peiscsvision

observed objects

anchoring monitor odometry PeopleBoy Hardware

self−localisation On−board PC observed objects

motors

position driver

playerjoy

amcl driver

peisplayer

encoders laser camera

camera driver

peiscam


observed objects

peiscsvision

observed objects

Camera−01 camera

camera driver

peiscam

peiscsvision Camera−02 camera

camera driver

observed objects

peiscam

Figure 8.6: An outline of the software configuration used for the simulated experiment. The single bold box delimits the workstation, which is the only physical system in use. The grey boxes are simulated hardware components, such as motors and sensors. The rounded rectangles are processes, for instance the Gazebo simulator and the various Peis-Components. The dashed rectangles reflect logical groupings. The anchoring monitor process can be seen near the centre of the figure.


Workstation

Astrid Hardware

159

On−board PC

peisrouter

player motors

position driver

playerjoy

amcl driver

peisplayer

encoders laser

peiscam

camera ptz

peispantilt

rfid

peisrfid


observed objects

observed objects

anchoring monitor odometry PeopleBoy Hardware

self−localisation On−board PC observed objects

player motors

position driver

playerjoy

amcl driver

peisplayer

encoders laser camera ptz

peiscam


observed objects

peiscsvision

observed objects

peispantilt

Camera−01 camera

peiscam

peiscsvision Camera−02 camera

observed objects

peiscam

Figure 8.7: An outline of the software configuration used for the second and third experiments, performed in the real environment. The bold boxes represent the various physical systems: two robots, two cameras, and the workstation. The grey boxes are hardware components, such as motors and sensors. The rounded rectangles are processes, for instance the Player robot device interface, and the various Peis-Components. The dashed rectangles reflect logical groupings. The anchoring monitor process can be seen near the centre of the figure.


160

8.4 Experiment 1: Find a Parcel (Simulation) 8.4.1

Goal

The task in this experiment was for Astrid to obtain a position estimate for “parcel-22”, located near the entrance of the Peis-Home. This position estimate should be accurate to within 300mm, which was roughly the size of the target parcel. To accomplish this task, a number of different robots needed to exchange various types of information about objects in the environment. The goal of the experiment was to verify that the proposed anchoring framework is able to support multiple robots in matching and fusing various types of information, in order to anchor a parcel description to its perceptual representation; this description was represented using a single positive definite description. The goal was considered to be met if the task was successfully completed.

8.4.2

Setup

For this first experiment, the Gazebo simulation of the Peis-Home was used. The experiment involved four robots – recall that the general notion of robot is used here. One fixed camera, C1, was used; it was positioned so that it could view the main entrance. Simulations of the mobile robots, Astrid and PeopleBoy were also used. Finally, an RFID reader was assumed to be placed near the entrance. Since the RFID reader was not available in the simulated environment, observations from the RFID reader were manually inserted into the logs. All three cameras (the fixed camera and the cameras mounted on the mobile robots) were configured with a resolution of 320x240, and they provided images at 1Hz. The vision systems were configured to detect only uniformly coloured parcels, which were lying in fixed positions on the ground throughout the Peis-Home. There were four parcels in the environment. Self-anchors were not relevant for this experiment, so their creation was disabled. Since the observed objects were static, anchor prediction was also disabled. Information sources were calibrated to detect only objects of interest for the given task. One positive definite named description was used to describe the parcel of interest. Data was logged online (in simulation), and processing was done offline. During offline processing, the local frame time was set at 100ms – recall that this is the time within which two percepts from the same information source are assumed to refer to different objects. This means that local anchor management steps were performed at 10Hz. The global frame time was set at 1s, so global anchor management steps were performed at 1Hz.

8.4. EXPERIMENT 1: FIND A PARCEL (SIMULATION)

161

Figure 8.8: The domains considered for the first experiment. The “entrance” region is shown for the position domain. The colour domain is displayed using a hue-saturation circle. The shape and texture domains are represented as discrete arrays of possible values.

Deployment The deployment of processes and the connections between them are shown in figure 8.6. Note that although the AMCL self-localisation algorithm was available, it was not needed for the presented scenario, since the robots were not moving during the experiment. Also, note that only one camera was used in the experiment. Recall that the anchoring monitor is used to log and process data in a centralised manner, despite the fact that it implements a decentralised framework and decentralised algorithms. Domains Four domains of information were considered in the experiment. Position information, in global 2D coordinates, and colour information, in 3D HSV coordinates, were both used, as described in section 5.3. In addition to these, a 1D shape domain, and a 1D texture domain were also considered. Both of these were represented as bin models, where each bin consisted of a possible shape or texture. Information for these 1D domains was added to anchors by hand, since shape and texture domains were not implemented in the anchoring framework; their purpose here is mainly illustrative. The four domains are shown in figure 8.8. Recall that in the implementation of the framework used here, global anchor spaces have the same dimensions and coordinate systems as local anchor spaces.

162


Information Sources In this experiment, each robot had one information source. The following information sources were providing information about objects to the anchoring framework. • C1: The camera provided {observed-range-bearing} position information about detected objects. The peiscsvision Peis-Component was calibrated to detect only parcels of interest, using both shape and colour. However, colour information was only used for object detection; no colour information was provided to the anchoring framework. This was done in order to simulate that C1 was a black and white camera. Position observations were converted from local coordinates to global coordinates via the coordinate transformation described in section 6.2. The selflocalisation estimate for the camera was fixed, accurate and precise; the error was less than 100mm. The position sensor model assumed little uncertainty, since the elevated vantage point of the camera resulted in accurate range-bearing estimates. Overall, the position estimates provided by the camera were both accurate and precise. • Astrid’s camera (R1.1): Astrid’s on-board camera produced both position ({observed-range-bearing}) and colour ({observed-colour}) information about objects. The peiscsvision Peis-Component was filtering on shape and colour. Position information was mapped to the local anchor space using the coordinate transformation described in section 6.2. Colour information was mapped into the local anchor space using the conceptual sensor model described in section 5.6.2. Astrid’s initial self-localisation estimate was fairly accurate (within 500mm); however, Astrid’s position observations were not as accurate as those from C1. • PeopleBoy’s camera (R2.1): PeopleBoy’s on-board camera was an information source which produced position, colour and shape information about observed objects. PeopleBoy’s vision system was also filtering on shape and colour, and the position and colour information were treated as for Astrid’s camera. Shape domain information was added to observations by hand, offline. PeopleBoy was not moving, but had poor initial self-localisation (uncertainty of more than 1000mm). As such, produced object position estimates were extremely imprecise. • RFID-01: The RFID reader was assumed to be positioned near the entrance to the Peis-Home. This reader could detect the presence of nearby RFID tagged objects. Observations originating from the RFID reader were generated by hand during the experiment, using the peisrouter. In the presented experiment, the RFID reader produced both {in-region} position information, as described in section 5.6.1, and texture domain information.


8.4.3

163

Execution

Initial Configuration Initial robot and parcel positions are shown in figure 8.9. The parcels are the two red and two green boxes. At the beginning of the experiment, none of the participants knew the number or properties of these parcels. Actions The sequence of actions carried out during the experiment is presented here. Astrid’s local (both own and received) and global anchors are shown for various points in time in figure 8.10. • Time t0: Astrid is given the task of finding the position of “parcel-22”, located near the entrance of the Peis-Home. The task is used by Astrid’s planner to create a description, which specifies that the anchoring module should consider only objects in the “entrance” region of the Peis-Home. This description was created using a grounding function able to convert {in-region} information to the local anchor space. The description can be seen at t0 of figure 8.10. • Time t1: Astrid receives a local anchor ψ11 from the RFID reader. The reader has detected an RFID tagged object called “parcel-22”. The received local anchor contains position, colour, and texture information. Note that the name of the object does not get included in the local anchor, as discussed in section 4.6. The contained position information is symbolic {in-region} information, which says that the object is near the entrance. This is known since RFID-01 can only detect objects which are near the entrance. The provided colour information indicates that the detected parcel is green; the symbolic description of green is imprecise, since the definition of “green” could correspond to a relatively wide range of colours. The anchor also contains the information that the detected object is striped; this information was added by hand – it is assumed that it was written on the RFID tag. Anchor ψ11 matches the active description, and is accepted. The matching and fusion steps are trivial, since there are no other anchors. The result is the creation of global anchor ω111 , which is a copy of ψ11 . The anchors are shown at t1 in figure 8.10. The dashed boxes around the local and global anchors at time t1 indicate that they refer to the same object. • Time t2: Astrid receives three local anchors from PeopleBoy. One of these, ψ23 , is not in the entrance region; it is therefore discarded, since it does not match the only active positive description. The other two anchors do match the description, so they are accepted. Anchor ψ21 does not match the local anchor received from RFID-01, since the colours are

164


different. As such, ψ21 is trivially matched and fused as a new global anchor ω121 . PeopleBoy’s local anchor ψ22 , on the other hand, does match the local anchor ψ11 received from RFID-01, and the result of the fusion of the two is stored in global anchor ω111 . Note that this anchor now contains position and colour information from both RFID-01 and PeopleBoy, texture information from RFID-01, and shape information from PeopleBoy. The anchors are shown at t2 in figure 8.10. Note that the local anchor ψ11 is still available at time t2, but that the previous global anchor was discarded; newly created global anchors always replace previous ones. Again, the dashed boxes indicate which local and global anchors refer to the same objects. • Time t3: Astrid’s vision system detects two differently coloured objects near the entrance. Both of them match the active description, so two new local anchors, ψ31 and ψ32 , are created. Anchor ψ31 matches ψ21 , and the resulting new global anchor is ω121 . Anchor ψ32 matches both ψ11 and ψ22 , and the resulting global anchor is ω111 . These anchors have improved position estimates, thanks to the position information provided by Astrid’s vision system. The anchors are shown at t3 in figure 8.10, and the dashed boxes are again used to indicate which anchors refer to the same objects. • Time t4: Astrid receives four local anchors from C1. Two of these, ψ43 and ψ44 , are rejected by the position constraint of the description. The other two are accepted. Anchor ψ41 matches the parcel described by anchors ψ21 and ψ31 , and these are fused to form ω121 . Local anchor ψ42 matches ψ11 , ψ22 and ψ32 , and these are fused to create global anchor ω111 . Both of the resulting global anchors match the description, so both are returned as possible candidates for “parcel-22”. These anchors are shown at t4 in figure 8.10. • Time t5: At this point there are two anchors which match the description. This might be fine if the planner was looking for an indefinite object (e.g. “a parcel near the entrance”). However, in this case the definite “parcel22” is desired, and more information is needed in order to disambiguate the situation. Saliency information [144] could potentially be used to aid in determining the right question to ask [45]. For now, however, it is simply assumed that the planner asks for more information about the desired parcel, and it is informed that the desired parcel is green. This allows the planner to add that constraint to the description, as shown at t5 of figure 8.10. At this point, only one global anchor, ω111 , which is the correct one, is returned as matching the description for “parcel-22”.


8.4.4

165

Results

The final position estimate for the only parcel which matched the description had an error of 63mm, which was easily within the 300mm margin specified for the experiment; the quality of the final position estimate was mainly due to the inclusion of the position estimate from C1. Astrid’s estimate had an error of 0.330m, before fusion with C1’s estimate. Note that C1 on its own would not have been able to determine which parcel was the correct one, since it needed the colour information available from the robot cameras. The final anchor is also complemented with shape and texture information, thanks to RFID-01 and PeopleBoy.

8.4.5

Discussion

The successful achievement of the experimental task shows that the proposed anchoring framework is able to support multiple robots in matching and fusing various types of information. This means that the experimental goal was achieved. The use of descriptions within the framework was also demonstrated; the sole active positive definite description filtered out uninteresting local anchors received from other robots, and it was also associated with candidate global anchors. It should be noted, in particular, that the described experiment showed how the following key aspects of the anchoring problem were addressed. • Complementary percepts about objects in the environment were gathered, associated and combined. In particular, this is seen by the fact that the final global anchor for “parcel-22” contains position, colour, shape, and texture information, which could only be achieved by combining information from heterogeneous information sources. • Redundant percepts about objects in the environment were successfully fused, resulting in improved estimates of object properties. In particular, position estimates from various robots were combined, resulting in an estimate which was more accurate and precise than most of the robots could have obtained on their own. • The use of descriptions was demonstrated; specifically, a description was used to represent the desired parcel. This description was also updated dynamically during the run. Note that interest might also have been used to exclude percepts deemed uninteresting, in order to reduce the complexity of the data association problem.


166

R1 P24 C1

(a)

R2

P25

P23 P22

(b)

RFID−01

c)

Figure 8.9: Initial positions (top) for robots and parcels in the Gazebo simulation used for the first experiment. Astrid is R1, PeopleBoy is R2. The desired parcel, “parcel-22”, is P22, in the bottom right-hand corner of the figure. The images seen by the three cameras are shown in (a) for C1, (b) for Astrid, and (c) for PeopleBoy.

8.5. EXPERIMENT 2: FIND A PARCEL (REAL ROBOTS)

167

Figure 8.10: Astrid’s local (left) and global (right) anchors at various times during experiment 1. The local anchors are both Astrid’s own local anchors, as well as those received from other robots. Local anchors are available at the time they are received and at all later times; global anchors are discarded and re-created at each time step. The domains for each anchor are position, colour, shape, and texture, as shown in figure 8.8. Darker areas are more possible, lighter areas are less possible. The dashed boxes indicate which anchors refer to the same objects.


168

8.5 Experiment 2: Find a Parcel (Real Robots) 8.5.1

Goal

The task in this experiment was for PeopleBoy to find and approach the RFIDtagged parcel P1. The parcel was tagged with its name and colour. It was supposed that from a position near the parcel PeopleBoy could then perform some action such as picking up the parcel, although this was not part of the presented experiment. The position estimate for P1 should be accurate to within 400mm, which was roughly the size of the target parcel. In order to accomplish this task, PeopleBoy needed the help of Astrid, since Astrid was equipped with an RFID reader, which allowed a detected parcel’s identity to be established. Astrid’s camera was assumed to be black and white for this experiment. The goal of the experiment was to verify that the framework allows robots to correctly associate heterogeneous percepts, obtained from real sensors, and moving robots, in order to successfully anchor a parcel name to its perceptual representation. The experiment should verify that the framework is applicable despite the uncertainty inherent in real sensor data; in this sense, the experiment complements the previous experiment. The task considered both symbolic and numeric information, and required that information from different robots be combined in order to correctly identify the desired parcel. The goal was considered to be met if the task was successfully completed.

8.5.2

Setup

For this experiment, the real Peis-Home was used. Photos of the experimental environment are shown in figure 8.11. The experiment involved both mobile robots, Astrid and PeopleBoy. Both cameras were configured with a resolution of 320x240, and they provided images at 1Hz. The vision systems were configured to detect only the relevant parcels, which were lying in fixed positions on the ground throughout the Peis-Home. The parcels measured roughly 200x300x400mm. There were four parcels in the environment, two blue and two pink. Two of the four parcels had RFID tags containing the parcel’s name and colour. Astrid’s forward-facing RFID reader could read these tags. As in the first experiment, the creation of self-anchors and anchor prediction were disabled. Information sources were again calibrated to detect the objects of interest for the given task, and all robots were assumed to have one positive description which allowed all arriving information to be accepted. Data was logged online, and processing was done offline. During offline processing, the local frame time was set at 100ms – recall that this is the time within which two percepts from the same information source are assumed to refer to different objects. This means that local anchor management steps were performed at 10Hz. The global frame time was set at 1s, so global anchor management steps were performed at 1Hz.


(a)

169

(b)

Figure 8.11: Photos of the setup used for experiments 2 and 3, taken using C1 and C2.

Deployment The deployment of processes and the connections between them are shown in figure 8.7. Robot motion was manually controlled, via the playerjoy application. Note that no fixed cameras were used in this experiment. Recall that the anchoring monitor was used to log and process data in a centralised manner, despite the fact that it implements a decentralised framework and decentralised algorithms. Domains Two domains of information were considered in the experiment. Position information, in global 2D coordinates, and colour information, in 3D HSV coordinates, as discussed in section 5.3. Recall that in the implementation of the framework used here, global anchor spaces have the same dimensions and coordinate systems as local anchor spaces. Information Sources The following information sources were providing information about objects to the anchoring framework. • Astrid’s camera (R1.1): Astrid’s on-board camera, which was treated as a black and white camera, produced only {observed-range-bearing} position information, which was mapped to the local anchor space using the conceptual sensor model described in section 6.2. Although the peiscsvision Peis-Component was calibrated to detect parcels using both shape and colour, the colour information was only used for object detection, and not passed on to the anchoring framework. This was done in order to simulate that Astrid’s camera was black and white, as was done


170

with the fixed camera C1 in the first experiment. Both robots were moving during the experiment, and self-localisation estimates were obtained from the Player AMCL driver, which implemented the self-localisation algorithm described in section 6.1.3. • Astrid’s RFID reader (R1.2): Astrid was equipped with a forward-facing RFID reader, which could read RFID tags located within a distance of roughly 200mm. The RFID reader was mounted at a height which allowed it to detect and read the RFID tags which were placed on some of the parcels. The information on the tags consisted of the tagged parcel’s name and colour. Colour information was symbolic {is-colour} information, described in section 5.6.1. Detection of RFID tagged objects also allowed {near-self} symbolic position information to be inferred, as discussed in section 5.6.1. This information was very imprecise, since it included both Astrid’s self-localisation uncertainty, as well as significant uncertainty regarding the exact position of the detected RFID tags. • PeopleBoy’s camera (R2.1): PeopleBoy’s on-board camera and vision system produced both {observed-range-bearing} position information and {observed-colour} colour information about observed parcels. The peiscsvision Peis-Component was calibrated to detect parcels using both shape and colour information. The conceptual sensor models described in sections 6.2 and 5.6.2 were used to map information into the local anchor space.

8.5.3

Execution

Initial Configuration Two runs of the experiment were performed; in the first run, parcel P1 was positioned very close to PeopleBoy, so that it would be the first parcel to be detected. In the second run, another parcel was placed in between PeopleBoy and parcel P1. The initial positions for the robots and parcels for the first run of the experiment are shown in figure 8.12. In the second run, the positions of parcels P1 and P2 were swapped. There were four parcels in the environment, two blue and two pink. Parcels P1 and P4, one of each colour, had RFID tags, on which the parcel’s name and colour were written. At the beginning of the experiment, none of the participants knew the number or properties of the parcels in the environment. Note that figure 8.12 reflects a Gazebo simulation of the initial setup, created for visualisation purposes only; the experiment was executed in the real environment, shown in figure 8.11.


171

R2

P3 P2 P1 R1 P4

C1

Figure 8.12: The initial layout of robots and parcels used for experiments 2 and 3. Note that the figure reflects a Gazebo simulation of the initial configuration, for visualisation purposes only. The actual experiments were carried out in the real Peis-Home environment, shown in figure 8.11. Parcels P1 and P4 had RFID tags, containing the parcel’s name and colour. Recall that Astrid is R1, and PeopleBoy is R2. C1 was not in not used in experiment 2, but it was used in experiment 3.

Actions Both runs began with Astrid being remote controlled around the Peis-Home, examining the parcels in an arbitrary order. When parcel P1 was identified thanks to its RFID tag, Astrid left the area, and PeopleBoy was then remote controlled to the approximate position of parcel P1. Nearby parcels were examined until one which matched Astrid’s anchor for P1 was found. The robots were normally kept still for a few seconds in front of each parcel, although observations were also made while the robots were moving. Astrid and PeopleBoy were continuously exchanging local anchors. It was assumed that Astrid’s task planner told PeopleBoy’s task planner which of Astrid’s local anchors was P1 (the target parcel). This assumption was needed since percept names are not transferred to corresponding local anchors (recall section 4.6). This should be seen as a high level decision taken by Astrid’s task planner – the user of the anchoring framework. In a sense, Astrid and PeopleBoy are executing a joint plan, the coordination of which is not within the scope of this work; a thesis by Lundh [107] examines several aspects of this type of cooperative plan, which needs to take participant heterogeneity into account.

172


Observations Figures 8.13 and 8.14 show the positions of parcel observations made by both robots during the two runs of the experiment. The true positions of the parcels are shown by the large filled points; the blue points correspond to the blue parcels, and the pink points correspond to the pink parcels. The plotted observations reflect the centre of gravity (per equation 5.3) of the position grids created using the conceptual sensor models described in sections 6.2 and 5.6.1. Point shapes indicate the parcel from which each observation originated. Point colours indicate which robot made the observation. Astrid’s RFID observations (R1.2) are indicated by larger points. Robot trajectories are shown as solid lines. The trajectories reflect the self-localisation estimates from the AMCL algorithm, not the ground truth (which was unavailable). The filled points on these lines indicate the positions from where parcels were observed. Note that the positions from which Astrid made RFID observations are not shown, since these roughly correspond to the positions of the observations themselves. Figures 8.15 and 8.16 show the errors in each observation, versus the time of the run. These figures give a sort of time line of the events in each run. Note that Astrid’s RFID observations (information source R1.2) were generally farther from the parcels than the other observations. This is because these observations consist of {near-self} information, centred on Astrid’s position. These observations are nonetheless consistent with the parcel positions, since the position grids take this imprecision into account. In the first run, Astrid first observed P4 in the bottom left-hand corner (the Peis-Home bedroom), after which P1 was found, in the living room (upper right). PeopleBoy observed P1 first, ending the search. In the second run, Astrid took roughly the same path as in the first run, however P1 and P2 were in different positions. So after P4, P2 was observed, and finally P1. PeopleBoy also observed P2 before P1, but eventually both parcels were observed. Anchors The local and global anchors created during the two runs of the experiment are shown in figures 8.17 and 8.18. Time t1 is after Astrid has found parcel P1; time t2 is after PeopleBoy has located P1. The shown anchors are PeopleBoy’s anchors. The local anchors at t1 are received from Astrid; those at t2 are PeopleBoy’s own local anchors. Local anchors are available at the time they are received and at all later times; global anchors are discarded and re-created at each time step. The domains for each anchor are position and colour. In the first part (up to t1) of the first run, Astrid observed parcels P4, then P1. Both anchors contained numeric position information from the camera, and symbolic colour information from the RFID tags. Note that the {near-self} information which could be inferred from the RFID reading was less precise


R1.1 sees P1 R1.1 sees P2 R1.1 sees P3 R1.1 sees P4 R1.2 sees P1 R1.2 sees P2 R1.2 sees P3 R1.2 sees P4 R2.1 sees P1 R2.1 sees P2 R2.1 sees P3 R2.1 sees P4 C1.1 sees P1 C1.1 sees P2 C1.1 sees P3 C1.1 sees P4 P1 truth P2 truth P3 truth P4 truth R1 R2

4000

3000

y (mm)

173

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)

Figure 8.13: Parcel observations for run 1 of experiment 2.

R1.1 sees P1 R1.1 sees P2 R1.1 sees P3 R1.1 sees P4 R1.2 sees P1 R1.2 sees P2 R1.2 sees P3 R1.2 sees P4 R2.1 sees P1 R2.1 sees P2 R2.1 sees P3 R2.1 sees P4 C1.1 sees P1 C1.1 sees P2 C1.1 sees P3 C1.1 sees P4 P1 truth P2 truth P3 truth P4 truth R1 R2

4000

y (mm)

3000

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)



174

800 R1.1 sees P1 R1.1 sees P2 R1.1 sees P3 R1.1 sees P4 R1.2 sees P1 R1.2 sees P2 R1.2 sees P3 R1.2 sees P4 R2.1 sees P1 R2.1 sees P2 R2.1 sees P3 R2.1 sees P4 C1.1 sees P1 C1.1 sees P2 C1.1 sees P3 C1.1 sees P4

700 600

error (mm)

500 400 300 200 100 0 3

4

5

6

7

8

9

time (minutes)

Figure 8.15: Error for each parcel observation, versus time, for run 1 of experiment 2.


700 600

error (mm)

500 400 300 200 100 0 19

20

21

22

23

24

25

26

time (minutes)



175

than the information from the camera, so the information in the local anchor reflects only the camera’s position estimate. The global anchors from these two observations were identical to the local anchors, as shown at t1 in figure 8.17. In the second part (after t1) of the first run, PeopleBoy observed P1, which matched Astrid’s local anchor ψ12 . At t2 in figure 8.17 the global anchor ω212 contains the result of the fusion of local anchors ψ12 and ψ21 . This is the anchor which PeopleBoy would then use to complete the task. In the first part (up to t1) of the second run, Astrid again observed parcel P4 first. The corresponding local anchor ψ12 , shown at t1 in figure 8.18, contained position information from the camera, and symbolic colour information from P4’s RFID tag. After this, parcel P2 was observed, but this observation contained only position information, since Astrid’s camera was black and white (and P2 had no RFID tag with which to convey colour information). A separate local anchor was created for P2; this anchor is not shown in the figure. Soon after the creation of the local anchor for P2, P1 was observed; this observation again contained position information from the camera, as well as colour information from the RFID tag. This anchor matched the anchor for P2. This is because the anchor for P2 contained no colour information, and the position uncertainty in both anchors, from both self-localisation and the observations, was larger than the distance between them. So the observation of P1 was fused with the local anchor which was originally created for P2; this anchor is ψ12 at t1 of figure 8.18. Note that this anchor contains the colour information from P1’s RFID tag, even though the position information was erroneously computed from both P1 and P2. Again, the created global anchors were identical to the local anchors at t1. In the second part (after t1) of the second run, PeopleBoy first observed P2. This anchor contained position and colour information from PeopleBoy’s camera, as shown at t2 of figure 8.18. It did not match any of Astrid’s local anchors, and so the resulting global anchor ω221 was a copy of ψ21 . Finally, PeopleBoy observed P1, and this anchor matched both the position and colour of ψ12 . The resulting global anchor ω212 was then the final anchor which PeopleBoy would use to complete the task. The positions of the global anchors which resulted from the matching of local anchors are shown in figures 8.19 and 8.20. The true positions of the parcels are shown by the large filled points; the blue points correspond to the blue parcels, and the pink points correspond to the pink parcels. The other plotted points are the centre of gravity (per equation 5.3) of the position grids of the global anchors. The point shapes and colours indicate the parcel to which each global anchor corresponds. The error in global anchor position estimates is shown in figures 8.21 and 8.22. Recall that global anchors were computed at 1Hz during the offline processing. Global anchors were only computed when there were changes to the local anchors, and local anchors were only computed when there were new observations. In the offline analysis presented here, it is assumed that all in-

176


Figure 8.17: PeopleBoy’s local (left) and global (right) anchors at two time points for run 1 of experiment 2. Darker areas are more possible, lighter areas are less possible. The dashed boxes indicate which anchors refer to the same objects.

Figure 8.18: PeopleBoy’s local (left) and global (right) anchors at two time points for run 2 of experiment 2. Darker areas are more possible, lighter areas are less possible. The dashed boxes indicate which anchors refer to the same objects.


177

P1 P2 P3 P4 P1 truth P2 truth P3 truth P4 truth

4000

y (mm)

3000

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)

Figure 8.19: Global anchors created during run 1 of experiment 2.


4000

y (mm)

3000

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)



178

800 P1 P2 P3 P4

700 600

error (mm)

500 400 300 200 100 0 3

4

5

6

7

8

9

time (minutes)

Figure 8.21: Error for each global anchor, versus time, for run 1 of experiment 2.

800 P1 P2 P3 P4

700 600

error (mm)

500 400 300 200 100 0 19

20

21

22

23

24

25

26

time (minutes)



179

formation was successfully exchanged between robots, and no approximations were used. This means that global anchors were identical across robots. The global anchors reflect the estimates for each parcel which result from the combination of the local anchors from both robots. Note that these estimates are much more consistent than the raw observations.

8.5.4

Results

In both runs of the experiment, parcel P1 was found, and an accurate position estimate was obtained. Despite the fact that in the second run Astrid was not able to detect that P1 and P2 were two distinct parcels, the information exchange between Astrid and PeopleBoy allowed the situation to be disambiguated, and appropriate global anchors for all observed parcels were successfully created. In both runs, global anchor position estimates for all parcels were well under the 400mm margin specified in the task, as can be seen in figures 8.21 and 8.22.

8.5.5

Discussion

The successful achievement of the experimental task demonstrates that the framework is robust enough to handle the uncertainty contained in information produced by real sensors on moving robots. The robots were able to exchange and correctly associate heterogeneous percepts about the parcels in the environment, despite this uncertainty. This means that the experimental goal was achieved. This experiment was designed to be similar to the previous experiment, which was carried out in the simulated environment. The focus in the simulated experiment was on gathering and combining various types of information from the different information sources. This showed the breadth of the framework, in terms of its ability to consider various domains and various information sources. The focus of this experiment, performed in the real environment, was on the robustness of the framework with respect to uncertainty contained in real sensor data. This experiment was intended to complement the simulated experiment. Looking at the results from both experiments, it can be concluded that the proposed anchoring framework is both broad enough to be able to consider various domains and information sources, and robust enough to handle the uncertainty which is present in real sensor data.


180

8.6 Experiment 3: Find Multiple Parcels 8.6.1

Goal

The task in this experiment was for a group of robots to obtain position and colour estimates for a number of parcels which were lying on the ground at various locations in the Peis-Home. Position estimates should be accurate to within 400mm, which was roughly the size of the parcels. To build a consistent world model, the robots needed to exchange, match and combine information. This experiment had two objectives. First, it aimed to verify that the framework is able to successfully match and fuse heterogeneous and uncertain percepts obtained from more than two robots, without degradation of the quality of the estimates. For some algorithms, the step from two to three robots can be problematic; this experiment aimed to show that the proposed framework does not suffer from this problem. The fusion algorithm itself was studied empirically in chapter 7; this experiment complemented this experiment, by verifying that the benefits of the fusion approach are not compromised when the method is used within the overall anchoring framework. As a secondary objective, the experiment was used to briefly characterise the performance of the full and bounded versions of the data association algorithm. In particular, the experiment compared the correctness and required computation time of the bounded version versus the full version.

8.6.2

Setup

For this experiment, the real Peis-Home was used, and the setup was identical to that used in the second experiment, except that in addition to the two mobile robots, the fixed camera C1 was also used. The setup is shown in figure 8.11. The fixed camera, like the cameras mounted on the mobile robots, was configured with a resolution of 320x240, and it provided images at 1Hz. Deployment The deployment of processes and the connections between them are the same as in the previous experiment; see figure 8.7. Domains The domains considered were the same as in the previous experiment; position information, in global 2D coordinates, and colour information, in 3D HSV coordinates, were both used, as discussed in section 5.3.

8.6. EXPERIMENT 3: FIND MULTIPLE PARCELS

181

Information Sources The following information sources were providing information about objects to the anchoring framework. • C1.1: The fixed camera provided {observed-range-bearing} position and {observed-colour} colour information about detected parcels. The peiscsvision Peis-Component was calibrated to filter on shape and colour, so that only the desired parcels were observed. As in the previous experiments, position observations were converted from observed local coordinates to the global coordinates used in the local anchor space via the coordinate transformation described in section 6.2. Colour observations were mapped into local anchor spaces as described in section 5.6.2. • Astrid and PeopleBoy’s cameras (R1.1 and R2.1): The robots’ on-board cameras produced {observed-range-bearing} position information and {observed-colour} colour information about observed objects. Again, peiscsvision was filtering on shape and colour, and the conceptual sensor models described in sections 6.2 and 5.6.2 were used to map information into the local anchor space. Both robots were moving during the experiment, and self-localisation estimates were obtained from the Player AMCL driver, which implemented the self-localisation algorithm described in section 6.1.3. • Astrid’s RFID reader (R1.2): Astrid was equipped with a forward-facing RFID reader, which could read RFID tags located within about 200mm. The RFID reader was mounted at a height which allowed it to detect and read the RFID tags which were placed on some of the parcels. The information on the tags consisted of the tagged parcel’s name and colour. Colour information was symbolic {is-colour} information, described in section 5.6.1. Detection of RFID tagged objects also allowed {near-self} symbolic position information to be inferred, as discussed in section 5.6.1. This information was very imprecise, since it included both Astrid’s selflocalisation uncertainty, as well as significant uncertainty regarding the exact position of the detected RFID tags.

8.6.3

Execution

Initial Configuration Two runs of the experiment were performed. For both runs, the initial positions for the robots and parcels were the same as for the first run of the second experiment, shown in figure 8.12. There were four parcels to detect, two blue and two pink. Parcels P1 and P4, one of each colour, had RFID tags, on which the parcel’s name and colour were written. Only two of the parcels, P1 and P2, were

182


within the field of view of the fixed camera C1. At the beginning of the experiment, none of the participants knew the number or properties of the parcels in the environment. Note that, as in the previous experiment, figure 8.12 reflects a Gazebo simulation of the initial setup, created for visualisation purposes only; the experiment was executed in the real environment, shown in figure 8.11. Actions At the beginning of each run, the fixed camera C1 quickly located P1 and P2, which were within its field of view. Meanwhile, the robots, manually controlled, navigated around the Peis-Home, visiting the parcels in an arbitrary order. As in the previous experiment, the robots were normally kept still for a few seconds in front of each parcel, although observations were also made while the robots were moving. Observations Figures 8.23 and 8.24 show the parcel observations made by each information source during each run. The true positions of the parcels are shown by the large filled points; the blue points correspond to the blue parcels, and the pink points correspond to the pink parcels. The plotted observations reflect the centre of gravity (per equation 5.3) of the position grids which were created using the conceptual sensor models described in sections 6.2 and 5.6.1. The point shapes indicate the parcel from which each observation originated. Point colours indicate which robot made the observation. Astrid’s RFID observations (R1.2) are indicated by larger points. Both runs were relatively long, and the robot trajectories were overlapping throughout the runs; for this reason, robot trajectories are omitted from the figures. Figures 8.25 and 8.26 show the error in each observation versus time. These plots give a sort of time line of the events in each run. These figures show that all robots were making observations sporadically throughout both runs. Note that as in the previous experiment, Astrid’s RFID observations (information source R1.2) were generally farther from the parcels than the other observations. This is because these observations consist of {near-self} information, centred around Astrid’s position. These observations are nonetheless consistent with the parcel positions, since they take the imprecision of this type of information into account. Anchors Figures 8.27 and 8.28 show the positions of the global anchors for each parcel, after each execution of the global anchor management steps. The true positions of the parcels are shown by the large filled points; the blue points correspond to the blue parcels, and the pink points correspond to the pink parcels. The


183

R1.1 sees P1 R1.1 sees P2 R1.1 sees P3 R1.1 sees P4 R1.2 sees P1 R1.2 sees P2 R1.2 sees P3 R1.2 sees P4 R2.1 sees P1 R2.1 sees P2 R2.1 sees P3 R2.1 sees P4 C1.1 sees P1 C1.1 sees P2 C1.1 sees P3 C1.1 sees P4 P1 truth P2 truth P3 truth P4 truth

4000

y (mm)

3000

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)


R1.1 sees P1 R1.1 sees P2 R1.1 sees P3 R1.1 sees P4 R1.2 sees P1 R1.2 sees P2 R1.2 sees P3 R1.2 sees P4 R2.1 sees P1 R2.1 sees P2 R2.1 sees P3 R2.1 sees P4 C1.1 sees P1 C1.1 sees P2 C1.1 sees P3 C1.1 sees P4 P1 truth P2 truth P3 truth P4 truth

4000

y (mm)

3000

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)



184


700 600

error (mm)

500 400 300 200 100 0 26

27

28

29

30

31

32

33

34

35

36

time (minutes)



700 600

error (mm)

500 400 300 200 100 0 2

3

4

5

6

7

8

9

10

11

12

time (minutes)



185


4000

y (mm)

3000

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)



4000

y (mm)

3000

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)



186

800 P1 P2 P3 P4

700 600

error (mm)

500 400 300 200 100 0 26

27

28

29

30

31

32

33

34

35

36

time (minutes)


800 P1 P2 P3 P4

700 600

error (mm)

500 400 300 200 100 0 2

3

4

5

6

7

8

9

10

11

12

time (minutes)



187

other points correspond to the centre of gravity of the position grids in the global anchor; point shapes and colours indicate the parcel to which each global anchor corresponds. Figures 8.29 and 8.30 show the corresponding anchor position error versus time. Recall that global anchor management was run at 1Hz. However, when no new local anchors existed, global anchors were not updated. In the offline analysis presented here, it is assumed that all information was successfully exchanged between robots, with no approximations, so global anchors are identical across robots. The global anchors reflect the estimates for each parcel which result from the combination of the local anchors from all robots. Note that these estimates are much more consistent than the raw observations. It is interesting to note that the global anchors corresponding to the two parcels visible to the fixed camera were extremely similar to the observations from the fixed camera, and they do not appear to be significantly affected by the observations from the mobile robots. This is due to the fact that the estimated uncertainty of the fixed camera’s estimates (from both self-localisation and observations) is smaller, so the position estimates from the fixed camera are more precise than those from the mobile robots. Since the estimates are essentially completely overlapping, the agreement between them is essentially identical to the more precise estimates, as discussed in section 5.2.3. Incidentally, the camera estimates were often actually less accurate than the robot observations; this was likely due to slightly incorrect self-localisation information for the camera. Bounded Data Association Correctness The data collected for this experiment was processed using both the full and bounded versions of the data association algorithm, discussed in section 6.3.4. When the bounded version was used, it was used for both local and global data association. The observations were the same for both the bounded and nonbounded versions. The global anchor positions obtained using the bounded algorithm are shown in figures 8.31 and 8.32. The errors in the global anchor position estimates are shown in figures 8.33 and 8.34. For the first run, the results using the bounded version of the data association algorithm differed slightly from the full version. The results for the second run were identical to the full version. The differences in the first run can be seen by comparing figures 8.27 and 8.31 with figures 8.29 and 8.33. In particular, notice that in figure 8.33, for numerous time points near the middle of the run, there are multiple global anchor for parcels P1 and P2. This indicates errors in data association, since anchors which should have matched were not treated as matching. Recall that the bounded version of the algorithm will never consider two non-matching entities to be matching; it can, however, erroneously choose not to match entities which should be matched. This is what occurs in the first run of this experiment.

188


Despite the data association errors made by the bounded algorithm in the first run, by the end of the run the global anchors are identical to those obtained using the full algorithm. This recovery suggests that no errors in local data association were made. This can be inferred since the lack of anchor deletion means that erroneously created local anchors would never disappear. Global data association errors, on the other hand, can be “undone” during the next global anchor management step, since previous global anchors are destroyed. This is one reason why it was mentioned in section 6.3.4 that the bounded version of the algorithm can normally safely be applied to the global data association step. Using the bounded version for local data association is more of a risk, since erroneous local anchors might be created. A summary of the performance of the bounded and full versions of the data association algorithm is shown in table 8.1. The percentage of data association errors is computed using the number of times the global data association resulted in invalid anchors, and the total number of times that global data association was performed in the run. The average position error is the average error for all parcels, for all executions of the global data association algorithm. The final position error is the average error in mm for each of the four parcels, at the end of the run. Table 8.1: Results using full and bounded data association algorithms.

data association errors (%) average position error (mm) final position error (mm)

full 0 281 244

Run 1 bounded 10.48 276 244

full 0 259 227

Run 2 bounded 0 259 227

Bounded Data Association Processing Time The computation time required for local and global data association is an important aspect of the data association algorithm, given the inherent complexity of the problem, discussed in section 6.3.5. The computation time for both local and global data association was logged for both runs of the experiment, using both the full and bounded versions of the data association algorithm. Note that the implementation was not code-optimised, so the absolute computation time could probably be significantly improved; however, the provided values nonetheless give an indication of the relative complexity of the presented methods. Processing was performed offline, using a Linux workstation with a 3.2GHz processor. Figures 8.35 and 8.36 show the number of associations and required computation time for local data association, for both the full and bounded data association methods, and for both runs of the experiment. The solid green line


189


4000

y (mm)

3000

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)

Figure 8.31: Global anchors created during run 1 of experiment 3, using the bounded data association algorithm described in section 6.3.4.


4000

y (mm)

3000

2000

1000

0 0

1000

2000

3000

4000

5000

6000

x (mm)

Figure 8.32: Global anchors created during run 2 of experiment 3, using the bounded data association algorithm described in section 6.3.4.


190

800 P1 P2 P3 P4

700 600

error (mm)

500 400 300 200 100 0 26

27

28

29

30

31

32

33

34

35

36

time (minutes)

Figure 8.33: Error for each global anchor, versus time, for run 1 of experiment 3, using the bounded data association algorithm described in section 6.3.4.

800 P1 P2 P3 P4

700 600

error (mm)

500 400 300 200 100 0 2

3

4

5

6

7

8

9

10

11

12

time (minutes)

Figure 8.34: Error for each global anchor, versus time, for run 2 of experiment 3, using the bounded data association algorithm described in section 6.3.4.


191

20

0.2

10

0.1

0

processing time (s)

number of associations

associations (local) proc. time (local full) proc. time (local bounded)

0 27

28

29

30

31 32 time (minutes)

33

34

35

Figure 8.35: Number of associations and computation time versus run time for local data association in run 1 of experiment 3.

20

0.2

10

0.1

0

processing time (s)


associations (local) proc. time (local full) proc. time (local bounded)

0 3

4

5

6

7 time (minutes)

8

9

10

11

Figure 8.36: Number of associations and computation time versus run time for local data association in run 2 of experiment 3.


192

100

1 associations (global) proc. time (global full) proc. time (global bounded)

0.9

80

0.8

70

0.7

60

0.6

50

0.5

40

0.4

30

0.3

20

0.2

10

0.1

0

processing time (s)


90

0 27

28

29

30

31 32 time (minutes)

33

34

35

Figure 8.37: Number of associations and computation time versus run time for global data association in run 1 of experiment 3.

100

1 associations (global) proc. time (global full) proc. time (global bounded)

0.9

80

0.8

70

0.7

60

0.6

50

0.5

40

0.4

30

0.3

20

0.2

10

0.1

0

processing time (s)


90

0 3

4

5

6

7 time (minutes)

8

9

10

11

Figure 8.38: Number of associations and computation time versus run time for global data association in run 2 of experiment 3.


193

1 local (full) global (full) local (bounded) global (bounded)

0.9 0.8

processing time (s)

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

10

20

30

40

50

60

70

80


Figure 8.39: Computation time versus number of associations for both runs of experiment 3.

indicates the number of associations (left y-axis). The plotted red + and blue × symbols indicate the processing time for the full and bounded versions of the data association algorithm, respectively (right y-axis). Note that for local data association, the number of associations, and hence the computation time, varies little throughout the runs. This is because the number of percepts considered during each local data association step does not vary significantly. This is expected, since the number of percepts is limited by the number of information sources each robot is using. The computation time required by the full and bounded methods is essentially the same. Figures 8.38 and 8.38 show the number of associations and required computation time for global data association, for both the full and bounded data association methods, and for both runs of the experiment. Again, the solid green line indicates the number of associations (left y-axis). The plotted red boxes and blue diamonds indicate the processing time for the full and bounded versions of the data association algorithm, respectively (right y-axis). The global data association step considers an increasing number of associations as the run progresses, since there are more local anchors to consider as each new parcel is observed. This means that more computation time is also required. As can be seen in the figures, the computation time requirements in-


194

crease dramatically near the end of the runs, and the performance increase of the bounded method becomes increasingly apparent. Figure 8.39 shows the overall trend of computation time versus the number of associations, for both the full and bounded data association methods, and for both runs of the experiment. Both local and global data association are considered, as well. Note that the local data association instances only affect the beginning of the curve, since the number of associations for local data association is always relatively low. Trend lines computed using regression are also shown, for both the full and bounded versions of the algorithm. The fact that the bounded method offers a significant reduction in required computation time when the number of associations gets large is yet another reason why the bounded method is particularly useful for global data association. This is also another reason why it might often be unnecessary to use the bounded version for local data association.

8.6.4

Results

The task was successfully achieved in both runs, with both the full and bounded versions of the data association algorithm. The robots were able to properly associate observations and anchors both locally and globally, in order to properly detect all four parcels. Despite some data association errors when using the bounded algorithm in one run, by the end of the run the results were the same as for the full version. Final position estimates were all well within the desired 400m error margin, as can be seen in figures 8.29 and 8.30. These figures also show that the position estimate errors were decreasing, in general, as new information arrived throughout the runs. The global anchors were consistent with the ground truth positions of the parcels, as can be seen in figures 8.27 and 8.28. The computation time for local data association was always less than 50ms, which would allow a local frame time of up to 20Hz. The global data association was always less than 1s, which would allow a global frame time of up to 1Hz. This can be seen in figure 8.39. These numbers could likely be improved with code-optimisation.

8.6.5

Discussion

The experimental objectives were achieved, given the accomplishment of the specified task. The data association algorithm successfully matched information from three real robots, two of which were moving. The errors in global anchor position estimates were within the specified error margin, and, in general, the error was decreasing as new information arrived. The experiment also verified that the bounded version of the data association algorithm produced similar results to the full version, while using less computation time.

8.7. EXPERIMENT 4: ANCHORING IN A FULL ROBOTIC SYSTEM

195

8.7 Experiment 4: Anchoring in a Full Robotic System 8.7.1

Goal

The task for this experiment was inspired by the “Lost and Found” challenge of the RoboCup@Home competition [130], and involved having PeopleBoy detect and navigate to one or more household objects placed within the experimental environment. Two fixed cameras were available to assist in object detection and coarse object localisation. The goal of this experiment was to verify that the proposed anchoring framework can be included as part of an overall robotic architecture, which can be used to address a complex online task. The task requires that the cooperative anchoring problem be addressed, in a distributed robotic system. The experiment complements the previous experiments, which examined the anchoring framework in isolation, using offline processing. Recall that this final experiment was performed by Borissov and Janecek [25, 26], using a simplified implementation of the anchoring framework proposed in this work. An overview of their implementation of the anchoring framework will be presented shortly.

8.7.2

Setup

For this experiment, the real Peis-Home was used. The setup was similar to the setup used in the previous two experiments. The robots used were the mobile robot PeopleBoy, and both fixed cameras, C1 and C2. The fixed cameras were positioned as shown in figure 8.40; note that the cameras had roughly perpendicular perspectives of the Peis-Home. The figure shows the region of the Peis-Home which was within the field of view of both cameras. The three robots each had one camera as their only information source. The fixed cameras produced position and colour information. The robot’s camera produced colour, shape, and SURF features (Speeded Up Robust Features [15], discussed below). The deployment of processes was similar to that shown in figure 8.7; note that the peisrouter and anchoring monitor tool were not used. The peiscam and peiscsvision components were used as before, and the Player drivers were also used. Path planning and motion control were also performed using Player drivers. A number of other Peis-Components were used to perform operations on images, as well as to execute the overall object detection strategy. The objects which the system had to detect were five everyday objects, which could be detected using colour segmentation and identified using SURF features. The selected objects are shown in figure 8.41.


196

Figure 8.40: A rough map of the Peis-Home, in which the area viewed by each fixed camera is shown. The darkened area is visible by both fixed cameras. Figure adapted from [25], used with permission.

8.7.3

Approach

Overall Strategy The approach taken by Borissov and Janecek to address the task involved three phases and a number of steps, which are summarised here. • Phase 1 – Offline Setup: The following two offline steps were performed. – Step 1 – Offline Object Inspection: This step involved showing the various objects of interest to PeopleBoy, from various viewing angles, against a neutral background. This allowed PeopleBoy to obtain a description of each object in terms of its colour, shape, and a set of SURF features. Object sizes were also added to the description of each object. The resulting set of positive descriptions was used by all robots. – Step 2 – Offline Background Image Acquisition: Fixed cameras acquired images of the environment before the target objects were placed. • Phase 2 – Candidate Detection: Once the offline steps were performed, the system could begin searching for the desired objects using colour segmentation on images obtained from the two fixed cameras. The candidate detection steps can be summarised as follows.


(a)

(b)

(c)

(d)

197

e) Figure 8.41: Objects to detect for experiment 4. Figures from [25], used with permission.

198


– Step 3 – Background Subtraction: The fixed cameras each acquired images of the environment, with one or more desired objects positioned somewhere within the field of view of both cameras. Background subtraction and colour segmentation were performed, resulting in a number of coloured blobs, which corresponded to objects which appeared to have been placed in the environment after the background images were taken. If no interesting blobs were found, new images were acquired (it could happen that the desired objects were simply not in place yet, for instance). From the point of view of the anchoring framework, the background subtraction step was performed from within each camera’s information source. The output blobs were described by their colour, size in pixels, and position in the image. – Step 4 – Interest Filtering: Interest filtering was performed on the blobs output from the previous step, based on their colour. Blobs which did not match any of the positive descriptions created in step 1 were discarded. – Step 5 – Local Anchor Management: The blobs which survived the previous step were converted into local anchors, containing position and colour estimates. Local matching and fusion were not needed, since each camera only considered one information source, and one input image (taken at a single point in time). Given this, no two blobs could be considered to be matching (given the normal data association assumptions). This meant that there was exactly one local anchor for each blob. – Step 6 – Global Anchor Management: Position and colour estimates from both cameras were matched and fused. The result of the global anchor management step was a set of global anchors, which became the object candidates to be verified in the third and final phase of the approach. Note that the position estimates obtained in this step were the only position estimates used by the system; position estimates were not refined in the verification phase. • Phase 3 – Verification: Once a set of candidates were available from the fixed cameras, the final phase of the detection task was initiated, in which PeopleBoy verified the candidates using the following steps. – Step 7 – Navigation: PeopleBoy navigated to the approximate position of the closest unverified candidate object. If no unverified candidates were available, the system indicated that the task was completed, and execution was halted. – Step 8 – Scanning: When PeopleBoy was in the vicinity of the closest unverified candidate object, a scan was performed by moving the


199

pan-tilt unit. Blobs which had both the same colour as the candidate, and a shape which was consistent with a similarly-coloured positive description, were verified using SURF features, per step 9. If verification succeeded for any of the blobs, the system removed the verified object from the list of positive descriptions, and returned to step 7. If the pan-tilt scan resulted in no found blobs, or if verification of all found blobs failed, People rotated 90 degrees and performed another pan-tilt scan. If all rotations were tried with no successful verification, PeopleBoy returned to step 7. – Step 9 – Feature Verification: Candidate blobs which matched a description’s colour and shape during the scan in step 8 were further verified using SURF features. This step was only used for blobs which already matched in the colour and shape domains, since SURF feature extraction is a computationally demanding operation. Before extracting the features, the camera zoomed in on the candidate blob. If a blob was found to match any of the positive descriptions in all three domains (colour, shape, and SURF features), the system indicated that verification had succeeded; otherwise, verification failure was indicated. Anchoring Implementation: Representation The domains used in the experiment included shape, SURF features, colour, and position. The local and global anchor spaces had the same coordinate systems, so space transformation functions were not needed. The domains and conceptual sensor models are briefly described here; for more details see the full description of the work by Borissov and Janecek [25, 26]. A blob’s shape was represented using a normalised array of 64 elements, which corresponded to distances from the centre of the blob; the rays were roughly 6 degrees apart. Matching of shapes, needed for the interest filtering part of step 8, was performed by finding the best alignment between the two shape arrays, and computing a normalised difference between the lengths of all elements. The representation can be seen as a sort of circular 1D bin model, per section 5.2.2 (note however that in this case, the representation was not used to represent a fuzzy set, merely an array of values). An interesting difference between this and previously discussed domains is that the bins do not have an inherent alignment, since shape signatures are rotation independent; this is why matching requires that the best alignment be found. Fusion of shape information was not required. A sample shape signature, shown with only 22 distance segments, is shown in figure 8.42(a). Each object of interest was also described using SURF features, obtained during offline inspection of the object. These features contained descriptions of salient or interesting points in images containing the object. SURF features are intended to be robust with respect to luminance, contrast, size, and orientation.


200

(a)

(b)

Figure 8.42: Shape signature (a) and SURF feature locations (b) for one of the objects used for experiment 4. Figures from [25], used with permission.

The SURF feature space is a multi-dimensional space, in which each point of interest is described using a vector of values. Various transformations are used to extract the salient points in a given image, and to derive the description vectors. The details of these transformations are not within the scope of this work; more details can be found in the literature [15, 106, 105]. The important thing to note is that SURF features allowed signatures to be created for perceived objects, and these signatures could be compared with one another. The matching process involved comparing features from two objects, and the resulting match value reflected how many features matched. Fusion of SURF information was not required. The interesting points found in an image of one of the objects used in the experiment are shown in figure 8.42(b). The HSV colour space was used for colour. For interest filtering, both in steps 4 and 9, colours were represented using symbolic colour names, each mapped to a non-overlapping hue range. This representation was essentially a binary 1D bin representation similar to those discussed in section 5.2.2; the bin model was used to represent crisp sets, rather than fuzzy sets. A given hue value was mapped to a specific bin, with a membership value of 1; all other bins had membership values of 0. Only matching was performed on this type of representation; the matching process simply involved checking if, for two colours, the same bin had a membership value of 1. For the global anchor management performed in step 6, colours were represented using fuzzy sets, implemented using two trapezoids: one for hue, and one for saturation. Such 2D parametric trapezoidal representations were discussed in section 5.2.2. The matching and fusion of these was performed according to section 5.2.3. Operations on these parametric representations were fast and easy to compute. Position information was used in steps 5 and 6. For a given blob received from one of the fixed cameras, range and bearing fuzzy sets for the correspond-


201

ing object were computed using: the minimum and maximum dimensions of all objects of interest which had the same colour; the blob’s size and position in the image; and the camera’s position, orientation, and field of view. These computations were similar to those performed in the fusion experiments described in chapter 7. Range and bearing estimates were represented using two trapezoids, similar to the inputs to the conceptual sensor model described in section 6.2. Note that the core of the range estimate trapezoids could be quite large, especially for objects with significant asymmetry in their shape, such as the book in figure 8.41. The support of the range estimates and the core and support of the bearing estimates were selected based on empirical data. The trapezoids for range and bearing were converted into trapezoids for x and y, using a bounding box approach. This step increased the uncertainty in the estimates, but simplified the matching and fusion processes. The increase in uncertainty was made less significant by the fact that the cameras were almost aligned with the x and y axes to start with. A graphical example of the process is shown in figure 8.43. The resulting representation was a 2D parametric trapezoidal representation, as in section 5.2.2. Again, these trapezoids could be easily compared and fused, per section 5.2.3. Anchoring Implementation: Processing The local anchor management performed in step 5 was trivial, since no matching or fusion was needed. In the global anchor management performed in step 6, data association and fusion were performed using information from both the colour and position domains. Both colour and position information were represented using fuzzy sets, implemented using 2D trapezoids. In the colour domain, the trapezoids represented hue and saturation; in the position domain, they represented x and y. Matching and fusing such trapezoids was a simple operation, discussed in section 5.2.2. The matching operator used for data association in both the colour and position domains was the match1 operator, given by equation 5.7. A greedy data association algorithm was used instead of the globally optimal algorithm proposed in section 6.3. In each iteration, it associated the first two local anchors (one from each fixed camera, taken in an arbitrary order) which were found to be fully matching in both the position and colour domains. When no more fully matching local anchors were found, the algorithm halted. The fusion operator used for fusion of both colour and position was the fuse operator, given by equation 5.9. The fusion algorithm only needed to combine estimates from the two fixed cameras. Anchoring Implementation: Comparison The implementation of the anchoring framework used by Borissov and Janecek differed from the implementation proposed in chapters 5 and 6 in a number


202

y

y

x

x

(a)

y

(b)

bearing trapezoid range trapezoid

y

x

x (c)

(d)

Figure 8.43: Range and bearing trapezoids for a given observation are shown in (a) and (b). A bounding box was used to convert these to bounding boxes in (x, y) coordinates, in (c). Finally, another bounding box was used to align the trapezoids with the x and y axes, in (d). Figures adapted from [25], used with permission.

of respects. Most notably, a number of new and useful features were implemented. Shape and SURF feature domains were added, in order to allow more fine grained object detection and verification. Interest filtering was implemented for the shape, SURF, and colour domains, to allow the system to focus only on interesting objects. In addition to these added features, a few simplifications were made, mainly to reduce execution time, and to reduce the complexity of the implementation. One such simplification, with respect to the framework itself, was that description matching in the colour domain was performed using a simpler representation than the representation used in the anchoring steps. This probably reduced the time required for interest filtering significantly. It is possible that using a more detailed representation might have reduced the number of candidate objects which the system needed to verify. However, given that the hue dimension,


203

which was used in the simplified colour representation, is often the most salient dimension of the HSV colour space, the overall impact of this simplification on the performance of the system was probably small. Another simplification was that the representations for position and colour were parametric 2D trapezoids, as opposed to the grids used in the implementation proposed in this work. In particular, the position trapezoids were approximated using bounding boxes, which could increase the amount of uncertainty in position estimates significantly. This simplification would definitely reduce the computation time and complexity of the matching and fusion steps. The gain would be even greater in larger environments, since the benefit of parametric models versus bin models grows with the size of the environment. Overall, the increase in uncertainty was likely not a significant problem for the given task. This is mainly due to the fact that the position estimates obtained from the fixed cameras were only used as initial estimates, which guided the mobile robot as it performed a thorough inspection of the area. Finally, the greedy data association algorithm was much simpler than the globally optimal algorithm proposed in this work. This simplification would again reduce computation time, while simplifying the implementation. For the application in question, the sub-optimality of the algorithm was probably a reasonable price to pay for this reduction in computational cost, for a number of reasons. First, data association only needed to consider the two fixed cameras, which meant that complex matching scenarios could never arise. Second, the background subtraction step significantly improved the robustness of the position and colour estimates obtained from the fixed cameras; this made data association errors less likely. And finally, even if data association were to fail, there would still be a chance for the mobile robot to detect interesting object using position estimates obtained from only one fixed camera.

8.7.4

Execution

Initial Configuration Three types of runs were performed. First, runs which considered only the first two phases of the approach were executed, and the accuracy of the candidate detection phase was examined. A second type of run examined the results achieved by the full system, which executed all three phases of the approach. Finally, runs which conformed to the RoboCup@Home Lost and Found scenario were executed. All runs relied on the same offline phase, consisting of two steps. In step 1, each of the five objects shown in figure 8.41 was shown to PeopleBoy, from multiple viewing angles, against a neutral background. A custom graphical tool allowed colour, shape, and SURF information to be collected. Object sizes were entered using the same tool. In step 2, both cameras captured background images of the Peis-Home, before any of the objects of interest were in place.


204

4

3

2

1

5

Figure 8.44: The possible object positions used in experiment 4. Figure from [25], used with permission.

Five pre-defined positions were selected in the Peis-Home, each with four possible orientations. In all runs, objects were only placed at these positions. The positions were all within the field of view of both cameras, and they covered a range of positions and heights, at which similar objects might be found in a typical household environment. The used positions are shown in figure 8.44. First: Candidate Detection In the first set of runs, which examined the accuracy of the candidate detection phase, one object was placed in the Peis-Home at a time. A total of 100 runs were performed, 20 for each object. Each object was placed in each of the five pre-defined positions, with each of the four pre-defined orientations. The candidate detection steps were then performed. Only the fixed cameras were involved in these runs. Second: Candidate Detection and Verification In the second set of runs, the performance of the overall system was examined. A total of 20 runs were performed. At the beginning of each run, one randomly selected object was placed in one of the five pre-defined positions, at one of the pre-defined orientations. The position and orientation were also randomly selected. The candidate detection and verification steps were then performed. Both the fixed cameras and the mobile robot were involved in these runs.


205

Third: RoboCup@Home Scenario Two runs were performed which conformed to the RoboCup@Home Lost and Found scenario. In the scenario description, three objects should be placed in the environment, and all three should be found within five minutes. The objects were placed at pre-defined positions and orientations in the Peis-Home. The selection of which objects, which positions, and which rotations, was again random. Candidate detection and verification steps were then performed; again, both the fixed cameras and the mobile robot were involved in these runs.

8.7.5

Results

First: Candidate Detection The average position estimate error over the 100 runs was 1.02m, and the standard deviation was 0.8m. The median error was 0.81m. The error ranged from less than 1mm to 4.84m. In 8 out of the 100 runs, the object was not found by both cameras (in which case the detection was considered to have failed). This was likely caused by slight errors in segmentation. Given the range to the objects, and their small size, even errors of a few pixels could result in significant position estimate errors, which could cause matching to fail. Second: Candidate Detection and Verification The system successfully located and identified the target object in 14 out of the 20 runs. There were no false positives, in any of the runs. The average distance between the robot and the detected object was 1.12m, and the standard deviation was 0.58m. On average, the task took 113 seconds to complete. The standard deviation of the time was 81 seconds. Some of the failures were caused by problems which would be fairly easy to fix. For instance, two failures were caused by the robot coming too close to the object, which meant that the pan-tilt scan did not see the object. This could likely have been solved by using a different navigation routine, or by changing the range of the pan-tilt scan (the latter modification would cause a noticeable increase in the time required for each pan-tilt scan). One failure was caused by a failure in feature matching, for the book. The features were not as rotation independent as expected. This could potentially have been remedied by collecting features from more object rotations. One failure was caused by a poor initial candidate position estimate. The estimate was from only one camera, since the object estimates from the fixed cameras did not match. The robot stopped too far away from the object to see it during the scanning step. Using a more sophisticated representation for position information might make such failures less likely.


206

Third: RoboCup@Home Scenario The system correctly located all three objects in both runs, and the robot was able to position itself within 1m of all objects. The first run completed in 226 seconds; the second run took 400 seconds, which was longer than the five minute time limit stipulated in the Lost and Found scenario. The delay in the second run was caused by improper segmentation of one of the objects in the candidate detection phase. The object was detected anyway, while scanning for one of the other objects.

8.7.6

Discussion

The three types of runs show that the approach to the object detection and localisation problem suggested by Borissov and Janecek is a promising one. The cooperative anchoring problem is successfully addressed by their implementation of the anchoring framework, and the experiments verified that the framework can be used to address a complex online task.

8.8 Summary The objective of this chapter was to show that the proposed anchoring framework is able to address the anchoring problem, even in the face of uncertainty and heterogeneity. The four presented experiments showed how tasks which contained various aspects of the anchoring problem were successfully addressed using the proposed framework. These successes mean that the objective of the chapter has been met.

Chapter 9

Conclusions This chapter provides a summary of the contributions of this work. First, the proposed anchoring framework, the presented realisation of it, and performed experiments are examined. A number of limitations and potential directions for future work are then discussed. Finally, some general conclusions are drawn.

9.1 Summary 9.1.1

Problem Definition

Various definitions of the anchoring problem have been proposed over the years, and many have failed to fully capture the generality of the problem. Simply put, non-perceptual information used to describe and reason about objects of interest needs to be associated with perceptually grounded representations of these objects. This allows physically embedded agents to meaningfully interact with the objects in their environment. Non-perceptual information need not be limited to names or symbols, and perceptual information need not be limited to numeric sensor data. In chapter 3 this thesis provided general definitions for both the single-robot and cooperative anchoring problems.

9.1.2

Framework

A novel anchoring framework was proposed in chapter 4 which addresses a number of limitations of previous approaches. In particular, the proposed framework addresses both the single-robot and cooperative anchoring problems. Descriptions of objects of interest can be formulated in various ways, and these descriptions can be associated with representations of objects based on heterogeneous and uncertain perceptual information. These perceptual representations, called anchors, can be seen as evolving regions in multi-dimensional anchor spaces. These spaces are inspired by conceptual spaces, and they allow various types of information from multiple domains to be represented. In particular, both symbolic and numeric information can be 207

CHAPTER 9. CONCLUSIONS

208

represented in the same anchor space. Local anchor spaces are used to simplify interaction between various components in robotic architectures; global anchor spaces are used to simplify interaction between various robots. The single-robot anchoring problem is addressed by computing local anchors based on information arriving from different sensors on the same robot. Local anchors can be updated frequently, and they can be particularly useful for task execution. The cooperative anchoring problem is addressed by computing global anchors based on local anchors exchanged between robots. Global anchors can be more complete and robust than local anchors, since they contain information obtained from multiple robots; this makes them particularly useful for cognitive tasks and coordination. An important feature of the proposed anchoring framework is the transparency with which the single-robot and cooperative anchoring problems are addressed. If any local anchors are successfully exchanged, the framework allows the information they contain to be exploited by all recipients. If no local anchors are exchanged, the framework transparently allows global anchors to be computed and used nonetheless. This is possible due to the decentralised nature of the framework, which allows local and global anchor management steps to be performed independently.

9.1.3

Realisation

An implementation of the proposed framework was presented in chapters 5 and 6. The implementation describes one possible realisation of the proposed anchoring framework, and it was used mainly as an evaluation and testing tool. The implementation uses fuzzy sets, and associated operations, to represent, match, and fuse various types of information. The presented implementation was used with two different approaches to self-localisation, one of which also uses fuzzy sets to represent information. Self-localisation information is used as an input to the proposed multi-robot object localisation method, which takes both observation and self-localisation uncertainty into account. A data association algorithm was also proposed, which allows heterogeneous information from multiple domains to be matched and associated. The approach also allows object names to constrain associations, when possible. Associated information is fused using fuzzy intersection in order to obtain updated estimates of object properties which reflect a consensus between sources. The same data association and information fusion algorithms are used for both local and global anchor management.

9.1.4

Experiments

A number of experiments were used to validate the work presented in this thesis. In chapter 7, the cooperative object localisation approach was examined

9.2. LIMITATIONS AND FUTURE WORK

209

using an “error landscape” analysis. This analysis allowed the performance of the approach to be characterised with respect to various types of errors on each of the method’s inputs. The results showed that the method is quite adept at dealing with bearing errors in observations and orientation errors in selflocalisation, while being somewhat less effective when faced with range errors. In chapter 8, four experiments were used to validate the proposed anchoring framework. The first experiment was performed in a simulated and static environment, and it included the manual addition of two sensing domains to the implemented framework. This experiment demonstrated the use of descriptions for representing objects of interest, and it also illustrated the proposed local and global anchor management steps being performed across multiple domains by multiple robots. The second experiment demonstrated that the framework is able to handle the uncertainty in real sensor data originating from two mobile robots navigating in a real environment. The third experiment was also performed in a real environment, and it involved two mobile robots and one fixed camera. This experiment verified that the framework could handle more than just two agents, and it was also used to briefly characterise the correctness and time requirements of the full and bounded data association algorithms. Finally, a fourth experiment, performed by Borissov and Janecek [25, 26], demonstrated how a different implementation of the proposed anchoring framework was used within a complete robotic system to perform a complex object detection task in a real world environment.

9.2 Limitations and Future Work 9.2.1

Framework Improvements and Extensions

One limitation of the proposed framework is that it does not include any mechanism for recovering from local data association errors. Data association errors can occur for a number of reasons, including sensor errors and ambiguous situations. As new information is gathered, these errors might be detected; however, recovery is difficult if received percepts are not stored. It would be interesting to investigate the use of multiple hypothesis tracking (MHT) [129, 17] to track several association hypotheses simultaneously, especially when associating entities for which information is only available in separate domains. A possible extension of the framework could include functionality for exchanging, matching and fusing object descriptions. This could facilitate coordination between robots, by allowing them to have shared representations not only of the objects they perceive, but also of the objects in which they are interested. In this thesis all such coordination was left up to higher layers. However, many of the functionalities present in the framework could be used to improve the consistency of object descriptions across robots. Another possible extension could involve a more detailed look at self-anchors and their uses. Self-anchors could be useful for cooperative self-localisation, as

210


well as to improve coordination between robots in domains such as network robot systems, in which “smart objects” have information about themselves. These ideas were only briefly touched upon in this thesis. Treatment of negative information could be another useful extension of the proposed framework. By considering the range and field of view of available sensors, as well as the estimated position of an object, it is possible to estimate the likelihood of observing that object from a given pose. Most systems simply discard the information that no object is being observed at a position at which an observation could be expected. Such non-observations could be used to increase uncertainty in corresponding object position estimates, which could in turn result in perceptual actions being taken to acquire updated information about the unobserved object. The anchoring framework could also be extended to be more directly involved in resolving ambiguous situations. For instance, when two anchors exist which match a given definite object description, the anchoring framework could report in which domains the two anchors differ most, in order to help a user of the framework decide which information or perceptual actions might lead to a fast and reliable disambiguation of the situation. The current framework assumes that many aspects are designed a priori. In particular, the domains and dimensions used in conceptual spaces are fixed and known. It would be interesting to incorporate some means of dynamically modifying, extending, and negotiating the used anchor spaces. In addition, it would be extremely interesting to examine the automatic creation of symbols and symbolic descriptions, based on saliency, in order to facilitate interaction with humans and coordination between multiple agents [144, 85, 72]. Finally, work is already under way which aims to help higher layers better exploit the perceptual information contained in anchors for high-level reasoning and communication [45]. In particular, the extraction of symbolic information from anchors can make it easier for cognitive layers to reason about the properties of observed objects. Such abstractions are particularly useful for facilitating human-robot interaction, and they can also be used as an efficient means of exchanging information between robots. Such abstractions also allow available information about objects to be increased, by taking ontologies of concepts and contextual information into account. For instance, knowledge about object types, and their uses and default properties, can be used to help artificial systems in assessing complex situations and solving tasks. Although this thesis does not specifically address these issues, the conceptual space approach is well suited for integrating anchoring and high level concepts. In particular, a concept which represents a particular object type (e.g. a cup) corresponds to a region in a given conceptual space. A concept represented in this way can easily be compared with an anchor represented in the same conceptual space, providing a measure of how much a particular anchor corresponds to a particular object type. Also, concepts could easily be used as descriptions in the proposed framework. A planner wishing to find a cup could

9.3. CONCLUSIONS

211

use the conceptual space representation of this concept directly, as a positive description. The integration of the proposed framework and ongoing work which deals with conceptual knowledge is an important direction for future work.

9.2.2

Implementation Improvements and Extensions

The presented implementation of the anchoring framework is currently packaged as a monitoring tool, which was mainly used for development and testing of the framework. This implementation could be extended and improved in a number of ways. One important improvement would involve a re-organisation of the software to enable more convenient deployment and optimisation of the core components of the framework. Another improvement would be the incorporation of alternative representation tools; in particular, to improve or replace the grids which are currently used in the framework. Grids are good for visualising information, and they can be used efficiently in some cases; however, they do not scale well. The use of sample-based or parametric representations, for instance, could result in a significant reduction in computational costs, and a corresponding increase in scalability. The search algorithm used for data association might also be improved. A number of sophisticated data association methods exist in the literature which might be adapted to the multi-domain search problem addressed in this work. An important extension of the implementation could include the inclusion of other domains, as well as corresponding conceptual sensor models for relevant information sources. Currently only position and colour are included in the implementation, but other domains, such as shape and texture, could improve the usefulness of the implementation significantly. Finally, the approach to prediction in this work is extremely simple, and more complex prediction models could be applied. In particular, prediction in the position domain could take estimated velocity into consideration. Also, in long duration applications, the number of anchors considered will normally grow over time, and this growth has been shown to be computationally problematic. Methods for automatic anchor deletion or archiving could alleviate this problem.

9.3 Conclusions The anchoring problem is broad, in that it involves a number of different components and layers in robotic architectures. This can make it hard to formalise, since different components often operate on completely different structures. Approaches to the problem can also be hard to evaluate, since the performance of any approach to such a general problem will inherently depend on the performance of several underlying and associated components. This makes it difficult to derive meaningful performance metrics.

212


This thesis has nonetheless proposed a formalisation of the anchoring and cooperative anchoring problems. This formalisation will certainly not please everyone. Those working with low-level perception and control might be uncomfortable with the lack of detail regarding the information abstraction performed by generic “information sources”. Those working with high-level processes may be frustrated by the superficial treatment of symbols and reasoning. Hopefully these limitations will be forgiven, if only because they allow the anchoring problem, which lies in between these areas, to be the focus of this work. The proposed formalisation was used to develop a complete and novel anchoring framework, which addresses a number of important limitations of existing approaches. An implementation of the framework has been shown, through experimental validation, to be capable of addressing both the singlerobot and cooperative anchoring problems. Nonetheless, there is much work yet to be done. A number of important and interesting directions for future work have been identified, and many others surely exist. Hopefully some of the ideas proposed in this work will be of use as these are explored.

References [1] ActivMedia Robotics. web site. http://www.activmedia.com. (Cited on page 153.) [2] R. Adrian. Sensor management. In Procs of AIAA/IEEE Digital Avionics Systems Conf, pages 32–37. IEEE Computer Society Press, 1993. (Cited on page 20.) [3] R. Aragues, E. Montijano, and C. Sagues. Consistent data association in multi-robot systems with limited communications. In Robotics: Science and Systems Conference (RSS), 2010. Accepted for publication. (Cited on page 51.) [4] T. Arai, E. Pagello, and L.E. Parker. Advances in multirobot systems. IEEE Trans on Robotics and Automation, 18(5):655–661, October 2002. (Cited on page 1.) [5] F. Baader, D. Calvanese, D.L. McGuinness, P. Patel-Schneider, and D. Nardi. The description logic handbook: theory, implementation, and applications. Cambridge University Press, 2003. (Cited on page 26.) [6] R. Bajcsy. Active perception. Procs of the IEEE, Special issue on Computer Vision, 76(8):996–1005, 1988. (Cited on page 20.) [7] T. Balch, Z. Khan, and M. Veloso. Automatically tracking and analyzing the behavior of live insect colonies. In Procs of the Int Conf on Autonomous Agents, pages 521–528, 2001. (Cited on page 22.) [8] Y. Bar-Shalom. Tracking and data association. Academic Press, San Diego, CA, USA, 1987. (Cited on pages 4, 20, and 21.) [9] Y. Bar-Shalom, T. Kirubarajan, and C. Gokberk. Tracking with classification-aided multiframe data association. IEEE Trans on Aerospace and Electronic Systems, 41(3):868–878, 2005. (Cited on page 22.)

213

214

REFERENCES

[10] Y. Bar-Shalom and X. R. Li. Multitarget Multisensor Tracking: principles and techniques. YBS Publishing, 1995. (Cited on pages 4 and 20.) [11] Y. Bar-Shalom, X.R. Li, and T. Kirubarajan. Estimation with applications to tracking and navigation. Wiley-Interscience, 2001. (Cited on page 20.) [12] Y. Bar-Shalom and E. Tse. Tracking in a cluttered environment with probabilistic data association. Automatica, 11:451–460, 1975. (Cited on pages 4 and 21.) [13] A. H. Barr. Superquadrics and angle-preserving transformations. IEEE Computer Graphics and Applications, 1:11–22, 1981. (Cited on page 41.) [14] K. J. Barwise and J. Perry. Situations and Attitudes. MIT Press, Cambridge, USA, 1983. (Cited on page 11.) [15] H. Bay, T. Tuytelaars, and L. Van Gool. SURF: Speeded Up Robust Features. In Procs of the European Conf on Computer Vision (ECCV), pages 404–417, 2006. (Cited on pages 195 and 200.) [16] S. Benferha and C. Sossai. Reasoning with multiple-source information in a possibilistic logic framework. Information Fusion, 7:80–96, 2006. (Cited on pages 25 and 75.) [17] S.S. Blackman. Multiple hypothesis tracking for multiple target tracking. Aerospace and Electronic Systems Magazine, 19(1):5–18, 2004. (Cited on pages 21, 22, 32, and 209.) [18] S. Blackmore. Consciousness: An Introduction. Hodder & Stoughton, Oxford, UK, 2003. (Cited on pages 4 and 20.) [19] I. Bloch. Information combination operators for data fusion: A comparative review with classification. IEEE Trans on Systems, Man, and Cybernetics, A-26(1):52–67, 1996. (Cited on pages 25, 65, and 73.) [20] I. Bloch and A. Hunter. Fusion: general concepts and characteristics. Int Journal of Intelligent Systems, 16(10):1107–1134, 2001. (Cited on page 23.) [21] I. Bloch and H. Maître. Fuzzy mathematical morphologies: A comparative study. Pattern Recognition, 28(9):1341–1387, 1995. (Cited on pages 83, 91, 99, and 118.) [22] H.A.P. Blom and E.A. Bloem. Probabilistic data association avoiding track coalescence. IEEE Trans on Automatic Control, 45(2):247–259, 2000. (Cited on page 22.)

REFERENCES

215

[23] A. Bonarini, M. Matteucci, and M. Restelli. Anchoring: do we need new solutions to an old problem or do we have old solutions for a new problem? In Procs of the AAAI Fall Symposium on Anchoring Symbols to Sensor Data in Single and Multiple Robot Systems, pages 79–86, 2001. (Cited on pages 14, 15, 16, and 18.) [24] A. Bonarini, M. Matteucci, and M. Restelli. Problems and solutions for anchoring in multi-robot applications. Journal of Intelligent and Fuzzy Systems, 18(3):245–254, 2007. (Cited on pages 14, 15, 18, 19, and 61.) [25] A. Borissov and J. Janecek. A network robot system for object identification and localization. Master’s thesis, Örebro University, 2008. (Cited on pages 150, 195, 196, 197, 199, 200, 202, 204, and 209.) [26] A. Borissov, J. Janecek, F. Pecora, and A. Saffiotti. Towards a network robot system for object identification and localization in RoboCup@Home. In Procs of the Workshop on Network Robot Systems (NRS) at IROS, Nice, France, 2008. (Cited on pages 150, 195, 199, and 209.) [27] H. Boström, S.F. Andler, M. Brohede, R. Johansson, A. Karlsson, J. van Laere, L. Niklasson, M. Nilsson, A. Persson, and T. Ziemke. On the definition of information fusion as a field of research. Technical Report HS-IKI -TR-07-006„ School of humanities and Informatics, University of Skövde, Sweden, 2007. (Cited on page 23.) [28] C. Breazeal. A motivational system for regulating human-robot interaction. In Procs of the National Conf on Artificial Intelligence, pages 54–61, 1998. (Cited on page 26.) [29] J. E. Bresenham. Algorithm for computer control of a digital plotter. IBM System Journal, 4(1):25–30, 1965. (Cited on pages 98 and 100.) [30] M. Broxvall, S. Coradeschi, L. Karlsson, and A. Saffiotti. Recovery planning for ambiguous cases in perceptual anchoring. In Procs of the AAAI Conf on Artificial Intelligence, pages 1254–1260, Pittsburgh, PA, 2005. AAAI Press. (Cited on pages 20, 55, and 110.) [31] P. Buschka, A. Saffiotti, and Z. Wasik. Fuzzy landmark-based localization for a legged robot. In Procs of the IEEE Int Conf on Intelligent Robots and Systems (IROS), pages 1205–1210, Takamatsu, Japan, 2000. (Cited on pages 24, 72, 89, and 90.) [32] J.-P. Cánovas, K. LeBlanc, and A. Saffiotti. Robust multi-robot object localization using fuzzy logic. In D. Nardi, M. Riedmiller, and C. Sammut, editors, RoboCup 2004: Robot Soccer World Cup VIII, LNCS, pages 247–261. Springer, 2005. (Cited on page 25.)

216

REFERENCES

[33] Y.U. Cao, A.S. Fukunaga, and A.B. Kahng. Cooperative mobile robotics: Antecedents and directions. IEEE Trans on Autonomous Robotics, 4(1):7–27, March 1997. (Cited on page 1.) [34] S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous information sources. In Procs of the IPSJ Conf, pages 7–18, 1994. (Cited on page 26.) [35] A. Chella, S. Coradeschi, M. Frixione, and A. Saffiotti. Perceptual anchoring via conceptual spaces. In Procs of the AAAI Workshop on Anchoring Symbols to Sensor Data, San Jose, CA, 2004. (Cited on pages 13, 16, and 19.) [36] A. Chella, M. Frixione, and S. Gaglio. Conceptual spaces for computer vision representations. Artificial Intelligence Review, 16:137–152, 2001. (Cited on page 41.) [37] S. Coradeschi and A. Saffiotti. Anchoring symbols to vision data by fuzzy logic. In A. Hunter and S. Parsons, editors, Symbolic and Quantitative Approaches to Reasoning and Uncertainty: Proc of the ECSQARU Conf, number 1638 in LNCS, pages 104–115. Springer-Verlag, 1999. (Cited on page 74.) [38] S. Coradeschi and A. Saffiotti. Anchoring symbols to sensor data: preliminary report. In Procs of the AAAI Conf on Artificial Intelligence, pages 129–135, Austin, TX, 2000. (Cited on pages 1, 11, 12, 13, 15, 16, 45, and 82.) [39] S. Coradeschi and A. Saffiotti, editors. Anchoring Symbols to Sensor Data in Single and Multiple Robot Systems: Papers from the AAAI Fall Symposium, Menlo Park, California, USA, 2001. AAAI Press. (Cited on page 11.) [40] S. Coradeschi and A. Saffiotti. An introduction to the anchoring problem. Robotics and Autonomous Systems, 43(2-3):85–96, 2003. (Cited on page 12.) [41] S. Coradeschi and A. Saffiotti, editors. Special issue on Perceptual Anchoring, Robotics and Autonomous Systems, volume 43 (2-3). Elsevier Science, 2003. (Cited on page 11.) [42] S. Coradeschi and A. Saffiotti, editors. Anchoring symbols to sensor data: Papers from the AAAI Workshop, Technical Report WS-04-03, Menlo Park, California, USA, 2004. AAAI Press. (Cited on page 11.) [43] G.F. Coulouris, J. Dollimore, and T. Kindberg. Distributed systems: concepts and design. Addison Wesley, 3 edition, 2001. (Cited on page 51.)

REFERENCES

217

[44] J.L. Crowley, P. Stelmaszyk, and C. Discours. Measuring image flow by tracking edge-lines. In Int Conf on Computer Vision, pages 658–664, 1988. (Cited on page 22.) [45] M. Daoutis, S. Coradeschi, and A. Loutfi. Grounding commonsense knowledge in intelligent systems. Int Journal on Ambient Intelligence and Smart Environments, 1(4):311–321, 2009. (Cited on pages 13, 14, 16, 26, 164, and 210.) [46] S. Deb, K.R. Pattipati, and Y. Bar-Shalom. A multisensor-multitarget data association algorithm forheterogeneous sensors. IEEE Trans on Aerospace and Electronic Systems, 29(2):560–568, 1993. (Cited on page 22.) [47] S. Deb, M. Yeddanapudi, K. Pattipati, and Y. Bar-Shalom. A generalized sd assignment algorithm for multisensor-multitarget state estimation. IEEE Trans on Aerospace and Electronic Systems, 33(2):523–538, 1997. (Cited on page 22.) [48] K. Demirli and M. Molhim. Fuzzy dynamic localization for mobile robots. Fuzzy Sets and Systems, 144:251–283, 2004. (Cited on page 24.) [49] R. Deriche and O. Faugeras. Tracking line segments. In European Conf on Computer Vision, pages 259–268, 1990. (Cited on page 22.) [50] D. Dey, S. Sarkar, and P. De. A probabilistic decision model for entity matching in heterogeneous databases. Management science, 44(10):1379–1395, 1998. (Cited on page 26.) [51] M. Dietl, J.-S. Gutmann, and B. Nebel. Cooperative sensing in dynamic environments. In Procs of the IEEE Int Conf on Intelligent Robots and Systems (IROS), pages 1706–1713, 2001. (Cited on pages 22 and 25.) [52] D. Dubois, J. Lang, and H. Prade. Possibilistic logic. In D. Gabbay, C. J. Hogger, and J. A. Robinson, editors, Handbook of Logic in Artificial Intelligence and Logic Programming, volume 3, pages 439–513. Carendon Press, 1994. (Cited on page 66.) [53] H. F. Durrant-Whyte. Integration, Coordination, and Control of MultiSensor Robot Systems. Kluver Academic Publishers, 1988. (Cited on page 25.) [54] H. F. Durrant-Whyte. A beginners guide to decentralised data fusion. Technical report, Australian Centre for Field Robotics, The University of Sydney NSW, 2006. (Cited on page 25.)

218

REFERENCES

[55] A. Farinelli, R. Farinelli, L. Iocchi, and D. Nardi. Multi-robot systems: a classification focused on coordination. IEEE Trans on Systems, Man, and Cybernetics, 34:2015–2028, 2004. (Cited on page 1.) [56] M. Fernandez and H. F. Durrant-Whyte. A failure detection and isolation algorithm for a decentralised multisensor system. In Procs of the IEEE Int Conf on Multisensor Fusion and Integration for Intelligent Systems, pages 27–33, Las Vegas, USA, 1994. (Cited on page 25.) [57] A. Ferrein, L. Hermanns, and G. Lakemeyer. Comparing sensor fusion techniques for ball position estimation. In A. Bredenfeld, A. Jacoff, I. Noda, and Y. Takahashi, editors, RoboCup 2005: Robot Soccer World Cup IX, LNCS, pages 154–165. Springer, 2006. (Cited on pages 25 and 26.) [58] R.J. Fitzgerald. Track biases and coalescence with probabilistic data association. IEEE Trans on Aerospace and Electronic Systems, 21(6):822– 825, 1985. (Cited on page 22.) [59] T. Fong, C. Thorpe, and C. Baur. Collaboration, dialogue, human-robot interaction. Robotics Research, 6:255–266, 2003. (Cited on page 26.) [60] T. Fortmann, Y. Bar-Shalom, and M. Scheffe. Sonar tracking of multiple targets using joint probabilistic data association. IEEE Journal of Oceanic Engineering, 8(3):173–184, 1983. (Cited on pages 4, 21, and 22.) [61] D. Fox, W. Burgard, and S. Thrun. Markov localization for mobile robots in dynamic environments. Artificial Intelligence Research Journal, 11:391–427, 1999. (Cited on pages 24 and 25.) [62] D. Fox, J. Hightower, L. Liao, D. Schulz, and G. Borriello. Bayesian filtering for location estimation. IEEE Pervasive Computing, 2(3):24– 33, 2003. (Cited on page 24.) [63] G. Frege. Über sinn und bedeutung (on sense and meaning). Zeitschrift für Philosophie und Philosophische Kritik, 100:25–50, 1892. (Cited on page 11.) [64] J. Fritsch, M. Kleinehagenbrock, S. Lang, F. Loemker, G. A. Fink, and G. Sagerer. Multi-modal anchoring for human-robot-interaction. Robotics and Autonomous Systems, 43(2):133–147, 2003. (Cited on page 15.) [65] E. Garcia, M.A. Jimenez, P.G. De Santos, and M. Armada. The evolution of robotics research. Robotics and Automation Magazine, IEEE, 14(1):90–103, March 2007. (Cited on page 1.)

REFERENCES

219

[66] P. Gärdenfors. Conceptual Spaces: The Geometry of Thought. MIT Press, Cambridge, MA, USA, 2000. (Cited on pages 13, 19, 26, and 41.) [67] A. Gelb. Applied Optimal Estimation. The MIT Press, 1989. (Cited on pages 20 and 24.) [68] A. Genovesio and J.C. Olivo-Marin. Split and merge data association filter for dense multi-target tracking. In Procs of the Int Conf on Pattern Recognition (ICPR), volume 4, pages 677–680, 2004. (Cited on page 21.) [69] B. Gerkey, R. T. Vaughan, and A. Howard. The player/stage project: Tools for multi-robot and distributed sensor systems. In Procs of the Int Conf on Advanced Robotics, pages 317–323, Coimbra, Portugal, June 2003. (Cited on pages 94, 152, 153, and 155.) [70] D. Gohring. Cooperative object localization using line-based percept communication. In U. Visser, F. Ribeiro, T. Ohashi, and F. Dellaert, editors, RoboCup 2007: Robot Soccer World Cup XI, LNCS, pages 53–64. Springer, 2008. (Cited on page 24.) [71] D. Gohring and H.-D. Burkhard. Cooperative world modeling in dynamic multi-robot environments. Fundamenta Informaticae, 75(1– 4):281–294, 2007. (Cited on page 24.) [72] C.V. Goldman, M. Allen, and S. Zilberstein. Learning to communicate in a decentralized environment. Autonomous Agents and Multi-Agent Systems, 15(1):47–90, 2007. (Cited on pages 19, 39, 61, and 210.) [73] N. J. Gordon, D. J. Salmond, and A. F. M. Smith. A novel approach to non-linear and non-Gaussian bayesian state estimation. In IEE Procs-F, volume 140, pages 107–133, 1993. (Cited on page 20.) [74] S. Guirnaldo, K. Watanabe, and K. Izumi. Enhancing the awareness of decentralized cooperative mobile robots through active perceptual anchoring. Int Journal of Control, Automation and Systems, 2:450–462, 2004. (Cited on pages 1, 20, and 23.) [75] J.-S. Gutmann. Markov-Kalman localization for mobile robots. In Procs of the Int Conf on Pattern Recognition (ICPR), pages 601–604, 2002. (Cited on page 25.) [76] D. L. Hall and J. Llinas. An introduction to multisensor data fusion. Procs of the IEEE, 85(1):6–23, 1997. (Cited on pages 4 and 23.) [77] R. Hanek, T. Schmitt, M. Klupsch, and S. Buck. From multiple images to a consistent view. In P. Stone, T. Balch, and Gerhard Kraetzschmar, editors, RoboCup 2000: Robot Soccer World Cup IV, LNCS, pages 169– 178. Springer, 2001. (Cited on page 25.)

220

REFERENCES

[78] S. Harnad. The symbol grounding problem. Physica D, 42:335–346, 1990. (Cited on pages 20 and 57.) [79] Fredrik Heintz, Jonas Kvarnström, and Patrick Doherty. Bridging the sense-reasoning gap: Dyknow – stream-based middleware for knowledge processing. Advanced Engineering Informatics, 24(1):14–26, 2010. (Cited on page 15.) [80] F. Herrera, L. Martinez, and PJ Sánchez. Managing non-homogeneous information in group decision making. European Journal of Operational Research, 166(1):115–132, 2005. (Cited on page 26.) [81] D. Herrero-Pérez, H. Martínez-Barberá, K. LeBlanc, and A. Saffiotti. Fuzzy uncertainty modeling for grid based localization of mobile robots. Int Journal of Approximate Reasoning, 51(8):912–932, October 2010. (Cited on pages 24, 89, 90, and 92.) [82] D. Herrero-Pérez, H. Martínez-Barberá, and A. Saffiotti. Fuzzy selflocalization using natural features in the four-legged league. In D. Nardi, M. Riedmiller, and C. Sammut, editors, RoboCup 2004: Robot Soccer World Cup VIII, LNCS. Springer, 2005. (Cited on pages 24, 89, 90, and 92.) [83] G. P. Huang, N. Trawny, A. I. Mourikis, and S. I. Roumeliotis. On the consistency of multi-robot cooperative localization. In Proceedings of Robotics: Science and Systems, Seattle, USA, June 2009. (Cited on page 25.) [84] A. H. Jazwinski. Stochastic Processes and Filtering Theory. Academic Press, 1973. (Cited on page 24.) [85] P.W. Jordan and M. Walker. Learning content selection rules for generating object descriptions in dialogue. Journal of Artificial Intelligence Research, 24(1):157–194, 2005. (Cited on page 210.) [86] S. Julier and J. Uhlmann. A new extension of the Kalman filter to nonlinear systems. In Procs of the Int Symp on Aerospace/Defense Sensing, Simulation and Controls, pages 182–193, 1997. (Cited on page 24.) [87] Emil Kalman, Rudolph. A new approach to linear filtering and prediction problems. Trans of the ASME Journal of Basic Engineering, 82(Series D):35–45, 1960. (Cited on page 24.) [88] L. Karlsson, A. Bouguerra, M. Broxvall, S. Coradeschi, and A. Saffiotti. To secure an anchor – a recovery planning approach to ambiguity in perceptual anchoring. AI Communications, 21(1):1–14, 2008. (Cited on pages 20, 55, and 110.)

REFERENCES

221

[89] G. J. Klir and T. A. Folger. Fuzzy Sets, Uncertainty, and Information. Prentice-Hall, 1988. (Cited on pages 65, 73, and 74.) [90] Saul A. Kripke. Naming and Necessity. Harvard University Press, Blackwell, 1980. (Cited on page 56.) [91] J. Larsson. PEIS home simulator. Master’s thesis, Örebro University, 2007. (Cited on page 152.) [92] K. LeBlanc and A. Saffiotti. Issues of perceptual anchoring in ubiquitous robotic systems. In Procs of the ICRA-07 Workshop on Omniscient Space, Rome, Italy, 2007. (Cited on page 16.) [93] K. LeBlanc and A. Saffiotti. Cooperative anchoring in heterogeneous multi-robot systems. In Procs of the IEEE Int Conf on Robotics and Automation (ICRA), Pasadena, CA, USA, 2008. (Cited on page 16.) [94] K. LeBlanc and A. Saffiotti. Multirobot object localization: A fuzzy fusion approach. IEEE Trans on System, Man and Cybernetics B, 39(5):1259–1276, 2009. (Cited on page 25.) [95] D.B. Lenat. CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38(11):33–38, 1995. (Cited on pages 13 and 26.) [96] D.B. Lenat, RV Guha, K. Pittman, D. Pratt, and M. Shepherd. CYC: toward programs with common sense. Communications of the ACM, 33(8):30–49, 1990. (Cited on page 13.) [97] D.B. Lenat, M. Prakash, and M. Shepherd. CYC: Using common sense knowledge to overcome brittleness and knowledge acquisition bottlenecks. AI magazine, 6(4):65, 1985. (Cited on page 13.) [98] A. Levy, A. Rajaraman, and J. Ordille. Querying heterogeneous information sources using source descriptions. In Procs of the Int Conf on Very Large Data Bases, pages 251–262. Citeseer, 1996. (Cited on page 26.) [99] F. Li-Wei. Distributed data fusion algorithms for tracking a maneuvering target. In 10th Int Conf on Information Fusion, pages 1–8, Quebec, Canada, 2007. (Cited on page 25.) [100] J. Llinas, C. Bowman, G. Rogova, A. Steinberg, E. Waltz, and F. White. Revisiting the jdl data fusion model II. In Procs of the Int Conf on Information Fusion, volume 2, pages 1218–1230, 2004. (Cited on pages 4 and 23.)

222

REFERENCES

[101] L.S. Lopes. Carl: from situated activity to language level interaction and learning. In Procs of the IEEE Int Conf on Intelligent Robots and Systems, pages 890–896, 2002. (Cited on page 15.) [102] A. Loutfi and S. Coradeschi. A review of past and future trends in perceptual anchoring. In P. Fritzche, editor, Tools in Artificial Intelligence, chapter 15. I-Tech Education and Publishing, 2008. (Cited on page 11.) [103] A. Loutfi, S. Coradeschi, M. Daoutis, and J. Melchert. Using knowledge representation for pereptual anchoring in a robotic system. Int Journal on Artificial Intelligence Tools, 17(5):925–944, 2008. (Cited on page 13.) [104] A. Loutfi, S. Coradeschi, and A. Saffiotti. Maintaining coherent perceptual information using anchoring. In Procs of the Int Joint Conf on Artificial Intelligence (IJCAI), Edinburgh, UK, 2005. (Cited on pages 12, 18, and 47.) [105] D. Lowe. Object Recognition From Local Scale-Invariant Features. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), September 1999. (Cited on page 200.) [106] D. Lowe. Distinctive Image Features From Scale-Invariant Keypoints. International Journal of Computer Vision, 60(2):91–110, 2004. (Cited on page 200.) [107] R. Lundh. Robots That Help Each Other: Self-Configuration of Distributed Robot Systems. PhD thesis, Örebro University, Örebro, Sweden, 2009. (Cited on page 171.) [108] F. Mastrogiovanni, A. Sgorbissa, and R. Zaccaria. A distributed architecture for symbolic data fusion. In Procs of the Int Joint Conf on Artificial Intelligence, pages 2153–2158, Hyderabad, India, 2007. (Cited on pages 16 and 26.) [109] P. S. Maybeck. Stochastic models, estimation, and control, volume 141 of Mathematics in Science and Engineering. Academic Press, 1979. (Cited on pages 20 and 24.) [110] E. Mazor, A. Averbuch, Y. Bar-Shalom, and J. Dayan. Interacting multiple model methods in target tracking: a survey. IEEE Trans on Aerospace and Electronic Systems, 34(1):103–123, jan 1998. (Cited on page 25.) [111] J. McCarthy. Programs with common sense. In Procs of the Teddington Conf on the Mechanization of Thought Processes, 1959. (Cited on page 13.)

REFERENCES

223

[112] John McCarthy. Formalization of common sense, papers by John McCarthy edited by V. Lifschitz. Ablex, 1990. (Cited on page 13.) [113] N. Megherbi, S. Ambellouis, O. Colôt, and F. Cabestaing. Multimodal data association based on the use of belief functions for multiple target tracking. In Int Conf on Information Fusion, volume 2, pages 900–906, 2005. (Cited on pages 21 and 23.) [114] J. Melchert, S. Coradeschi, and A. Loutfi. Knowledge representation and reasoning for perceptual anchoring. In Procs of the IEEE Int Conf on Tools with Artificial Intelligence (ICTAI), Patras, Greece, 2007. (Cited on page 13.) [115] J. Melchert, S. Coradeschi, and A. Loutfi. Spatial relations for perceptual anchoring. In Procs of AISB Annual Convention, 2007. (Cited on pages 14 and 26.) [116] J.S. Mill. A System of Logic: Ratiocinative and Inductive, Being a Connected View of the Principles of Evidence and the Methods of Scientific Investigation. Longman, London, UK, 1 edition, 1843. (Cited on page 56.) [117] J. Modayil and B. Kuipers. Autonomous development of a grounded object ontology by a learning robot. In Procs of the National Conf on Artificial Intelligence, pages 1095–1101, 2007. (Cited on page 15.) [118] R. Murphy. Human-robot interaction in rescue robotics. IEEE Trans on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 34(2):138–153, 2004. (Cited on page 26.) [119] F. Naumann, U. Leser, and J.C. Freytag. Quality-driven integration of heterogeneous information systems. In Procs of the Int Conf on Very Large Data Bases, pages 447–458, 1999. (Cited on page 26.) [120] Network Robot Forum. web site. www.scat.or.jp/nrf/English/. (Cited on page 150.) [121] W. Ng, J. Li, S. Godsill, and J. Vermaak. A hybrid approach for online joint detection and tracking for multiple targets. In IEEE Aerospace Conference, pages 2126–2141, 2005. (Cited on page 21.) [122] W. Nistico, M. Hebbel, T. Kerkhof, and C. Zarges. Cooperative visual tracking in a team of autonomous mobile robots. In G. Lakemeyer, E. Sklar, D. G. Sorrenti, and T. Takahashi, editors, RoboCup 2006: Robot Soccer World Cup X, LNCS, pages 146–157. Springer, 2007. (Cited on page 25.)

224

REFERENCES

[123] P. Pinheiro and P. Lima. Bayesian sensor fusion for cooperative object localization and world modeling. In Procs Conf on Intelligent Autonomous Systems, Amsterdam, The Netherlands, 2004. (Cited on page 25.) [124] Player/Stage/Gazebo. web site. http://playerstage.sourceforge.net. (Cited on pages 94, 152, 153, and 155.) [125] Polhemus tracking systems. web site. http://www.polhemus.com. (Cited on page 132.) [126] R.L. Popp, K.R. Pattipati, and Y. Bar-Shalom. M-best S-D assignment algorithm with application to multitarget tracking. IEEE Trans on Aerospace and Electronic Systems, 37(1):22–39, 2001. (Cited on page 22.) [127] D. Quass, A. Rajaraman, J. Ullman, J. Widom, and Y. Sagiv. Querying semistructured heterogeneous information. Journal of Systems Integration, 7(3):381–407, 1997. (Cited on page 26.) [128] C. Rasmussen and GD Hager. Probabilistic data association methods for tracking complex visual objects. IEEE Trans on Pattern Analysis and Machine Intelligence, 23(6):560–576, 2001. (Cited on page 22.) [129] D.B. Reid. An algorithm for tracking multiple targets. IEEE Trans on Automatic Control, 24(6):843–854, 1979. (Cited on pages 4, 21, 22, 25, and 209.) [130] RoboCup@Home. web site. http://www.robocupathome.org. (Cited on pages 150 and 195.) [131] J.A. Roecker, L.F. Syst, and C.O. Boulder. Multiple scan joint probabilistic data association. IEEE Trans on Aerospace and Electronic Systems, 31(3):1204–1210, 1995. (Cited on page 22.) [132] L. Ronnie, M. Johansson, and Ning Xiong. Perception management: an emerging concept for information fusion. Information Fusion, 4(3):231– 234, 2003. (Cited on page 20.) [133] D. Roy. Semiotic schemas: A framework for grounding language in action and perception. Artificial Intelligence, 167(1-2):170–205, 2005. (Cited on page 15.) [134] B. Russell. On denoting. Mind, 56:479–493, 1905. (Cited on pages 11 and 56.) [135] A. Saffiotti. Pick-up what? In C. Bäckström and E. Sandewall, editors, Current Trends in AI Planning, pages 266–277. IOS Press, Amsterdam, NL, 1994. (Cited on pages 1, 11, and 14.)

REFERENCES

225

[136] A. Saffiotti. The uses of fuzzy logic in autonomous robot navigation. Soft Computing, 1(4):180–197, 1997. (Cited on pages 65 and 66.) [137] A. Saffiotti, A. Björklund, S. Johansson, and Z. Wasik. Team Sweden. In RoboCup 2001: Robot Soccer World Cup V, LNCS. Springer, 2002. (Cited on page 133.) [138] A. Saffiotti and M. Broxvall. PEIS ecologies: Ambient intelligence meets autonomous robotics. In Procs of the Int Conf on Smart Objects and Ambient Intelligence, pages 275–280, Grenoble, France, 2005. (Cited on pages 18, 150, and 151.) [139] A. Saffiotti, M. Broxvall, M. Gritti, K. LeBlanc, R. Lundh, J. Rashid, B. S. Seo, and Y. J. Cho. The PEIS-ecology project: vision and results. In Procs of the IEEE Int Conf on Intelligent Robots and Systems (IROS), pages 2329–2335, Nice, France, 2008. (Cited on pages 1, 5, 18, 26, 150, and 151.) [140] A. Saffiotti and K. LeBlanc. Active perceptual anchoring of robot behavior in a dynamic environment. In Procs of the IEEE Int Conf on Robotics and Automation (ICRA), pages 3796–3802, San Francisco, CA, 2000. (Cited on pages 20, 118, 131, and 133.) [141] A. Sanfeliu, N. Hagita, and A. Saffiotti. Network robot systems. Robotics and Autonomous Systems, 56(10):793–797, 2008. (Cited on pages 1, 5, 18, 26, and 150.) [142] T. Schmitt, R. Hanek, M. Beetz, S. Buck, and Bernd Radig. Cooperative probabilistic state estimation for vision-based autonomous mobile robots. IEEE Trans on Robotics and Automation, 18(5):670–684, 2002. (Cited on page 25.) [143] S.C. Shapiro and H.O. Ismail. Anchoring in a grounded layered architecture with integrated reasoning. Robotics and Autonomous Systems, 43(2-3):97–108, 2003. (Cited on page 15.) [144] L. Steels. The Talking Heads Experiment, volume 1. Words and Meanings. Laboratorium, Antwerpen, 1999. (Cited on pages 42, 164, and 210.) [145] A.N. Steinberg, F.E. White, and C.L. Bowman. Revisions to the jdl data fusion model. In Sensor Fusion: Architectures, Algorithms, and Applications, Procs of the SPIE, volume 3719, 1999. (Cited on pages 4 and 23.) [146] A. Stroupe, C. Martin, and T. Balch. Merging Gaussian distributions for object localization in multi-robot systems. In Procs of the Int Symp on Experimental Robotics (ISER), 2000. (Cited on pages 25 and 26.)

226

REFERENCES

[147] S. Stubberud and K. Kramer. Data association for multiple sensor types using fuzzy logic. IEEE Trans on Instrumentation and Measurement, 55(6):2292–2303, 2006. (Cited on page 23.) [148] The RoboCup Federation. web site. http://www.robocup.org. (Cited on pages 130 and 146.) [149] S. Thrun. Robotic mapping: A survey. In G. Lakemeyer and B. Nebel, editors, Exploring Artificial Intelligence in the New Millenium. Morgan Kaufmann, 2002. (Cited on page 24.) [150] S. Thrun, W. Burgard, and D. Fox. Probabilistic Robotics. MIT Press, 2005. (Cited on pages 24, 25, 89, and 94.) [151] S. Thrun, D. Fox, W. Burgard, and F. Dellaert. Robust monte carlo localization for mobile robots. Artificial Intelligence, 128(1–2):99–141, 2000. (Cited on pages 24, 89, and 94.) [152] C. von der Malsburg. The correlation theory of brain function. Technical Report 81-2, MPI biophysical chemistry, Gottingen, Germany, 1981. Reprinted in E. Domany, J. L. van Hemmen, and K. Schulten (Eds.), Models of neural networks II, chapter 2 (pp. 95–119), Berlin, Springer (1994). (Cited on pages 4 and 20.) [153] L. Wald. Some terms of reference in data fusion. IEEE Trans on Geoscience and Remote Sensing, 37(3):1190–1193, 1999. (Cited on page 23.) [154] E. Waltz and J. Llinas. Multisensor data fusion. Artech House Boston, London, UK, 1990. (Cited on page 4.) [155] Z. Wasik and A. Saffiotti. Robust color segmentation for the RoboCup domain. In Procs of the Int Conf on Pattern Recognition (ICPR), Quebec City, Quebec, CA, 2002. (Cited on pages 133 and 152.) [156] S. Weber. A general concept of fuzzy connectives, negations and implications based on t-norms and t-conorms. Fuzzy sets and systems, 11:115–134, 1983. (Cited on page 73.) [157] Z. Weihong. A probabilistic approach to tracking moving targets with distributed sensors. IEEE Trans on Systems, Man, and Cybernetics, 37(5):721–731, 2007. (Cited on page 25.) [158] F.E. White. A model for data fusion. In Procs of the National Symposium on Sensor Fusion, 1988. (Cited on pages 4 and 23.) [159] Wikipedia. AIBO robot web site. http://en.wikipedia.org/wiki/AIBO. (Cited on page 130.)

REFERENCES

227

[160] C. Yu and D.H. Ballard. On the integration of grounding language and learning objects. In Procs of the Nat Conf on Artificial Intelligence, pages 488–494, 2004. (Cited on page 15.) [161] L. A. Zadeh. Fuzzy sets. Information and Control, 8:338–353, 1965. (Cited on page 65.) [162] L. A. Zadeh. Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1:3–28, 1978. (Cited on page 66.) [163] M.A. Zaveri, S.N. Merchant, and U.B. Desai. Robust neural-networkbased data association and multiple model-based tracking of multiple point targets. IEEE Trans on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 37(3):337–351, 2007. (Cited on page 23.)

Cooperative Anchoring - CiteSeerX

Cooperative Anchoring - CiteSeerX

Suggest Documents

Judgment Aggregation for Cooperative Anchoring ... - Marija Slavkovik

Cooperative Anchoring in Heterogeneous Multi-Robot Systems - AASS

Multiple Pathway Anchoring and Adjustment (MPAA ... - CiteSeerX

On Finiteness as Logophoric Anchoring - CiteSeerX

On Finiteness as Logophoric Anchoring - CiteSeerX

Expectations Anchoring in Inflation Targeting Regimes - CiteSeerX

The Anchoring-and-Adjustment Heuristic - CiteSeerX

cooperative collision warning - CiteSeerX

Distributed Cooperative Caching - CiteSeerX

Cooperative Boolean Games - CiteSeerX

Cooperative/Competitive - CiteSeerX

cooperative collision warning - CiteSeerX

Cooperative method development - CiteSeerX

Cooperative Checkpointing Theory - CiteSeerX

ANCHORING BIAS

Anchoring the genome

MAV Cooperative Timing Experiments - CiteSeerX

Computer supported cooperative music - CiteSeerX

Computer-Supported Cooperative Work - CiteSeerX

The Superdatabase Architecture: Cooperative ... - CiteSeerX

Cooperative Public Web Maps - CiteSeerX

Hint-based Cooperative Caching - CiteSeerX

Analysis of Cooperative Work - CiteSeerX

cooperative courseware authoring support - CiteSeerX