investigation into the use of vision for the navigation of mobile robots. ...... 3navigation n. the science of getting ships, aircraft, or spacecraft from place to place. â.
Vision-based Mobile Robot Navigation: A Qualitative Approach Gordon Cheng Bachelor of Computer Science (Wollongong) Master of Computer Science (Wollongong)
A thesis submitted for the degree of Doctor of Philosophy of The Australian National University
Department of Systems Engineering Research School of Information Sciences and Engineering The Australian National University
c
2000 by Gordon Cheng. All rights reserved.
Statement of Originality These doctoral studies were conducted under the supervision of Dr. Alexander Zelinsky as advisor. The work submitted in this thesis is the result of original research carried out by myself, in collaboration with others, while enrolled as a Ph.D. student in the Department of Systems Engineering, The Australian National University. It has not been submitted for any other degree or award in any other university or educational institution.
Gordon Cheng
iii
Abstract The goal of mobile robotics research has been to establish effective ways for a mobile robot to navigate from one place to another. This thesis documents an experimental investigation into the use of vision for the navigation of mobile robots. The research presented in this thesis was directed at achieving qualitative navigation in an unmodified, unknown and dynamic environment. This thesis describes the research involved in the development of a number of key vision primitives: a set of high performance real-time vision processes for the support of navigation, and a suite of visual behaviours which fully utilises the vision processes. From the vision primitives we are able to construct a set of basic visual behaviours, Collision avoidance, Obstacle avoidance and Goal seeking. These behaviours allow a mobile robot system to function in a dynamically changing environment. Building from this foundation, a range of abilities were realised, Wandering, Target tracking, Target pursuiting and Foraging. Demonstrating the flexibility and diversity of our approach, a number of applications were realised: Landmark-based navigation, a Soccer playing robot, and a Vacuum cleaning robot. The ability to resolve conflicting situations for an autonomous mobile robot is addressed in this thesis. Addressing such problems enables a mobile robot to achieve autonomously, goal-oriented navigation. Experimental results and an analytical study of our approach is also presented as part of this thesis. In the final part of this thesis we present a paradigm for robot control, Supervised Autonomy. Supervised Autonomy is a framework, which facilitates the development of human robot systems. The components of this framework have been devised in a human-oriented manner, to augment users in accomplishing their task. Experimental results of applying this framework to the use of a mobile robot teleoperation system are presented. The system we have developed makes extensive use of the vision processing components we developed earlier, in producing an intuitive human robot system.
v
List of publications Book Chapters • G. Cheng and A. Zelinsky. Real-Time Vision Processing for a Soccer Playing
Mobile Robot, pages 135–146. Volume 1 of RoboCup-97: The First Robot World Cup Soccer Games and Conferences, Springer Verlag. An extended version of [Cheng and Zelinsky, 1997b].
Journal Papers • A. Zelinsky and G. Cheng. A Behaviour-based Vision guided Mobile Robot. IEEE Transactions on Robotics and Automation. Submitted.
• G. Cheng and A. Zelinsky. Supervised Autonomy: A Framework for HumanRobot Systems Development Special Issue of Autonomous Robots.
To be
published.
Conference Proceedings • G. Cheng and A. Zelinsky.
Supervised Autonomy: A Framwork for Human
Robot Systems Development IEEE International Conference on Systems, Man and Cybernetics, volume VI, pages 971–975, October 12-15 1999. (invited paper.) An extended version of this paper is to be published in a special issue of Autonomous Robots.
• G. Cheng and A. Zelinsky. Goal-oriented Behaviour-based Visual Navigation. IEEE International Conference on Robotics and Automation, pages 3431–3436, Leuven, Belgium, May 16-21 1998. (nominated for best paper.) • D. Jung, G. Cheng, and A. Zelinsky.
Robot Cleaning: An Application of
Distributed Planning and Real-Time Vision. In Proceedings of International Conference on Field and Service Robotics (FSR’97), Canberra, Australia, 1997. vii
viii
List of publications
• G. Cheng and A. Zelinsky. Supervised Autonomy: A Paradigm for Teleoperating Mobile Robots.
In Proceedings of the 1997 Intelligent Robots and Systems
Conference, volume 2, pages 1169–1176, IROS’97, Grenoble, France, September 7-11 1997. • D. Jung, G. Cheng, and A. Zelinsky. An Experiment in Realising Cooperative
Behaviours between Autonomous Mobile Robots. In Proceedings of the Fifth In-
ternational Symposium On Experimental Robotics (ISER’97), Barcelona, Spain, 1997. • G. Cheng and A. Zelinsky. Real-Time Visual Behaviours for Navigating a Mo-
bile Robot. In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 973–980. IROS’96, 1996.
• G. Cheng and A. Zelinsky. A Physically Grounded Search in a Behaviour Based
Robot. In Proceedings of the Eighth Australian Joint Conference on Artificial
Intelligence, pages 547–554, 1995. (nominated for best paper.)
Workshop presentations • G. Cheng and A. Zelinsky.
A Reactive Vision-based Navigation System for
a Mobile Robot. In Proceedings of the 1997 Workshop for Reactive Systems, 1997. University of New South Wales.
• G. Cheng and A. Zelinsky. Real-Time Vision Processing for Autonomous Mobile Robots.
In Proceedings of the 1996 Pan-Sydney Workshop for Visual
Information Processing, 1996. • A. Zelinsky and G. Cheng. Incorporating Real-Time Vision into Robotic Tasks.
In Proceedings of the Australian Academy of Technological Sciences and Engineering Workshop on Manufacturing Technology, pages 61–74, November 1995.
Contents 1 Overview 1.1
1
Principal objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.1.1
Autonomous Navigation . . . . . . . . . . . . . . . . . . . . . .
2
1.1.2
Human-Robot Interaction . . . . . . . . . . . . . . . . . . . . .
5
1.2
Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.3
Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.3.1
Background and Review . . . . . . . . . . . . . . . . . . . . . .
7
1.3.2
System Infrastructure . . . . . . . . . . . . . . . . . . . . . . .
8
1.3.3
Vision-based Behaviours for Mobile Robot Navigation . . . . .
8
1.3.4
Autonomous navigation . . . . . . . . . . . . . . . . . . . . . .
9
1.3.5
Human-robot interaction
. . . . . . . . . . . . . . . . . . . . .
10
1.3.6
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2 Background 2.1
2.2
2.3
11
Robot control architectures . . . . . . . . . . . . . . . . . . . . . . . .
12
2.1.1
Planner-based . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.1.2
Purely reactive . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
2.1.3
Hybrid Architectures . . . . . . . . . . . . . . . . . . . . . . . .
16
2.1.4
Behaviour-based Architecture . . . . . . . . . . . . . . . . . . .
18
Robot Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
2.2.1
Localisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
2.2.2
Robot Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
Vision-based Navigation . . . . . . . . . . . . . . . . . . . . . . . . . .
32
2.3.1
SHAKEY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
2.3.2
Stanford Cart . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
2.3.3
The INRIA mobile robot . . . . . . . . . . . . . . . . . . . . .
35
2.3.4
Polly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
ix
x
CONTENTS
2.4
2.3.5
Pebbles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
2.3.6
The Hyper Scooter . . . . . . . . . . . . . . . . . . . . . . . . .
38
2.3.7
Hyper Mouse . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
2.3.8
RJ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
3 Robot System 3.1 3.2
3.3
43
Yamabico Mobile Robot . . . . . . . . . . . . . . . . . . . . . . . . . .
46
3.1.1
Communication server . . . . . . . . . . . . . . . . . . . . . . .
46
Vision processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
3.2.1
Video module . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
3.2.2
Tracking module . . . . . . . . . . . . . . . . . . . . . . . . . .
48
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
4 Robot Vision
49
4.1
Vision processing framework . . . . . . . . . . . . . . . . . . . . . . . .
50
4.2
Template matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
4.3
Searching for free space . . . . . . . . . . . . . . . . . . . . . . . . . .
55
4.4
Compensating for camera lens distortion . . . . . . . . . . . . . . . . .
56
4.5
Calculating the Intensity of Lighting . . . . . . . . . . . . . . . . . . .
58
4.6
Lighting Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
4.6.1
Lighting Adaptation — without accounting for texture . . . . .
59
4.6.2
Lighting Adaptation — accounting for texture . . . . . . . . .
60
4.6.3
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
4.6.4
A Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . .
63
4.7
The Interest-Operator . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
4.8
Recognition/Reassurance . . . . . . . . . . . . . . . . . . . . . . . . .
66
4.9
Transforming Image Coordinates to World Coordinates
. . . . . . . .
67
4.10 Blind spot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
4.11 Passable space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
4.12 Setting the robot’s velocity . . . . . . . . . . . . . . . . . . . . . . . .
71
4.13 Speed of processing and Noise filtering . . . . . . . . . . . . . . . . . .
72
4.14 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
5 Robot Visual Behaviours 5.1
System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77 78
CONTENTS
5.2
5.3
xi
5.1.1
Action selection . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
5.1.2
Behaviour connectivity . . . . . . . . . . . . . . . . . . . . . . .
80
5.1.3
Actuation shell . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
Basic Behaviours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
5.2.1
Collision Avoidance . . . . . . . . . . . . . . . . . . . . . . . .
83
5.2.2
Free Space Move . . . . . . . . . . . . . . . . . . . . . . . . . .
85
5.2.3
Goal Seeking . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
Combined Behaviours . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
5.3.1
Wandering . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
5.3.2
Target Pursuit . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
5.3.3
Foraging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
96
5.4
Landmark-based Navigation . . . . . . . . . . . . . . . . . . . . . . . . 100
5.5
Robot Vacuum cleaning . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.6
Soccer playing skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.7
5.6.1
Ball Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.6.2
Got Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.6.3
Goal Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.6.4
Goto Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.6.5
Dribbling the Ball . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.6.6
Kick Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.6.7
Toward Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.6.8
Soccer playing . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6 Adaptation 6.1
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 6.1.1
6.2
6.3
115
Visual servoing . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Navigation algorithm and Analysis . . . . . . . . . . . . . . . . . . . . 119 6.2.1
The Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.2.2
Basic Navigation Algorithm . . . . . . . . . . . . . . . . . . . . 121
6.2.3
An improved navigation algorithm . . . . . . . . . . . . . . . . 125
6.2.4
Testing for goal reachability . . . . . . . . . . . . . . . . . . . . 127
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6.3.1
Behaviour conflict management . . . . . . . . . . . . . . . . . . 131
6.3.2
Landmark Signature . . . . . . . . . . . . . . . . . . . . . . . . 131
6.3.3
Landmark matching . . . . . . . . . . . . . . . . . . . . . . . . 132
xii
CONTENTS 6.3.4
Contour Following . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.4
Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
7 Supervised Autonomy
141
7.1
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.2
The Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.3
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 7.3.1
Communication – Flow of Commands . . . . . . . . . . . . . . 146
7.3.2
Qualitative instructions . . . . . . . . . . . . . . . . . . . . . . 148
7.3.3
Making a Suggestion . . . . . . . . . . . . . . . . . . . . . . . . 153
7.3.4
Purposive Map Making . . . . . . . . . . . . . . . . . . . . . . 153
7.3.5
Teleoperation console . . . . . . . . . . . . . . . . . . . . . . . 154
7.4
A Supervised Autonomy experiment . . . . . . . . . . . . . . . . . . . 157
7.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8 Conclusion 8.1
161
Further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
A Experimental data
167
A.1 Lens distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 A.2 Teleoperation Console – User Interface . . . . . . . . . . . . . . . . . . 168 A.2.1 Communication settings . . . . . . . . . . . . . . . . . . . . . . 168 A.2.2 Robot settings . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 A.2.3 Console panel settings . . . . . . . . . . . . . . . . . . . . . . . 170 Bibliography
171
List of Figures 2.1
Planner-based system . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.2
Reactive system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2.3
Hybrid system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.4
Behaviour-based system . . . . . . . . . . . . . . . . . . . . . . . . . .
18
3.1
Overview of our system . . . . . . . . . . . . . . . . . . . . . . . . . .
44
3.2
Yamabico Mobile Robot . . . . . . . . . . . . . . . . . . . . . . . . . .
45
4.1
Vision Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
4.2
Template Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
4.3
6 × 8 grid image and the templates used for Free-space-search . . . . .
55
4.4
Compensating for lens distortion . . . . . . . . . . . . . . . . . . . . .
57
4.5
Adapting to lighting changes . . . . . . . . . . . . . . . . . . . . . . .
62
4.6
Interest-Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
66
4.7
Templates acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . .
68
4.8
Robot’s field of view from onboard CCD camera . . . . . . . . . . . .
69
4.9
Camera configuration: . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
4.10 Non passable space . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
4.11 Robot velocity curve . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
4.12 Moving around an obstacle . . . . . . . . . . . . . . . . . . . . . . . .
72
4.13 Filtering level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73
4.14 Speed of processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
4.15 Target pursuit experiment . . . . . . . . . . . . . . . . . . . . . . . . .
75
5.1
Action selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
5.2
Connection of a behaviour . . . . . . . . . . . . . . . . . . . . . . . . .
81
5.3
General connectivity of behaviours . . . . . . . . . . . . . . . . . . . .
82
5.4
Connection: Collision Avoidance . . . . . . . . . . . . . . . . . . . . .
83
xiii
xiv
LIST OF FIGURES
5.5
Collision Avoidance Experiment . . . . . . . . . . . . . . . . . . . . . .
84
5.6
Connection: Free Space Move . . . . . . . . . . . . . . . . . . . . . . .
85
5.7
Free Space Move . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
5.8
Highest free space cell (cell 2-2, starting from top left) . . . . . . . . .
87
5.9
Largest free space area . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
5.10 Goal Seeking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
5.11 Goal Seeking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
5.12 Wandering: behaviour interconnections . . . . . . . . . . . . . . . . . .
91
5.13 Wandering in a static environment . . . . . . . . . . . . . . . . . . . .
92
5.14 Wandering in a dynamically changing environment . . . . . . . . . . .
93
5.15 Configuration: Target Pursuit . . . . . . . . . . . . . . . . . . . . . . .
94
5.16 Target Pursuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
5.17 Target tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
5.18 Foraging: Behaviour interconnections . . . . . . . . . . . . . . . . . . .
98
5.19 Foraging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
99
5.20 Foraging - trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.21 The robot exhibiting a bias toward a landmark . . . . . . . . . . . . . 101 5.22 PM: Landmark-based navigation . . . . . . . . . . . . . . . . . . . . . 103 5.23 Landmark-based navigation . . . . . . . . . . . . . . . . . . . . . . . . 103 5.24 Robot Vacuuming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.25 Network of soccer behaviours . . . . . . . . . . . . . . . . . . . . . . . 106 5.26 Soccer Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.27 Goto Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.28 Dribbling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.29 Kicking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 5.30 Soccer playing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 6.1
Types of local minima . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.2
Visual servoing to a goal target . . . . . . . . . . . . . . . . . . . . . . 118
6.3
Behaviour interconnections . . . . . . . . . . . . . . . . . . . . . . . . 121
6.4
Basic algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.5
Improved algorithm
6.6
Reachability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.7
Maze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.8
Line segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.9
Conflict between behaviours . . . . . . . . . . . . . . . . . . . . . . . . 130
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
LIST OF FIGURES
xv
6.10 Landmark signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 6.11 Contour Following . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.12 Contour Following . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.13 Initial PM showing the possible paths, this is used as a resource to the system in performing navigation . . . . . . . . . . . . . . . . . . . . . 135 6.14 Path trace showing robot’s trajectory: n = 15 sec, starting in a CW direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 6.15 Path trace showing robot’s trajectory: n = 30 sec, starting in a CCW direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 6.16 Visual servoing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 7.1
Teleoperation system configuration . . . . . . . . . . . . . . . . . . . . 146
7.2
Direct robot command . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.3
Qualitative Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.4
Operator Notification
7.5
Go Forward – Qualitative Behaviour . . . . . . . . . . . . . . . . . . . 148
7.6
Go Toward – Qualitative Behaviour . . . . . . . . . . . . . . . . . . . 149
7.7
Go Between – Qualitative Behaviour . . . . . . . . . . . . . . . . . . . 150
7.8
Keep To . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7.9
Finding unknown landmark targets . . . . . . . . . . . . . . . . . . . . 154
. . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7.10 Console and Go Toward . . . . . . . . . . . . . . . . . . . . . . . . . . 155 7.11 Selection of landmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 7.12 Mission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 7.13 Snapshots of a Supervised Autonomy mission . . . . . . . . . . . . . . 158 7.14 PM of Mission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 A.1 Communication settings . . . . . . . . . . . . . . . . . . . . . . . . . . 168 A.2 Robot settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 A.3 Panel settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
List of Tables 4.1
Comparison of lighting adaptation schemes for low texture content templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2
63
Comparison of lighting adaptation scheme for high texture content templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xvii
64
Chapter 1
Overview “In the beginning there was no Europe. All there was, for five million years, was a long, sinuous peninsula with no name, . . . ” — [Davies, 1996]
This thesis studies the use of vision for autonomous mobile robot navigation. Emphasis is placed on achieving real-time mobile robot navigation in unmodified, unknown and dynamic indoor environments. Unmodified means the environment is not modified in any way to accommodate the robot. Unknown means the robot is not given any detailed information about the environment and situations to which it will operate. The robot must function in dynamic circumstances and must be able to react to unforeseen and rapidly changing situations.
1.1
Principal objectives
Our research goal can be divided into two distinct objectives. First, to solve the autonomous navigation problem for indoor mobile robots, with the aim of transferring the outcomes to other mobile robotic systems working in other domains. Secondly, to consider aspects of human-robot interaction for such systems that facilitate our needs. Clearly, we can form two topics of discussion: Autonomous navigation and Human-robot interaction. 1
2
Overview
1.1.1
Autonomous Navigation
To begin we should first examine what the term “autonomous navigation” means. Autonomous1 is often described as the meaning of governing of one’s self — selfgoverning. Historically, the terms navigate2 and navigation3 are used in reference to sailing of ships, or to flying of aircraft or even to driving cars. The essential element in sailing, flying and driving can be viewed as “to control the course of ”, or “to steer”. Drawing from these definitions, our initial goal can be restated as “to investigate self-governing systems that are able to steer from one place to another place”. In characterising autonomous navigation we can identify three essential attributes: motion, reactivity, and conflict resolution. The robot requires a mechanism that allows it to move freely in the environment. It must be able to detect and react to situations (by ways of cues). To elaborate, the ability to detect and resolve conflicting situations is vital in the development of a truly autonomous system. If a robot is to achieve its objective of moving toward a goal while remaining safe (avoiding collisions), conflicting situations that may arise need to be resolved. Such attributes are apparent in our everyday lives: biological systems can freely move through their environment without collision, while accomplishing particular tasks such as foraging. If a conflicting situation arises, attempts are made to resolve it. For example, behavioural studies of the fiddler crabs [Zeil, 1998] have shown that simple biological systems such as tiny crabs can successfully return to their burrow even when it is outside their visual perception field, or is obscured by unexpected obstacles. This is an inspiring example of a biological system that exhibits the attributes of an autonomous navigation system. To break the problem of autonomous navigation down further, a series of questions were posed, ranging from “How should we perceive the world?”, “What to do with what has been perceived?”, “What internal representation should be used?” and finally 1
autonomous adj. 1. Govt. a. self-governing; independent; subject to its own laws only. b. pertaining to an autonomy. 2. independent; self-contained; self-sufficient; self-governing. 3. Philos. containing its own guiding principles. 4. Biol. existing as an independent organism and not as a mere c form or state of development of an organism. 5. Bot. spontaneous. — The Macquaire Dictionary 2 navigate v. (-ting) 1. manage or direct the course of (a ship or aircraft) using maps and instruments. 2. a sail on (a sea, river, etc.). b fly through (the air). 3. (in a car etc.) assist the c driver by map-reading etc. 4. sail a ship; sail in a ship. — Oxford English Dictionary 3 navigation n. the science of getting ships, aircraft, or spacecraft from place to place. — c
Merriam Webster Dictionary
1.1 Principal objectives
3
“How should conflicts be resolved?”. These questions provide a broader view of the difficult issues that must be addressed.
How should the robot sense its environment?
The ability to sense the surrounding environment is a fundamental requirement to any autonomous system. Entomological studies have indicated that many insects such as bees and ants navigate quite adequately using vision [Cheng et al., 1987], [Srinivasan, 1993, Srinivasan et al., 1997]. These creatures navigate relying solely on vision sensing, and exhibit the capabilities of being able to sight and react to goals while detecting and avoiding obstacles. Such kinds of systems have provided the inspiration for this research. Computer vision is a rich sensory medium that provides a wealth of information about goals as well as obstacles without the need to modify the environment. This key characteristic makes visual sensing highly attractive sensor for robot perception. Therefore, the focus and challenge of our investigation has been in the processing of this information. Our contribution in this area includes the development of real-time vision algorithms that support vision-based navigation.
What to do with what has been perceived?
The capability to select an appropriate action based on what is perceived is a crucial element for any system. The success of the Behaviour-based approach to robotics has been attributed to the ability of these systems to react to dynamically changing environments while maintaining self-preservation [Brooks, 1990] and [Steels, 1994]. This promotes our position in adapting the behaviour-based methodology to the development of our system. The modularity of the behaviour-based methodology provides us with an elegant way of implementing our system. Our efforts have focused on the development of a rich set of visual-based behaviours for use in robot navigation.
4
Overview
How should the world be represented? Visual perception and a behaviour-based architecture provide the sensing mode and framework for our system. However, an internal representation is needed for further advancement of our system. The work of [Agre and Chapman, 1990] suggest that plans or maps should be used as resources or guides for action and not as programs to control action. Agre and Chapman argues that robots should be provided with augmentable resources that guide navigation. A system using this philosophy has been implemented [Zelinsky et al., 1995, Zelinsky and Kuniyoshi, 1996] for a sonarbased mobile robot. The robot was guided by a resource called Purposive-Map (PM). The notion of a map providing qualitative like representation is appealing. The PM representation contains a sequence of actions that a robot should perform and specifies the conditions that trigger those actions. This removes the notion that every step of the system needs be defined. For example, in landmark-based navigation, landmarks can be used as location cues, which can also determine what action should be taken next.
How should conflicts be resolved? As mentioned earlier, behavioural conflicts may occur while trying to achieve an objective. In order to be able to resolve a problem, one must first identify it. We have developed techniques that facilitate the detection of behavioural conflicts. Utilising this approach we propose a systematic way for resolving such problems. Our approach monitors the system’s internal behaviours, looking for occurrences of conflicts. Once a conflict is detected, the system records the location of where it occurred and then begins a systematic search to escape this situation. For example, while a robot is performing a task involving landmark-based navigation, the robot may sight a goal, but at the same time the robot may also be blocked by an obstacle. This results in a conflict between trying to reach a goal while remaining safe. Once this conflict has been detected, the robot records the location of its occurrence. The robot then starts moving around the obstacle until the goal becomes reachable, or notices that the goal is unreachable. In posing the above questions we have established the basis for our investigation into the development of a vision-based autonomous navigating mobile robot. A significant portion of this thesis has been devoted to the experimental investigation of
1.1 Principal objectives
5
such systems.
1.1.2
Human-Robot Interaction
In the opening pages of the published book “The Robotics Revolution”, Scott writes, “. . . at the end of the day, however wonderful the robots, it is the humans who mean the most.” [Scott, 1984]. Few people would not share this view. Historically, robotic research has been oriented towards the development of systems that can assist people to perform their tasks. Today, little has changed in orienting robotics towards this goal. This, and the large number of domains in which autonomous mobile robots can be applied, has motivated us to investigate a framework that can facilitate the interaction between people and robots. We consider that, any proposed methodology should take into consideration the alleviation of stress on human operators while providing suitable levels of instructions and feedback. The team “Alleviating stress” means that a human operator need not be burdened with the complete control of a system. “Suitable instructions” means that a human operator should not need to know the complex instructions required to command a robot to perform a task. “Suitable feedback” means that feedback is given both in appearance and explanation of what has taken place. With these considerations we propose a methodology of “Supervised Autonomy”, that provides a framework for the construction of human-robot interactive systems. It consists of five interconnecting components: “self-preservation”, “instructive feedback”, “qualitative instructions”, “qualitative explanations”, and “user interfaces”.
“Self-preservation” — How should we alleviate the burden placed on the operator ? The introduction of “self-preservation” shifts the general control of the robot from the operator back to the robot itself. “Self-preservation” controls include safety aspects of the robot, such as collision and obstacle avoidance. It removes the need for a tight command loop controlled by the operator. For example, if the robot has been instructed to visually servo to a selected target, and unexpected obstacles appear, the robot will move around it or stop to avoid collision, and then attempt to continue servoing.
6
Overview
“Instructive feedback” — Do you see what I see? The notion of “Instructive feedback” is to provide the same feedback to the operator as to the robot. This is based on the intuition that if the operator shares the same perception medium as the robot, instructions to the robot will be more reasonable and sensible. For example, in the case of a visual target to servo to, the User can simply select a target from the same visual data stream that is also shared by the robot.
“Qualitative instructions” — How should the robot be instructed? Commanding a robot can be a difficult and troublesome task. To overcome this problem we prescribe the use of qualitative high-level instructions. These instructions offer or suggest information that are easily understood and are relatively natural to use. For example, the command “keep along this feature” is used for directing our robot to follow a wall.
“Qualitative explanations” — How should the representation be provided to a User ? There is a need to describe to a User what is happening during the course of a robot mission. This description should include what events and activities took place, in a given period of time. For example, a mission can be described as, “Go forward” until you can’t, “keep along” following that wall and then “go through” this door.
“User interface” — What controls should we provide to the User ? To complete the system, consideration must be given to the development of a userinterface. It provides a means for the display of “Instructive feedback” and abilities for the User to give “Qualitative instructions”. As an experimental test-bed for our approach we have chosen the task of teleoperating a mobile robot. This task is a well defined problem and an excellent framework for evaluating our ideas. A well integrated system should provide an operator with the flexibility required to travel in complicated and dynamic environments, without requiring a priori knowledge of the environment.
1.2 Contributions
1.2
7
Contributions
The contributions of this research can be summarised as follows: • A modularised vision-based control architecture has been developed. The ar-
chitecture removes the restriction that the vision processor must reside on the same physical platform as the robot.
• Real-time vision processing algorithms have been developed as part of this research. A novel image correlation technique has been derived and implemented.
• A set of flexible high performance real-time vision-based behaviours for mobile
robot navigation have been developed. These behaviours have proven to be useful across a wide area of robot applications.
• In the move towards autonomous mobile navigation, we have developed a system that can resolve conflicting situations for obstacle avoidance using vision-based techniques. • A paradigm called, “Supervised Autonomy” has been proposed to facilitate human-robot interaction.
1.3
Outline
First a discussion of related previous research work is presented in Chapter 2. In Chapter 3, we outline our experimental setup. Chapters 4, 5 and 6 present the visual behaviours for mobile navigation and show how these behaviours can be utilised to achieve a robust and high performance navigation system for a variety of applications. Chapter 7 reports on an extension of the navigation system to support the “Supervised Autonomy” paradigm. A summary of the individual chapters of this thesis is as follows:
1.3.1
Background and Review
A review of mobile robot navigation approaches is presented in Chapter 2. It begins by examining relevant control architectures used in mobile robot navigation. This is
8
Overview
followed by a review of navigation schemes. The chapter ends with a discussion of past significant research in vision-based mobile robots.
1.3.2
System Infrastructure
In Chapter 3, a description of our system hardware is presented. Details of our robot, the vision processor and the communication server used in this research are presented. In Chapter 4, the robot’s vision processing system that was developed is presented. The vision processing framework is explained, and a component by component description of the system is given. Compensating for camera lens distortion, free space detection and camera image to floor plane transformations are calculated. Schemes for segmentation of the robot’s environment into floor and non-floor space, adaptation to lighting changes, goal detection, control of the robot’s velocity and noise filtering are presented. The emphasis of these vision processes is on achieving real-time performance.
1.3.3
Vision-based Behaviours for Mobile Robot Navigation
Chapter 5 presents the functionality of the robot’s visual behaviours. The basic behaviours are Collision Avoidance, Free-Space-Move and Goal Seeking. The architecture of the system together with the action selector are also covered. We show that by combining such behaviours, purposeful tasks can be realised. A range of robot applications that have been constructed using these purposive behaviours is presented in the remainder of the chapter. These applications include a soccer playing robot and a vacuuming robot. To show the robustness and effectiveness of our visual behaviours, we conducted wandering experiments in both static and dynamic environments. The robot was able to avoid collisions with both moving and stationary obstacles. Another basic skill we developed was visual servoing. The effectiveness of the visual sensing was demonstrated by the robot chasing a remote controlled car. To demonstrate the ability of goal-based navigation, we instructed the robot to search for a goal while wandering, thereby producing a foraging-like behaviour. A logical extension of our basic repertoire of visual navigation skills was to perform
1.3 Outline
9
purposeful navigation. This led to the development of a landmark-based navigation system. Here we show that without any elaborate and detailed information about the environment, successful missions can be achieved. We provided the robot with a set of visual cues, each cue was associated with a particular action, as the robot sees each cue it performs the appropriate associated action. This experiment also demonstrated the opportunistic aspects of our system. For example, if by chance the robot sees the goal before visiting all the other landmarks, it will execute the relevant action. To demonstrate the flexibility and reactiveness of our system, we developed a soccer-based repertoire of robot behaviours. Experiments included servoing; goal shooting; kicking; dribbling and; a combination of kicking and chasing of the ball. In our final experiment, the robot was able to locate, servo, dribble the ball, detect a goal and finally shoot the ball into it. An experiment was also done for robot vacuum cleaning, as part of a multi-robot cooperative cleaning project. In this experiment our robot was responsible for the specialised task of vacuuming litter, that had been gathered by another robot. Our robot was able to servo on the litter without the use of explicit models.
1.3.4
Autonomous navigation
In the final step towards autonomy, we discuss in Chapter 6 how our robot can resolve internal behavioural conflicts. A systematic search algorithm is introduced, and followed by a description of the refined version of the algorithm, which is implemented using visual behaviours. Vision-based techniques for resolving behaviour conflict detection and contour following of obstacles are presented. To illustrate the capability of our system to resolve conflict situations, a number of goal-oriented experiments were performed. In one experiment, the robot was directed to head toward a given goal that had been deliberately blocked. In another experiment, the robot was sent to wander around searching for a blocked goal. In both cases the robot was able to resolve the conflict between moving to the goal and avoiding collisions.
10
Overview
1.3.5
Human-robot interaction
In Chapter 7, a paradigm called “Supervised Autonomy” for teleoperating a mobile robot is proposed. The basic concept and its objectives are presented. Each of the visual behaviours developed are described, including extensions to the vision processing components. The teleoperator console developed to support this research is also described. To demonstrate the effectiveness of the concept of “Supervised Autonomy”, we allowed an inexperienced user to supervise a mobile robot using an intuitive computer user-interface. The robot was commanded by a set of high-level qualitative instructions. One mission our robot was able travel from one room to another, while using only a small set of high-level qualitative instructions.
1.3.6
Conclusion
To conclude, Chapter 8 summaries the key points of our research and outlines avenues for further research.
Chapter 2
Background “Genius is one per cent inspiration and ninety-nine per cent perspiration.”1 — [Edison, 1932]
Mobile robot navigation research evolved out of the desire to allow robots to autonomously travel between places in order to perform tasks. Over a number of years researchers have studied this problem from many directions, such as, investigating navigation schemes, control systems, and sensing. In this chapter a review of the past mobile robot navigation research is presented. The review provides a basis for inspiration of the work presented in the remaining chapters. Our discussion takes a retrospective view of past approaches to the problem of navigating a mobile robot from one place to another. In Section 2.1, we begin by reviewing control architectures that have been used in mobile robotics. In Section 2.2, a discussion of different navigation approaches is presented. The discussion outlines various navigation schemes, from model-based planning using geometric maps, to a discussion of techniques that avoid using a detailed description of the environment in performing mobile robot navigation. A discussion of the problem of robot localisation is also given. Section 2.3 presents a review of significant vision-based navigational systems that are relevance to our research.
1
For Rochelle, — “thanks for the quote”.
11
12
Background
World Model
Sensors
Central reasoning system
Perception
Motor controller
Actuation
Figure 2.1: Planner-based system
2.1
Robot control architectures
A control architecture provides the structure for the control of a system, it imposes constraints on the way a system should perform. We begin with an outline of common control architectures that are used. For our discussion we have divided the architectures into four classes2 : Planner-based, Purely reactive, Hybrid and Behaviour-based approaches.
2.1.1
Planner-based
A typical system, which employs the Planner-based approach, depends on a centralised internal model of the environment, which is often referred to as the world model. The model is used for the verification of sensory data, and generating actions based on the perception data. The model is also used by the planner to produce the most appropriate sequence of actions i.e. a plan. The structure of a typical Planner-based system is given in Figure 2.1. This kind of system is often referred to as the “sense-model-plan-act” architecture. The system is structured so that all processing, sensing, controlling and reasoning is performed centrally. Such systems are often referred to as centralised intelligence. All components are controlled by the central system, the overall behaviour of the system emerges from this centralised controller. The planner-based approach employs many of the classical Artificial Intelligent (AI) methodologies3 . Early systems which follow the planning philosophy included SHAKEY and the Stanford Cart, for a further description of these systems see Sections 2.3.1 and 2.3.2 respectively. 2
similar division has also been done by [Matari´c, 1992a, Arkin, 1998] “Planning” was first used by [Feigenbaum and Feldman, 1963, Newell and Simon, 1963], where a plan is defined as a sequence of actions designed to accomplish some task. 3
2.1 Robot control architectures
13
During the research of the application of classical AI techniques to robotics, a number of problems arose. The main problem faced by researchers at the time, was the lack of available computation power. Real-time performance was near impossible, mainly due to the fact that many of the techniques had complex models that required the exploration of large search spaces. The favoured approach was to examine the tree of possibilities as quickly and effectively as possible. Since this examination and searching was proven problematic, the earlier planner-based systems showed a lack of responsiveness to real world situations. A heavy computation cycle was required to process data for the next action. This lack of performance inhibited the development of robot systems that could satisfactorily cater for real-world events. Although this has been the case in the past, the recent advances in computational technologies is still enabling the monolithic approach to the development of robotic systems to continue. Another shortfall of the early systems was their heavy dependency on the correctness and consistency of information entering and processed within these systems [Nilsson, 1984, Giralt et al., 1984, Chatila and Laumond, 1985]. These assumptions are typically due to the system design methodologies. Typically these systems, model or abstract away — ignoring — sensing errors to a level that it is unrealistic when it is applied to the real world. See [Latombe, 1991] for example systems which ignore the usage of sensor information in arriving to solutions to robotic problems. Such expectation can not always be lightly applied to real-world sensory data, sensing errors are a typical occurrence in real world systems. In summary, planner-based systems provide: √
high level of data processing.
× monolithic in structure — complex in design and implementation. × lack real-time performance, due to large computation cycles. × unrealistic in their design — ignorant of the characteristic of real world data.
2.1.2
Purely reactive
The idea of allowing stimulus and response to control the actions of a robot led to the development of reactive systems. The development of purely reactive robot systems is
14
Background Sensor
Scheme
Left obstacle sensor
Right obstacle sensor
Collision sensor
Left obstacle detected
Right obstacle detected
Collison detected
Turn right wheel forward
Turn left wheel forward
Stop all wheels
Example
Actuation
Figure 2.2: Reactive system
often referred to as biologically inspired. Like their biological counterparts all of the connections between its perception sensors and actuator actions are prewired. The systems are divided into condition-action pairs. Using such an architecture, stimuli from the environment are tied directly to actions performed by the robot. Figure 2.2 presents an example configuration of a reactive system. Robots that have been built using reactive schemes have been shown to give robust performance in the real world [Brooks, 1986, Arkin et al., 1992]. Historically, the two robotic systems which first possessed reactive like behaviour were the “Tortoises” of [Walter, 1953], and the “Vehicles” of [Braitenberg, 1965]4 . The first reactive mobile robots were the “Tortoises” (ELMER and ELSIE) developed by [Walter, 1953]. Walter5 , a neurophysiologist, designed his robots as a study of reflex actions. His study focused on the importance that neurological interconnections played in the production of complex behaviours, rather than the number of neurons. The tortoises were designed to wander around seeking light sources and were also able to recharge their own batteries. 4
The full potential of Braitenberg’s Vehicles was not realised until publication of [Braitenberg, 1984] 5 Walter was also accredited with the discovery of theta and beta waves (brain waves associated with light and heavy sleep).
2.1 Robot control architectures
15
Braitenberg’s Vehicles were designed purely as thought experiments in “synthetic psychology” [Braitenberg, 1984].
They were designed to show that complex be-
haviours can be realised through simple connections between sensors and actuators. Braitenberg shows the possibility of behaviours such as cowardice, aggression, love and exploration. Although no physical system was ever built by its creator, the basic ideas of Braitenberg’s Vehicles are still being used today. A number of extensions have been proposed by [Lambrinos and Scheier, 1995] and [Wyeth, 1997]. In the work of Lambrinos and Scheier, they extended the basic idea of Braitenberg’s Vehicles in a behaviour neural network 6 , allowing the robot to possess selfsufficiency while performing a task of peg collecting.
Self-sufficiency is achieved
through the simulation of battery levels, the energy is decrease proportional to the distance, when the battery is low the robot heads toward a charging station indicated by a light source. When the robot has collected a peg it returns to its home base, marked by a black spot on the floor. Using simple IR sensors this system can execute the following actions: move forward, avoid, home, deposit, go to charging station, and recharge. The extension is based on providing action selection capabilities in a weighted fashion between these reactive actions. The work of Wyeth, extended Braitenberg’s ideas into the task of “Hunt and Gather”. Wyeth’s robot wanders around its environment using vision information, hunting for a tennis ball while avoiding obstacles. The basic idea of Braitenberg’s Vehicles was used as the foundation for this robot, directly coupling sensing and motor actuation to form the overall system. The systems of Lambrinos and Wyeth have shown promising results in the way they can handle situations in real world environments. Contrary to Planner-based systems, reactive systems take a minimal system approach. Purely reactive systems have no internal states, or models of their environment, and therefore do not require any complex search to produce a response. The reduction in complexity and the minimal computation used allows systems to achieve high responsiveness to the real world. As can be seen from Figure 2.2, the execution of these reflexes is purely a simple command lookup. Such systems rely on the direct coupling between sensing and action and the fast feedback loop of the environment. The designer of such systems must be able to predict all the relevant world states that 6
a term used by the authors.
16
Background
can arise, and the appropriate corresponding actions. These systems do not depend on the correctness of information, since information describing the world is obtained directly from its sensor(s). The work of [Brooks, 1986] and [Arkin, 1987] revived the interest in reactive like robot systems. Brooks produced what is now known as Behaviour-based architecture for the control of robots. The work of Arkin used the insights gained from cognitive psychology studies of action-oriented perception [Neisser, 1976], to describe “Motor Schema”, where individual motor schema correspond to primitive behaviours [Arkin, 1987]. The systems of Brooks and Arkin provide more than just a simple hookup of sensors and actuations, they also provide, at a low level, internal states. Although reactive systems produce desirable responses with great efficiency, by having little internal representation, these systems lack run-time flexibility. Some attempts have been made to address this shortcoming, these include: Hybrid systems, which attempts to combine planner-like structures with reactive sub-systems; and Behaviour-based systems, where reactive components are embedded as part of the system structure. These approaches are discussed in Sections 2.1.3 and 2.1.4 respectively. In summary, purely reactive systems are: √
easily manageable in structure and organisation.
√
highly responsive to real world situations.
× lack run-time flexibility. × lack the ability to solve problems that were not preconceived by the designers.
2.1.3
Hybrid Architectures
Ideally, these systems advocate the notion of gaining the best of both worlds. The idea of a Hybrid system is to divide the system into two parts, at the lower level of control a reactive approach is used, and at the higher level, a planner is used for decision making. An example of such a system is shown in Figure 2.3. Generally the low-level process takes care of the overall safety of the system. And the high-level planner is used to
2.1 Robot control architectures
Sensors
17
Collision sensor
Collison detected
Stop all wheels
Right obstacle sensor
Right obstacle detected
Turn left wheel forward
Left obstacle sensor
Left obstacle detected
Turn right wheel forward
Planner
Motor controller
Actuations
Figure 2.3: Hybrid system
determine action sequences. This approach has gained significant attention since there is no need to abandon the knowledge and insights gained from past research in plannerbased systems [Arkin, 1989, Arkin, 1990, Payton, 1991, Connell, 1991, Gat, 1992]. One of the first hybrid architectures was reported by Arkin, named Autonomous Robot Architecture (AuRA) [Arkin, 1989, Arkin, 1990]. AuRA has been applied to the development of a number of real world robot systems, integrating world knowledge into a robot navigation system, and reactive components for the management of the survival and safety of the system. Specifically, this architecture has been applied to the domains of indoor and outdoor mobile robot navigation, manufacturing environments, vacuuming, mobile manipulation and multi-robots. Although Hybrid architectures for mobile robot navigation have taken advantage of prior information, which on some occasions has helped the robot to achieve efficient and sophisticated results. These promising results were primarily due to the added flexibility purely reactive approaches did not provide, in allowing the integration of well established deliberative components. However, the integration of such disparate components may prove to be problematic, if the newer reactive component is being treated as a functional unit, rather than concurrently running modules, as intended, in order for a system to keep up with events in the real world. For example, if
18
Background
Collision Avoidance
Obstacle Avoidance Sensors
Actuations Wall Following
Move To Goal Figure 2.4: Behaviour-based system mode switching is used, the reactive component of the system may be left waiting indefinitely for the deliberative part of the system, to determine the next most suitable action. This is not the case in systems using a Behaviour-based architecture, see Section 2.1.4 below. In summary: √
/× allows classical AI components to be integrated with reactive components. √
responsiveness is due to the reactive components of the architecture.
√
provides the flexibility which was lacking in purely reactive system.
× may be problematic if incorrectly integrated.
2.1.4
Behaviour-based Architecture
In some ways, reactive systems can also be classified as Behaviour-based systems. The notion of behaviour by definition is the outcome of a response to a stimulus. In our work, we use the term “behaviour” as a competence module which is responsible for a particular stimulus and response. A behaviour when stimulated produces an
2.1 Robot control architectures
19
output response. To make the distinction clear, a behaviour is considered to be more than just a simple configuration of sense then act. It can contain internal states, and its output actions can be influenced by other behaviour(s). It is through the interaction between behaviours, that emerges the overall output of the system. Although, behaviour-based systems have reactive components, the final response is not directly determined by the reactive components. Typically, a behaviour-based system incorporates an arbitrator for determining the final motor command. The structure of a typical behaviour-based system is shown in Figure 2.4. Behaviours are usually executed concurrently, and the execution time of each behaviour tends to be short. Through the execution of these behaviours the overall behaviour of the system emerges. The best known and most cited behaviour-based approach is the “Subsumption architecture”, proposed by [Brooks, 1986]. Using this architecture, systems are constructed out of layered behaviours, where each layer is responsible for its own level of competency. The interaction of the layered behaviours provide the overall behaviour of the system. The subsumption architecture proved, that a high level of sophistication in robotic systems can be achieved through the use of simple sensing and action modules. One of the key points of Brooks’ work, was that a model-free approach to robotics could realise a level of performance that was not previously attained by traditional AI systems. Particularly in the responsiveness to dynamic situations. One example was the robot, Genghis, a 1kg six legged robot which was able to stand and walk with a simple behaviour-based architecture [Brooks, 1989, Brooks, 1990]. Using only force measurements in a model free context the system was able to lift its legs over obstacles, thus also allowing it to navigate over rough terrain. A Behaviour-based system does not employ a world model to determine its action. It uses a different form of internal representation for determining action, as opposed to model-based planners. The internal models that are typically used are based on exploiting physically grounded information, that can be directly sensed. Thus, providing the added advantage of not needing to perform troublesome operations, associated with model searching and matching during the operation of a system. Again Gengis serves as an example of how a successful robotic system can be built to function in a realistic environment, without the use of an explicit internal model of the environment. Other similar systems which exploit the natural constraints of the environment include, Herbert the can collecting robot [Connell, 1991], and Polly the
20
Background
vision guided mobile robot [Horswill, 1994a]. Building systems based on reactive components to create a set of behaviours, and using an arbitrator to select an action from the available behaviour outputs, has resulted in an impressive range of tasks that can be accomplished (see [Maes, 1991, Steels and Brooks, 1995] for descriptions of some of these systems.). However, as with all systems, behaviour-based systems are not perfect. The dead-lock situation arising from behavioural conflict is well understood and has been studied by a number of researchers [Adams et al., 1990, Hartley and Pipitone, 1991, Arkin et al., 1992, Zelinsky and Kuniyoshi, 1996]. A typical example is the conflict arising in goalreaching and collision avoidance behaviours in a behaviour-based navigation system. This is an important problem, and is addressed in this thesis in the context of achieving goal oriented navigation while resolving deadlock situations. In summary: √
the overall system behaviour emerges from interactions of behavioural components, to produce behaviour which was not explicitly specified.
√
only the minimum amount of communication is used between modules.
√
as with purely reactive systems, behaviour-based systems are also highly responsive.
× conflicts between behaviours can occur, resulting in deadlock states.
2.2
Robot Navigation
As stated in Chapter 1, the process of navigation is the act of traveling from one place to another. For a physical mobile robot system, approaches in navigation have set about resolving two basic problems: the localisation problem7 – determining the whereabouts of the system; and Robot Motion – determining where to move next given the current situation, this involves embracing a strategy for establishing collision free paths while allowing a goal to be reached. Addressing these problems in an integrated robot system, allows the robot to achieve successful navigation. 7
also referred to as the “Where am I?” problem, suggested by [Leonard and Durrant-Whyte, 1991].
2.2 Robot Navigation
21
In this section we outline some of the different navigation approaches that have been used in the past. Specifically, the presentation of robot motion is divided into two, the model-based approach and the sensor-based approach. Model-based, in the context of our discussion makes use of a full description of the environment for navigation. Sensor-based, will be presented in the context of techniques or approaches that are used to avoid detailed descriptions of the environment. First we introduce some of the techniques used in solving the localisation problem.
2.2.1
Localisation
Localisation provides one of the key elements in navigating a physical robot. It answers’ the question, “where am I?”. For localisation, the solution is directly related to the available sensor(s) and technique in which the sensor(s) are being employed. Methods of localisation can be divided into two categories, accumulative8 and absolute. Accumulative techniques allow a system to autonomously maintain relative information from its available sensor(s). The information provides a relative location as to where the robot began gathering data from. Absolute techniques require a system to determine its current location through the use of external means. The information provides a global position as to where the robot is. Excellent accounts of the available sensors relating specifically to localisation have been provided by [Everett, 1995, Borenstein et al., 1996, McKerrow, 1990]. The general concept of the important techniques is explained below. Commonly used accumulative techniques include:
• Odometry – This method relies on the ability to measure the wheel rotation
and/or steering of the robot. It is a self-contained method that provides an estimate of its relative position, by integrating rotational measurements over time. However, position errors tend to grow without bound unless an absolute reference is used to reset the error. An example of a system which places heavy usage on odometry information is [Kanayama and Yuta, 1988, Iida and Yuta, 1991]. In this system, navigation is performed by determining a straight line trajectory, based on the odometry readings gathered from the wheel encoders. Due
8
also known as, Dead Reckoning.
22
Background to the self-containment and low-cost of odometry methods, many commercial robot systems use this technology. For example the Nomad series robots from Nomadic Technologies Corp. and RWI series robots from Real World Interface Inc. Typically these systems provide information which allow the control of a robot by distance and rotation, based on odometry readings. • Inertial – A gyro can be used as a turn indicator, allowing the relative orientation
of a system to be determined. The information provided by the gyro can be in the form of rate of rotation, the angular orientation of the system can be computed by integrating this information over time. An accelerometer can also be used to measure the acceleration of a system. The measurement is provided as the velocity rate information, allowing velocity and linear position to be calculated through integrations (once for velocity and twice for position). [Kelly, 1994, Everett, 1995, Borenstein et al., 1996] give an excellent survey of techniques involved in the use of inertial sensors for the purpose of mobile robot navigation. The impressive property of these devices is that they are selfcontained, non-radiating and they can not be jammed. However, like odometric system, inertial sensors accumulate errors without bounds.
Common absolute techniques include:
• Beacon-based – localisation usually relies on three or more known beacons
placed in the environment. The beacons are usually highly distinguishable from the rest of the environment, e.g. bright lights or reflectors, allowing detection to be made easily. In order for the robot to perform localisation, triangulation based on the known positions of these beacons is used. The advantage of this method is that no elaborate scheme is needed for the recognition of a beacon. However, the environment requires modification in order for this method to function effectively.
• Landmark-based (Artificial/Natural landmark) – localisation relies on distinctive identifiable features at known locations, in order for the robot to detect
and determine its position. The features are usually individually distinguishable, allowing the robot to acquire positional information without the explicit need of the detection of other landmark(s). However, elaborate schemes are often needed for recognition of landmarks. In the context of vision, two basic
2.2 Robot Navigation
23
schemes are commonly used: feature-based and appearance-based recognition. The major difference between these two methods are in the level of complexity and the amount of processing needed. Feature-based methods tend to require more complex and complicated algorithms, and therefore more computational time is required. Whereas the appearance-based approach uses simpler algorithms to determine landmark locations, and most systems have shown to be fast in their processing. For example the system of INRIA, fully uses featurebased techniques for its navigation, see Section 2.3.3, while the robot Polly, exploits the minimum computation of appearance-based techniques in performing navigation, see Section 2.3.4. Polly perform its processing at around 15Hz, whereas the system of INRIA required processing time of well over 5 seconds. • GPS – One of the latest technologies in absolute positioning, Global Position
System (GPS), was first introduced by the United States military, to track military vehicles on the surface of the earth, using a number of satellites orbiting the earth. Based on the known orbital locations of the satellites taken as the line-of-sight distance, the position of the system is determined by the use of trilateration, see [Getting, 1993]. Although GPS is now being applied to robot navigation, the real potential of this technology is still yet to be realised. Some of the potential error sources associated with this method include, satellite position, ionospheric refraction, tropospheric refraction, multipath reflection and the availability of the Satellite(s). Researchers such as [Kelly, 1994, Nebot et al., 1997, Sukkarieh et al., 1998] are using inertial systems with GPS and Kalman Filters to obtain optimal position estimates.
Although the main thrust of this thesis is centered on vision-based navigation, we advocate navigation systems which uses a combination of localisation techniques. In particular, we are interested in the combination of odometry and landmark-based navigation. In summary: √
/× two basic techniques for localisation, accumulative and absolute. √
accumulative localisation techniques are self-contained requiring no additional external information. They can provide information such as distance travelled, heading and velocity.
24
Background × intrinsically, accumulative localisation techniques tend to accumulate errors over time.
√
absolute localisation techniques can reliably provide accurate position of the current location of a system.
× absolute localisation techniques typically require access to external information, which can require modification of the environment.
2.2.2
Robot Motion
Determining the motion of a robot is a fundamental requirement that must be fulfilled by a mobile robot navigation system. Computationally, the way in which the motion of a robot can be determined, tends to be closely related to the internal representation that the particular system manipulates when establishing how its goal should be accomplished. Attributes to be considered include, the amount of data required, how the information is gathered, and how it will be used. In this section we’ll discuss the common techniques used in navigation. There are a number of approaches, to categorise we have divided our discussion in two: a discussion is given to the use of a Model-based approach; followed by a discussion of ways in which models can be avoided using a Sensor-based approach.
Model-based navigation A model, typically in the form of a map provides a detailed description of the environment. It is either provided by the system designer or constructed by some mechanism provided by the designer. The model or map is then used to control the manner in which the robot moves, the success of the robot is highly dictated by the accuracy of the model. This section discusses a number of strategies used in model-based navigation. We divided the strategies into two basic categories: the development of planning algorithms, and the use of the models for the navigation of mobile robots. In the discussion of planning algorithms, we describe various approaches to solve the navigation problem, which have contributed to the early foundations of robotics research. The presentation of the usage of models in navigation starts with a discussion of map
2.2 Robot Navigation
25
building, followed by a discussion of model matching. Finally, we end with a discussion of the ways in which the use of planning, map building and model matching can be combined in mobile robot navigation.
Motion planning – The foundational work in mobile robot navigation, has been oriented towards the research of path finding, where a robot searches for a path in order to reach its goal destination. As stated, the basic assumption taken by this approach is to use a complete description of the environment, and the exact position is assumed to be known at all times, thus removing the localisation problem. The prime objective of the research, has been to discover optimal collision-free paths for a robot to reach its current goals. In this work the capability of the robot, is usually not considered. Typically an abstracted view is taken, for example assume the robot to be a point which can then easily be transformed into a geometric map [Koditschek, 1987, Latombe, 1991, Zelinsky, 1994b]. Path planning algorithms perform their task normally with geometric spatial data of the environment, for example the dimensions of walls and obstacles and their location are specified in a global Cartesian manner. The basic structure of these systems, compute various types of graph like structures, such as visibility graphs [Nilsson, 1969], Voronoi diagrams [Fortune, 1986], freeway nets [Brooks, 1983] and silhouettes [Canny, 1988]. For example, the visibility graph, one of the earliest and most common path finding systems, outputs a series of connected non-intersecting straight line segments to its goal, allowing a robot to travel without the occurrence of collision before reaching its goal. The Voronoi diagram method produces a set of points which can yield a path that has maximum clearance between the robot and the obstacles. The grid or cell decomposition method is another popular technique which has been commonly used for path planning, [Brooks and Lozano-P´erez, 1983, Kambhampati and Davis, 1986, Zhu and Latombe, 1989]. These methods decompose the map into a number of free space cells. These cells are connected together to form a connectivity graph, which can be searched for a collision free path to the goal. Quadtrees are another example of cell decomposition, in this method the map is recursively divided into smaller square cells until a cell is entity either free space or an obstacle. This reducing the number of cells that
26
Background are needed to represent the environment. Only the cells which are free are used to form a connectivity graph, which can be searched for a path to the goal. Other attempts have been made to incorporate sensory information into motion planning systems [Kimoto and Yuta, 1995, Shmoulian and Rimon, 1998]. However such methods require realistic information about the environment. For example, the work of Kimoto et. al., reported on a robot simulator that used accurate physical models. The simulator closely matched the outcome of a sonar-based mobile robot in a real environment. However, the real world is inevitably different to the model of the environment. Sensor data is inaccurate and we can not always determine or predict the actions of all the agents in the environment. As the real world is unforgiving, the robot will quickly become lost if the accuracy of its model has been compromised. While the robot simulator has limitation it is still an useful debugging tool, if a robot program does not work in the simulator it has no hope of working in the real world. To compensate for this deficiency, research has been undertaken in sensor oriented systems. We refer to this as the Sensor-based approach, where the effort is directed towards avoiding the use of a full model. A system which avoids the use of a model, tends to compensate for its lack of knowledge by relying increasingly on its sensing ability. Further discussion of Sensor-based systems is given in later sections. Some work in motion planning based systems has been reported on mechanisms that allow re-planning taking into account sensing errors. For example, in the work of [Yuta and Nagatani, 1993], a mobile robot can correct its path as well as replan its path based on collected sensory information. • Most planning systems determine the complete path a robot takes before any motion is made by the robot. Typically, the plans do not take into
consideration possible dynamic changes to the environment, after the plan has been established. Planning can not consider such information unless future events can be easily determined or are available a priori. • Planning is usually done in an off-line manner, once all the necessary infor-
mation is provided to the algorithm. The algorithm typically performs a state search on the information provided, until a motion trajectory can be determined. Another aspect of planning algorithms is that the generated solution usually ignores the consideration of the sensing capability of the robot [Latombe, 1991].
2.2 Robot Navigation
27
• Path finding techniques can be summarised with a number of key points. In a model-based system approach: Path finding usually occurs in a well defined environment that is known a priori and, most of the environments are typically 2D configurations, sometimes 3D. These methods do not scale up for multi degree of freedom robots, due to the explosion of search space. Model/Map building – Providing an accurate and precise model is a difficult and a time consuming task. Map building is one common technique used to reduce the effort of avoiding the need for supplying a complete model of the environment by the designer of the system. The initial belief was that if a robot has the capability to build maps then this will allow easy adaptation to new environments. There are a number of ways in which a map can be constructed by a mobile robot. Most of these systems believe that map construction should be performed via sensor fusion [Hoppen et al., 1990, J¨org, 1995, Castellanos et al., 1998]. Sensor fusion is usually performed through the arrangement of different sensors, they are combined to yield the final outcome. Some common techniques include, averaging, weighted sum, and Kalman filtering. The basic approach to map building is as follows: extraction of features from raw sensor data, fusing the data from the sensors (for example sonar, vision, laser and radar), then automatically producing a model of the environment. Research has also been geared toward the maintenance of constructed models. The work of [Elfes, 1987, Crowley, 1989, Engelson and McDermott, 1992] are such examples, they suggested ways in which a constructed model can be maintained. Further examples, such as the work of [Thrun et al., 1998, Durrant-Whyte et al., 1999, Leonard and Feder, 1999] are given in the following sections. The function of these types of systems is as follows, the most recent data given by the sensor model is used to determine the corresponding content in a local composite world model of the environment. A step is then taken to modify the local composite model within a global structure of the environment, based on the confidence values of its matching. Model matching – If a model is used for navigating a real mobile robot, one problem which must be resolved is the problem of model matching. As discussed, this can be largely assisted by the use of localisation techniques. One of the problems this technique faces is the association of the sensor data with the world model
28
Background being kept. There is a difference in information content between the sensor data and the model [Kak et al., 1990]. The main problem arises, when determining which part of the measurement belongs to which part of the model/map. Once the system has determined the local feature in the environment, the system then needs to place or search for this same feature in the map. As discussed, the use of odometry for localisation is common for navigation. This is usually performed based on placing the robot at a known location in the environment. The system then performs navigation based on the distance travelled while following known features provided in the map. For example, using the model of a connectivity graph, produced via a Voronoi diagram, a robot can travel by keeping track of the relative distance and orientation of the system, while moving between each connected node in the model. The model matching in this case is the relative distance and orientation between each node. Recent techniques employ more sophisticated methods, such as statistical and probability schemes in reducing the uncertainty of its matching, a discussion of these methods is given in the next section.
Motion planning/Map building/Model matching – Recently robotic systems have begun to emerge, which connect all three aspects: motion planning, map building and model matching. The recent work of [Durrant-Whyte et al., 1999, Leonard and Feder, 1999] address some of the issues of the problem of large scale simultaneous localisation and map building (SLAM9 ). Durrant-Whyte et al. proposed a scheme in which a map can be built and maintained, the importance of their study is that they are able to show that this problem is solvable theoretically, while still not practically tractable for large scale environments. The work of Leonard et al. proposed schemes which are computationally efficient for large-scale concurrent mapping, while being able to localise an autonomous underwater vehicle (AUV). The work of [Thrun et al., 1998], proposed a complete system which performs mapping, motion planning and model matching/updating. In addressing these problems they use statistical and probability techniques to estimate the uncertainty within the system (some of these techniques include: Bayesian analysis, Hidden Markov modeling). They have chosen to maintain maps in two ways, geometrically and topologically (a discussion of topological map is given in the next section on Sensor-based navigation). The maps they built are based on 9
a term used by Durrant-Whyte et al. in [Durrant-Whyte et al., 1999].
2.2 Robot Navigation
29
sonar and video information. Topological information allows the system to localise itself, while the geometric map is used to perform path planning. The work of Thrun et. al. has successfully demonstrated ways in which a mobile robot navigation system can be built using a model-based approach. Furthermore, their work brings forward the notion that just solving model-based planning or sensorbased navigation is not enough to complete a mobile robot navigation system. All these aspects must be addressed, planning, mapping and localisation. Although these systems have shown promising results, they have not shown realistically how they can handle the explosion of information gathered by the extremely large amount of data points. The ability to handle dynamic environments is still an important issue which requires further consideration. In spite of the fact that large bodies of work have been devoted to this area of research, the problems of constructing and maintaining maps and models is still an open issue for further robotic research. In this thesis we propose a system which in part, addresses the open issues of what is required to provide a complete robot navigation system.
In summary:
× model-based approaches tend to assume global level information, where all the information is given.
× model-based approaches require large amounts of data. × motion planning is typically done off-line, before any motion is made. Hence can’t handle dynamic environments.
× most of the sophisticated planning algorithms have yet to be fully realised in real robot systems.
√
in nearly all cases of model-based approaches to navigation, an optimal solution to the goal is provided.
Sensor-based navigation In this section we describe methods that try to minimise the usage of models for robot navigation. We refer to these approaches as Sensor-based navigation. An essential
30
Background
and successful aspect of sensor-based navigation, is the use of local sensor information. A strategy which removes the reliance on a supplied map or model.
Potential field – One technique that has gained popularity is the Potential field method for the navigation of mobile robots. The popularity of this technique is due to a number of factors, it can be applied in both online and offline situations. When used online this technique provides a reactive like approach to navigation [Khatib, 1986, Koditschek, 1987, Adams et al., 1990, Arkin et al., 1992, Zelinsky et al., 1993, Haddad et al., 1998]. And when used offline a motion plan can be generated using this technique. A common problem with the on-line potential field method is that local minima can occur. However, the off-line method avoids this problem [Koditschek, 1987, Tilove, 1990, Latombe, 1991]. Topological Map – A good technique which avoids using a detailed model of the environment for navigation is the topological map, in which places are defined and linked together by paths. Topological style maps can be used instead of a full geometric map of the environment when performing navigation. Allowing the robot to gather local information using its own sensors, to determine the best way to accomplish its goal [Kuipers and Levitt, 1988, Matari´c, 1992b, Horswill, 1994a]. This approach, provides minimal information for the navigation task, it involves obtaining a set of reliable landmarks that are distributed through the environment. The robot’s task is to locate/navigate using these landmarks. Two basic functions need to be addressed by these systems. The robot must be able to recognise places/landmarks and provide a route to each location. In Section 2.3.4, we describe the robot Polly that uses a topological map of its environment to perform the task of giving a guided tour of the AI lab at MIT. With only a simple representation the system was able demonstrate robustness in the execution of its task. In addition, the system showed that a topological map can be used in an opportunistic manner, when the system is lost, it is able to relocate by single landmark detection. Although topological maps can be more robust over geometric maps for mobile robot navigation, the shortfall of topological style map navigation is that they do not contain enough information for a system to determine optimal paths to a goal, also new goals or places can not be inferred, from the topological map.
2.2 Robot Navigation
31
Learning places – The work of [Kuipers and Levitt, 1988] and [Matari´c, 1992b] constructed topological maps of the entire environment using sonar. The maps were then used to plan paths. The unique feature of these approaches is that a network of distinct topological places was used. However since both approaches rely on wall following, they are only suitable for learning corridor-like environments. In the work of [Kortenkamp and Weymouth, 1994], a system was built that performs navigation based on a “Cognitive” like map structure. Their system looks for distinguishable transitions between one space to another. Using these transitions, a network of paths can be linked to form a topological like structure. This allows the robot to travel from one place to another. The robot performs place transition recognition using sonar and vision data, giving rise to better accuracy in place recognition. Purposive Map – The work of [Zelinsky and Kuniyoshi, 1996] proposed the Purposive Map (PM), it is topological in nature, and allows dynamic changes, as well as planning. A navigation control structure which encompasses a topological like representation, removing the essence of geometric spatial detail of an environment. The basic structure is represented by connections of landmarks, for each landmark an action is associated, (for example, turn left 45 degrees). Connecting these landmarks are behaviours and utilities which are required to reach the next landmark. For example, “Go Forward”, if you see a door “Go through it”, then turn left 90◦ , “Go Forward”, if you see a wall “Follow the Wall” for 10 seconds, etc. Navigation is performed via an arbiter, coordinating and monitoring behaviours with the representation provided by the PM. Navigation of this system was performed with sonar and odometry. For implementation purposes, a PM is simply represented as a sequential list data structure, connecting each of the behaviours as an ordered sequence in the list. Other emphases of this work included, escaping from local minima, which occurred in potential field method for mobile robot navigation. In the thesis we make use of the PM for vision guided mobile robot navigation. In Chapter 5, a PM is used for the purpose of performing Landmark-based Navigation. In Chapter 6, it is used in performing goal-oriented Navigation, where it is able to handle conflicting situations. Furthermore, we take advantage of the qualitative nature of a PM, in producing a more intuitive and natural representation of events. Using a PM in this way, a robot is able to communicate to the user how well the robot is progressing in the execution of a particular task,
32
Background and what behaviours within the robot were triggered during task execution. In Chapter 7, we present a teleoperation system based on this approach.
In summary: √ √
less detailed information is required.
/× consideration for real world situations must be incorporated into the system in order to make it work, hence ensuring the system to be more realistic in its capabilities, and robust. × learning new places, and inferring geometric knowledge.
2.3
Vision-based Navigation
Since this thesis is dedicated to the research of vision-based navigation, it is only appropriate to end the chapter with an outline of some of the systems which provided the inspiration for our research.
2.3.1
SHAKEY
SHAKEY was one of the first major robotics research efforts undertaken by the Artificial Intelligent community in the late sixties [Nilsson, 1984]. It involved a mobile robot developed at the Stanford Research Institute (SRI). SHAKEY was the first mobile robot to navigate using vision. It used a black and white (B&W) video camera as its primary sensor, the main computation was done offboard. The computation included image processing and world model matching. It used a planner program, STRIPS10 which determined an action sequence for the robot to execute [Fikes and Nilsson, 1971, Fikes et al., 1972]. The STRIPS program was a high level planner which manipulated symbols, information which had been preprocessed by the perception unit of the system. The environment SHAKEY operated in, was specially arranged rooms, with obstacles (large painted blocks and wedges) customised for SHAKEY’s perception ability. SHAKEY was programmed to push obstacles/objects out of the way, or push them to 10
STanford Research Insitute Problem Solver
2.3 Vision-based Navigation
33
a particular location. The ability to perceive the environment was overly simplified, allowing the system to translate what had been seen into carefully defined symbols for the planner to generate actions. Although, SHAKEY was considered successful in performing tasks in a perfectly arranged world. It was never tested in realistic real world environments. This success advocated the use of internal world models for the design of robotic systems, which has been followed for quite some time. This type of problem is far removed from the real problem of determining the state of the world. This was primarily due to ignoring real world data, and the system’s translation of data into high level symbols for decision making. This problem was addressed by later systems which do not make such assumptions, such as in reactive and behaviour-based systems [Brooks, 1986], using the Physical Grounded Hypothesis, which states that the symbols that a system uses and manipulates should be grounded to the physical world. In Section 2.3.4 and the Pebbles in Section 2.3.5, we discuss two robots built on the physically grounded hypothesis, Polly and Pebbles. Both of these robots were able to demonstrate robustness and high performance in real world environments. We also believe that the real world must be taken into account during the development of a real world robot system, and in a way that minimises artificial modification to the environment. In this thesis, we have elected not to customise our environment in any special way, in order to achieve successful mobile robot navigation. In summary, SHAKEY had the following advantages and disadvantages: × assumed perfect sensing. × unrealistic definition of the problem, such as the establishment of an artificial and structured environment.
× dynamic environmental changes were not considered, consequently no obstacle avoidance was performed.
√
successful at a higher-level planning at the task-level.
2.3.2
Stanford Cart
The “Stanford Cart” is one of the best known vision-based mobile robot projects [Moravec, 1983]. The system utilised a video camera mounted on a mobile platform,
34
Background
the video signal was broadcast to a remote computer which then processed and controlled the motion of the robot via radio signals. The system used a planner for obstacle avoidance and path determination. Once images of the complete scene of the environment were received the system processed them. The processing time was reported to take 1-2 hours. Once the image processing was completed, a planned path was produced which guided the robot around obstacles. Many lessons were learnt from the “Stanford Cart” experience [Moravec, 1983]. Firstly, and quite importantly the honest reporting of mishaps. Many of the problems encountered were simply due to the length of time the system spent processing information. For example shadows moved, causing the robot to make errors in its maps. In this case, the tight loop control and also the lighting adaptation presented in this thesis can cater for such shortfall, see Chapter 4. Moravec’s ideas have also been explored by other researchers [Thorpe, 1983], such as the idea of image segmentation using an Interest-Operator. Moravec appears to be the first to propose the use of a vision based Interest Operator to detect distinctive features in an environment. This thesis also makes use of this idea. Other contributions accredited to Moravec were stereo matching and ranging algorithms. The key contribution of Moravec’s was that he showed that vision guided mobile robot navigation was possible. Although this system provided some of the landmark ideas in mobile robot navigation, the work fell short on the basic functioning of the robot itself. The main lesson learnt from the “Stanford Cart” was that a tight loop control between sensing and action is necessary when a large dependency is placed on its sensory inputs. In our work, we have ensured that such a tight control has been made for the action of the robot in response to sensor inputs. There by, yielding an effective and responsive robot system. In summary, the contribution and drawbacks of this work are: √
the originator of the Interest-Operator.
√
stereo vision ranging.
√
placed importance on real world implementation.
2.3 Vision-based Navigation
35
× slow in processing.
2.3.3
The INRIA mobile robot
The INRIA11 mobile robot is another example of early systems, which made extensive use of world models [Ayache, 1991]. This work was based on the use of stereo vision. The researchers also extended their approach to trinocular vision. Navigation was based on the construction of a 3 dimensional model of the environment, using the vision system. This work presented techniques for generating a 3D representation of the environment, using a number of simultaneous images taken from various viewpoints. Their approach to navigation was to build feature maps using geometric primitives as features. The vision-based reconstruction of the environment lead to building local and global geometric representations of the environment. The update of the 3D model was based on new observations of known recognisable features that were previously detected. This project also introduced uncertainty estimates of each feature using probability distributions. The control of the robot was based on avoiding known obstacles and known paths in the environment. The project invested a great deal in the reconstruction of a detailed representation of the environment. However, it is unclear how this system could handle dynamic situations. We see that the core ideas of fully constructing the environment into a representation for performing navigation is an overkill for most situations, and places an unnecessary burden on the overall system resources. In this thesis, we present a vision system which moves away from this line of thought. Without fully reconstructing the environment, we are able to realise effective robotic applications. In summary, the contribution and drawbacks of this work are: √
full reconstruction of the environment is performed.
× computationally expensive. × precise calibration is needed for the extraction of features. √ 11
consider uncertainty in their modelling of the environment.
INRIA: Institut National de Rechercche en Informatique et en Automatique
36
Background
2.3.4
Polly
Polly was a vision guided mobile robot that was built to give tours of the 7th floor of the MIT AI Lab [Horswill, 1994a, Horswill, 1994b]. A task based approach was used to the construction of Polly. Polly’s navigation system utilised prior knowledge of its environment to perform its tour of the laboratory. Polly was one of the first highly responsive vision-based mobile robots. Polly used a topological like map in its navigation, which comprised of a set of landmarks that it used to localise itself. The landmarks were a set of individual snapshots taken from a particular location in the office environment. Landmark or place recognition was done by matching every landmark with live video data, in order to determine where the robot was currently located. Polly used no explicit general obstacle avoidance routine. Its avoidance scheme emerged out of the corridor follower and the wall follower. Horswill suggested that this is in effect a local navigation strategy equivalent to the method of artificial potential fields. Therefore, the robot could become trapped in local minima. One of the problems of Polly was that at times it mistook shadows as obstacles. This is due to the simple nature of the robot’s vision processing system. The processing was based on a simple extraction of textureless floor in an image, determining the available free space to travel. This allowed Polly to roam its office environment while avoiding obstacles. Polly demonstrated practically how a mobile robot equipped with vision sensors could perform a meaningful task reliably without any elaborate schemes. In this thesis, we have adapted a similar strategy in determining the available free space for our system to travel. However, we have chosen to develop a more adaptable method that results in a general obstacle avoidance scheme. In summary, the contribution and drawbacks of this work are: √
highly responsive and robust.
√
a first to show simple and well defined behaviours that are able to perform complex tasks effectively.
√
no 3D model of the environment.
√
few storage and computation requirements.
2.3 Vision-based Navigation
37
× robot was purposely built for tasks in its own habitat, and could not function well in other environments.
× suffers from the conflict of behaviours problem.
2.3.5
Pebbles
The mobile robot “Pebbles” was developed as an extension to Horswill’s original vision system, that was developed for Polly. The aim of the system was to provide obstacle avoidance for mobile robot navigation. The avoidance system was based on the implementation of three vision modules: brightness gradient, RGB (Red, Green, Blue) colour and HSV (Hue, Saturation, Value) colour [Lorigo, 1996, Lorigo et al., 1997]. The three modules were processed independently from the same input source, the selection of which module was used to command the robot is determined by using a weighted depth array of free space. The depth array was then used to directly command the robot. The depth of the array determined the robot’s speed, the larger the depth of the array the faster the robot was commanded. The turns were determined by the amount of free space available left and right of the robot. The side with the most free space determined the turning angle of the robot. In this thesis, a similar scheme was used to determine our robot’s velocity. The system was designed to avoid obstacles while moving in an unstructured environment. Basically, the system wanders around its environment avoiding obstacles. Using similar approaches as Horswill for the detection of obstacles, the system also showed robust performance. Noticeably, the system did not provide any mechanism for the performing of high-level navigation, such as place to place navigation. Apart from Polly, Pebbles was another example of how simple techniques can produce a robust scheme for general obstacle avoidance in dynamic environments. In this thesis, we also employ such schemes to provide our robot with a general avoidance scheme. In summary, the contribution and drawbacks of this work are: √ √
highly responsive and robust obstacle avoidance.
/× no consideration was made in regard to goal level navigation, as in Polly. × depends on an even distribution of texture.
38
Background × no high-level navigation capability. × can not handle the conflict in behaviours problem.
2.3.6
The Hyper Scooter
This system is a vision based mobile robot designed for human-robot interaction, where a human operator works with the vehicle. The authors refer to their robot as “a teachable vision-based mobile robot” [Shibata et al., 1996]. The interesting aspect of this project is the use of template correlation12 in real-time as the prime vision processing technique, and the usage of qualitative like instructions for mobile robot navigation. The system provides two qualitative commands, “Go toward” and “Turn until you see the target”. In this thesis, we extended a number of these qualitative commands to produce a human-robot interface for our robot. Since the Hyper Scooter used a camera mounted high on the scooter. This system could also handle the image varying in size problem. i.e. as the scooter gets closer to an object the size of the object increases. The Hyper Scooter did not consider the condition of lighting changes and it also did not consider the problem of obstacle avoidance. Although, this work was a demonstration of qualitative navigation through the use of high-level human instructions. No further work was done, in developing new additional high-level instructions. Although it was not explicitly stated by its researchers, the interesting notion of human-robot sensor sharing, appears to be intuitively appealing for qualitative level navigation. Allowing the human and the robot to share the same perceptive medium will ensure that the command to the robot is more realistic and sensible. In this thesis, we extend these idea in the development of our human-robot interface system. In summary, the contribution and drawbacks of this work are: √
showed efficient and effective use of template matching.
√
introduced vision-based qualitative navigation.
√
simple user-interface which allow the user and robot to share a common view.
12
a discussion of template correlation processing technique is given in Section 4.2
2.3 Vision-based Navigation
39
× obstacle avoidance was not handled. × only a small set of vision-based qualitative instructions were implemented.
2.3.7
Hyper Mouse
This robot performed hallway navigation, it was based on a recording of omni-view sequence of recorded snapshots [Matsumoto et al., 1996, Matsumoto et al., 1997]. In contrast to landmark based mobile robot navigation, where only distinctive features are used as landmarks, Matsumoto’s system used a sequence approach, and navigation was done in two stages. In the first stage, a sequence of images was recorded. The recording was done by driving the robot through the hallway, collecting images, each recorded image was correlated with the live video, once the correlation error increased beyond a preset threshold, a new image was then recorded. The navigation stage of the system used the recorded snapshots for two purposes, the snapshot was used to correct the robot’s position error in the corridor. The other usage of the snapshot was to localise the robot in the environment. The matching of images was performed on every frame, this determined whether or not the next snapshot in the sequence should be used. The approach used to obstacle avoidance was done simply using image subtraction to check if an obstacle was blocking the robot’s path. In which case the robot simply waited until the way forward was clear before proceeding. No attempt was made to avoid the obstacle. This navigation method depends on the next image of the sequence to appear. If a disturbance occurs while the robot is navigating and the system misses the next expected image, the system comes to a halt. Another problem with such a navigation scheme is the heavy dependency of placing the robot in the correct initial position. The navigation approach used in this system may prove to be infeasible in large environments, this is due to the large number of image sequences that are required for each mission. Due to the infeasibility of this scheme, the work presented in this thesis avoids such storage problems by using a landmark based scheme for navigation. In summary, the contribution and drawbacks of this work are: √
introduced image collection and reuse.
40
Background × corridor navigation only. × no general avoidance scheme was used. × can get lost if images are not kept in sequence. √
can replay a mission.
2.3.8
RJ
The aim of the RJ robot system was to build an internal representation of an environment through self-exploration [Taylor and Kriegman, 1998]. The environment was setup with a number of distinctive landmarks (i.e. bar-codes). A vision sensor was used for detection of landmarks, and general robot navigation was performed using proximity sensors: for the detection of collisions and to perform boundary following. The focus of this work was on the exploration and the construction of an internal representation of a static office environment. This system was designed for systematic exploration of an environment. The internal representation produced was a graph like structure, where the nodes are connected via landmarks. A key aspect of this system is that the researchers were able to provide a theoretical study of the performance of their system. Allowing further improvement of algorithms to be made easily, an important property which is highly desirable to the analysis of a mobile robot navigation system. We believe this is an important part of a navigation system, in this thesis we present a theoretical analysis of our system for performing goal-oriented navigation. The RJ system assumes that the robot is able to detect that it has circumnavigated an obstacle, without the need for sophisticated sensing. Such a sensing capability is useful for a robot to possess. However, it is unclear from the description of the work how the detection of circumnavigation can be made possible, since dead-reckoning is error prone. In summary, the contribution and drawbacks of this work are: √
provided a theoretical study and result of a physical navigation system.
× general obstacle avoidance was not considered, thus restricting the system to work only in a static environment.
2.4 Summary
2.4
41
Summary
In this chapter we have reviewed some of the relevant past research that is related to our own work. We have highlighted all of their contributions while raising the issues of their shortfall. We saw that earlier mobile robot systems, such as SHAKEY, attempted solving navigation problems by providing the robot with its own custom environment, avoiding many of the realistic problems that robots need to face. The landmark work of the Stanford cart made the effort of processing a complete scene of its environment to determine the best available path. And in a more recent work the INRIA mobile robot attempted to solve the problem by fully reconstructing the 3D environment of the robot. In the Hyper Mouse, the system recorded every step taken by the robot in its environment, in order to allow the robot to rerun through the same path again. On the other extreme the systems Polly and Pebbles, solve the navigation and/or a subset of the problem by providing only the necessary information required for the task. Polly and Pebbles have shown to perform reliably and effectively in both static and dynamic environments, without the need for overly sophisticated structures. From our discussion, we believe that the behaviour-based approach has significant potential for use in robotic control architectures. The minimal use of data is a highly desirable feature. At a higher level, the Hyper Scooter system provided an interesting approach to interfacing a robot with a user, by providing simple qualitative instructions. The use of these qualitative instructions can provide a user with a natural and easy way of instructing a robot. We believe this one particular aspect of human-robot interface that should be explored further. Many of the systems described in this chapter have provided much of the inspiration of our development. In the chapters to follow, we will demonstrate our efforts in resolving some of the identified problems, and drawing from our insights we will present systems that we have constructed to demonstrate the effectiveness of our key ideas.
Chapter 3
Robot System “Se non `e vero, `e molto ben trovato.” — [Anonymous, 1585]
In this chapter we discuss our robot system hardware and its components. The configuration of our experimental setup is shown in Figure 3.1, it consists of three major components: an off-board vision processing unit, a radio communication unit, and a mobile robot. Each of the system components is explained in the following sections. Looking at Figure 3.1 it is clear that our configuration is a remotely based system, the processing units of the system is distributed over a number of physical entities. It is similar to the Remote-Brained project of the Inoue-Inaba Laboratory, University of Tokyo [Inaba, 1993, Kagami et al., 1997] and the vision-based reinforcement project of the Asada Laboratory, Osaka University [Asada et al., 1995, Uchibe et al., 1996]. These systems provide platforms in which the host processor (e.g. vision processor) resides away from the robot, and commands to the robot are sent via some form of remote media such as radio modems. Remotely based systems can provide the advantage of not needing to rebuild a robot whenever new hardware becomes available. For example, currently we are using a greyscale vision processor, if a colour vision processor becomes available, a simple connection is all that is required. Another benefit of the Remote-Brained approach, is that it preempts the notion that the focus on computation should not be restricted by any physical limitations of a system. However, our choice of using a remotely based system has been primarily driven by resources and physical constraints. Physically the robot can not support the weight of 43
44
Robot System
Video Receiver
Video
Video
Video Module Tracking Module
Mobile Robot VME Bus
MVME162 Embedded Controller Ethernet Transceiver
Ethernet
Radio SUN Workstation
Radio Modem
Figure 3.1: Overview of our system: The basic information flow begins with the robot transmitting video signals from the on-board CCD camera. A video receiver connected to the vision system is used to receive the video signals for processing. Once processing has been completed the results are then sent to a SUN workstation. The workstation transmits the results back to the robot via a radio modem.
45 the vision processor and is not able to provide the energy require to power the vision processor. The necessity to decouple our system, results in noise problems associated with transmission interference which must be catered for. In light of this situation, a distributed architecture was developed to address these problems. Our architecture is made up of three key components. They are a vision processing framework, an action selector, and an actuation shell. The components are described in Sections 4.1, 5.1.1 and 5.1.3, respectively. Section 3.1 describes the mobile robot that we are using in this research. A description of the communication server is given in Section 3.1.1. The vision processor is presented in Section 3.2. The chapter ends with a summary.
Figure 3.2: Yamabico Mobile Robot
46
Robot System
3.1
Yamabico Mobile Robot
The Yamabico mobile robot shown in Figure 3.2 was developed at Tsukuba University [Yuta et al., 1991]. The robot was designed to be a small self-contained autonomous robot, it has a multi-processor based architecture. All of the processor modules communicate through the Yamabico-bus, using a Dual-Port-Memory mechanism. The robot has a MC68000-CPU master module, running the MOSRA OS [Yuta et al., 1991]. MOSRA provides the following features: process management, hardware interrupt handling, exception handling, memory management and interprocess communication. A transputer-based (T-805) locomotion module provides all of the motor feedback and control of the robot [Iida and Yuta, 1991]. This locomotion module is designed to follow a given trajectory, using feedback information from the robot’s wheel encoders. The locomotion module operates as a digital PID controller to govern motion of the robot. An ultrasonic module is also provided on the robot, but it is not used in our experiments. In addition, a radio modem, a small size CCD camera and a video transmitter have been included in our system. The modem has a maximum bandwidth of up to 9600bps. It is used to obtain the results from the vision processor via the communication server. The video transmitter and camera provide video input to the vision system. The video transmitter has a maximum broadcast rating of up to 1km.
3.1.1
Communication server
The communication server is a SUN-workstation (running Solaris OS) with a radio modem attached. The communication server was established for a number of reasons: it allows single point communication to the robot from anywhere on the ethernet network without needing to be inside the range of the radio modem; a more powerful machine can be used to provide the management if required; and it removes the restriction of having resource demanding routines residing on the robot, as these routines can be executed elsewhere. All data communication is done using a layer-based communication software developed by [Jung and Stanley, 1997], named Radnet. Radnet provides all data packet routing to the robot from the vision system via the communications server. The server manages all network traffic between the vision system and our mobile robot.
3.2 Vision processor
47
For example, once a image frame is processed, a command from the vision system is then sent back to the robot, guiding it on its way.
3.2
Vision processor
The vision processor used for this research was developed by [Inoue et al., 1992], and is manufactured by Fujitsu Co., Japan. The processor is designed for template-based tracking via image correlation of a live video steam. The correlation is performed using a onboard video processing chip, which uses a technique called “Sum of Absolute Differences”, a detail description is presented in Chapter 4. The system comes with two processing modules, a video module and a tracking module. The system is designed to support up to five tracking modules simultaneously. Currently, only one tracking module is used. The modules are connected via a special vision-bus — a ribbon cable connected to the front of each module. The modules are powered via a VME-bus backplane, and a Motorola MVME-162 Embedded Controller running the r VxWorks operating system.
3.2.1
Video module
The video module provides two selectable input channels, one output channel and two external sync sources (horizontal and vertical). A NTSC video receiver is connected to one of the input channels on the video module. The receiver is used to receive video signals from the mobile robot. Each received NTSC image is digitised into a 512×512 sized image containing 256 greyscale values, and is stored in one of the three video memories (VRAM) on the tracking module. The video module is designed to accept incoming video signals at video frame rate (30Hz). The output channel of the video module outputs an NTSC video stream, which is used to display either the live input or any of the images that are stored in the VRAMs. In addition, a four bit 512 × 512 overlay memory is available, which allows programs to display text and/or graphics over the output NTSC video signal.
48
Robot System
3.2.2
Tracking module
The tracking module manages all of the image tracking in the system, it uses correlation based template tracking. As mentioned earlier, each tracking module has three VRAM units for storage and processing, generally one is used for storing templates and another is used for the live video inputs. A tracking module can track up to 80 image templates that are 16 × 16 pixels in
size, and up to 160 image templates that are 8 × 8 pixels in size. The tracking of
templates is done for each video frame in the 30 Hz video stream. One restriction of this processor is that only one size can be selected for tracking, i.e. either 8 × 8 or
16×16. However, a magnification factor compensates for this restriction, by providing a greater range of template sizes. The magnification factors can be 1, 2, 3 or 4, which allows the template size to be extended to ranges of 8 × 8, 16 × 16, 24 × 24, 32 × 32 or 64 × 64. The magnification factor is used to specify the number of pixels that
need to be skipped or ignored during image correlation. The details of the correlation technique used by the vision processor are presented in Chapter 4.
In order not to penalise system performance all initialisation of tracking data structure is completed before the start of each input video frame.
3.3
Summary
This chapter provided a description of the three major components of our experimental setup: a Yamabico mobile robot, a communication server and the vision processor. The presentation also included a description of the general flow of information between these components. Its should be noted that in our system configuration, the vision processor is off-board. The video input is taken from on-board the robot and then sent to the vision processor via a remote video link. Data communication from the vision processor to the robot is done via a pair of radio modem.
Chapter 4
Robot Vision “People asking questions – lost in confusion. Well, I tell them there’s no problem, – only solutions.” — [Lennon, 1980]
Vision sensing provides a considerable amount of information about an environment. Vision can simultaneously provide information about the absence or presence of obstacles; distinguishable features in the environment, such as landmarks; as well as provide data about dynamic changes in the environment. It is an ideal sensor for robot perception, since it provides a tremendous wealth of information. However, obtaining visual features of the environment has not been an easy task. This is due to the considerable amount of computation required. With the current advances in vision processing technology, faster general processors and more readily available specialised vision hardware, vision-based robot perception is within reach. This chapter describes our approach to providing real-time processing methods to support vision guided navigation of a mobile robot. Handling environmental changes is an important part of any navigation system, both in terms of perception as well as action. The focus of this chapter is on the perception. The problems of action selection is dealt with in the chapters that follow. A set of components have been designed to facilitate perception of an environment, even if changes occur. The design of these components has been done with the aim of providing support for mobile robot navigation in a dynamically changing environment. In this chapter template-based techniques for searching for free space to allow safe motion of the robot and detection of goals are described. Techniques to overcome the 49
50
Robot Vision
limitations and deficiencies of the available technology are also given considerable attention. A mechanism for compensating for lens distortion of a standard video camera is presented. Fluctuations in lighting gives rise to unreliability problems in a template-based navigation system. To combat this problem a lighting intensity measure was developed to detect changes in lighting conditions of the environment. Methods for assisting the motion of the robot have also been devised, such as calculating the corresponding 3D position in the real world to a 2D pixel in a CCD camera. To facilitate smooth robot motion we developed a method for determining safe velocities for the robot in the presence of obstacles. In this chapter, we propose a general vision processing framework to support the overall operation of the system. The framework brings together these components, and together with the action selection scheme and the robot’s actuation shell, provides the overall control architecture of our system. The interconnections inside our architecture are described, in Chapter 5. We begin with a description of the vision processing framework, in Section 4.1, this is followed by a description of its real-time components. A description of the core vision processing element, Template Matching is presented in Section 4.2. The segmentation of input images into safe movable space by the Free Space Search process is described in Section 4.3. Camera lens distortion compensation is presented in Section 4.4. Lighting Intensity calculation is described in Section 4.5. A comprehensive description of Lighting Adaptation is presented in Section 4.6. A method for capturing the visual attention of the robot by an Interest-Operator is presented in Section 4.7. A process for target identification is presented in Section 4.8. A technique for transforming the image coordinates into real world coordinates is discussed in Section 4.9. Once the robot knows where it can move safely, it must then determine at what speed it can safely operate. The process of determining velocity is discussed in Section 4.12. In Section 4.13 a discussion of the performance of the system operating at various processing speeds is presented. Finally in Section 4.14 a summary of the contributions of this chapter is given.
4.1
Vision processing framework
Our General Open-Loop Framework for vision processing is shown in Figure 4.1. Realtime components have been constructed to satisfy the overall real-time requirements
4.1 Vision processing framework
51
S e tup te mplate s
Te mplate s corre lations
Adjust te mplate s
V ide o input
Calculate V ide o input inte nsity
Normalise for came ra le ns
Normalise for inte nsitie s change s
Thre sholding
Fre e space se arch
Inte re st ope rator
Goal De te ctions
Figure 4.1: Vision Processing
of the system. The framework is made up of specialised processes, each responsible for a particular operation, executing within a tight processing loop. All processing operations revolve around the central template matching processor. They are categorised into three types of processes: General Processing Supports, Feature Detectors and Auxiliary Functions. These processes are briefly explained and, then their full details are covered in the remainder of the chapter.
52
Robot Vision
General Processing Support Units The General Processing Support units are: Template matching, Normalisation, and Light Intensity Adaptation. These processing units provide the essential support structure for the overall system. The Template matching process determines how closely two images match. The Normalisation process is used to compensate for camera lens distortion. The Light Intensity Adaptation process ensures that the most suitable image templates are used for the current lighting conditions.
Feature Detectors Feature Detectors are responsible for specialised processing, and they are directly connected to a particular behaviour or set of behaviours. For example, the Freespace-search described in Section 4.3, is connected with the Collision Avoidance and Free-Space-Move behaviours described in Chapter 5. In the case of the Free-Space-Move behaviour, the Free-space-search detector provides information about the availability of free space for robot motion. In the case of the Collision Avoidance behaviour, the Free-space-search detector, also provides information regarding potential collisions. Other Feature Detectors we have developed such as the Soccer ball detector, and the Soccer goal detector for specific robot navigation tasks are described in Chapters 5, 6 and 7.
Auxiliary Functions The Auxiliary Functions perform the computation required by the visual behaviours. These functions include: determining a safe velocity for robot motion, and image to world coordinate transformations.
Processing loop In order to keep the processing of video input data at 30 Hz, we have enforced our process execution to stay within a tight processing loop. In doing so, we maximise the utilisation of the vision processor, which in turn provides us with a highly responsive
4.2 Template matching
53
system. Part of the requirements for each system component is that they must keep within this tight processing loop. The benefit of this, is that any execution time that can be saved, can be utilised elsewhere in the system. After the system initialisation has been completed, the following steps are then applied:
1. Setup the image templates; 2. Start the image correlation of the templates; 3. Perform all general processing: such as adapting to lighting changes and normalising for the camera lens distortion etc; 4. Execute the feature detectors: Free-space-search and Interest-operator etc; 5. Repeat steps 2-4 until process termination.
An important point to note is that the above processing loop executes as an individual process during the vertical blanking period of the video input, which is of the order of 0.33 ms.
Initialisation One drawback of our vision processor is that the loading of templates requires a substantial amount of time in relation to the processing time. To overcome this drawback templates are predefined and installed before the start of template correlation. Our initialisation process takes up to 35 seconds, this includes the generating and loading all of the predefined templates. Normally, around 500 templates are stored, occupying around 125K of video memory out of the available maximum of 512K.
4.2
Template matching
The processing of template matching is performed using two FIFO (first-in-first-out) queues within the tracking module of the Fujitsu MEP processor, one for input, the other for output. At the beginning of each frame, data structures specifying where
54
Robot Vision
Figure 4.2: Template Matching
and what part of an image to correlate are placed onto the input-FIFO. As soon as processing is completed, the results are written to the output-FIFO. The results are stored in a data structure which includes correlation values for the matched templates. To maximise the coverage area of a single template we have chosen a template size of 16 × 16, with a magnification factor of 4. As explained
in Chapter 3, the magnification factor determines the sub-sampling factor of each template. Therefore, a magnification factor of 4 will produce a template of size 64×64.
For each template match, a correlation value D is produced ranging from 0 to 65536 (16 × 16 × 256 = 65536), using the technique “Sum of Absolute Differences” given by
Equation (4.1). The correlation values determine how well matching occurred (the
lower the correlation value the better the match). Figure 4.2 shows a screen dump of matching a set of templates of clear floor with a live video frame containing obstacles. The correlation values are shown in the figure as white squares1 . The size of the squares represent the magnitude of the correlation values. The better the templates match to clear floor the smaller the squares that are displayed.
D=
n m X X x
1
y
| w(x − m, y − n) − f (x, y) |
(4.1)
In order to maintain real-time performance, the white squares are not always displayed during navigation.
4.3 Searching for free space
55
where m and n are the horizontal and vertical size of the template w(x, y) is the greyscale pixel value of the template at the specified coordinates f (x, y) is the greyscale pixel value of the live image D is the correlation value between reference image w and live image f
Row 1 Template
Row 2 Template
Row 3 Template
Row 4 Template
Row 5 Template
Row 6 Template
Figure 4.3: 6 × 8 grid image and the templates used for Free-spacesearch
4.3
Searching for free space
To reiterate, the Free-space-search detector processes an input image, and provides information concerning the available amount of space that allows our robot to safely move. The operation of the Free-space-search detector is as follows. A correlation is performed on a 6 × 8 grid where each grid cell is 64×64 pixels in size. Each cell
is correlated against a pre-stored set of template images of clear floor. To minimise memory requirements for each row of the grid, only one pre-stored image of the floor is used for correlation, as illustrated by Figure 4.3. Using the correlation values, a normalisation process described in Section 4.4, is performed to eliminate the effects of lens distortion. A tolerance threshold value T is used for labelling grid cells as floor space, as shown in Equation (4.2). The percentage error Error% for the calculation of the tolerance threshold was determined empirically, this is set usually to 8%. If the correlated values D of the floor are within the threshold tolerance level the cell is marked as free space, as shown in Equation (4.3). The 6 × 8 grid was chosen due to a number of factors, the upper row of the image tends not to contain any information
concerning the floor. And the viewed distance from the upper row to the robot is
56
Robot Vision
sufficiently far away enough (approx. 2m) not to be problematic. By removing such unnecessary calculation further saving in computation is also made. Alternatively a technique such as optical flow could also be use to detect obstacle(s) [Nakamura and Asada, 1995, Kr¨ose et al., 1997]. However, we have elected not to employ such approach for a number of reasons, optical flow techniques are usually computationally expensive. Even though Nakamura and Asada – using the same vision hardware as used in this thesis, their system still required substantial additional processing in order for optical flow based obstacle avoidance to function properly. Additionally, while computing optical flow the Nakamura and Asada system could not perform other type of processing, such as target tracking. Our choice of processing allow us to perform these and other functions simultaneously. In addition, flow information is only available while robot or obstacles are in motion. Therefore the robot or the obstacle(s) must be moving in order for the system to avoid it. An undesirable situation occurs when the robot is facing a wall, which yields a response of a collision motion.
T = M axcorrelation × (1 − Error%)
∀(x, y) :
D(x, y) − T 0 Non-free
(4.2)
(4.3)
where ∀(x, y) signifies all grid cells D(x, y) is the final correlation value of a template match at grid cell x, y M axcorrelation is the highest possible correlated value (in the 16 × 16 case the maximum correlation=65535) Error% is the maximum correlation error which the system can tolerate T threshold tolerance level of template correlation matching
4.4
Compensating for camera lens distortion
To reduce the memory requirements of the templates and also to maximise the quality of the templates, all templates are acquired from the centre of a video image. One grid cell of the floor is stored for each row of templates to be matched. Each cell is taken from the centre of the image of a clear floor. We found that the correlation
4.4 Compensating for camera lens distortion
57
between the centre and the outer edge of the image can vary significantly due to the photometric distortion – clearer images are found toward the centre of the lens. In order to compensate for the camera lens distortion, we determined that a polynomial relationship exists between the correlated values. From this relationship, compensation factors for the distortion in the lens were derived. For implementation purposes a discrete form of the normalisation shown in Equation (4.4) was used to perform the normalisation.
a)
20000
20000
15000
15000
10000
10000
5000
5000
0
0
b)
c)
Figure 4.4: a) Floor, b) Correlation plot of the floor, c) Normalised correlation plot At each grid location, an associated template match error value was computed. This computation is based on the error in correlation across each cell matched with a pre-stored image of the floor. These error values were then applied to each of the correlated values on each template match, using Equation (4.5). Figure 4.4a shows an image of clear floor taken from the CCD camera mounted on top of the robot. Figure 4.4b shows a plot of the raw correlation values of this image. Figure 4.4c shows a plot after normalisation. The confidence values of each template match are plotted by the correlated values for each location of the image. The lower the value, the higher the confidence.
E(x, y) =
e(0,0) e(0,1) . . . e(0,j−1) .. .. .. . . . .. .. .. . . . e(i−1,0) . . . . . . e(i−1,j−1)
∀(x, y) : Dnew (x, y) = D(x, y) × e(x,y)
(4.4)
(4.5)
58
Robot Vision where in our case i = 6, and j = 8 e(x,y) is the compensation factor at grid location x, y of an image, (see appendix A.1 for the exact data values that were used.) Dnew (x, y) is the newly adjusted distortion value at image location x, y
4.5
Calculating the Intensity of Lighting
An important measure of the environment for a vision sensor is a measure of the brightness within the environment. We use the vision processor to determine the average intensity over a given region in the video steam, allowing us to perform calculation of lighting intensity at video frame rate. This gives an indication of the ¯ in the environment as shown in Equation (4.6). Lighting changes lighting changes, I, are measured by altering the template matching correlation to match on a zero valued template, as shown in Equation (4.6). This allows us to calculate the average lighting intensity over a given region. This measure is utilised by our lighting adaption scheme process and an alternate template selection scheme.
I¯ =
Pm Pn x
y
| f (x, y) − 0 |
m×n
(4.6)
where I¯ is the calculated intensity value f (x, y) is the greyscale pixel value in the live image frame m and n are the horizontal and vertical sizes of the template (in our case m = 16, n = 16)
4.6
Lighting Adaptation
A scheme for combating the problem of subtle lighting changes has been incorporated into our system. Other researchers have experienced lighting problem while using vision sensors, [Moravec, 1981, Horswill, 1994a, Lorigo, 1996]. Features such as an obstacle’s shadows can easily be mis-interpreted as being part of the obstacle. We have experienced similar problems, during the course of our research. In our case we were fortunate — or unfortunate enough, whichever the case maybe — to have relocated our robot to three different laboratories. After relocation various lighting adaptation approaches were trialled. Two schemes are described in this section. Our
4.6 Lighting Adaptation
59
final approach allows the system to dynamically adapt to various lighting condition in real-time.
4.6.1
Lighting Adaptation — without accounting for texture
The first scheme we developed was based on using the average intensity of the stored ¯ y) is measured between the expected average intentemplate. An error factor △I(x,
¯ y), as described sity of the store template w(x, ¯ y) and the current sensed region I(x, by Equation (4.7). This factor is then used to adjust the correlated values determined by the template matching process. The adjustment is done according to a sensitivity range, that is used to restrict the level of adaption. From our own experience the ¯ y) is too large the correlated value sensitivity range was set to 80. If the error △I(x, D(x, y) remains unchanged, otherwise if the error is within range the correlated value
is adjusted to Da (x, y) using Equation (4.9). The advantage of this scheme is that it is computationally cheap, requires no additional storage, and the effects of the adaption are instantly available. However, the drawback of this scheme is that texture information is discarded. This is a reasonable assumption for a continuous image such as carpeted floor, but not for images such as goal targets or landmarks. To overcome this deficiency, we derived a scheme which accounts for texture information. The improved scheme is described in the next section. After describing the various schemes a comparison is presented in Section 4.6.4.
¯ y) = w(x, ¯ y) △I(x, ¯ y) − I(x,
η(x, y) =
¯ y) |> Sensitivity | △I(x, 0 ¯ y) |≤ Sensitivity △I(x, ¯ y) | △I(x,
(4.7)
η(x, y) ∀(x, y) : Da (x, y) = D(x, y) − D(x, y) × w ¯
(4.8)
(4.9)
60
Robot Vision ¯ y) is the intensity error between the expected average image where △I(x, ¯ y) intensity w(x, ¯ y) and the measured average intensity I(x, η(x, y) determines if adjustment is needed, and by how much to make Da (x, y) to the newly adjusted distortion value at image location x, y Sensitivity is the lighting intensity tolerance level of the system it is usually set to 80
4.6.2
Lighting Adaptation — accounting for texture
Our second adaptation scheme is based on altering the templates to be matched in the next frame using the intensity of the current matching region in an image, instead of altering the correlated value of a template match, as was done in the last scheme. From our experience we noted that the “Sum of Absolute Differences” (SAD) does not cater for subtle image changes very well. As this is the technique employed by our vision processor, the need for a new technique that was able to work with our existing hardware was needed. This ensured that the performance of our system was not compromised.
Derivation Our own experience and other substantial studies of correlation techniques have shown that, the “Zero Mean Sum of Absolute Differences” (ZMSAD) has the potential to handle intensity changes between two images, with a substantial reduction in computation effort in comparison to the “Zero Mean Normalised Cross Correlation” (ZMNCC) [Aschwanden and Guggenb¨ uhl, 1993]. We propose changes to the SAD correlation function, which involves parameterising the averages of the two images being correlated. By exploiting the differences between the averages of the two images being matched, a parameter δ can be derived, as shown in Equation (4.10). We propose a new correlation function Dd , given by Equation (4.11). Our new correlation function does not suffer the side-effects of equating a good match from two images with a constant texture. i.e. a correlation between a purely black and a purely white image will not produce a good match. Other techniques such as ZMSAD and ZMNCC methods can not overcome such situations. w ¯ − f¯ > Sensitivity 0 δ= w ¯ − f¯ ≤ Sensitivity w ¯ − f¯
(4.10)
4.6 Lighting Adaptation
Dδ =
61
n m X X x
y
|(f (x − m, y − n) + δ) − w(x, y)|
(4.11)
where δ is the adjustment factor for a given template Dd is our new derived correlation function f is the current image to be matched w is the stored image template f¯ is the measured intensity of the current image to be matched w ¯ is the measured intensity of the stored image template
4.6.3
Implementation
Our implementation takes advantage of the insights gained from this derivation. Using the new derived correlation function we have implemented a more robust template matching system. The vision processor we are using has strongly influence the outcome of our final implementation. The method we have developed makes no compromises to the real-time performance of the overall system. The adaptation scheme we developed can be divided into two stages: initialisation and execution.
Initialisation
The initialisation of the new correlation process is performed during the initial stage of the vision processing, i.e. before image correlation begins. Using the sensitivity range a set of templates are generated. The template generation is done using the formula shown in Equation (4.12). The whole process involves, adjusting all pixel values within a template by the difference between the template’s average intensity w ¯ and each sensitivity range, thus producing a set of additional templates for each template of the floor. All additional templates are associated with a corresponding intensity value.
∀t : [∀(m, n) : w(m, n)t = w(x, y)w¯ + (w ¯ − t)]
(4.12)
62
Robot Vision where t signifies the new template range (0. . . 255) m, n is the size of each template w ¯ is the average intensity of the stored template w(m, n)t is the new template of average intensity value t w(m, n)w¯ is the stored reference template, based on the average intensity w ¯
a)
b)
Figure 4.5: Adapting to lighting changes: a) before adaptation, b) after adaptation Execution The execution process involves the selection of the next most suitable template to be used in next video frame. This decision is made as follows, the current image intensity is calculated, using Equation (4.6), the intensity value is used to determine if the difference is within a sensitivity range. If it is a within range then a new template w is selected as per Equation (4.13).
w=
|w ¯ − I¯ |> Sensitivity ww¯ |w ¯ − I¯ |≤ Sensitivity wI¯
(4.13)
where w is the newly selected template w ¯ is the average intensity of the stored reference template ¯ I is the current measured average intensity ww¯ is the stored reference template, based on its average intensity w ¯ ¯ wI¯ is the newly selected template based on I Sensitivity is the lighting intensity tolerance level of the system it is usually set to 80 Figure 4.5 shows an example of our system adapting to subtle light changes in
4.6 Lighting Adaptation
63
the environment. The size of the black squares indicate the level of correlation, after adaptation the correlation improves markedly.
Image (148 × 200) Changes Average intensity (0..255) Original correlation function (SAD) Texture not accounted for Texture accounted for
No change
Darker
Lighter
174(w) ¯
¯ 149(I)
¯ 198(I)
0
740000 (9.80%)
739518 (9.79%)
0
633678 (8.39%)
637516 (8.45%)
0
0 (0%)
482 (6.39 × 10−3 %)
Table 4.1: Comparison of lighting adaptation schemes for low texture content templates
4.6.4
A Comparison
Now that we have presented our two lighting adaptation schemes. We will show how they differ with a comparative study. Tables 4.1 and 4.2, present the results of comparing the two schemes. Table 4.1 shows data associated with matching templates that contain little texture information, e.g. the carpet of our laboratory. Table 4.2 shows a comparison of the schemes applied to templates that contain substantial texture information, images of my daughter, Madeline. These tables show the results of the correlation techniques applied to three identical images which have small changes in lighting. The first row of the tables, show three images: the first shows the original image with no changes, the second is a darker version of the original image, and the last is a lighter version of the original image. The last three rows tabulate the correlation results using the three different
64
Robot Vision
Image (148 × 200) Changes Average intensity (0..255) Original correlation function (SAD) Texture not accounted for Texture accounted for
No change
Darker
Lighter
74(w) ¯
¯ 51(I)
¯ 99(I)
0
677291 (8.97%)
739998 (9.80%)
0
466782 (6.18%)
489999 (6.49%)
0
96515 (1.28%)
2 (2.65 × 10−5 %)
Table 4.2: Comparison of lighting adaptation scheme for high texture content templates correlation functions of our discussion. The first function is the original correlation function which was implemented in our hardware vision processor, the second last row shows the correlation results of using the correlation function which does not account for texture. The last row shows the correlation results of the function which does account for texture. Both tables show promising results for the latter scheme, which uses texture information for the adjustment of lighting changes. The results show that in order to gain a substantial increase in accuracy in template matching texture information must be taken into account. Both studies show that the texture based scheme resulted in very low correlation errors while matching with lighter and darker images. The non-zero correlation errors are due to the physical boundary of the greyscale image in which values can only be between 0 and 255. In our implementation accounting for texture scheme requires substantial amounts of storage. For each template, 51 associated templates were used. In contrast the scheme that ignored texture, had the advantage of cheap computation, and required
4.7 The Interest-Operator
65
no additional storage. This could be useful when trading off between the performance and storage capacities in a system. For example, template matching of free space carpet areas could performed with texture information discarded, and landmark/target matching should be done while taking texture information into account.
4.7
The Interest-Operator
An Interest-Operator was introduced mainly for two reasons. Firstly, to increase the efficiency of any goal oriented operation, by reducing the search space for goals. Secondly, to provide a feature detector which doesn’t use an explicit model. A similar idea of an Interest-Operators were first used on the Stanford Cart [Moravec, 1981], then later applied to the CMU Neptune rover [Moravec, 1983]. Our operator is similar to the one proposed by [Thorpe, 1983], it also uses a Sobel gradient like operator [Gonzalez and Woods, 1992]. The major difference, is that, our operator is not applied to the image pixels directly. We have chosen the zero-crossing method — which is used in standard edge detecting image processing techniques — as our interest-operator, defined by Equation (4.14). The operator is applied over all of the correlated values D of the floor image, using Equation (4.15). For example, when the image of a soccer ball shown in Figure 4.6a is processed using the method, it produces the correlation values shown in Figure 4.6b. Normalisation is then performed using Equation (4.5) on the correlation values, as shown in Figure 4.6c. The final step involves applying our Interest-Operator. Its effects are shown in Figure 4.6d. The Interest-Operator provides us with single peak of possibly interesting objects and it can be easily identified. An example application of such model-free usage, is the vacuuming experiment described in Chapter 5. A model-free vacuuming system is necessary since it is infeasible to model all the various possible configuration of litter on the floor. Another utilisation of this operator was for an experimental prototype of a soccer playing robot, which is also presented in Chapter 5.
ψ(−1,−1) ψ(0,−1) ψ(1,−1) ψ(−1,0) ψ(0,0) ψ(1,0) ψ(−1,1) ψ(0,1) ψ(1,1)
R(x, y) =
1 1 X X
i=−1 j=−1
=
−1 −1 −1 −1 8 −1 −1 −1 −1
ψ(i,j) D(x+i,y+j)
(4.14)
(4.15)
66
Robot Vision
30000 25000 20000 15000 10000 5000 0
a)
b)
30000
70000
25000
60000 50000
20000
40000
15000
30000
10000
20000
5000
10000
0
0
c)
d)
Figure 4.6: Interest-Operator: a) Soccer ball, b) Correlated values, c) Normalised, d) Segmentation using the Interest Operator
4.8
Recognition/Reassurance
For goal-oriented image processing, we must ensure that the current goal can be identified. We have decided not to focus on the image recognition problem. To adhere to the model-free approach we have chosen to use the appearance-based approach to recognise goals. This approach have been used by [Horswill, 1994b, Matsumoto et al., 1997] to recognise landmarks during navigation experiments. In both of these approaches the entire reference image is stored and during mission execution the reference image is correlated against the live video images. This can be computationally expensive. In another approach reported by [Nakamura and Asada, 1995], only the required target image, and additional resolution changes (by scaling) of the target image are stored. A search routine of the entire image is only applied when the target is lost. Although this approach attempts to reduce the computation load, the search routine degrades the system performance during normal operation – when
4.9 Transforming Image Coordinates to World Coordinates
67
the target is not present. Nakamura and Asada, showed that storing and matching different resolution of a target does provide a better and more robust system. However, simply using various scales of the same image will not account of the perspective changes to the system due to the viewing angle of the robot. In our approach we only store the region of the image that are identified by the Interest-Operator. Initially the robot is taught a goal by positioning the robot in a position from which the robot is likely to approach the goal during a mission. The human operator, selects the goal and then the robot visual servoing behaviour, based on the interest-operator described in Chapter 5, servoes on the target. As the robot approaches the target, it monitors the correlation values of the initially selected template. Once the correlation values exceeds a preset threshold another template is acquired and stored for later use. The same scheme could also be applied to the automatic acquisition of landmarks. Figure 4.7 shows the correlation values and the acquisition of new templates of goal target. During mission execution when the robot is searching for a target, it tries to match the stored target templates with the live video. The detection of a search area to begin matching is also determined by the Interest-Operator. The advantage of using the Interest-Operator during matching is that constant monitoring or searching of a desired goal is not necessary. That is the Interest-Operator only draws the robot’s attention if a potential target exists. Thus, no unnecessary computation is performed. Such as in the case of Horswill’s Polly system where matching of landmarks is performed in every cycle of the processing loop [Horswill, 1994b].
4.9
Transforming Image Coordinates to World Coordinates
After determining where in the video image we would like the robot to move, we need to transpose this location in the image to real world coordinates. Due to the fixed configuration of the current camera system, minimal visual coverage is provided to the robot, as shown in Figure 4.8. By exploiting this physically grounded constraint we derived an image pixel to floor coordinates transformation function. Although in most cases it is not necessary – as with most Behaviour-Based system – to measure the exact location in the real world a particular point. We found it more useful in determining the spatial separation (left and right side) of the robot. In addition it is
68
Robot Vision Correlation value (against initial template) 10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 0
1 2 3 4 5 Distance from target (farthest to closest)
Initial template selected
6
New template selected
Figure 4.7: Templates acquisition: The initial template was selected at interval 0, then as the robot approaches the selected target a new template was selected (at interval 3). This selection was determined when the preset threshold value was exceeded (in this case it was set at 8000).
use to serve as an indicator for determining passable space, Section 4.11. The coordinate transformation is calculated by measuring four distances, shown in Figure 4.9. The first distance is the height of the camera to the floor, heighty . The second distance is the nearest view point of the camera, blindy . The third distance is the furthest visible point to the top view of the camera, blindy plus lengthy . The fourth distance is the furthest point to the left or right of the camera view, lengthx . With these measurements we can calculate the three transformation angles, using Equations (4.16), (4.17) and (4.18). Using these angles we can determine the angular ratio in both the x and the y directions. The angular ratios are used to calculate the final transformations. Figure 4.9a shows the elevation view of the robot. Figure 4.9b shows the plan view of the robot.
4.9 Transforming Image Coordinates to World Coordinates
69
Figure 4.8: Robot’s field of view from onboard CCD camera
α = tan
−1
heighty blindy
(4.16)
−1
heighty blindy + lengthy
(4.17)
β = tan−1
blindy + lengthy lengthx
(4.18)
θ = tan
Equation (4.19) calculates the transformation of the y th pixel towards the horizon of the floor image.
y=
heighty
tan θ +
pixely (α−θ) screeny
+
roboty + blindy 2
(4.19)
where pixely is the y th pixel of the screen roboty is the length of the robot in the y direction Equation (4.20) calculates the transformation of the xth pixel toward the horizon of the floor image.
x = tan
β(1 − 2pixelx ) screenx
×y
(4.20)
70
Robot Vision length x x
y
height y
β
θ
α
y length y
blind y
robot y 2
a)
b)
Figure 4.9: Camera configuration: a) Elevation view, b) Plan view where pixelx is the xth pixel of the screen From Equations (4.19) and (4.20), we can calculate a trajectory for the robot to travel to that corresponds with a point in the CCD image, as shown in Equation (4.21).
x γ = tan−1 ( ) y
4.10
(4.21)
Blind spot
With the current system configuration, a 0.35m dead zone exists below the camera which could not be observed by the robot. A secondary sensor can compensate for such a dead-zone, such as ultrasonic, infrared or bumper sensors. However, our robot navigation system is quite robust, and the need for such secondary sensor is minimal. In this research we rely solely on visual sensing for normal operation of the robot.
4.11
Passable space
Passable free space is defined by the available free space immediately in front of the robot which allows it to move forward without collision. The passable free space can
4.12 Setting the robot’s velocity
71
Figure 4.10: Non passable space be calculated from the Image Coordinates to World Coordinates transform described in Section 4.9, and the physical width of the robot. Figure 4.10 shows an example of such a situation. In this instance, the inner four cells are used to test for passable free space. This method is used throughout our system for collision detection.
4.12
Setting the robot’s velocity
The velocity of the robot is calculated using the amount of free space that is available to the robot. The size of free space is determined by the Free-space-search operation, and is calculated from the ratio of the maximum horizontal and vertical free space available in its field of view. The calculation is shown in Equations (4.22) and (4.23).
topspeed = maxspeed ×
y maxy
2 topspeed × x + x velocity = topspeed − maxx maxx
(4.22)
(4.23)
where maxspeed is the maximum physical velocity of the robot (in our case the maximum velocity is set to 600mm/sec) maxx is the maximum image pixel in x direction (we used 448) maxy is the maximum image pixel in y direction (we used 480) topspeed is the highest or upper velocity of the robot to travel velocity is the calculated velocity of the robot to use The overall effects of determining the robot’s velocity are shown in Figure 4.11. The curve is a plot of pixel x, y against the robot’s velocity. The figure shows that
72
Robot Vision
Velocity function maximum velocity
(0, 0)
Left side of image (480, 448) (256, 448)
(0, 448) (0, 448)
Bottom of image
Figure 4.11: Robot velocity curve as the amount of free space increases the faster the robot travels. When the robot needs to move around an obstacle, it slows down, as seen in Figure 4.12. The robot stops if there were no room to move.
velocity = 208mm/sec
velocity = 159mm/sec
Figure 4.12: Moving around an obstacle
4.13
Speed of processing and Noise filtering
To gain an indication of our processing requirements of our system, we collected a large set of experimental data. The results are shown in Figure 4.14. Our interest has
4.13 Speed of processing and Noise filtering
73
30
Number of command(s) filtered
25
Filtering levels
20
15
10
5
0 0
5
10 15 20 Interval of command transmission (Hz)
25
30
Figure 4.13: Filtering level
centred on finding a maximum reliable velocity for the general running of the robot, as well as determining the minimum processing requirements to operate the robot. Thereby, allowing us to chose a set of parameters that would be most suitable for the general operation of our system. An indication of a reliable speed to operate the robot is useful, as it will prevent us from over specifying the operation of our system, thus minimising any chance of mishaps from occurring, such as unnecessary collisions caused by travelling too fast. We found that a reliable velocity is also related to the physical locomotion controller of our robot. In our case the locomotion system of the Yamabico robot had noticeable motion stability problems when it exceeded velocities of more than 800mm/s. Noise filtering was used to overcome the problems caused by noisy transmission of the video signals to the vision system. The filter performs sanity checking on all the commands that are sent to the robot, this checking is done via a “Winner take All” scheme. This scheme has been incorporated into our overall system architecture, and is discuss further in Chapter 5. The basic objective of this scheme is to ensure that the robot does not get into states such as start-stop motion between consecutive video frames. The level of filtering of commands that are required for the robot to smoothly operate was studied. Command filtering is the number of commands that
74
Robot Vision
Speed of processing
Collision
800 mm/s 600 mm/s Robot’s velocity 300mm/s 0
6
10
15 20 Interval of command transmission (Hz)
25
0 mm/s 30
Figure 4.14: Speed of processing
need to be processed before a decision is made. Different levels of filtering for our system are shown in Figure 4.13. This figure shows that if we were to send commands at an interval of 1 Hz all 30 output commands are filtered, however if we were to send commands at the rate of 30 Hz then no commands are filtered. The results shown in Figure 4.14 show that sending commands at the rate of 30 Hz can be problematic. As some of the robot commands were erroneous, due to sensing errors due to the physical nature of our setup. Examples of such sensings error are illustrated in Figure 4.15. Here we see a sequence of 12 images taken during a robot mission. Frames 4, 5, 8, 9, 10 and 11 have been corrupted to various degrees by noise interference during transmission. The original intention of the data collection was to gain an indication if off-theshelf computing hardware can be utilised for a vision guided mobile robot. Our results indicate a number of things. Processing at 30 Hz is not essential for real-time operations. Filtering of noisy commands is a necessity, and collective decision making improves the overall reliability performance of the system. We decided to operate our robot at 600mm/sec and 6 Hz (that is, commands to the robot are sent 6 times a second). This is due to the fact that operating at less than 6 Hz the system the system reaction to events appears to be too slow.
4.13 Speed of processing and Noise filtering
75
1)
2)
3)
4)
5)
6)
7)
8)
9)
10)
11)
12)
Figure 4.15: An image sequence that contains noisy data taken from a target pursuit experiment.
76
Robot Vision
Operating faster than 10 Hz has proven to be unreliable due to noise in the sensor data. Therefore, to return to our original intention, if slower hardware is used we do not need to process at 30 Hz, as long as the sensor data is reliable. For example, if the vision processing was performed on-board the mobile robot.
4.14
Summary
In this chapter we have described the details of the vision processing components within our system. These components were designed to facilitate various aspects of navigation. From low level survival mechanisms to more higher level goal oriented navigation. The computational demands of these components have been kept to a minimum, thus allowing the system to operate in real-time. For a remotely operated system, the importance of real-time processing has been verified through the study of vision processing speed. We concluded that filtering was important and necessary for unreliable sensory data. We also conclude that if reliable sensor data is available processing at 30 Hz may not necessary. We believe this fact can be attributed to the richness of the data that is provided by vision. It was vital for a robot to be able to adapt to environmental changes. Therefore, a new mechanism for adapting to varying lighting condition was developed, leading to a novel correlation technique. We showed that template based systems can be used in an effective manner and are adaptive to environment conditions. In this chapter, we also presented a low-cost scheme for goal acquisition and matching using a model-free approach. The development of vision processing techniques described in this chapter have led to the advancement of other robot vision systems. The parameterised correlation technique has been applied to camera calibration for an active stereo head vision project [Brooks et al., 1997]. Our technique for image to world coordinate transformation has also been used by other mobile robot research groups, [Heng et al., 1997]. In the following chapters, the importance of diverse, effective and flexible components will be come apparent. This is demonstrated by our robot, navigating in differing domains.
Chapter 5
Robot Visual Behaviours “It’s as if thinking were a little-used backup, too slow and error-prone to be depended on in the normal course of things.” — [Calvin, 1996]
Here, in this quotation Williams H. Calvin, the theoretical neurophysiologist has captured the essence of behaviour-based systems, where biological systems are controlled by basic innate behaviours that are prewired, and are setup for sure-fire responses. Prof. Rodney Brooks of the MIT AI lab has led the way in instigating the development of prewired sensor-based control to robots, and a shift away from the well established Artificial Intelligence (AI) approach to the control of robots [Brooks, 1986]. The difference between the approaches is highlighted by the different robot control architectures. The classic AI approach requires a single monolithic controller to oversee the complete operation of a system, while with the Behaviour-based approach, a small set of custom designed distributed modules are responsible for processing responses to stimulus. The success of using Behaviour-based approaches has been demonstrated in many systems, [Anderson and Donath, 1990, Brooks, 1990, Horswill, 1994a, Steels, 1995] are some typical examples. This success has led to a shift in the way robotic systems are designed. Our motivation in adopting a Behaviour-based approach is that such systems have shown to be fast and responsive, and easily satisfy our requirement of operating in dynamic environments. Our initial focus has centred on the development of a set of behaviours that perform basic mobile robot navigation. Throughout this research, we have progressively built a collection of vision-based behaviours. The sophistication of our system was 77
78
Robot Visual Behaviours
increased by combining the various behaviours. A set of “Basic behaviours” have been used throughout our system, they have been designed to allow easy customisation for other purposes. The “Combined behaviours” demonstrated how higher-level behaviours can be derived from the “Basic behaviours”, without the need for other specialised behaviours. Finally, the usefulness of these behaviours, as “Specialised Behaviours” is demonstrated through several robot applications. This chapter begins with a description of the overall control architecture of our system, in Section 5.1. This is followed by a discussion of the functionality of its components — its visual behaviours. The components sections are broken up into “Basic Behaviours” and “Combined behaviours”, presented in Sections 5.2 and 5.3 respectively. Using these behaviours we have developed several specific robot applications: Landmark-based Navigation in Section 5.4, Robot Vacuum Cleaning in Section 5.5, and Robot Soccer Playing in Section 5.6. A discussion of the “specialised behaviours” used in these applications is also presented.
5.1
System Architecture
A control architecture dictates the organising structure of how a system should be built and the manner in which a given problem is to be solved. Our system architecture is a behaviour oriented architecture, where each behaviour is implemented as a competence module, responsible for its own response to a stimulus. In addition, higher level navigation behaviours are accomplished through the use of Purposive Maps (PM), as discussed in Section 2.2.2. Keeping our system design modular, and minimising communication between modules, has allowed us to produce a flexible system in which modules can either run on a separate host or on the same host. In our system the vision processing is performed off-board. The set of system behaviours running as separate processes form the complete control architecture of the system. These processes are: a vision processing framework, an action selector, and an actuation shell. While the vision processing framework provides the perception data to the system’s behaviours, the action selector chooses the most suitable behaviour in reaction to a current situation. The actuation shell running on the robot controls the motion of the system. The Action selector is
5.1 System Architecture
79
presented in Section 5.1.1 and the Actuation shell is presented in Section 5.1.3. The system architecture is similar across the different applications that we have developed, differing only in the connectivity of the behaviours. Collision Avoidance
Goal Seeking
Action selector
Free-space -move
Actuation shell
Figure 5.1: Action selection: The basic function of the Action selector is to collect and tally all occurrences of behaviours activation. Once collected a final robot actuation command is determined and then issued.
5.1.1
Action selection
The development of the action selector has been motivated by the need for noise filtering, and is primarily used to overcome the problems caused by noisy transmission of the video signals to the vision system. The overall effect of this action selection is to provide a form of sanity check on all commands that are sent to the robot. Thus, filtering out error prone commands from eventually being transmitted to the robot for execution. This ensures the robot does not enter transient states such as start-stop motions between consecutive video frames. An action selection scheme is used for the activation of the most suitable responses to various situations. The development of the scheme was motivated by the work conducted in Section 4.13, which showed that filtering was a necessity. The scheme we have chosen is assisted by the connectivity of the behaviour network. The selection of a response is based on a “Winner-Take-All” scheme. At each processing iteration, a behaviour can be triggered in response to a particular stimulus, a vote is registered for each trigger. The triggering of behaviours is determined by the behaviour con-
80
Robot Visual Behaviours
nectivities, described in Section 5.1.2. Figure 5.1 shows the relationship between the behaviours and the action selector. At a regular periodic interval of time the votes are collated, and an action is selected and sent to the robot, for execution. The “Winner-Take-All” scheme provided the best and most logical way for an action to be selected. It imposes no additional delay to processing. Whereas other schemes, such as selection via averaging or median value give rise to delay, this is due to the fact that they intrinsically accumulate data in their calculation. Also logically the “Winner-Take-All” provides a clear cut decision in selecting the most suitable action without the need of incorporating any additional information in the process. Whereas other schemes such as averaging, to ensure the success of the system a definite cut-off must be derived. For example, when averaging the 5 commands of “3 Collision Avoidance” to “2 Obstacle Avoidance”, the average of this situation is unclear, additional information is needed to determine the most suitable action. In contrast, when another behaviour is introduced into our system it is simply added as another candidate, ready for execution.
5.1.2
Behaviour connectivity
As stated earlier, action selection is assisted by the connection of the set of behaviours. This is an important part of the action selection, and ensures that generally the most suitable behaviour is activated. Figure 5.2 shows how connections to a behaviour are configured. A behaviour can have connections coming from feature detector and/or from any other behaviour(s). A simple connection between a collision sensor to the action of stopping the wheels of the robot, specify the Collision Avoidance behaviour (further details are given in Section 5.2.1). These connections define what event or situation causes a behaviour is to be activated or suppressed. The general connectivities of behaviours in our system are illustrated in Figure 5.3. The connections dictate the overall behaviour of the system, and determine how our system responds to the external world. As our discussion progresses, it will become apparent why these connections were established. Detection modules are depicted as diagonal boxes, and behaviour modules by rectangular boxes. The arrows indicate preconditions to a behaviour, and the lines with small circles indicate inhibition. For example to ensure safety to the robot, the Collision Avoidance behaviour has the highest priority over all the other behaviours, as shown in Figure 5.3. The Free Space Move behaviour needs to be active when the Goal Seeking behaviour is activated. In order to prevent the
5.1 System Architecture
81 link from other behaviour
Inhibit
Feature Detector
Precondition
Behaviour Module
Precondition Precondition
link from other behaviour
Inhibit Stop wheels
Collision detected
Figure 5.2: Connection of a behaviour
robot from collision it is necessary for the Collision Avoidance behaviour to inhibit the Free-Space-Move and Goal Seeking behaviours. For this reason, the inhibition connections between these behaviours are necessary. A similar structure was shown to be feasible by [Maes, 1990]. This work demonstrated, a conceptual example of a carefully arranged task, and that appropriate action selection were yielded by their behaviour network. Our work differ in two ways, one is that the network we have designed has been derived for navigation generally, rather than for a particular task. In addition our network has been designed to achieve a high response to changing situations. These ideas will be developed later in this chapter, and in the chapters that follow.
82
Robot Visual Behaviours
Collision Detection
Free-Space Detection
Ball Detection
Collision Avoidance
Goal Seeking
Free-SpaceMove
Precondition Inhibit
Sensing Input
Competence Module
Figure 5.3: General connectivity of behaviours
5.1.3
Actuation shell
An actuation shell was developed to control the motion of the robot. This process is automatically executed by MOSRA1 at boot time. The process manages all motor commands that are sent by the action selection scheme. Once received they are interpreted, and appropriate motor commands are then passed onto the low-level motor controller for quick execution. This process also continuously monitors the input, allowing any overriding command to be intercepted for execution. This facility was introduced because when unexpected situations arise, rapid termination of the current command is critical. For example, if suddenly a moving obstacle appears in the view of the robot, the robot must avoid it or stop immediately. Because the shell monitors all incoming events even while the robot is in motion, the shell is able to easily interrupt the current executing motor command and switch to another more appropriate command. 1
MOSRA is the Operating System for Yamabico robots.
5.2 Basic Behaviours
5.2
83
Basic Behaviours
In this section we describe three basic visual behaviours: Collision Avoidance, Obstacle Avoidance, and Goal Seeking. These behaviours have been designed with a common set of basic attributes; safety, opportunism and flexibility. The safety aspect is necessary to guard against any harm to the robot or to others. Opportunistic behaviours ensure that, if a situation arises that the robot can achieved its goal or subgoals the robot does so. Flexibility of design allows the behaviours to be re-applied to other applications or systems. Finally, achieving real-time performance of these behaviours was our prime objective throughout their development.
Collision Detection
Collision Avoidance
Free-Space-Search
Figure 5.4: Connection: Collision Avoidance
5.2.1
Collision Avoidance
The Collision Avoidance behaviour acts as a safeguard to protect the robot during motion. The modular computational structure of the Collision Avoidance behaviour can be simplified into Detection and Avoidance schemes, as shown in Figure 5.4. The detection phase involves determining the availability of free-space ahead of the robot for safe motion. If insufficient free space is available, the robot suppresses its other behaviours and activates the avoidance scheme. In general, the robot is set to spin in its place. Commonly, a situation in which this behaviour can be activated is when the robot is not able to move away from obstacles, both static and dynamic, such as people and other robots. Also, this behaviour acts as a backup to the Free Space Move behaviour, which is explained in the next section. Figure 5.5 shows an instance of the Collision Avoidance behaviour during a navigation experiment. The figure shows the robot wandering through the free space of the laboratory, after reaching a confined corner, the robot spins out of the corner and then reenters the free space
84
Robot Visual Behaviours
1)
2)
3)
4)
5)
6)
Box
Labmate robot 5
4
Wheelchair
3 2
6
Computer box
1
Figure 5.5: Collision Avoidance Experiment
5.2 Basic Behaviours
85
areas. The figure shows a sequence of video frames captured during the experiment2 . The position where the frames were recorded are marked in the plan schematic of the room. The basic algorithm for Collision Avoidance scheme is as follows: 1. Calculate free space 2. If no passable free space3 exists directly in front of the robot then stop and spin (either Clockwise (CW) or Counter-Clockwise (CCW)) until passable free space is detected; 3. Goto step 1.
Free-Space Detection
Free-Space-Move
Free-Space-Search
Figure 5.6: Connection: Free Space Move
5.2.2
Free Space Move
Free Space Move provides the robot with the ability to move freely within its environment. In effect, this behaviour provides obstacle avoidance in static and dynamic situations. This behaviour utilises the vision processing of the Free Space Search detector to determine the best possible motion for the robot to moves. Figure 5.6 shows the simple structure of this behaviour. Part of the function of this behaviour is to provide the best possible motion actions to the left and the right of the robot. The internal avoidance scheme determines the motion of the robot. Figure 5.7 shows a snapshot from an experiment in which the robot avoids a dynamic obstacle (a person walking in its path). The robot sees that there is a larger free area on the left of the image, it steers toward the left, thus, avoiding the person’s leg. 2 Note: the white squares displaying the correlation matches of free-space are not always drawn. However, the commanded direction to the robot is always shown. 3 Passable free space is defined in Chapter 4
86
Robot Visual Behaviours
1)
2)
3)
4)
5)
6)
6
5 Moving person
1 2
Computer desk
3
4
Computer desk
Figure 5.7: Free Space Move
5.2 Basic Behaviours
87
The basic algorithm for the Free Space Move scheme have been set as follows:
1. If no passable free space exists in the front of the robot then perform Collision Avoidance behaviour; 2. Calculate the size of the free space areas to the left and right of the robot; else calculate speed4 ; 3. If all areas are free, then move forward at maximum speed; 4. Otherwise, choose the largest free space area. If the free space regions are the same in size then arbitrarily choose the left or the right side; 5. Calculate a motion trajectory to the highest free space cell of the chosen free space area;
X
Figure 5.8: Highest free space cell (cell 2-2, starting from top left) The method we are using in determining the highest cell on the left or right, is done by simply scanning from bottom to top looking for the tallest free space area, starting from the center column moving to the outer most columns. Each scan is performed until a non-free-space cell is determined, or the top of the image is reached. The highest (tallest) cell out of all the columns scanned is used as the target cell, and the target cells that are closest to the center are given precedence over cells near the edge of the image. Figure 5.8 shows the exact cell (cell 2-2, starting from top left) that has been selected as the target cell in an image of live video input. The same scanning technique is used in determining the largest free space area, left or right. As it can be seen in Figure 5.9, the live video input image is divided 4
Speed calculation is defined in Section 4.12
88
Robot Visual Behaviours
Figure 5.9: Largest free space area down the center. Scanning from bottom to top, inner to outer, adding up the free space cells of the left and right areas. The greater of the two is equated to be the largest free space area. In Figure 5.9, the largest area lies on the right.
Goal Detection
Interest Operator
Goal Seeking
Reassurance
Figure 5.10: Goal Seeking
5.2.3
Goal Seeking
The Goal Seeking behaviour was developed to allow purposeful navigation tasks to be accomplished. The behaviour also serves as the core for other purposeful behaviours, such as target pursuit, foraging, landmark-based navigation and soccer playing, described respectively in Sections 5.3, 5.4 and 5.6. The Goal Seeking behaviour exploits the Free Space Search detector, the Interest Operator and the Target Reassurance scheme described in Chapter 4. The interconnections between the components of this behaviour are shown in Figure 5.10. The Interest Operator provides an efficient way of segmenting the input image that allows the behaviour to focus its attention on searching for a given goal. The Reassurance scheme is used to identify a desired goal. Once a goal has been detected, the robot
5.2 Basic Behaviours
89
executes the associated action for that particular goal, for example, visual servoing. Figure 5.2.3 shows an example of the Goal Seeking behaviour at work. In this experiment our robot visually servoes a target, in this case a radio controlled toy car.
1)
2)
3)
1
Target (Toy car)
2 3
Figure 5.11: Goal Seeking The basic algorithm for the Goal Seeking behaviour is as follows:
1. If a feature has been detected by the Interest Operator then Check if the feature is the desired target using the Reassurance scheme; 2. If the desired target has been detected, then Calculate the location of the target relative to the robot; 3. If the target is less than the set minimum distance, then stop; 4. Calculate a motion trajectory to the Target; 5. Move toward the Target; 6. Goto Step 1;
This behaviour continually loops between Steps 1 and 6, looking for features that are potential goal targets, checking if a selected feature matches a desired goal target and then moving toward the target if it is detected. Adopting this approach yields a
90
Robot Visual Behaviours
robust behaviour that can recover from motion control positioning errors as well as sensing errors in the vision processing such as inexact lens distortion compensation and temporary losses of the target due to noise problems.
5.3
Combined Behaviours
The Goal Seeking behaviour described above uses an appearance-based approach to goal detection. Reference images that were previously acquired are matched against regions in the image identified by the Interest Operator. A model-free version of the Goal Seeking behaviour can be realised by omitting the Reassurance scheme. This will cause the robot to visually servo on targets identified by the Interest Operator. The basic algorithm is as follows:
1. If one or more features have been detected by the Interest Operator then select the closest feature as the target. 2. Calculate the location of the target relative to the robot. 3. If the target is less than the set minimum distance then stop. 4. Calculate a motion trajectory to the target. 5. Move towards the target. 6. Goto Step 1
Determining the closest feature to the robot was quite easy due to the fixed camera configuration, as discussed in Section 4.9. There is no difference in performance in using the appearance-based or modelfree Goal Seeking behaviours. Selection of the behaviour to use depends on the robot’s purpose. If the purpose of the robot is to reach a specific goal target amongst a set of possible targets identified by the Interest Operator then the appearancebased behaviour should be used. Otherwise the model-free behaviour will select the target which is closest to the robot. The model-free Goal seeking behaviour can
5.3 Combined Behaviours
91
be used to acquire appearance-based models of features in the environment, for use in a landmark-based map making. The acquired appearance models would be then used by the appearance-based Goal seeking behaviour to navigate to relevant visual landmarks. In the following section we show that by combining our basic behaviours, a wide range of purposeful navigation tasks can be realised. These include wandering, target tracking, target pursuit, and foraging. Also by extending our basic behaviours more sophisticated higher-level tasks can be achieved. Landmark-based navigation is presented in Section 5.4. A vacuum cleaning robot and a soccer playing robot are presented in Sections 5.5 and 5.6 respectively.
Free-Space Detection
Collision Detection
Free-Space-Move
Collision Avoidance
Precondition Inhibit
Figure 5.12: Wandering: behaviour interconnections
5.3.1
Wandering
A natural first experiment for our robot to perform was Wandering. This behaviour was used as an platform for testing our vision processing algorithms. The Wandering behaviour is responsible for moving the robot around its environment while maintaining safe distance to obstacles (both static and dynamic). The behaviour was realised through the combination of the Collision Avoidance behaviour and the Free Space Move behaviour. The interconnection between these two behaviours is shown in Figure 5.12. In general, the Free Space Move behaviour is always running unless a collision is detected by the Collision Avoidance behaviour, which inhibits the Free
92
Robot Visual Behaviours
1)
2)
3)
4)
5)
6)
Wheelchair Box
Chair Desks
Chair
5
4 3
6 2 1
Desks
Figure 5.13: Wandering in a static environment
5.3 Combined Behaviours
93
1)
2)
3)
4)
5)
6)
Wheelchair 2 Moving person
Box 3
Moving person with chair 4
1 Desks
6 5
Desks
Figure 5.14: Wandering in a dynamically changing environment
94
Robot Visual Behaviours
Space Move behaviour. Emerging out of this interaction of behaviours is an overall behaviour which avoids both static and moving obstacles. Figure 5.13 shows snapshots of our robot “wandering” in a static environment, the robot was able avoid a number of unspecified obstacles, such as chairs, boxes and general office furniture. Using the same configuration of behaviours another experiment was undertaken, shown in Figure 5.14. The situation was some what different. While the robot was “wandering”, a person unexpectedly entered its view, the robot was able to successfully avoid the transient person. Later the robot was faced with another obstacle, a moving chair. The robot was also able to safely avoid the chair and then continue on its way. The basic algorithm for Wandering is as follows: 1. Perform Free Space Move behaviour 2. Perform Collision Avoidance behaviour 3. Goto Step 1
Goal Detection
Goal Seeking
Collision Detection
Collision Avoidance
Figure 5.15: Configuration: Target Pursuit
5.3.2
Target Pursuit
This behaviour is an example of how easily a pursuing behaviour can be realised by using our basic visual behaviours. The Target Pursuit behaviour can be described as moving to a selected target until it is centered at a minimum preset distance. A desirable side effect emerges from this simple specification, when the target moves, the robot will follow. Thus, producing a moving target pursuit behaviour. The configuration of the basic behaviours is shown in Figure 5.15. Here only two behaviours
5.3 Combined Behaviours
95
1)
2)
3)
4)
5)
6)
Wheelchair Box
6 5 Desks
4
3
2 1
Desks
Figure 5.16: Target Pursuit
96
Robot Visual Behaviours
are used: Collision Avoidance and Goal Seeking. Either the appearance-based or model-free version of the Goal seeking behaviours can be used. The only change that was required, was to set the action of the Collision Avoidance behaviour to stop rather than spin. Figure 5.16 shows the results of this combination of behaviours, the robot in model-free mode chases a remote controlled car. Once the pursued target is within the minimum range the robot stops. However, the robot should ensure the target remains centred in the robot’s field of view. The target could move laterally and still be within the minimum range. To ensure the target doesn’t suddenly disappear out of the view a target tracking behaviour is also required. Figure 5.17 shows the robot tracking the remote controlled car after target pursuit has ended. The basic algorithm for the complete Target Pursuit behaviour is as follows:
1. Perform Goal Seeking behaviour 2. Calculate a motion trajectory that centres the target 3. Turn towards the target 4. Goto Step 1
5.3.3
Foraging
For a robot to be useful it must be able to perform purposeful tasks. For example the robot should be able to autonomously find a goal which is unsighted and whose location is unknown. Such a behaviour is similar to the behaviour animals exhibit when looking for food. A foraging behaviour was realised by combining our basic behaviours, with the Interest Operator and the target Reassurance scheme, as shown in Figure 5.18. Before the foraging behaviour is run a goal is specified. The robot executes the Free Space Move, the Collision Avoidance and the Goal Seeking behaviours. This results in a robot motion that is similar to the Wandering behaviour we previously described. The Goal-Seeking behaviour is responsible for identifying potential targets, discarding irrelevant targets and finally guiding the robot to the goal once it is sighted.
5.3 Combined Behaviours
1)
97
2)
3)
1 2
3 Target (Toy car)
Figure 5.17: Target tracking
98
Robot Visual Behaviours
Free-Space Detection
Free-Space-Move
Collision Detection
Collision Avoidance
Goal Detection
Goal Seeking
Figure 5.18: Foraging: Behaviour interconnections
Figures 5.19 and 5.20 shows the results of an experiment in foraging for a toy car. The snapshots show the robot identifying a set of potential targets during the mission. At snapshot 3 and 4 the reassurance scheme rejects the targets identified by the Interest Operator as false matches. At snapshot 6 the Goal Seeking behaviour identifies the correct target and the foraging behaviour terminates. The basic algorithm for foraging is as follows.
1. Perform Free Space Move behaviour
2. Perform Collision Avoidance behaviour
3. Perform Goal Seeking behaviour
4. If target has been reached then Stop
5. Goto Step 1
5.3 Combined Behaviours
99
1)
2)
3)
4)
5)
6)
7)
8)
9)
Figure 5.19: Foraging
100
Robot Visual Behaviours Wheelchair
Computer box
Toy car 6
5 1 4
False targets
2 3 Moving person sitting on a chair
Computer desk
Computer desk
Figure 5.20: Foraging - trace
5.4
Landmark-based Navigation
Landmark-based navigation is fundamental to mobile robot navigation systems. Such systems allow robots to travel from one place to another by the use of landmarks. In our case, each landmark acts a cue to guide the robot to the next landmark until the final goal landmark is reached. For example, after seeing the green house on the corner, turn left and a postal mailbox is located 20 meters from the corner. From our earlier discussion it is clear that the self-preservation of our robot is handled by Collision Avoidance and Free Space Move behaviours. The Goal-Seeking behaviour can also be used for landmark detection, The Reassurance scheme ensures that each landmark is correctly identified. An additional mechanism is required to activate the appropriate robot motion action associated with the landmark cue, and the select the set of templates for the Goal Seeking behaviour to detect the next landmark. The state information to specify the robot’s mission is represented in a qualitative manner using a Purposive map (PM). For each landmark in the PM, a motion vector guiding the robot to the next landmark is specified. These specifications can be approximate, their purpose to guide the robot in the general direction of the next landmark. The Goal Seeking behaviour will override any
5.4 Landmark-based Navigation
101
errors in the specification. The specification need not be complete, for example the motion vector may omit the magnitude of distance to travel, in fact the vector may be omitted entirely, which results in the robot performing a foraging behaviour for the next landmark. Figure 5.22 shows an example of the internal structure of a PM. This PM specifies a mission between three landmarks, and was used to guide the robot in a navigation experiment described later in this section and shown in Figure 5.23.
Large area of free space
Normal motion of the robot
Landmark
Stationary observer
Figure 5.21: The robot exhibiting a bias toward a landmark The purpose of specifying a landmark motion vector in the PM is that it allows the robot to move efficiently between landmarks. While the robot is moving towards the next target landmark, unexpected obstacles could be encountered. The Free Space Move and Collision Avoidance behaviours may cause the robot to execute motions that result in the robot moving away from the target landmark. The Free Space Move behaviour was modified to accept the landmark motion vector as an input. In its original form the Free Space Move behaviour selects the largest region of free space available for robot motion. The modified version of the behaviour caused the robot to select the largest region of free space which is closest to the direction of the next landmark. If a landmark motion vector is not supplied, the robot behaves in the same manner as the original version of the behaviour. The effect of this biasing is illustrated in Figure 5.21. The interconnection between the Free Space Move, Collision Avoidance and Goal Seeking behaviours to form the Landmark Navigation behaviour are the same as the Foraging shown in Figure 5.18.
102
Robot Visual Behaviours
The basic algorithm for the Landmark Navigation behaviour is as follows. 1. Select the first landmark templates and motion vector from the PM 2. Perform Free Space Move (motion vector) behaviour 3. Perform Collision Avoidance behaviour 4. Perform Goal Seeking behaviour 5. If the landmark is reached then select the next landmark template from the PM and motion vector. If there are no more landmarks then Stop 6. Goto Step 2 Figure 5.23 shows our robot executing the mission specified in the PM in Figure 5.22. In this experiment the PM specified the motion vector between each landmark, the robot performs its mission based on travelling from one landmark to the next. The robot was able to successfully reach its final goal. Other landmark navigation experiments were conducted in which transient obstacles such as people blocked the way to the next landmark. By using the landmark motion vector in the Free Space Move behaviour, the robot was able to avoid the obstacles and still reach its target landmark. However, it must be said that this strategy only works when the obstacles are not significant in shape and size, in which case a learning strategy is required. This issue is addressed in Chapter 6.
5.5
Robot Vacuum cleaning
This development was produced for the support of a multi-robot cooperative cleaning project [Jung et al., 1997a, Jung et al., 1997b, Jung, 1998]. This project investigated an architecture for multi-robot cooperation to perform the task of cleaning a laboratory. The cleaning task used two heterogeneous mobile robots, one designated with the task of “sweeping”, the other, our robot, with the task of “vacuuming”. The “vacuuming” robot was internally fitted with a vacuum cleaner, installed in the centre of the robot that enabled it to collect litter. In order to vacuum the litter, a visually-based Vacuuming behaviour was developed. Detection of litter is a difficult problem, since litter lacks structure, and has
5.5 Robot Vacuum cleaning
Start
Free-SpaceMove (motion vector)
Clear
103
Target sighted
Reach target
Goal Seeking
Ball
Clear
Free-SpaceMove (motion vector)
Target sighted
Goal Seeking
Reach target
Bottle
Clear
Free-SpaceMove (motion vector)
Target sighted Key to the Diagram Goal Seeking
Reach target
At Goal denotes Place
denotes Behaviour
denotes Condition
Figure 5.22: PM: Landmark-based navigation
a)
b) Figure 5.23: Landmark-based navigation
c)
104
Robot Visual Behaviours
many possible appearances. By using the model-free approach based on the Interest Operator the litter can be identified. The robot then uses the Goal Seeking behaviour to servo on the litter to within a minimum distance. The robot then attempts to vacuum up the litter. False matches such as obstacles are guarded against by checking the bump sensor5 of the robot. The robot automatically moves forward over the litter and then reverses until the Interest Operator indicates there is no more litter. The basic algorithm for the Vacuuming behaviour is as follows:
1. Perform the Wandering behaviour 2. If litter is detected by the Interest Operator then Goto Step 3. Otherwise Goto Step 1. 3. Perform the Goal Seeking behaviour 4. Start the vacuum cleaner 5. While litter is detected by the Interest Operator move toward the litter. 6. Stop the vacuum cleaner 7. Start reversing the robot 8. Goto Step 2
Figure 5.24 shows snapshots of the vacuuming behaviour at work. The robot servos on a collection of litter, it begins vacuuming, moving forward and back over the litter until the Interest Operator fails to detect any more litter.
5.6
Soccer playing skills
This section describes a mobile robot system which performs basic soccer playing skills. The system was developed to evaluate the possible feasibility of our visual behaviours controlling the robot to play the game of soccer. The “Robot World Cup” initiative was established by [Kitano et al., 1997] as the next great challenge for the AI community. The challenge involves teams of mobile robots competing in 5
To note, the bump sensor was only introduced for this particular task. The focus of this research is still on the use of vision for the purpose of mobile robot navigation.
5.6 Soccer playing skills
105
1)
2)
3)
4)
5)
6)
1
2
3
4
6
Figure 5.24: Robot Vacuuming
Litter
5
106
Robot Visual Behaviours
Collision Detection
Free-Space Detection
Ball Detection
Collision Avoidance
Goto Ball
Free-SpaceMove
Got Ball
Toward Goal
Goal Detection
Kick Ball
Precondition Near Goal
Inhibit
Sensing Input
Competence Module
Figure 5.25: Network of soccer behaviours
5.6 Soccer playing skills
107
a game of soccer. Each robot is responsible for its own sensing and actions. Various configurations are being considered, ranging from simulation to small, medium and large sized robots. In our system, a medium sized robot, we consider the low level sensing and the basic behaviours required to fulfil the task of soccer playing. The development of such a system demonstrates the diverse flexibility of our visual behaviours for dynamic situations. In order for a robot to play soccer it must be able to react to its changing external environment quickly. It must also be able to navigate freely on the playing field while avoiding any obstacles such as other robot soccer players. Following on from our earlier work, we have continued to build additional behaviours to realise this task. A network showing the interconnectivity of the specialised behaviours is shown in Figure 5.25. The network of behaviours has multiple specialised instances of the Goal Seeking behaviour. The Goto Ball behaviour, the Kick Ball behaviour and the Toward Goal behaviour are all based the Goal Seeking behaviour. They are described in the following sections. The general operation of the system is governed by emergent interaction between all the behaviours. The Free Space Move and the Collision Avoidance behaviours are responsible for keeping the robot away from obstacles other than the ball. The Goto Ball, Kick Ball and Toward Goal behaviours are responsible for guiding the robot to the ball, then dribbling the ball, and finally shooting the ball into the goal.
Additional Detectors To realise this application additional specialised visual detectors were needed. Feature detectors for locating the goal and the ball in various situations were developed. These detectors are presented in the following sections.
5.6.1
Ball Detector
An important attribute in a game of soccer is the ability to detect a soccer ball. Ball detection is done at two levels. The first level involves determining if the ball is in the near vicinity of the robot. The next level determines if the robot actually possesses the ball after initial contact has been made, and it is described in the next section.
108
Robot Visual Behaviours
To detect the soccer ball we take advantage of the functionality of our Interest Operator. The Interest Operator segments out any distinctive features from the rest of the soccer field. We then use our Reassurance scheme to determine if the detected feature is is indeed a soccer ball. Note: We pre-store templates of the soccer ball at different ranges, one for each row of the video image grid. The ball detector is described in the following sections.
5.6.2
Got Ball
The Got Ball detector determines if the ball is directly in front of the robot. This is done by placing search templates of the the soccer ball on the bottom row of the video image grid. By detecting the position of the ball at the bottom of the image determines how the robot should position itself relative to the ball. For example, if the ball is on the right side of the image then the robot should move to the right to keep the ball centred. Basically this detector visually determines once the robot has possession of the soccer ball, if it still has the ball and in what position it is in front of the robot. Section 5.6.5 shows how the Got Ball detector is used for dribbling.
Figure 5.26: Soccer Goal
5.6.3
Goal Detector
The Goal Detector checks the upper portions of the live video image for the appearance of the left and the right pre-stored templates of a goal (in our case the left and right bottom edges of a doorway). Judging the closeness of the goal is done through exploiting the fix camera configuration of our system, since the camera is fixed pointing straight down, the closer an object, in our case the goal the further down the
5.6 Soccer playing skills
109
screen the goal will appear. Figure 5.26 shows a snapshot of the Goal detector finding the goal. Once the goal has been detected, the middle point is determined, i.e. the distance away from the goal. The calculation of the distance is done using the technique described in Section 4.9. This is the distance used by the behaviours in performing their action. For example, to gain maximum clearance for a shot at the goal. The action handling of the Goal Detector is discussed in the following sections.
Soccer playing behaviours The following sections describe each of the specialised soccer playing behaviours of our system.
a)
b)
c)
d)
e)
f)
Figure 5.27: Goto Ball
5.6.4
Goto Ball
This behaviour is activated by Ball Detection once a “Ball” is detected it causes the robot to move towards the ball using visual servoing with the Goal Seeking behaviour. This is shown in snapshots of Figure 5.27. The robot can avoid obstacles on its way using Free Space Move. The motion of the robot is safeguarded by the Col-
110
Robot Visual Behaviours
lision Avoidance behaviour. The Goto behaviour is an instance of the Goal Seeking behaviour driven by a visual cue, i.e. “The Ball”. This behaviour is similar to the landmark-based navigation described in Section 5.4.
a)
b)
c)
d)
e)
f)
Figure 5.28: Dribbling
5.6.5
Dribbling the Ball
The behaviour of dribbling the ball was achieved through the combination of Got Ball detection, and Free Space Move behaviour. Once the robot has detected that it has procession of the ball, it still moves based on the motion determined by the Free Space Move behaviour. Figure 5.28 shows the robot dribbling the ball. The Free Space Move behaviour can be used to guide the robot to the goal by supplying a motion vector of the goals, as was done for the landmark navigation.
5.6.6
Kick Ball
This behaviour waits until the Soccer Goal is sufficiently close enough for a shot at the goal. This is given by the Goal Detector, once detected the robot performs a kick like action. The Kick is done by stopping then accelerating rapidly, producing a kick
5.6 Soccer playing skills
111
a)
b)
c)
d)
e)
f)
Figure 5.29: Kicking like action. Figure 5.29 shows an internal view of visual servoing and then kicking the ball away from the robot. Also, the snapshots f) to i) of Figure 5.30 shows an external view of the robot kicking the soccer ball into a goal.
5.6.7
Toward Goal
This behaviour is triggered by a precondition of Got Ball, and its subsequent motion is determined by the Free Space Move behaviour with a motion vector to move in the direction of the Soccer Goal. The motion vector is altered once the goal is sighted, given by the Goal Detector. A motion vector can also be introduced as in Landmarkbased navigation, to provide a bias when the goal is unsighted, i.e. given a priori. For example, at the beginning of a game, the robot is given a motion vector to the goal. However, it does not move toward the goal until it has the ball.
5.6.8
Soccer playing
Figure 5.30 shows the result after combining the soccer playing behaviours. In this figure, the robot first detected the ball, then servoed towards it, once the robot de-
112
Robot Visual Behaviours
a)
b)
c)
d)
e)
f)
g)
h)
i)
Figure 5.30: Soccer playing tected that it has possession of the ball the robot started dribbling the ball, and finally when the goal has been detected the robot then activates the kick ball behaviour. The importance of this demonstration is to show that following the same line of approach in development as in the previous sections, a somewhat difficult problem can still be solved in a simple manner. Although we have demonstrated, our system is capable of performing basic soccer skills, needless to say that, it was not our primary focus in producing a complete multi-robot system which cooperates in the playing of a game of soccer. This task was only selected to demonstrate the ability of our system working in an ever changing environment. Many research issues still need to be addressed in order to allow a robot
5.7 Summary
113
to truly play a game of soccer. Some of the issues that was not addressed here include, detection of other players (including members of its own team), rather than just simply treating them as another obstacle. In relation to team playing, issues relating to such problems were never considered. Therefore, our robot was not equipped to recognise or cooperate with other robots. These issues deserve much more detailed attention, see [Kitano, 1998] for further details of other research that is primarily focused on solving such problems.
5.7
Summary
This chapter focused on the discussion of our system architecture and its components — the visual behaviours. All of the developments described in this chapter have shown the properties of diversity, flexibility and most importantly have shown robust performance in dynamic situations. Most of the achievements that were accomplished can be attributed to the advantages of using a real-time vision processing system. Our research produced a set of behavioural components that act as building blocks for higher-level behaviours. The basic component behaviours we developed were Collision Avoidance, Free Space Move and Goal Seeking behaviours. Initially we implemented a higher level Wandering behaviour using the Free Space Move and Collision Avoidance behaviours. This resulted in a robot with survival skills which were shown to be robust in both static, and dynamic environments. Other goal-oriented behaviours that used the Goal Seeking behaviour, such as Target Pursuit were also demonstrated. To further demonstrate usefulness of our basic behaviours, a Foraging task was realised through the combination of all three basic behaviours. The robot was able located a given goal within a room without prior knowledge of its location. We further extended our system to realise a landmark-based mobile robot navigation system through the use of state information, stored in a Purposive Map, to sequence the execution of the robots behaviours. The robot was able to navigate with minimal information and could deal with dynamic situations. We showed that, as our system increased in sophistication, from simple reactive navigation to purposeful landmark-based navigation was possible. Two robot applications were developed using the repertoire of visual behaviours, namely, robot vacuum cleaning and mobile robot soccer playing.
114
Robot Visual Behaviours
In our research to realise our navigation system we specifically avoided using a model-based approach to robot perception. This contributed to the real-time capabilities and robustness of the system. We showed that a wide variety of useful robot behaviours could be realised without the need for explicit models. In closing, we have demonstrated a robust, real-time high performance navigational mobile robot system. However, a known problem in such a behaviour based system is deadlock between behaviours. For example, a situation can arise when the robot has sighted a goal but the path to the goal is blocked by an obstacle. A conflict will result between the robot’s desire to achieve its goal, while at the same time avoiding collisions. In Chapter 6 we present a vision-based solution to this problem.
Chapter 6
Adaptation “. . . , but it is like the quick walking of a man who has taken the wrong road: the more quickly he walks the further he goes astray.” — [Yu-Lan, 1948]
In this chapter we describe a mobile robot system that performs goal-oriented visual navigation using a behaviour-based architecture. Our robot is able to efficiently avoid obstacles in dynamic and unknown environments, while trying to reach a goal. Our mobile robot can resolve conflicts between its own internal behaviours while achieving its task. Our principle approach has been twofold: a navigation system must be able to react to dynamic situations in real-time; and a map should be used as a resource not as a mechanism for navigation, it should be expressible in a qualitative manner. To reiterate, in the previous chapter we described a behaviour-based approach in building visually guided mobile robots, the outcome of this development has realised several highly robust responsive systems. Resource-based navigation using Purposive Maps (PM) has been researched by other members of our group [Zelinsky et al., 1995, Zelinsky and Kuniyoshi, 1996]. The PM was used as a resource to coordinate the arbitration of behaviours, using a PM knowledge can be acquired, and the navigation system became more robust. However, one of the most common problems in behaviour-based systems is that they tend to suffer the problem of competing behaviours. In this chapter we address this shortcoming. While it is true to say that, behaviour-based systems for robotics are currently being chosen over the well-established, traditional AI approaches. However, no well115
116
Adaptation
defined theoretical foundation has been presented to support such systems. In this chapter we also present the theoretical analysis of our navigation system that we have developed for autonomous mobile robots. The robot navigates in an unknown environment, and avoids obstacles. We will show how the navigation problem can be expressed in a mathematical manner, this allows analysis and thus improvements can be more easily made using this analysis. An improvement to our basic algorithm is presented. We present experimental results of a real robot navigating in complex environment. In this chapter we describe an architecture that ties together the various components that we described in the previous chapters. This chapter begins by reviewing some related work in Section 6.1. An outline of what problems must be addressed is presented. We present solutions to these problems in Section 6.2. An analysis of the underlying algorithms is presented. Section 6.3 describes how our algorithms are implemented, additional supporting visual components are described. In Section 6.4 experimental results are provided to verify the correctness of our navigation scheme in unknown environments are presented. Finally in Section 6.5 a summary of the Chapter is presented.
6.1
Related Work
The Behaviour-based approach to robotics [Brooks, 1986, Arkin, 1998] has become extensively used as an alternative approach to traditional AI approaches. Behaviours are normally engineered by trial and error, and complete their tasks without explicit models from which to work. For example, consider what happens when a robot comes upon an obstacle while it is navigating, an unresolvable state may arise from competing move-to-goal and obstacle-avoidance behaviours. Behaviour-based systems can easily get into a deadlocked state between competing behaviours; this is an undesirable effect of these systems. This section will discuss some of the techniques that are involved in solving such a problem. The works of [Zelinsky et al., 1995, Zelinsky and Kuniyoshi, 1996] showed that a behaviour-based mobile robot is capable of purposive navigation and can resolve behaviour conflict situations. This work involved using a Nomad 200 robot that navigated by using dead-reckoning and sonar sensing. The initial experiments using sonar and dead-reckoning sensing implemented navigation behaviours that resulted
6.1 Related Work
117
in a potential-field like system. Potential field methods have long been used for robot path planning [Khatib, 1986]. Further details of potential field methods can be found in [Latombe, 1991]. Essentially, the robot is controlled by two forces, an attraction force and a repulsing force, summing these forces will result in a path that leads the robot to its goal. The Attractive Potential is defined as a force that is pulls the robot toward the target, it varies proportionally to the robot’s distance from the the target. The Repulsive Potential is defined as a force that pushes the robot away from obstacles, it varies inversely to the distance from an obstacle. The potential field is a real-time navigation method based on local information and therefore can only work in limited situations. For example, deadlock or local minima may occur between the desire to move toward a goal while avoiding obstacles. Target Target
Local minima
Start
a)
Local minima
Start
b)
Figure 6.1: Types of local minima In potential fields a local minima is defined as those points where the sum of the attractive potentials and the repulsion potentials is equal to zero. There are two possible types of local minima: minima where the target is blocked; and minima where the target is in free space, as shown in Figure 6.1. Techniques previously used to solve this problem include temporary relocation of the goal [Adams et al., 1990], by introducing random motion [Arkin et al., 1992]. Such techniques are ad. hoc. and assume static environments. Complete solutions to the local-minima problem based on defining a global numerical potential function with no local minima have been proposed [Koditschek, 1987, Tilove, 1990, Latombe, 1991]. However, these methods assume complete knowledge of the environment. An alternative to these methods using search techniques in which the robot progressively gains knowledge was reported [Zelinsky, 1991]. This method relied on geometric modelling of the environment with a discrete grid. Our approach to robot navigation does not rely on geometric modelling.
118
Adaptation
However, a systematic search of the environment is needed to escape behaviour conflict situations. The investigation of search techniques provides a superior foundation for deriving methods that can be realised on a physical robot. The requirement that our robots must operate in dynamic environments led us to move away from sonar-based systems to vision-based systems. Although sonar can provide range data cheaply, it doesn’t provide the necessary resolution that is needed to discriminate between moving and stationary obstacles. Also, using sonar it is difficult to estimate the shape of obstacles, which is important for efficient obstacle avoidance. The introduction of vision sensing can overcome these difficulties. From our discussion in the previous chapters, we know that the heavy computational burdens traditionally associated with vision processing can be overcome. And within the context of our problem, vision can also overcome the problem of the reliance on exact coordinates of a goal through the ability to visually servo.
a)
b)
c)
d)
Figure 6.2: Visual servoing to a goal target
6.1.1
Visual servoing
The robot is able to overcome the local-minima navigation problem shown in Figure 6.1a since the Goal Seeking behaviour allows the robot to visual servo when the
6.2 Navigation algorithm and Analysis
119
goal is visible. Figure 6.2 shows the robot avoiding the two walls on either size while visually servoing toward the goal (a toy on the floor). This is an important feature to incorporate into our system, since our system does not rely on dead-reckoning to detect its location, the goal is only specified by an image template and its relative bearing. This section has outlined the main problems associated with conflicting behaviours in a behaviour-based system. In the case of goal directed navigation the conflict resides between the attraction of going to the goal with the resistance to colliding with obstacles. Much of the early work in this area have been focused in solving the problem from a global and static perspective, with the production of potential functions which have no or few local-minimas. Little work has been done with respect to the development of navigation systems that use local sensor information to perform a systematic search of the environment for a solution to a navigation problem. Our approach performs a physically grounded search for a solution to resolve conflict situations.
6.2
Navigation algorithm and Analysis
In this section we present a scheme for resolving behaviour conflicts in a behaviour based robot. The scheme resolves the problem in a systematic way. We begin with a theoretical study and analysis of the problem. Few attempts have been made to analyse how behaviour based systems will perform, and how behaviours can be improved.
Robot navigation algorithms have
been formally stated and analysed by researchers in the field [Latombe, 1991] and [Lumelsky and Stepanov, 1987]. This work relies on exact position estimation, making implementation on a physical robot difficult. Only recently has a variant of such algorithms realised in a real robot system [Taylor and Kriegman, 1998]. The theoretical contribution of Taylor’s work is that it describes the conditions and characteristic environments in which their algorithm will not fail. Our study was undertaken to provide an understanding of the performance of our algorithm, in order to allow further improvements to be made. Although, our analysis is formulated in the context of a static environment, in order to simplify our analytical study. The algorithms have been designed to work in both static and
120
Adaptation
dynamic situations. In summary, we will show that our robot can reliably navigate in static unknown environments. The robot can eventually reach its goal or determine that the goal is unreachable. We will show that our approach is theoretically sound, it provides a complete search of the solution space, as well as an upper bound of the length of the solution path. By defining the behaviour mathematically, it makes it easier to enhance a behaviour. An enhancement of the basic algorithm is presented in Section 6.2.3.
6.2.1
The Environment
The environment contains two locations, a Start and a Target. The task is to move the robot from the start location to the target location, if a path exists. The environment also contains an arbitrary number of obstacles that can be either convex or concave in shape, and the obstacles could be connected. Our autonomous mobile robot only requires a few basic navigational behaviours: Collision Avoidance, Free Space Move; Goal Seeking and Contour Following. Coupled with the PM to implement the goal-based navigation scheme. These basic behaviours are: Collision Avoidance: this behaviour protects the robot from the possibility of making direct contact with any obstacle. Free Space Move: this behaviour provides the robot with the ability to move freely within its environment. In effect, this behaviour provides obstacle avoidance of static and dynamic obstacles. The behaviour can also be provided with a motion vector that causes the robot to move in free space that is closest to the direction of the motion vector. The motion vector is used to guide the robot toward a goal. Goal Seeking: this behaviour causes the robot to continuously search its visual field of view for goal targets. If a goal is sighted, the robot will visually servo toward the detected target. Contour Following: this behaviour guides our robot in either a clockwise (CW) or counter-clockwise (CCW) direction around the surface contour of the obstacle, regardless of the shape of the obstacle.
6.2 Navigation algorithm and Analysis
121
Collision Detection Contour Following Free-Space Detection Collision Avoidance
Free-SpaceMove (Goal vector)
Goal Detection
Goal Seeking
Precondition Inhibit Sensing Input
Competence Module
Figure 6.3: Behaviour interconnections
6.2.2
Basic Navigation Algorithm
Our analysis of the basic navigation algorithm shows that an upper limit can be put on the length of the path travelled by the robot. The cases that we are interested in our study are the worst case situations, when the robot commences avoiding an obstacle beginning in the middle of the obstacle as shown in Figure 6.4. In addition, the other situation we are interested in is the one where the robot commences avoidance of an obstacle starting at the edge of the obstacle. In this case the robot goes the wrong way when the goal is just around the corner.
122
Adaptation
Free-space-move (Goal Vector)
2n
2n 4n
4n
Contour follow
Contour follow
D 8n
Goal Seeking
Figure 6.4: Basic algorithm The basic algorithm is as follows: 1. Load Goal templates and set the motion vector to the Goal 2. Set direction to CW and Distance1 to N 3. Perform the Free Space Move (Goal Vector) behaviour 4. Perform Goal Seeking behaviour 5. Perform Collision Avoidance behaviour 6. If Goal is reached then Stop 7. If a conflict between Behaviours has occurred then Goto Step 7. Otherwise Goto Step 3 8. Perform Contour Following (Direction, distance) behaviour 9. Flip the value of the Direction (CC → CCW, CCW → CW) 10. If Direction is set to CW then double the Distance 11. If Goal reachable then goto Step 3. Otherwise Quit. A schematic of the algorithm is shown in Figure 6.4. 1 The reference unit used is Distance, is convenient for descriptive purposes, the algorithms presented do not depend on the unit being distance. It could also be time or energy, it is the magnitude of the unit that is important.
6.2 Navigation algorithm and Analysis
123
Conflict Detection Behaviour conflicts occur when two or more behaviours are activated simultaneously. For example, in the context of our problem, a conflict can occur between Collision Avoidance and Goal Seeking. In this case the robot has sighted the goal, but it’s progress is blocked by an obstacle. A more subtle conflict occurs between the Free Space Move(Goal-vector) and Collision Avoidance, which causes the robot to keep revisiting the same location.
Goal Reachability The idea of testing for goal reachability is to prevent the robot from attempting to solve unresolvable situations. A goal is considered unreachable if the robot returns to the same location where the conflict first occurred, while continually following an obstacle’s surface contour. A discussion on goal unreachability is presented in Section 6.2.4.
Analysis: From the description of the algorithm it is clear that most of the execution time is spent escaping the obstacle. This is where our analysis begins. The total length of the path ρ produced by our contour following technique should never exceed the limit of
ρ=
5D 2
− 4n m > 0 3n m=0
where m = log2
D 2n
(6.1)
where D is the total length of the contour of the obstacle that faces the robot, n is the trial contour following distance before trying to use the Free Space Move (Goal-vector) behaviour, and m is defined as the number of turns required to avoid the obstacle. Derivation: Consider contour following in either direction CW or CCW. The worst case would be to start from the middle of the obstacle; as the most number of trials would be required. The maximum distance required to escape the obstacle is
D 2
in the last
124
Adaptation
avoidance step. The number of trials prior to the last step can be expressed as a sequence of the form (n, n, 2n, 2n, 4n, 4n, 8n, 8n, . . .). This sequence can be expressed as 2i · n,
where i is the number of trials. With both clockwise and counter-clockwise directions
being explored, we can rewrite the sequence as 2 × (n, n, 2n, 2n, 4n, 4n, 8n, 8n, . . .),
and 4 × 2i · n respectively. Thus forming the series 4n + 8n + 16n + 32n + . . .. This
series can be expressed by Equation (6.2).
dt = 4
m−1 X i=0
2i · n
(6.2)
Equation (6.2) represents the total distance travelled dt, in m steps, before avoidD 2.
ing the obstacle. The mth step can not exceed
From this series, we can derive the
term m using Equations (6.3) and (6.4).
D 2
2m · n ≥
∴ m = log2
D 2n
(6.3)
(6.4)
Using Equation (6.4) we can express the total length of the robots path ρ as shown in Equation (6.5).
ρ = 4n
m−1 X
2i +
i=0
D 2
(6.5)
Considering all cases of m, we obtain Equation (6.6).
ρ=
4n
Pm−1 i=0
2i +
3n
D 2
m>0 m=0
(6.6)
Substituting for the summation in Equation (6.6), we can rewrite this as Equation (6.7):
ρ=
4n(2m − 1) + 3n
D 2
m>0 m=0
(6.7)
6.2 Navigation algorithm and Analysis
125
By substituting Equation (6.4) into Equation (6.7), we obtain the result expressed in Equation (6.1). If m > 0 it means that n must be less than
D 2.
If m = 0 then ρ = 3n, which means n is greater than or equal to
D 2.
When
this condition holds, the best case would be a path of length n. Otherwise, if the robot starts contour following in the direction that is longest, the worst case is 3n. Moreover, if the robot started at the edge of an obstacle and proceeded in the wrong direction while the target was just around the corner, this is equivalent to the second case of interest in our study.
6.2.3
An improved navigation algorithm
One possible way of improving the performance of the navigation algorithm would be to double n each time the same behaviour conflict occurs. The change in strategy is shown in Figure 6.5. From our previous analysis, we can see that by altering the search scheme, substantial improvements can be made. The improved algorithm is as follows:
1. Load Goal templates and set the motion vector to the Goal 2. Set direction to CW and Distance to N 3. Perform the Free Space Move (Goal Vector) behaviour 4. Perform Goal Seeking behaviour 5. If Goal is reached then Stop 6. If a conflict between behaviours has occurred then Goto Step 7. Otherwise Goto Step 3 7. Perform Contour Following (Direction, Distance) behaviour 8. Flip the value of the direction (CW → CCW, CCW → CW) 9. Double the Distance 10. If Goal reachable then goto Step 3. Otherwise Quit.
126
Adaptation
The effects of the algorithm are shown in Figure 6.5.
Free-space-move (Goal Vector)
2n Contour follow Contour follow
n
4n
Contour follow
D 8n
Goal Seeking
Figure 6.5: Improved algorithm
Analysis: The length of the path ρ produced by this avoidance technique will never exceed the limit defined in Equation (6.8).
ρ=
3D 2
− 2n m > 0 3n m=0
where m = log2
D 2n
(6.8)
where D is the total length of the obstacle contour facing the robot, n is the trial distance for each turn. m is defined as the number of turns required to avoid the obstacle. Derivation: Consider contour following in either the clockwise or counter-clockwise directions. The worst case is to start from the middle of the obstacle as the largest number of trials will be required, and the distance required to escape the obstacle is
D 2
in the
final step. The obstacle avoidance trials prior to the final step will result in a sequence of the form (n, 2n, 4n, 8n, 18n, . . .). This sequence can be expressed as 2i · n. With
both clockwise and counter-clockwise directions being explored, we can rewrite this
6.2 Navigation algorithm and Analysis
127
as 2n, 4n, 8n, . . . which can be expressed as 2 × (n + 2n + 4n + 8n + . . .). This series can be rewritten as Equation (6.9).
2
m=1 X i=0
2i · n
(6.9)
Substituting for the summation in Equation (6.9), we can rewrite this as Equation (6.10):
ρ = 2n(2m − 1) +
D 2
(6.10)
By substituting Equation (6.4) into Equation (6.10), we obtain the result expressed in Equation (6.8). Comparing: If we were to compare the two algorithms using Equation (6.1) and Equation (6.8), we can see that:
3D 5D − 2n < − 4n 2 2
(6.11)
Based on this comparison we can see that the improved algorithm will travel D + 2n units less than the previously proposed algorithm. From this analysis, we see no reason why further improvements could not be gained by increasing the multiple amounts from double to a higher value on each return of the same conflicting point, such as a Fibonacci sequence.
6.2.4
Testing for goal reachability
In order to complete the analytical study of our algorithm, a test for reachability is needed. This test provides the last component to our algorithm and is also used as a tool for the refinement of our algorithm. The reachability test provides a way of preventing the robot from over-searching, if no solution exists. By determining that no solution exists, resources such as battery power will not be wasted. Whenever
128
Adaptation
Goal
Local minima
Local minima
Goal
a)
b) Figure 6.6: Reachability
the robot encounters a conflict situation, point mi is defined for each conflict i = 1, 2, 3, 4 . . .. To construct the test for goal reachability, whenever a behaviour conflict is detected, a line from the conflict point mi to the goal point is defined. The line segment between the point mi and the surface of the obstacle is the reachability line segment. When the robot is trying to avoid an obstacle via continuous contour-following and it crosses this reachability line segment, it means that the goal target is unreachable. This concept is illustrated in the cases shown in Figure 6.6, a target inside a closed room; and the target outside a closed room. If the robot crosses the reachability line segment then the goal is deemed to be unreachable. Figure 6.7 shows the motion path that the robot would take inside a maze, and how the test for reachability is applied. The robot will encounter other conflict points while trying to escape the maze. Equation (6.12) defines, λ the test for goal reachability. The x, y coordinate values are shown in Figure 6.8. The key requirement of this method is that unique information must be maintained for each conflict point in order for the algorithm to succeed. In the context of a sensor-based system, a way of memorising each conflict point is needed, a visionbased solution is proposed in Section 6.3.2.
6.2 Navigation algorithm and Analysis
129
Conflict point
Conflict point
Conflict point
Goal
Figure 6.7: Maze
(x2, y2)
(x4, y4)
(x1, y1)
Goal
(x3, y3)
Local minima
Figure 6.8: Line segments
λ=
(x3 − x1 )(y3 − y4 ) − (y3 − y1 )(x3 − x4 ) (x2 − x1 )(y3 − x4 ) − (x3 − y1 )(x3 − x4 )
(6.12)
If the value λ falls between 0 ≤ λ < 1, it means the robot has crossed the line
segment (mi , oi ). If λ = 0 it means the robot has crossed at the local minimum. If λ = 1 it means the robot passed through the obstacle, which can never physically happen, and this could be useful for adjusting sensor errors. Other information can also be obtained from λ, for example if λ becomes ∞, it means the robot is moving
along a contour which is parallel to the line segment (mi , oi ), i.e. away from the goal.
If λ > 1 the robot’s motion is now on the other side of the obstacle and a path to the goal may exist.
130
6.3
Adaptation
Implementation
In this section we describe our visual obstacle avoidance scheme, how we detect conflicts between behaviours, and a visual contour following behaviour. We stated in the previous section that our algorithm is based on a systematic physically grounded search of the environment. Using the results of the analytical study we have chosen to implement the improved algorithm described in Section 6.2.3 in a vision-based context. The importance of a vision-based navigation method is that it is a rich sensor that allows the use of local information, such as landmarks that are within the robot’s field of view. By using vision to identify landmarks and targets relaxes the reliance on needing to know the precise Cartesian coordinates of a goal i.e. global information. For example, if only the direction of a desired goal is specified, and not the exact position. The robot only needs to head toward the goal and wait for the visual cue of the goal being sighted. Therefore accurate dead reckoning data is not required, and errors in the robot’s internal environment model can be compensated. Our system like all other behaviour-based systems fails in situations such as the one shown in Figure 6.9. Figure 6.9a shows the problem from the robot’s standpoint, Figure 6.9b shows an external view. The robot is faced with the dilemma of the conflicting situation of moving towards the toy (the goal), while avoiding the books blocking the way to the toy. In the following sections, we discuss how we resolve the situation shown in Figure 6.9.
a)
b)
Figure 6.9: Conflict between behaviours
6.3 Implementation
6.3.1
131
Behaviour conflict management
The detection of internal conflicts between behaviours is achieved through monitoring all of the robot’s behaviours. The Collision Avoidance and Goal Seeking behaviours must be monitored for situations shown in Figure 6.9. If a conflict between behaviours occurs, the conflicting behaviours are suppressed and the Contour Following behaviour is initiated. Another aspect of the behaviour conflict detector is that it must be able to detect if a conflict has reoccurred at the same place, i.e. the robot return to the same location where it first started contour following. A common approach would be to record the dead-reckoning information of the locomotion system. This relies on the locomotion information to be fairly precise. Instead we store only the relative direction of where the goal was first sighted. In addition we store a signature of the goal image and its relative position in the image.
00000011 00000000 00001000 00001000 00000000 11111111 Figure 6.10: Landmark signature
6.3.2
Landmark Signature
We have developed a scheme to record the location of the conflict between behaviours. The scheme relies on storing additional visual information. In the appearance-based approach the entire image is stored. However, we only store a selected portion of the image where a conflict situation occurred. This is in contrast to previous appearance-based landmark recognition techniques [Horswill, 1994a], and later methods [Matsumoto et al., 1996, Matsumoto et al., 1997], which store the entire input image. In our method, we remove all segments of images that have a high correlation match to the image of clear floor, as discussed in Section 4.3. Removing free space provides two advantages: redundant information is not stored and searched, and after the removal of free space, a unique visual pattern is created, which is used as a Landmark Signature. Figure 6.10 shows such a landmark, its segmentation and its
132
Adaptation
signature. When revisiting occurs, the matching of the signature is simply a disparity match between the stored signature and the current signature generated at the current location.
6.3.3
Landmark matching
The landmark matching process is complimented by a number of operations: use of the Interest Operator, the Goal Seeking behaviour, the stored goal image and the Landmark signature. The matching scheme is as follows, after the system has segmented out the floor space in the image, a correlation is performed on the region returned from the Interest Operator. The correlation matched is performed between the stored image of the goal and the current live video image. Furthermore, the other portions of the image are matched with the corresponding portions of the live image indicated by the landmark signature. The importance of the Goal Seeking behaviour is that it helps to minimise positional errors, since it tends to lead the robot back to positions which are in close proximity to the point of conflict. Due to this intrinsic property of the Goal Seeking behaviour, a more elaborate scheme was not necessary to ensure the system’s success.
6.3.4
Contour Following
The Contour Following visual behaviour is central to obstacle avoidance. It determines a trajectory that drives the robot around the contour of the obstacle. Figure 6.11 shows a sequence of snapshots from the robot’s view while it is executing this behaviour. In this example, the robot spins left to begin clock-wise (CW) contour following. The robot steers to the free space near the obstacle, when it loses sight of the obstacle it spins toward the target.
6.3 Implementation
133
1)
2)
3)
4)
5)
6)
5
6
4
Books
3
2 1 Wall
Figure 6.11: Contour Following
134
Adaptation
The basic algorithm is as follow: 1. To start Contour Following select a direction, either CW or CCW; 2. Find the upper most reachable free space that is closest to the obstacle; 3. Calculate a motion trajectory to reachable free space; 4. If the obstacle is no longer visible, Spin the robot towards where the obstacle was last sighted; 5. If the goal is sighted or the robot has travelled Distance then Stop. Otherwise Goto Step 2.
X
Figure 6.12: Determining the upper most reachable free space (marked by outlined ‘X’) An example of an upper most reachable free space is shown in Figure 6.12. The method we used in determining the highest cell is similar to the technique that was described in Section 5.2.2. Here the situation is somewhat different, we wish to hug the contour of an obstacle. Therefore the system only searches for the highest non-space cell, the system will then check the space next to it, either left or right depending on the direction of search (left, if searching inner to outer left, and vice versa for the right). If the neighbouring cell is a free cell, the system will select it as the target cell to steer to.
6.4
Experimental Results
Experimental results of our system have been used throughout the chapter. In this section we will present some of the outcomes. The initial PM used in our experiment is depicted in Figure 6.13, as it will be demonstrated the parameters within the PM
6.4 Experimental Results
135
are changed according to the acquisition of new knowledge of the environment by the system. Figure 6.14 shows a top camera view of our laboratory with a superimposed path showing the robot’s trajectory. The experiment involved the robot trying to navigate to a selected target. The robot was given a pre-defined template image of the goal, and only its direction bearing in the environment. The goal was setup so that a conflict may arise, a wall of books was placed on the floor, a goal (a toy chicken) was placed behind this wall. The robot was able to successfully negotiate the obstacle and reach the goal. The robot was able to find the goal, irrespective of the start location in the environment, this can be seen in Figures 6.14, 6.15 and 6.16. Several runs of this experiment were conducted, each experiment run was between 2 to 4 minutes (depending on the duration of n). In our presentation so far we have presented n in the context of a distance. It was only used as a convenient means of discussion. It is irrelevant what we use as the referencing unit for navigation. For implementation purposes it depends on the availability of suitable sensors. In our case a timer was used, as it is less prone to error. Conflict detected
ContourFollow (CW) Conflict detected
Start
Clear
Free-SpaceMove (Goal vector)
Goal detected n reached
Goal detected
Goal Seeking (Visual servo)
Goal reached
Stop
n reached
Conflict detected
Goal detected
ContourFollow (CCW)
Key to the Diagram Conflict detected denotes Place
denotes Behaviour
denotes Condition
Figure 6.13: Initial PM showing the possible paths, this is used as a resource to the system in performing navigation In the case of Figure 6.14, n was set to 15 seconds. The robot’s first contour
136
Adaptation
4 Goal
5
3 6
2 1 Start
8 7
Key to the diagram
Contour Following Goal Seeking Free-Space-Move
Path trace of experiment Conflict detected 3
ContourFollow (CW)
Goal detected n reached
Conflict detected 4 1
Start
Clear
Free-SpaceMove (Goal vector)
Goal detected
2
5
8
Goal Seeking (Visual servo)
Goal reached
Stop
Goal reached
Stop
7
n reached
Conflict detected
Goal detected
ContourFollow (CCW)
6
Conflict detected
PM path trace of experiment Conflict detected
ContourFollow (CW) n reached
Start
Clear
Free-SpaceMove (Goal vector)
Goal detected
Goal Seeking (Visual servo)
2n reached
Key to the Diagram
ContourFollow (CCW) denotes Place
denotes Behaviour
denotes Condition
Conflict detected
PM representation of the path taken by the robot Figure 6.14: Path trace showing robot’s trajectory: n = 15 sec, starting in a CW direction
6.4 Experimental Results
137
follows for 15 seconds in the CW direction. After the 15 seconds expired the robot started to move towards the goal again, then the same conflict was detected again. The robot verifies this by matching the stored information of the last behaviour conflict. Since the behaviour conflict matches, the robot starts contour following in the CCW direction (for a timeout period of 30 seconds). This is sufficient time for the robot to move around the obstacle. The Free Space Move and Goal Seeking behaviours ensure the robot reaches its goal. The fact that the robot reached the goal by travelling in the CCW while n was set at 30 seconds was recorded. On the next subsequent mission, the robot starts its contour following in the CCW direction with a duration of 30 seconds. Hence, exploiting the information gained from the previous mission, the system was able to reduce the execution time of the mission in successfully reaching its goal. Snapshots of contour following in this mission, were also presented in Section 6.3.4. A path trace of the second experiment is shown in Figure 6.15. To show the effect of visual servoing, the center columns of the books were removed. The robot moves in the same direction as before, and after detecting the goal, the robot immediately servos toward the goal. Figure 6.16 shows a view from the top of our laboratory with a superimposed path of our robot performing this experiment. As discussed in the Section 6.1.1, the use of vision sensor eliminates the type of problem that produces local minimas, which is caused by the potential fields method. To show that our system is able to handle subtle environmental changes, the robot was displaced with a different starting location for each experiment. As it was demonstrated in this section, the robot is able to cope and adapt to such changes. The success of these experiments are due to the qualitative way in which navigation was performed, that is no precise representation was used. It is important to note the various starting locations were selected manually, by returning the robot to a new start location at the beginning of each experiment. i.e. the robot did not autonomously return to its start position.
138
Adaptation
Goal
3
2
Start
1
5 4
Key to the diagram
Contour Following Goal Seeking Free-Space-Move
Path trace of experiment Conflict detected
ContourFollow (CW)
Goal detected n reached
Conflict detected 1
Start
Clear
Free-SpaceMove (Goal vector)
Goal detected
2
5
Goal Seeking (Visual servo)
Goal reached
Stop
Goal reached
Stop
4
n reached
Conflict detected
Goal detected
ContourFollow (CCW)
3
Conflict detected
PM path trace of experiment Start
Clear
Free-SpaceMove (Goal vector)
Goal detected
Goal Seeking (Visual servo)
n reached Key to the Diagram
ContourFollow (CCW) denotes Place
denotes Behaviour
denotes Condition
Conflict detected
PM representation of the path taken by the robot Figure 6.15: Path trace showing robot’s trajectory: n = 30 sec, starting in a CCW direction
6.4 Experimental Results
139
Start
1 Goal
2
Key to the diagram
Contour Following Goal Seeking Free-Space-Move
Path trace of experiment Conflict detected
ContourFollow (CW) Conflict detected 1
Start
Clear
Free-SpaceMove (Goal vector)
Goal detected
Goal detected n reached
2
Goal Seeking (Visual servo)
Goal reached
Stop
n reached
Conflict detected
Goal detected
ContourFollow (CCW)
Conflict detected
PM path trace of experiment Start
Clear
Free-SpaceMove (Goal vector)
Goal detected
Goal Seeking (Visual servo)
Goal reached
Stop
Key to the Diagram
denotes Place
denotes Behaviour
denotes Condition
PM representation of the path taken by the robot Figure 6.16: Visual servoing: Path trace showing robot’s trajectory
140
6.5
Adaptation
Summary
In Chapters 4 and 5 we described how a Behaviour-based mobile robot system could react to environmental changes with high performance and robustness. The problem of behavioural conflicts was not addressed. Thus, provided the motivation for the research work presented in this chapter. This chapter presented an analytical study to our solution of resolving the conflicts between behaviours. The analysis allow us to express mathematically the performance of the system, and thus, provided a means in which improvements to the algorithm could be made. The study provided a sounded idea that was incorporated in our final implementation of the algorithm. This chapter demonstrated a behaviour-based mobile robot which can competently navigated in a goal-oriented manner. The system is an extension of the behaviourbased architecture we presented in Chapters 4 and 5. A mechanism for detecting the conflict between competing behaviours was added to the architecture. To address the purposeful navigation in problem situations a novel search strategy, which involved developing a vision-based contour following behaviour was developed. An efficient and robust way of learning relevant features of the environment, namely Landmark Signature, was presented. The Landmark signature provided at a local record of where a conflict had occurred, and allowed for quick and robust matching when the landmark was revisited. Experimental results showed that our system was able to competently navigate and solve problems of conflict between its internal behaviours. Our robot could resolve the conflict between Collision Avoidance and Goal Seeking, the conflict in the robot’s desire to achieve its goal while remaining safe. Our robot was able to travel to a goal that was blocked without external intervention. It is also able to learn an efficient path which was recorded in the PM. This knowledge could be used as a resource in subsequent robot missions. The development of resource based navigation schemes is a key and essential component for autonomous mobile robots to be realised. In the next chapter we will present work that seeks to extend and maximise a robot’s autonomy through the human-robot interaction.
Chapter 7
Supervised Autonomy “What’s in a name? that which we call a rose By any other name would smell as sweet.” — [Shakespeare, 1595]
In this chapter we propose a paradigm for human-robot interaction called supervised autonomy in the context of robot teleoperation system. Supervised Autonomy is made up of five major components: “self-preservation”; “instructive feedback”; “qualitative instructions”; “qualitative explanations”; and an “user interface”. A qualitative approach has been taken in the design of each of the components. Supervised Autonomy was derived as a logical extension of our earlier work, in the development of autonomous mobile robot. In our extended system we provide a mobile robot with autonomy while allowing it to be teleoperated. Our approach can be utilised to combat the continuous closed-loop problem, latency and user stress. The general concept of our system is to incorporate supervisory control with a qualitative approach for the control of robots. Supervisory control, removes relying on a human operator to perform all the basic functions of perception and action in a system. The approach we have taken shifts these basic functions to the physical robot agent; integrated with a set of qualitative instructions, in combination with a high-level graphical user interface, together with additional visual behaviours. The basic concept and its objectives of supervised autonomy are presented in Section 7.1. A description of the components of our system are presented in Section 7.2. An implementation to supervise a mobile robot is presented in Section 7.3. The results of a supervised autonomy mission are presented in Section 7.4. Section 7.5 ends 141
142
Supervised Autonomy
the chapter with a summary of the research contributions of this chapter.
7.1
Related Work
Teleoperated robot systems have been a major area of interest in robotics research. Much of the research in teleoperation was initiated during the nuclear reactor era of the 1960’s [Corliss, 1972]. The high level of interest were due to the large number of applications that were typically performed in hazardous environments that were too dangerous for humans. A recent example of a robot operating in a hazardous environment was performed by the Dante II robot, in their exploration of the volcanic mountain, Mt. Spurr [Apostolopoulos and Bares, 1995, Wettergreen et al., 1995]. Sheridan’s book describes many other examples, includes toxic waste cleanup, military operations and mining [Sheridan, 1992]. Our interest in telerobotics is focused on providing an autonomous navigation system for mobile robots that can accept high level instructions from human operators. Our work is inspired by the problems of continuous close-loop control that are associated with most teleoperational systems. Typically these systems are of the direct control style that require constant monitoring by an operator of the robot’s actions while the operator is sending a stream of commands to the robot. A side-effect of constant monitoring the robot’s state and controlling the robot by the operator is the resulting burden of not just the overall mission but also the general safety of the robot. This produces an effect that [Arkin and Ali, 1994] refer to as “cognitive overload”. We share the same belief with Arkin and Ali that by removing the burden of constant monitoring and commanding helps reduce the operator’s cognitive load. Some common strategies for relieving this load include providing a set of preprogrammed routines [Corliss, 1972]. Other approaches use “planning by trying it out on a computer simulation first” [Sheridan, 1992]. Most of the reported work in simulation relies strongly on the use of environmental models that must be sufficiently well understood before any pre-set routines can be derived and programmed. By taking the simulation first strategy, the solution to the problem relies completely on the accuracy of the modelling of the real world. Such strategies lack the ability to attend to unforeseen situations that occur in real dynamic worlds. This may be possible for a robot manipulator working in a desktop environment. In the context of mobile robot situations, this is an unrealistic assumption.
7.2 The Components
143
An interesting approach was recently reported in the Rocky 7 Mars Rover research project. The robot Rocky 7 performs its navigation by traversing a path of way-points selected from a 2D image by a human operator [Volpe et al., 1996]. This approach removes the need for a complex path planner and allows the robot to use a simple algorithm to control movement between the way-points. However, while this approach is suitable for the Martian environment it can not work in dynamic worlds. For example, after the operator has selected a way-point for the robot to move towards, a dynamic obstacle could move to block the direct path or the way-point itself could move!
7.2
The Components
Our system consists of five major components; self-preservation, instructive feedback, qualitative instructions, qualitative explanation and an user interface. We now explain each component.
Self-preservation One approach to teleoperation is to shift the general low level survival skills, such as collision avoidance onto the robot [Arkin and Ali, 1994]. We believe this approach is a key characteristic that must be incorporated into any design of a supervised autonomous system. In the previous chapter we presented a vision-based mobile robot system that is capable of self-preservation. Drawing from our previous work, this feature has been incorporated into our teleoperation system. By providing the ability of self-preservation, the latency problem associated with direct teleoperation control is eliminated.
Qualitative Instructions A further attribute that must be considered is the nature of instructions used by the operator to command the robot. [Shibata et al., 1994] reported a system that used high level instructions to command a robot scooter to “Go toward” a given point in an image. In our system we have extended this concept further by providing additional qualitative instructions. We have implemented “Go forward”, “Go toward”, “Go
144
Supervised Autonomy
between” and “Keep to” instructions. These qualitative instructions are described in detail in Section 7.3.2.
Instructive Feedback An important issue in teleoperation is providing suitable feedback to the human operator. In our system we have chosen visual feedback. The exact same visual images are shared by the operator and the robot’s vision processing system. Our decision is based on the assumption that by sharing the same field of view with the robot more reasonable decisions will be taken by an operator. Our robot can also provide the operator with suggestions, such as which landmarks that are visible in its current view are suitable for navigation purpose i.e. tracking. By sharing the same view the operator can also verify if the robot’s choice(s) are sensible. We refer to this as instructive feedback. A suggestive command scheme and a direct feedback mechanism have been incorporated as part of our user interface, and are presented in Sections 7.3.3 and 7.3.5. We show that two levels of feedback should be provided to the user, a high level, through which the robot gives suggestions and a lower level using direct video feedback. At the lowest level of feedback that is provided to an operator, the images that are transmitted to the operator do not necessarily have to be live, as the communications bandwidth available to the human-robot interface could be quite low. Images at the end or the beginning of a command sequence could be sent back to the operator. By catering for latency in the transmission of the video images allows robots to work at distant teleoperation sites.
Qualitative Explanation Removing a system’s reliance on direct feedback, gives rise to a mechanism that can maintain a record of events that occurred during mission execution. We have chosen Purposive Maps (PM) [Zelinsky et al., 1995, Zelinsky and Kuniyoshi, 1996] to maintain a log of events, described in Chapter 2. The PM is used to provide a qualitative explanation of the events that occur during the execution of high level qualitative instructions. For example a person could be given the task of “making a cup of coffee”. This might involve the following actions, picking up a cup, walking down the corridor, avoiding collisions with people, opening a door, approaching the
7.3 Implementation
145
coffee machine, pouring a cup of coffee etc. These actions can be represented as a sequential list of actions within a PM, which can be executed in the same manner that was discussed in Chapter 6. Using a PM the robot is able to communicate to the user, how well the robot is progressing in the execution of a task, and what behaviours within the robot were triggered during task execution e.g. collision avoidance. By maintaining a record of events within a PM, we have the ability to automatically re-execute a user’s task instruction. An implementation of this scheme is presented in Section 7.3.4. The added advantage, as discussed in adopting the PM representation, is that the qualitative information of the mission collected by the robot can be re-executed.
User Interface It is asserted that the design of the user interface in teleoperational systems is important to the overall efficient operation and functioning of the system. A bad design can lead to low productivity and poor operator acceptance of the system [Shneiderman, 1992]. Care must be taken in the overall design of the teleoperational system. A discussion of implementation issues is presented in Section 7.3.5. As the prime focus of our work has been toward the research of mobile robot systems, a user study of our user interface has not been fully explored. A full evaluation of the usability of our user interface is beyond the scope of this thesis.
7.3
Implementation
In this section we present the physical details of the robot system in which we have implemented supervised autonomy. Our system consists of four major hardware components; a vision processing system; a communication server; an user console; and a mobile robot. The logical relationship of these components is shown in Figure 7.1. The vision system, communication server and the mobile robot were previously described in Chapter 3. The user console we are using is implemented on a Apple Power MacintoshTM computer, equipped with an AV frame grabber, capable of accepting NTSC video signals. Additional input devices have been incorporated onto the console, such as a unidirec-
146
Supervised Autonomy
Video Receiver
Video
Touch screen
Video
Video
Video Module Tracking Module
Mobile Robot VME Bus
MVME162 Embedded Controller Ethernet Transceiver
Ethernet
Ethernet
Radio SUN Workstation
Radio Modem
Figure 7.1: Teleoperation system configuration tional microphone for high quality sound input for speech processing, and also a touch screen. The touch screen replaces all the operations that are normally performed by a mouse. Details concerning the user console are presented in Section 7.3.5.
7.3.1
Communication – Flow of Commands
There are four distributed processes, each process is responsible for all command flow and event notification between all the processes. The commands can be categorises into three types: direct robot commands, qualitative instructions and operator notification. An example of a direct robot command is shown in Figure 7.2. Here the operator
7.3 Implementation
147
operator console
STOP!
STOP!
communication server
vision server STOP! robot Figure 7.2: Direct robot command has decided to abandon the current mission. A “STOP!” command is issued, this command is sent to the robot and a notification is sent to the vision system.
operator console START! communication server
vision server command
command robot Figure 7.3: Qualitative Instructions In the case of qualitative instructions, as shown in Figure 7.3, the operator performs the functions of the high level construction of a task, these instructions are then sent to the vision server for execution. At this point the execution of commands sent to the robot is done by the visual navigation behaviours. For example, in the case of moving to a target the operator performs target selection before sending a command to the robot that requires it to perform visual servoing. An example of visual servoing is shown in Section 7.3.2. With operator notification, it is important to gain the operator’s attention once the execution of a task can no longer be continued, for any reason. The command flow of system events is shown in Figure 7.4. A typical situation
148
Supervised Autonomy
operator console "Can't move" communication server
vision server STOP!
STOP! robot Figure 7.4: Operator Notification is shown in Figure 7.5, here the dialogue between the system and the operator is as follows. At a) the robot was instructed to “Go Forward” (towards the wheelchair), b) the robot moves forward until collision avoidance, c) shows the robot in “Stop!” mode. At this point the vision system commands the robot to stop, then notifies the operator that it can not move further and waits for the operator’s next instruction.
a)
b)
c)
Figure 7.5: Go Forward – Qualitative Behaviour
7.3.2
Qualitative instructions
To reduce the level of complexity in our system we provide four qualitative instructions. They are: Go Forward, Go Toward, Go Between and Keep To. Each of these instructions is inspired by basic qualitative notions of “Keep going forward until you can’t...”, “Go toward this place...”, “Go between these two landmarks..” and “Keep to this wall...”. Each of these qualitative instructions and their associative visual behaviours are discussed individually in the reminder of this section.
7.3 Implementation
149
Go Forward This instructs the robot to start moving ahead in its current direction until the Collision Avoidance behaviour described in Chapter 5 halts the robot and notifies the operator of this event. An example is shown in Figure 7.5. Each notification is recorded in a PM. The usage of the PM was discussed in Section 7.3.4.
Figure 7.6: Go Toward – Qualitative Behaviour
Go Toward The “Go Toward” qualitative instruction is used to instruct the robot to move toward a given target, that has been selected by the operator. Various effects can be achieved by using this behaviour, such as visual servoing, following and posing. Visual servoing can be simply achieved through the operator selecting a static object and the robot moves toward the selected target. A side effect of visually servoing to an target, is that if the target were to move the robot will follow it. Posing can be achieved by setting the minimum distance with which a servoed target can be approached. Figure 7.6 shows an example of the “Go Toward” instruction. In this case the user selected the toy car as a landmark. The robot servos on this target. If the car moves as shown in the example, the robot follows it autonomously. If the robot loses sight of the landmark, the robot stops and notifies the operator.
150
Supervised Autonomy
1)
2)
3)
3
Door
2 1
Wall
Figure 7.7: Go Between – Qualitative Behaviour
7.3 Implementation
151
Go Between The “Go Between” instruction allows the operator to select two horizontal template images that are used as reference landmarks. The robot is instructed to travel between the two reference landmarks. The motion trajectory is calculated by determining the average floor position of these two landmarks. Figure 7.7 shows the robot being instructed to pass through a doorway using the “Go Between” command. When travelling between the two reference landmarks the vision system tracks the two image’s positions. If the vision system loses track of any one of these images, or if the positioning error of the images become significantly large, a notification is issued back to the operator and the robot stops. This behaviour is useful for commanding the robot to move through tight spaces such as doorways, or narrow corridors.
x2 − x1 2 y2 − y1 yt = y1 + 2
xt = x1 +
(7.1) (7.2)
where x1 and y1 are the x and y coordinates of the first template x2 and y2 are the x and y coordinates of the second template The robot’s motion trajectory is determined by the steering to the position between the two landmarks. This position is calculated using Equations (7.1) and (7.2).
Keep To The “Keep To” qualitative instruction allows the operator to specify two diagonally adjacent template images. This selection determines the trajectory along which the robot will travel. For example if the skirting of a wall is used, the robot follows the wall until it loses sight of the wall or if tracking of the reference templates fails. The calculation of the robot’s trajectory corresponds to the angular relationship between the two diagonal templates. Figure 7.8 shows an example of the “Keep To” behaviour.
152
Supervised Autonomy
1)
2)
3)
3
2
1
Figure 7.8: Keep To
7.3 Implementation
153
θ = tan−1 (
x2 − x1 ) y2 − y1
(7.3)
where x1 and y1 are the x and y coordinate of the first template in the image x2 and y2 are the x and y coordinate of the second template in the image Using the first template, which is the physically closest to the robot, the robot servos until it is a distance d from the template. This stage of processing is similar to the “Go Toward” behaviour. Once the robot has reach the desired distance from the temporary goal, then the robot’s alignment is calculated using Equation (7.3). The angle θ determines the zero alignment for the robot to track the reference template. The computation of distance to objects is as described in Section 4.9.
7.3.3
Making a Suggestion
An additional feature that we have incorporated into our system is to allow the robot to suggest to the operator a landmark(s) that are interesting. This command instructs the vision system to determine from its current view a distinctive feature that it perceives to be a good candidate as a landmark for tracking purposes. Such a command is important because the perception capabilities of the operator are far more sophisticated than those of the robot. Therefore by allowing the robot’s own sensors to make the appropriate suggestions ensures that the most effective landmarks are selected. In Chapter 4, we described an Interest-Operator that was used to identify potential goal targets. Figure 7.9 shows the effect of the Interest-operator. The unknown toy car is easily identified as an interesting landmark feature. Once the landmark has been located visual servoing can be performed using this distinctive feature, for example, by using the “Go Toward” instruction.
7.3.4
Purposive Map Making
To be useful robots should be able to repeat taught tasks. Therefore when a robot is instructed to perform a particular task, the execution of the task must be recorded and then presented back to the operator as the mission evolves. This ensures that the operator is well informed of current and past events that have occurred during the course of the mission. The visual presentation of the user interface to the operator
154
Supervised Autonomy
70000 60000 50000 40000 30000 20000 10000 0
a)
b)
Figure 7.9: Finding unknown landmark targets should identify which behaviours were activated during the mission and also in what situation the activation occurred. In our implementation we use Purposive Maps (PM) to provide a representation of this information. A key property of PMs is the ability to express events and locations qualitatively. Other advantages of using PMs are that a general representation can be learnt from an execution of a mission. Later, the mission can be replayed. This feature further reduces the “cognitive overload” of the human user. An example of the used PM in our experimental work is presented in Section 7.4.
7.3.5
Teleoperation console
The complexity of the user console has been kept to a minimum, a screen layout of the system is shown in Figure 7.10. A video display from the robot’s video camera forms part of the interface, this is displayed within the main console window. This video display allows the user and robot to share a common perception medium, this promotes the notion of instructive feedback. A message log window is also provided, it displays any errors and messages from any of the servers. The window below the main console is the Qualitative explanation window, which displays a PM as it is being constructed. The main controls within the main console window are shown along on the bottom of the video display.
7.3 Implementation
155
Figure 7.10: Console and Go Toward
156
Supervised Autonomy
We will explain the main controls:
This icon instructs the robot to begin executing a set of qualitative instructions. This icon commands the robot to abort the current actions it is executing.
This icon commands the robot to start spinning right.
This icon commands the robot to start spinning left.
The spin icons are used to give the operator a pan view of the scene when deciding which visual behaviour to select for navigation from the Qualitative instructions menu. Our user interface takes advantage of the Macintosh look and feel. The touch screen provides all of the functionality that can be accessed by the mouse. Thus, an operator may instruct the robot to “STOP” simply by placing his/her finger over the stop button.
“Keep To”
“Go Between”
Figure 7.11: Selection of landmarks Qualitative instructions are constructed by selecting from a list of menu items that are located on the right side of the main console. This ensures only one instruction is executed at a time. Beneath the menu is a status indicator which give a report about the communication status between each server. The user selects a qualitative instruction from the touch-screen menu and then selects the visual landmarks that
7.4 A Supervised Autonomy experiment
157
are to be associated with this instruction, by pointing at the relevant parts in the displayed image. For example after selecting the Go Toward instruction, the user selects a landmark in the image to move towards. In the case of the “Keep To” and “Go Between” instructions, the user selects two landmarks by touch the screen at the appropriate places in the image, as shown in Figure 7.11. Other researchers have also implemented similar user interfaces for various command systems. [Sekimoto et al., 1996], uses a touch panel to control a robot’s motion to a target position. The robot has no autonomy and would drive into a wall if commanded to do so. The system that reported by [Shibata et al., 1994] also uses a touch screen for selecting an image of an object to construct a “Go toward” instruction. However, this system would also collide with obstacles if commanded to do so. These robots do not possess autonomous self-preservation capabilities.
4m
5m 1 2
7 3
6 4
5
Figure 7.12: Mission
7.4
A Supervised Autonomy experiment
Figure 7.12 shows an example of a mission in an indoor office environment(our labo1 the robot was instructed to “Spin” right, at 2 to “Go Forward”, at ratory). At
3 to “Go Between” and out to the corridor. Next at 4 to “Keep To” the skirting
5 “Spin” left, 6 to “Go Between” and through the door to boards of the wall, then
7 to “Go Toward” the target on the floor. Figure 7.13 another room. And finally at
show snapshots of this mission as the robot travels through each of these instruction.
158
Supervised Autonomy
1)
2)
3)
4)
5)
6)
7) Figure 7.13: Snapshots of a Supervised Autonomy mission
7.5 Summary
159
All landmarks for each command in this mission were selected by the operator via using our teleoperation console. As described in Section 7.3.5.
User select
Start
"Spin right"
"Go Forward"
Clear
Two target exists
"Go Between"
Can see target
"Go To"
Target in place
"Keep to"
User select
"Spin left"
Two target exist
"Go Between"
Stop
Key to the Diagram
denotes Place
denotes Behaviour
denotes Condition
Figure 7.14: PM of Mission A PM of this mission is shown in Figure 7.14. This figure shows places and behaviours, without a geometric representation of the world. It shows that at each location there is a distinct user instruction (for example at the beginning between the Start and the Spin commands). The mission was successfully automatically repeated with the same PM, thus showing the added advantage of using a PM data structure. However, the mission could only be repeated through manually returning the robot back to its initial position. This limitation of the system has also been raised in the previous chapter, Chapter 6. Through the latter stages of this thesis, we have elected not to pursuit finding a solution to this problem, rather leaving it as a suggestion for future work. The robot could learn or be taught another PM for the return journey.
7.5
Summary
We proposed the concept of “Supervised Autonomy”, a new paradigm for human-robot interaction. This paradigm provided a general framework for developing interactive
160
Supervised Autonomy
human-robot systems. We presented the system in a teleoperational context. We showed that our approach can combat some of the existing problems in mobile robot teleoperation. Specifically, the “cognitive overload” and the latency problems. The paradigm is aimed at relieving the stress loads of an operator by shifting many of the low and intermediate level controls to the robot. As part of the system we have incorporated instructive behaviours, these behaviours form high level “Qualitative Instructions”. Our approach serves as an example in constructing systems without the need for full operational controls by an operator to accomplish a task. In our experimental setup we demonstrated the overall approach to the development of human-robot interfaces, an operator was able to navigate a remote robot from one location to another with just a handful of simple qualitatively designed instruction. In turn, our approach to development has provided an easier and more useful system. The key attribute to the success of our system was primarily due to the usage of a qualitative representation in which events were recorded, Purposive Map. Purposive Map provided two essential elements to our system, the ability to explain to a user — in a natural manner — what has occurred during a mission. secondly, it provided a possible way in which missions can be reproduced.
Chapter 8
Conclusion “This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.” — [Churchill, 1942]
If robots are to play a significant role in our society, the capability of navigation must be provided. Particular attention needs to be given to the ability of the robot to handle changes in the environment while accomplishing its goal. The well-established modelbased approaches to mobile robot have shown poor performances in response to the interaction with the environment — especially to dynamically changing situations. Drawing from these experiences our system was designed to have a high degree of responsiveness, while catering to changing environments. The requirement that the robot be able to react to changes in environmental conditions led to the production of robust and reliable components that can deal with the uncertainty of the world. In this thesis, we developed visual primitives that were designed especially for mobile robot navigation, which were capable of dealing with changes in the environment. These visual primitives served as a foundation for the realisation of behaviours for robot navigation. This thesis showed that a basic set of behaviours can be used as building blocks to progressively solve significant problems in autonomous mobile robot navigation. Specifically, obstacle avoidance in both static and dynamic environments, landmarkbased navigation, foraging and goal-based navigation, while resolving conflicting situations. In terms of systems, the competence of our basic set of behaviours has been demonstrated in a number of systems: a soccer playing robot, in a co-operative robot 161
162
Conclusion
cleaning task and teleoperation of a mobile robot. The success of our system can be attributed to a number of factors. These being, the careful selection of basic well-defined behaviours, in combination with the modularisation of high performance vision processing components, arranged in a careful and manageable manner. The rapid processing of the vision data was the most important element that enriched the performance of our system. This allowed our system to handle significant changes in its environment, in a manner not possible with the other sensors, such as sonar or infrared devices. The importance of the organisation of behaviours, into carefully designed detectors and processes, provided the flexibility required for the tailoring of our system for other purposes. The point we wish to stress here is that, it is more suitable to design the system bottom up. This approach allows more general and flexible behaviours to be developed readily, hence providing reusable components. Our approach has provided an adaptable strategy which allows further arrangements of behaviours, for different tasks we believe this can be accomplished in an uncomplicated manner. This basic principle was appropriately applied throughout the design and development of our system. With regard to solving higher level problems in mobile robot navigation, we focused on resolving the problem of conflicting behaviours, in a behaviour based robot. This problem is caused by the interaction of a robot’s behaviours with the environment. By solving this problem we provided our robot with the ability to achieve autonomously, goal-oriented navigation. Our approach to this problem, in the context of navigation, was done using systematic search patterns in escaping conflicting states once the situations had been detected. By monitoring for conflicts, as well as the capability to detect reoccurring identical conflicts, we were able to complete a goal-based navigation system. Once we built a system with a sufficient level of autonomy, we examined the problem of human-robot interaction in the context of qualitative navigation. While retaining the established framework of behaviours additional components were introduced to produce a teleoperation system for a mobile robot. In the context of teleoperation, a paradigm, “Supervised Autonomy” was proposed to facilitate human-robot
8.1 Further work
163
interaction. This paradigm consists of five basic components: “self-preservation”; “instructive feedback”; “qualitative instructions”; “qualitative explanations”; and a “user interface”. These components together improved the way in which human and robot interaction is done. Importantly, we remove the reliance on a human to provide constant command instructions at every step. This was done by exploiting the autonomy of the robot. Using our paradigm we were able to teleoperate our mobile robot in a challenging mission that only required a handful of qualitative instructions while using only a simple user interface. The cognitive stress on the operator was alleviated, as the user only had to supervise the mission. The onboard intelligence in the robot prevented the robot from damaging itself, in the case of erroneous operator commands. In summary, we have succeeded in accomplishing the following:
• provided a robot with the ability to navigate in unmodified, unknown, dynamic indoor environments relying solely on vision, thus producing an “Autonomous Mobile Robot” system (Chapter 5). • extending this further, we succeeded in providing a mobile robot with sufficient autonomy in achieving goal-oriented navigation, even in conflicting and complex situations (Chapter 6). • equipped with its own autonomy, the robot was able to achieve qualitative navigation using the supervised-autonomy paradigm, while taking into consideration human-robot interaction (Chapter 7). • the outcomes of our effort was shown to be transferable across a number of domains, specifically: landmark-based navigation (Chapter 5), soccer playing robot (Chapter 5), cooperative cleaning robot (Chapter 5), and teleoperation of mobile robot (Chapter 7).
8.1
Further work
With the advancement in vision technology today, looking back, the vision system we started with, now seems quite primitive. Today’s systems are faster and cheaper while also providing added functionality. The potential for the use of vision as an alternative to other sensing devices is far more attractive now than ever before.
164
Conclusion
The current vision processor is large and restricted in its way of processing. Due to the size, the processor remained offboard, thus introducing reliability problems in the system. At times the system come to a halt due to a constant interference of video transmission, even though our system can handle a short duration of interference. Constant noise interference still needs to be addressed, this problem can be completely eradicated by moving the processor onboard. As stated, the current vision processor is restrictive in its way of processing, i.e. Template matching by “Sum of Absolute Differences” correlation. With the current advancement in processing, other more sophisticated vision algorithms can be used without lose in performance, for example, normalised based correlation. An extension to our system would be to increase the level of sophistication in the ability to recognise landmarks. For example, by using colour information, embedding colour could solve some of the problems associated with the reliability of recognition of landmarks, when a larger number of unique possibilities are available, thus producing a more robust and diverse recognition system. In addition, other vision sensors have now been made possible, such as stereo vision, active vision and panoramic vision sensing. These sensors can improve the current system in a number of ways, for example panoramic vision can be used to improve the overall visibility for a user in the task of teleoperation, providing horizontally, a 360◦ view of the environment. By adopting the combination of active and stereo vision, it would allow a better and more precise measurement of depth, in a wider field of view, not just in the ground plane (as it is in this thesis). Furthermore, these sensors could be used for the detection of landmarks above the ground plane. In this case, the depth information could be used for the recognition of landmarks. For example, associating depth with a number of templates of varying sizes, thus further improving the overall landmarks recognition of the system. Further, embedding an additional sensor to address the deficiency of vision sensing such as heading/bearing sensors should be considered. These sensors can provide a global sense to the robot, such as the bearing of the desired goal. Despite its significant power, vision is still only a local sensor. In a fully integrated system, it is desirable to be able to also obtain global information. The additional sensing required will depend on the targeted application domain in which the robot performs its task. The issue of autonomous navigation requires further research consideration, in the
8.1 Further work
165
area of strategies for detecting and resolving more general and complex situations. The problem addressed in Chapter 6, the type of behavioural conflict was explicitly stated by the developer, as to which is the possible conflict. Investigation in providing a system which can detect and resolve conflict, based on learning of temporal patterns may prove to be fruitful. The study of a better graphical user interface would be a welcome improvement to the console used in “Supervised Autonomy”. A full study of the computer-user interface is suggested. Consideration to providing other qualitative instructions to the user, while at the same time providing the user with a more natural manner to which they can achieve their task more effectively is desirable. Lastly, as discussed in Chapter 7 the limitation of our system to return home after performing a mission has been raised. We believe this limitation should be explored as further research, one way could involve human assistance. Initially, additional information needs to be collected in the return direction. In the case of vision, key images of the return view need to be taken. These images could be selected by a user during the original mission, with an additional camera mounted in the opposite direction. The robot is taught a PM of the return journey at the same time as performing the original mission. The robot would then use this PM on its return trip, with assistance from the user when unresolvable problems occur. As discussed, exploring further possibilities of new methods in the deriving new qualitative instructions for the opposite view, need to be investigated carefully in order to maintain the effectiveness of the overall system.
Appendix A
Experimental data A.1
Lens distortion
E(x, y) =
E(x, y) =
0.50 0.51 0.53 0.55 0.57 0.59
e(0,0) e(0,1) . . . e(0,j−1) .. .. .. . . . .. .. .. . . . e(i−1,0) . . . . . . e(i−1,j−1)
0.60 0.61 0.63 0.65 0.67 0.69
0.70 0.71 0.73 0.75 0.77 0.79
0.89 0.89 0.89 0.89 0.89 0.89
0.70 0.71 0.73 0.75 0.77 0.79
0.60 0.61 0.63 0.65 0.67 0.69
0.50 0.51 0.53 0.55 0.57 0.59
(A.1)
0.40 0.41 0.43 0.45 0.47 0.49
(A.2)
The error array shown above is computed using a small size colour camera, Model TCR-50 Colour CCD Board Camera, made by Technoame, JAPAN. The camera has a focal length of f 3.8 and pixel array size of 514 × 490.
167
168
A.2
Experimental data
Teleoperation Console – User Interface
The following provide a description of the settings that are available as part of the user interface which was described in Section 7.3.5.
A.2.1
Communication settings
In this option the user is provided with options of selecting the machines which are to servce. These options are only provided as a convenient, typically these are not changed.
Figure A.1: Communication settings
A.2 Teleoperation Console – User Interface
A.2.2
169
Robot settings
The robot settings dialog provide a means to initialise the robot with default settings.
Figure A.2: Robot settings
170
A.2.3
Experimental data
Console panel settings
This dialog was established to allow the user to set the parameters of the controls provided in the main console panel.
Figure A.3: Panel settings
Bibliography [Adams et al., 1990] Adams, M. D., Hu, H., and Probert, P. J. (1990). Towards a Real-Time Architecture for Obstacle Avoidance and Path Planning in Mobile Robots. In Proceedings of IEEE Robotics and Automation Conference, pages 584– 589. [Agre and Chapman, 1990] Agre, P. E. and Chapman, D. (1990). What are Plans for? Robotics and Autonomous Systems, 6:17–34. [Anderson and Donath, 1990] Anderson, T. L. and Donath, M. (1990). Animal Behaviour as a Paradigm for Developing Robot Autonomy. Robotics and Autonomous Systems, 6:146–168. [Anonymous, 1585] Anonymous (1585). The Oxford Minidictionary of Quotations. Oxford University Press. A common saying in the sixteenth century, used in Giordano Bruno, Italy (1585). [Apostolopoulos and Bares, 1995] Apostolopoulos, D. and Bares, J. (1995). Configuration of a Robust Rappelling Robot. In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS’95. [Arkin, 1987] Arkin, R. C. (1987). Motor Schema-Based Navigation for a Mobile Robot: An approach to Programming by Behavior. In Proceedings of IEEE Robotics and Automation Conference. [Arkin, 1989] Arkin, R. C. (1989). Towards the Unification of Navigational Planning and Reactive Control. In AAAI Spring Symposium on Robot Navigation, pages 1–5. Workshop Notes. [Arkin, 1990] Arkin, R. C. (1990). Integrating Behavioural, Perceptual, and World Knowledge in Reactive Navigation. In [Maes, 1991], pages 105–122. 171
172
BIBLIOGRAPHY
[Arkin, 1998] Arkin, R. C. (1998). Behaviour-Based Robotics. MIT Press. [Arkin and Ali, 1994] Arkin, R. C. and Ali, K. S. (1994). Integration of Reactive and Telerobotic Control in Multi-agent Robotics Systems. In Proceedings of the Third International Conference on Simulation of Adaptive Behavior(SAB94) [From Animals to Animats], pages 473–478, Brighton, UK. [Arkin et al., 1992] Arkin, R. C., Clark, R. J., and Ram, A. (1992). Learning Momentum: On-line Performance Enhancement for Reactive Systems. In Proceedings of IEEE Robotics and Automation Conference, pages 111–116. [Asada et al., 1995] Asada, M., Noda, S., Tawaratsumida, S., and K.Hosoda (1995). Vision-based reinforcement learning for purposive behaviour acquisition. In Proceedings of IEEE Robotics and Automation Conference, pages 146–153. [Aschwanden and Guggenb¨ uhl, 1993] Aschwanden, P. and Guggenb¨ uhl, W. (1993). Experimental Results from a Comparative Study on Correlation-Type Regristration Algorithms, pages 268–289. Wickmann. [Ayache, 1991] Ayache, N. (1991). Artifical Vision for Mobile Robots. MIT Press, Cambridge, Massachusetts. [Borenstein et al., 1996] Borenstein, J., Everett, H. R., and Feng, L. (1996). Navigating Mobile Robots: Systems and Techniques. A K Peters, Wellesley, Massachusetts. [Braitenberg, 1965] Braitenberg, V. (1965). Taxis, Kinesis, and Decussation. Progress in Brain Research, 17:210–222. The very first description of the Braitenberg’s Vehicles(toys). [Braitenberg, 1984] Braitenberg, V. (1984). Vehicles: Experiments in Synthetic Psychology. Bradford Books: The MIT Press, Cambridge, Massachusetts; London, England. [Brooks et al., 1997] Brooks, A., Dickens, G., Zelinsky, A., Kieffer, J., and Abullah, S. (1997). A High-Performance Camera Platform for Real-time Active Vision. In Proceedings of International Conference on Field and Service Robotics (FSR’97), pages 559–564, Canberra, Australia. [Brooks, 1983] Brooks, R. A. (1983). Solving the Find-Path Problem by Good Representation of Free Space. In Proceedings of the IEEE Transcation on Systems, Man and Cybernetics, volume 3, pages 190–197. SMC-13.
BIBLIOGRAPHY
173
[Brooks, 1986] Brooks, R. A. (1986). A Layered Intelligent Control System for a Mobile Robot. In IEEE Transactions on Robotics and Automation, RA-2. [Brooks, 1990] Brooks, R. A. (1990). Elephants don’t play Chess. Robotics and Autonomous Systems, 6:3–16. [Brooks, 1989] Brooks, R. A. ((Summer 1989)). A robot that walks: Emergent behaviour forms a carefully evolved network. Neural Computation, 1:253–262. [Brooks and Lozano-P´erez, 1983] Brooks, R. A. and Lozano-P´erez, T. (1983). A Subdivision Algorithm in Configuration Space for Findpath with Rotation. In Proceedings of the 8th International Conference on Artificial Intelligence, pages 799–806, Karlsruhe, FRG. [Calvin, 1996] Calvin, W. H. (1996). How Brains Think: Evolving Intelligence, Then and Now. BasicBooks. [Canny, 1988] Canny, J. F. (1988). The Complexity of Robot Motion Planning. MIT Press, Cambridfe MA. [Castellanos et al., 1998] Castellanos, J. A., Martinez, J. M., Neira, J., and Tarch, J. D. (1998). Simultaneous Map Building and Localization for Mobile Robots: A Multisensor Fusion Approach. In Proceedings of IEEE International Conference on Robotics and Automation, pages 1244–1249, Leuven, Belgium. ICRA’98. [Chatila and Laumond, 1985] Chatila, R. and Laumond, J. (1985). Position Referencing and Consistent World Modeling for Mobile Robots. In IEEE International Conference on Robotics and Automation, pages 138–145. [Cheng and Zelinsky, 1995] Cheng, G. and Zelinsky, A. (1995).
A Physically
Grounded Search in a Behaviour Based Robot. In Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, pages 547–554. (nominated for best paper.). [Cheng and Zelinsky, 1996a] Cheng, G. and Zelinsky, A. (1996a). Real-Time Vision Processing for Autonomous Mobile Robots. In Proceedings of the 1996 Pan-Sydney Workshop for Visual Information Processing. [Cheng and Zelinsky, 1996b] Cheng, G. and Zelinsky, A. (1996b). Real-Time Visual Behaviours for Navigating a Mobile Robot. In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 973–980. IROS’96.
174
BIBLIOGRAPHY
[Cheng and Zelinsky, 1997a] Cheng, G. and Zelinsky, A. (1997a). A Reactive Visionbased Navigation System for a Mobile Robot. In Proceedings of the 1997 Workshop for Reactive Systems. [Cheng and Zelinsky, 1997b] Cheng, G. and Zelinsky, A. (1997b). Real-Time Vision Processing for a Soccer Playing Mobile Robot. In Proceedings of the 1997 World Robot Cup, pages 57–62. IJCAI’97. [Cheng and Zelinsky, 1997c] Cheng, G. and Zelinsky, A. (1997c). Supervised Autonomy: A Paradigm for Teleoperating Mobile Robots. In Proceedings of the 1997 Intelligent Robots and Systems Conference, volume 2, pages 1169–1176, Grenoble, France. IROS’97. [Cheng and Zelinsky, 1998a] Cheng, G. and Zelinsky, A. (1998a).
Goal-oriented
Behaviour-based Visual Navigation. In Proceedings of IEEE International Conference on Robotics and Automation, pages 3431–3436, Leuven, Belgium. ICRA’98. (nominated for best paper.). [Cheng and Zelinsky, 1998b] Cheng, G. and Zelinsky, A. (1998b). Real-Time Vision Processing for a Soccer Playing Mobile Robot, pages 135–146. Volume 1 of [Kitano, 1998]. Extended version of IJCAI’97 Workshop paper. [Cheng and Zelinsky, 1999] Cheng, G. and Zelinsky, A. (1999). Supervised Autonomy: A Framwork for Human Robot Systems Development. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pages 971–975, Tokyo, Japan. SMC’99. Invited paper. An extended version of this paper is to be published in a special issue of Autonomous Robots. [Cheng et al., 1987] Cheng, K., Collett, T. S., Pickhard, A., and Wehner, R. (1987). The use of visual landmarks by honeybees: Bees weight landmarks according to their distance from the goal. Journal of Comparative Physiology, A:469–475. [Churchill, 1942] Churchill, W. (1942). Mansion House, (Of the Battle of Egypt.). [Connell, 1991] Connell, J. H. (1991). SSS: A Hybrid Architecture Applied to Robot Architecture. Technical report, IBM. [Corliss, 1972] Corliss, W. R. (1972). Teleoperators: Man’s Machine Partners. United States Atomic Energy Commission, Office of Information Services.
BIBLIOGRAPHY
175
[Crowley, 1989] Crowley, J. (1989). World Modeling and Position Estimation for a Mobile Robot Using Ultrasonic Ranging. In Proceedings of IEEE International Conference on Robotics and Automation, pages 674–680, Scottsdale, AZ. IEEE. [Davies, 1996] Davies, N. (1996). EUROPE: A History. Oxford. [Durrant-Whyte et al., 1999] Durrant-Whyte, H. F., Dissanayake, M. W. M., and Gibbens, P. (1999). Toward deployment of large scale simultaneous localisation and map building (slam) systems. In Proceedings of the 9th International Symposium of Robotics Research (ISRR’99), pages 121–127, Snowbird, Utah. [Edison, 1932] Edison, T. A. (1932). Life, chapter 24. [Elfes, 1987] Elfes, A. (1987). Sonar-Based Real-World Mapping and Navigation. In Proceedings of IEEE International Conference on Robotics and Automation, volume RA-3(3), pages 249–265. ICRA’87. [Engelson and McDermott, 1992] Engelson, S. and McDermott, D. (1992). Error Correction in Mobile Robot Map Learning. In Proceedings of IEEE International Conference on Robotics and Automation, pages 2555–2560, Nice, France. IEEE. [Everett, 1995] Everett, H. R. (1995). Sensors for Mobile Robots: Theory and Applications. 1-56881-048-2. A K Peters Ltd., Wellesley, MA. [Feigenbaum and Feldman, 1963] Feigenbaum, E. A. and Feldman, J., editors (1963). Computers and Thought. McGraw-Hill. [Fikes et al., 1972] Fikes, R. E., Hart, P., and Nilsson, N. J. (1972). Learning and Executing Generalized Robot Plans. Artificial Intelligence, 3(4):251–288. [Fikes and Nilsson, 1971] Fikes, R. E. and Nilsson, N. J. (1971). STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving. Artificial Intelligence, 2(3/4):189–208. STanford Research Insitute Problem Solver. [Fortune, 1986] Fortune, S. (1986). A Sweepline Algorithm For Voronoi Diagrams. In Proceedings of the Second ACM Symposium on Computational Geometry, pages 313–322. [Gat, 1992] Gat, E. (1992). Integration Planning and Reacting in a Heterogeneous Asynchronous Architecture for Controlling Real-World Mobile Robots. In Proceedings of the 1992 American Association for Artifical Intelligence Conference, pages 809–815.
176
BIBLIOGRAPHY
[Getting, 1993] Getting, I. A. (1993). The Global Positioning System. IEEE Spectrum, pages 24–33. [Giralt et al., 1984] Giralt, G., Chatila, R., and Vaisset, M. (1984). An Integrated Navigation and Mobile Robots, pages 191–214. MIT Press, Cambridge, MA. [Gonzalez and Woods, 1992] Gonzalez, R. C. and Woods, R. E. (1992). Digital Image Processing. Addision-Wesley Publishing Company. [Haddad et al., 1998] Haddad, H., Khatib, M., Lacroix, S., and Chatila, R. (1998). Reactive Navigation in Outdoor Environments using Potential. In Proceedings of IEEE International Conference on Robotics and Automation, pages 1232–1237, Leuven, Belgium. ICRA’98. [Hartley and Pipitone, 1991] Hartley, R. and Pipitone, F. (1991). Experiments with the Subsumption Architecture. In Proceedings of IEEE Robotics and Automation Conference, volume 2, pages 1652–1658. [Heng et al., 1997] Heng, T. C. H., Kuno, Y., and Shirai, Y. (1997). Active Sensor Fusion for Collision Avoidance. In Proceedings of the 1997 Intelligent Robots and Systems Conference, volume 3, pages 1244–1249, Grenoble, France. IROS’97. [Hoppen et al., 1990] Hoppen, P., Knieriemen, T., and Puttkamer, E. (1990). LaserRadar Based Mapping and Navigation for an Autonomous Mobile Robot. In Proceedings of IEEE International Conference on Robotics and Automation, pages 948–953, Cincinnati, OH. IEEE. [Horswill, 1994a] Horswill, I. (1994a). Visual Collision Avoidance by Segmentation. In Intelligent Robots and Systems, pages 902–909. IROS’94. [Horswill, 1994b] Horswill, I. D. (1994b). Specialzation of Perceptual Processes. PhD thesis, Massachusetts Institute of Technology. [Iida and Yuta, 1991] Iida, S. and Yuta, S. (1991). Vehicle Command System and Trajectory Control for Autonomous Mobile Robots. In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 212–217, Osaka, Japan. IROS’91. [Inaba, 1993] Inaba, M. (1993). Remote-Brained Robotics: Interfacing AI with Real World Behaviours. In Robotics Research: The Sixth International Symposium, pages 335–344. International Foundation for Robotics Research.
BIBLIOGRAPHY
177
[Inoue et al., 1992] Inoue, H., Tachikawa, T., and Inaba, M. (1992). Robot Vision System with a Correlation Chip for Real-time Tracking, Optical Flow and Depth Map Generation. In Proceedings of IEEE International Conference on Robotics and Automation, pages 1621–1626. [J¨org, 1995] J¨org, K. W. (1995). World Modeling for an Autonomous Mobile Robot Using Heterogenous Sensor Information.
Robotics and Autonomous Systems,
14:159–170. [Jung, 1998] Jung, D. (1998). An Architecture for Cooperation among Autonomous Agents.
PhD thesis, University of Wollongong, Wollongong, NSW, Australia.
http://pobox.com/~david.jung/thesis.html. [Jung et al., 1997a] Jung, D., Cheng, G., and Zelinsky, A. (1997a). An Experiment in Realising Cooperative Behaviours between Autonomous Mobile Robots. In Proceedings of the Fifth International Symposium On Experimental Robotics (ISER’97), Barcelona, Spain. [Jung et al., 1997b] Jung, D., Cheng, G., and Zelinsky, A. (1997b). Robot Cleaning: An Application of Distributed Planning and Real-Time Vision. In Proceedings of International Conference on Field and Service Robotics (FSR’97), pages 200–207, Canberra, Australia. [Jung and Stanley, 1997] Jung, D. and Stanley, B. (1995–1997).
Yamabico Au-
tonomous Mobile Robots — Reference Manual. Technical Report 1.5, Department of Computer Science, School of Information Sciences and Technology, University of Wollongong. [Kagami et al., 1997] Kagami, S., Kanehiro, F., Nagasaka, K., Tamiya, Y., Inaba, M., and Inoue, H. (1997). Design and Implementation of Brain Real-Time Part for Remote-Brained Robot Approach. In Proceedings of the 1997 Intelligent Robots and Systems Conference, volume 2, pages 828–835, Grenoble, France. IROS’97. [Kak et al., 1990] Kak, A., Andress, K., Lopez-Abadia, and Carroll, M. (1990). Hierarchical Evidence Accumulation in the PSEIKI System and Experiments in Modeldriven Mobile Robot Navigation, volume 5, pages 353–369. Elsevier Science Publishers, B. V., North-Holland. in Uncertainty in Artificial Intelligence.
178
BIBLIOGRAPHY
[Kambhampati and Davis, 1986] Kambhampati, S. and Davis, L. S. (1986). Multiresolution Path Planning for Mobile Robots. In Proceedings of IEEE Robotics and Automation Conference, volume RA-2(3), pages 135–145. [Kanayama and Yuta, 1988] Kanayama, Y. and Yuta, S. (1988). Vehicle Path Specification by a Sequence of Straight Lines. IEEE Journal of Robotics and Automation, 4(3). [Kelly, 1994] Kelly, A. (1994). Modern Inertial and Satellite Navigation Systems. Technical Report CMU-RI-TR-94-15, The Robotics Institute, Carnegie Mellon University. [Khatib, 1986] Khatib, O. (1986). Real-Time Obstacle Avoidance for Manipulators and Mobile Robots. In International Journal of Robotics Research, pages 90–98. [Kimoto and Yuta, 1995] Kimoto, K. and Yuta, S. (1995). Autonomous Mobile Robot Simulator—A Programming Tool for Sensor-Based Behavior. Autonomous Robots, 1(2):131–148. [Kitano, 1998] Kitano, H., editor (1998). RoboCup-97: The First Robot World Cup Soccer Games and Conferences, volume 1. Springer Verlag. [Kitano et al., 1997] Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., and Osawa, E. (1997). RoboCup: The Robot World Cup Initiative. In Proceedings of the first International Conference on Autonomous Agents, pages 340–347, Marina del Rey, CA. [Koditschek, 1987] Koditschek, D. E. (1987). Exact Robot navigation by Means of Potential Functions: Some Topological Considerations. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 1–6, Raleigh, NC. [Kortenkamp et al., 1998] Kortenkamp, D., Bonasso, R. P., and Murphy, R., editors (1998). Artificial Intelligence and Mobile Robot: Case Studies of Succesasful Robot Systems. AAAI Press/The MIT Press. [Kortenkamp and Weymouth, 1994] Kortenkamp, D. and Weymouth, T. (1994). Combining Sonar and Vision Sensing in the Construction and Use of Topological Maps for Mobile Robots. IEEE Transactions on Robotics and Automation.
BIBLIOGRAPHY
179
[Kr¨ose et al., 1997] Kr¨ose, B., Dev, A., Benavent, X., and Groen, F. (1997). Vehicle navigation on optic flow. In Proceedings of 1997 Real World Computing Symposium, pages 89–95. [Kuipers and Levitt, 1988] Kuipers, B. and Levitt, T. (1988). Navigation and mapping in large-scale space. AI Magazine, 9(2):25–43. [Lambrinos and Scheier, 1995] Lambrinos, D. and Scheier, C. (1995).
Extended
Braitenberg Architectures. Technical Report AI.95.10, Computer Science Department, University of Zurich. [Latombe, 1991] Latombe, J.-C. (1991). Robot motion planning. Boston: Kluwer Academic Publishers. [Lennon, 1980] Lennon, J. (1980). Watching the wheels. [Leonard and Durrant-Whyte, 1991] Leonard, J. and Durrant-Whyte, H. F. (1991). Mobile Robot Localization by Tracking Geometric Beacons. In IEEE Transactions on Robotics and Automation, volume 7(3), pages 376–382. [Leonard and Feder, 1999] Leonard, J. J. and Feder, H. J. S. (1999). A computationally efficient method for large-scale concurrent mapping and localization. In Proceedings of the 9th International Symposium of Robotics Research (ISRR’99), pages 128–135, Snowbird, Utah. [Lorigo, 1996] Lorigo, L. M. (1996). Visually-Guided Obstacle Avoidance in Unstructured Environments. Master’s thesis, Massachusetts Institute of Technology. [Lorigo et al., 1997] Lorigo, L. M., Brooks, R. A., and Grimson, W. E. L. (1997). Visually-Guided Obstacle Avoidance in Unstructured Environments. In Proceedings of the 1997 Intelligent Robots and Systems Conference, volume 2, pages 373–379, Grenoble, France. IROS’97. [Lumelsky and Stepanov, 1987] Lumelsky, V. J. and Stepanov, A. A. (1987). PathPlanning Strategies for a Point Mobile Automation Moving Amidst Unknown Obstacles of Arbitrary Shape. Algorithmica, 2:403–430. Special Issue: Robotics. [Maes, 1990] Maes, P. (1990). Situated Agents Can Have Goal. In [Maes, 1991], pages 49–70. [Maes, 1991] Maes, P., editor (1991). Designing Autonomous Agents: Theory and Practice from Biology to Engineering and Back. The MIT Press.
180
BIBLIOGRAPHY
[Matari´c, 1992a] Matari´c, M. J. (1992a). Behaviour-Based Control: Main Properties and Implications. In Proceedings of the Workshop on Intelligent Control Systems, International Conference on Robotics and Automation, Nice, France. [Matari´c, 1992b] Matari´c, M. J. (1992b). Integration of representation into goal directed behavior. IEEE Transactions on Robotics and Automation, 8(3):304–312. [Matsumoto et al., 1996] Matsumoto, Y., Inaba, M., and Inoue, H. (1996). Visual Navigation using View-Sequenced Route Representation. In Proceedings of IEEE International Conference on Robotics and Automation, pages 83–88. [Matsumoto et al., 1997] Matsumoto, Y., Inaba, M., and Inoue, H. (1997). MemoryBased Navigation using Omni-View Sequence. In Proceedings of International Conference on Field and Service Robotics, pages 184–191, Canberra, Australia. [McKerrow, 1990] McKerrow, P. J. (1990). Introduction to Robotics. Addison Wesley. [Miles and Wallman, 1993] Miles, F. A. and Wallman, J., editors (1993). Visual Motion and its Role in the Stabilization of Gaze. [Moravec, 1981] Moravec, H. P. (1981). Robot Rover Visual Navigation. UMI Res. Press. [Moravec, 1983] Moravec, H. P. (1983). The Stanford Cart and the CMU Rover. Technical report, The Robotics Insitute, Pittsburgh, U.S.A. [Nakamura and Asada, 1995] Nakamura, T. and Asada, M. (1995). Motion Sketch: Acquisition of Visual Motion Guided Behaviours. In Proceedings of International Joint Conference on Artifical Intelligent, pages 126–132. IJCAI’95. [Nebot et al., 1997] Nebot, E. M., Sukkarieh, S., and Durrant-Whyte, H. D. (1997). Inertial Navigation Aided with GPS Information. In Proceedings of M2VIP-97, pages 169–174, Toowoomba, Australia. [Neisser, 1976] Neisser, U. (1976). Cognition and Reality: Principles and Implications of Cognitive Psychology. Freeman, San Francisco, C.A. [Newell and Simon, 1963] Newell, A. and Simon, H. (1963). GPS, a program that simulates human thought. In [Feigenbaum and Feldman, 1963].
BIBLIOGRAPHY
181
[Nilsson, 1969] Nilsson, J. N. (1969). A Mobile Automation: An Application of Artificial Intelligence Techniques. In Proceedings of the 1st International Joint Conference on Artificial Intelligence, pages 509–520, Washington D.C. [Nilsson, 1984] Nilsson, J. N. (1984). Shakey the Robot. Technical Report 323, SRI International. Stanford Research Institute. [Payton, 1991] Payton, D. (1991). Internalized Plans: A Representation for Action Resources, pages 89–103. MIT Press. [Scott, 1984] Scott, P. B. (1984). The Robotics Revolution. Basil Blackwell, Oxford. [Sekimoto et al., 1996] Sekimoto, T., Tsubouchi, T., and Yuta, S. (1996). An Implementation of a Human Interface using a Touch Panel and Visual Images for Controlling a Vehicle. In Proceedings IEEE International Workshop on Robot and Human Communication, Tsukuba, Japan. ROMAN’96. [Shakespeare, 1595] Shakespeare, W. (1594 or 1595). Romeo and Juliet. Act 2, Scene 2. [Sheridan, 1992] Sheridan, T. B. (1992). Teleorobotics, Automation, And Human Supervisory Control. 0-262-19316-7. MIT Press. [Shibata et al., 1994] Shibata, T., Inaba, M., and Inoue, H. (1994). Instruction Use by a Vision-based Mobile Robot. In Proceedings of Workshop of Integration of natural Language Processing and Vision Processing, pages 115–121. [Shibata et al., 1996] Shibata, T., Matsumoto, Y., Kuwahara, T., Inaba, M., and Inoue, H. (1996).
Development and Integration of Generic Components for a
Teachable Vision-based Mobile Robot. IEEE/ASME Transactions on Mechatronics, 1(3):230–236. [Shmoulian and Rimon, 1998] Shmoulian, L. and Rimon, E. (1998). A∗ǫ − DF S: an Algorithm for Minimizing Search Effort in Sensor Based Mobile Robot Navigation.
In Proceedings of IEEE International Conference on Robotics and Automation, pages 356–362, Leuven, Belgium. ICRA’98. [Shneiderman, 1992] Shneiderman, B. (1992). Designing the User Interface: Strategies for Effective Human-Computer Interaction. Addison-Wesley, Reading (Mass.), second edition edition. (First edition: 1987.).
182
BIBLIOGRAPHY
[Srinivasan, 1993] Srinivasan, M. V. (1993). How insects infer range from visual motion? In [Miles and Wallman, 1993]. [Srinivasan et al., 1997] Srinivasan, M. V., Zhang, S. W., and Bidwell, N. J. (1997). Visually mediated odometry in honeybees.
Journal of Experimental Biology,
200:2513–2522. [Steels, 1994] Steels, L. (1994). The artificial life roots to artificial. Artificial Life, 1:75–110. [Steels, 1995] Steels, L. (1995). Building Agents with Autonomous Behavior Systems, chapter 3, pages 83–121. In [Steels and Brooks, 1995]. [Steels and Brooks, 1995] Steels, L. and Brooks, R., editors (1995). The Artificial Life Route to Artificial Intelligence: Building Embodied, Situated Agents. Lawrence Erlbaum Associates. [Sukkarieh et al., 1998] Sukkarieh, S., Nebot, E. M., and Durrant-Whyte, H. F. (1998). Achieving integrity in an ins/gps navigation loop for autonomous land vehicle applications. In Proceedings of the 1998 IEEE International Conference on Robotics and Automation, pages 3437–3442, Leuven, Belgium. [Taylor and Kriegman, 1998] Taylor, C. J. and Kriegman, D. J. (1998). Vision-Based Motion Planning and Exploration Algorithms for Mobile Robots. IEEE Transactions on Robotics and Automation, 14(3):417–426. [Thorpe, 1983] Thorpe, C. E. (1983). An Analysis of Interest Operators for FIDO. Technical report, The Robotics Insitute, Pittsburgh, U.S.A. [Thrun et al., 1998] Thrun, S., B¨ ucken, A., Wolfram, Fox, D., Fr¨ohlingaus, T., Hennig, D., Hofmann, T., Krell, M., and Schmidt, T. (1998). Map Learning and High-Speed Navigation in RHINO, chapter 1. In [Kortenkamp et al., 1998]. [Tilove, 1990] Tilove, R. B. (1990). Local Obstacle Avoidance for Mobile Robots Based on the Method of Artificial Potentials. In Proceedings of IEEE Intelligent Robots and Systems Conference, pages 566–571, Cincinnatai, OH. [Uchibe et al., 1996] Uchibe, E., Asada, M., and Hosoda, K. (1996). Behavior Coordination for a Mobile Robot Using Modular Reinforcement Learning. In Proceedings of the 1996 Intelligent Robots and Systems Conference, pages 1329–1336. IROS’96.
BIBLIOGRAPHY
183
[Volpe et al., 1996] Volpe, R., Balaram, J., Ohm, T., and Ivlev, R. (1996). The Rocky 7 Mars Rover Prototype. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1558–1564, Osaka, Japan. IROS’96. [Walter, 1953] Walter, W. G. (1953). The Living Brain. W. W. Norton, New York. [Wettergreen et al., 1995] Wettergreen, D., Pangels, H., and Bares, J. (1995). Behaviour-based Gait Execution for the Dante II Walking Robot. In IEEE/RSJ International Conference on Intelligent Robots and Systems, Pittsburgh, U.S.A. IROS’95. [Wyeth, 1997] Wyeth, G. (1997). Hunt and Gather Robotics. In Proceedings of International Conference on Field and Service Robotics (FSR’97), pages 334–341, Canberra, Australia. [Yu-Lan, 1948] Yu-Lan, F. (1948). A Short History of Chinese Philosophy: A Systematic Account of Chinese Thought From Its Origins to the Present Day. The Free Press. [Yuta and Nagatani, 1993] Yuta, S. and Nagatani, K. (1993). Path and Sensing Point Planning for Mobile Robot Navigation to Minimise the risk of Collisions. In Proceedings of the 1993 Intelligent Robots and Systems Conference. IROS’93. [Yuta et al., 1991] Yuta, S., Suzuki, S., and IIda, S. (1991). Implementation of a Small Size Experimental Self-Contained Autonomous Robot. In Proceedings of the 2nd International Symposium on Experimental Robotics, pages 902–909, Toulouse, France. [Zeil, 1998] Zeil, J. (1998). Homing in fiddler crabs (Uca lactea annulipes and Uca vomeris: Ocypodidae). Journal of Comparative Physiology A. [Zelinsky, 1991] Zelinsky, A. (1991). Mobile Robot Map Making Using Sonar. Robotic Systems. [Zelinsky, 1994a] Zelinsky, A. (1994a). Monitoring and Co-ordinating Behaviours for Purposive Robot Navigation. In International Conference on Intelligent Robot Systems (IROS), Munich, Germany. [Zelinsky, 1994b] Zelinsky, A. (1994b). Using Path Transforms to Guide the Search for Findpath in 2D. The International Journal of Robotics Research, 13(4):315–325.
184
BIBLIOGRAPHY
[Zelinsky and Cheng, 1998] Zelinsky, A. and Cheng, G. (1998). A Behaviour-based Vision Guided Mobile Robot. Autonomous Robots. In preparation. [Zelinsky and Kuniyoshi, 1996] Zelinsky, A. and Kuniyoshi, Y. (1996). Learning to Coordinate Behaviours in Mobile Robots. Advanced Robotics, 10(2):143–159. [Zelinsky et al., 1995] Zelinsky, A., Kuniyoshi, Y., Suehiro, T., and Tsukune, H. (1995). Using an Augmentable Resource to Robustly and Purposefully Navigate a Robot. In Proceedings of IEEE Robotics and Automation Conference, volume 3, pages 2586–2592. [Zelinsky et al., 1993] Zelinsky, A., Kuniyoshi, Y., and Tsukune, H. (1993). A Qualitative Approach to Achieving Robust Performance by a Mobile Agent. In Robotics Society of Japan Annual Conference, Tokyo, Japan. [Zhu and Latombe, 1989] Zhu, D. J. and Latombe, J. C. (1989). Constraint Reformulation and Graph Searching techniques in Hierarchical Path Planning. Technical Report STAN-CS-89-1279, Department of Computer Science, Stanford University.