Mobile Robot Navigation by a Distributed Vision System - CiteSeerX

3 downloads 0 Views 1MB Size Report
Keywords Distributed Vision System, Vision Agents, Mobile Robot .... i. Memorized robot path. Observation. Fig. 6 Communication between VAs and a robot inĀ ...
Mobile Robot Navigation by a Distributed Vision System

1

Mobile Robot Navigation by a Distributed Vision System Takushi Sogo Department of Social Informatics, Kyoto University Sakyo-ku, Kyoto 606{8501 JAPAN

Hiroshi Ishiguro Department of Computer and Communication Sciences, Wakayama University 930 Sakaedani, Wakayama 640{8510 JAPAN

Toru Ishida Department of Social Informatics, Kyoto University Sakyo-ku, Kyoto 606{8501 JAPAN

Received 31 October 2000

Abstract This paper proposes a

(DVS), which is an intelligent infrastructure for mobile robots consisting of many (VAs) embedded in an environment. The system monitors the environment from various viewing points with the VAs, maintains the dynamic environment models, and provides various information to robots. Based on this concept, we have developed a prototype of the DVS which consists of sixteen vision agents and simultaneously navigates two robots in a model town. Distributed Vision System

vision agents

Distributed Vision System, Vision Agents, Mobile Robot Navigation, Perceptual Information Infrastructure. Keywords

x1

Introduction

Toru Ishida

2

Realizing autonomous robots that behave in the real world based on visual information has been one of the goals in robotics and arti cial intelligence. For limited environments such as oces and factories, several types of autonomous robots which utilize vision sensors have been developed. However, it is still hard to realize autonomous robots behaving in dynamically changing real worlds such as an outdoor environment. As discussed in , the main reason lies in attention control to select viewing points according to various events relating to the robot. In order to simultaneously execute various vision tasks such as detecting free regions and obstacles in a complex environment, the robot needs attention control, with which the robot can change its gazing direction and collect information according to various events. If the robot has a single vision, it needs to change several vision tasks in a time slicing manner in order to simultaneously execute them ( ). Furthermore, in order to execute various vision tasks the robot needs to select the best viewing point according to the vision tasks ( ). However, it is dicult with current technologies to realize the attention control, since vision systems of previous mobile robots are xed on the mobile platforms and it is dicult to acquire visual information from proper viewing points. The robots generally have a single vision sensor and a single body, therefore the robot needs to make complex plans to execute the vision tasks with the single vision sensor. Another reason for the diculty in building autonomous robots behaving in the real world is that a single robot has to acquire a consistent model of a wide environment with vision sensors xed on its body. If the environment dynamically changes, it will be very hard to maintain the model. Our idea to solve these problems is to use many (VAs) embedded in the environment as shown in Fig. 1. Each VA independently observes events in the local environment and communicates with other VAs through the computer network. Since the VAs do not have any constraints in the mechanism like autonomous robots, we can install a sucient number of VAs according to the tasks to be performed, and the robots can acquire necessary visual information from various viewing points. As a new concept to generalize the idea, we have proposed , where multiple vision agents embedded in an environment recognize dynamic events by communicating with each other. In the distributed vision, Active Vision

1)

Temporal Attention Control

Spatial Attention Control

vision agents

Distributed

Vision

2)

Mobile Robot Navigation by a Distributed Vision System

3

Vision agents Computer network

Mobile robots

Fig. 1

Distributed Vision System

the attention control is dealt with as dynamic organization of communication between the vision agents (the detail of the concept of the distributed vision is discussed in ). Based on the concept, we have developed a (DVS) , which solves the above problems and realizes robust navigation of mobile robots in a complex and dynamic environment such as outdoor environments. The DVS consists of VAs, solves the attention control by selecting proper VAs, and realizes navigation of mobile robots based on visual information in a complex environment. This paper describes the detail implementation of the prototype of the DVS, as well as the concept of the distributed vision. In developing the DVS, the following are important: 1. navigation method of multiple robots using multiple VAs, 2. communication among VAs, 3. construction and management of environment models in the VA network. This paper mainly discusses the rst point. In the following sections, the architecture of the prototype system and the navigation method of robots are described. Finally, experimental results of navigation are shown. 2)

System

x2

Distributed Vision

2)

Distributed Vision System

Toru Ishida

4

2.1

Design policies of the DVS

The vision agents (VAs) in the distributed vision system (DVS) are designed based on the following idea: Tasks of robots are closely related to local environments.

For example, when a mobile robot executes a task of approaching a target, the task is closely related to a local area where the target locates. This idea allows to give VAs speci c knowledge for recognizing the local environment, therefore each VA has a simple but robust information processing capability. More concretely, the VAs can easily detect dynamic events since they are xed in the environment. A vision-guided mobile robot, the camera of which is xed on the body, has to move for exploring the environment, therefore there exists a dicult problem to recognize the environment through the moving camera. On the other hand, the VA in the DVS can easily analyze the image data and detect moving objects by constructing a background image for a xed viewing point. All of the VAs basically have the following common visual functions:  detecting moving objects by constructing the background image and comparing the current image with it,  tracking detected objects by a template matching method,  identifying mobile robots based on given models,  nding relations between moving objects and static objects in the images. For robustness and exibility, the DVS autonomously and locally calibrates the camera parameters with local coordinate systems according to demand, rather than keep the precise parameters (the detail is discussed in Section 3.4). In addition, the DVS does not use a geometrical map in robot navigation. It memorizes robot tasks directly shown by a human operator, then navigates robots based on the memorized tasks. 2.2

The architecture

Fig. 2 shows the architecture of the DVS for robot navigation. The system consists of multiple VAs, robots, and a computer network connecting them. detects moving robots and tracks them by referring to which stores visual features of robots. receives the results and estimates camera parameters for establishing representation frames Image processor

Knowledge database

Estimator

Mobile Robot Navigation by a Distributed Vision System Vision agent

Estimator of camera parameters

5

Controller

Camera

Task manager

Communicator Computer network

Knowledge database

Organization manager

Planner

Image processor

Memory of organizations

Memory of tasks

Robot Actuators

Motion planner

Fig. 2

Command integrator

Communicator

The architecture of the DVS

for sharing robot motion plans with other VAs. memorizes the trajectories of robots as tasks shown by a human operator, and selects proper tasks in the memory in order to navigate robots. plans robot actions based on the memorized tasks and the estimated camera parameter. communicates with other VAs through and selects proper plans. The selected plans are memorized in for planning robot tasks more properly. controls the modules, according to requests from the robots and the system state such as task teaching phase and navigation phase. The robot receives the plans through . selects and integrates the plans, from which actuator commands are generated. Task manager

Planner

Organiza-

tion manager

Communicator

Memory of organizations

Controller

Communicator

Command In-

tegrator

x3

Development of a prototype system We have developed a prototype system for robot navigation based on the architecture described in the previous section. In this section, the details of the system are described. 3.1

Fundamental functions for robot navigation

In the DVS, robot navigation is achieved with two steps | task teaching phase and navigation phase. In the task teaching phase, each VA observes and memorizes the path of a robot which is controlled by a human operator or

Toru Ishida

6

!

Background images Fig. 3

Detected free regions

Detecting free regions

autonomously moves in the environment. In the navigation phase, VAs communicate with each other to select VAs which provide proper visual information for navigation, then navigate robots based on the paths memorized in each sensor image. The system needs the following functions for robot navigation: 1. Navigate robots on free regions where the robots can move. 2. Avoid a collision with other robots and obstacles. 3. Navigate robots to their destinations. The system realizes these functions by using visual information acquired by VAs. Navigation on free regions is realized as follows. The knowledge of the free regions are obtained if VAs observe moving objects in the environment for a long time, assuming that the regions where the objects move around are free regions. In the developed system, VAs observe a robot controlled by a human operator in the task teaching phase (described in the next subsection), and recognize free regions as shown in Fig. 3. Collision avoidance is realized by generating a navigation plan so as to avoid a collision. In the developed system, if VAs nd that a robot is on a possible collision course with other robots or obstacles, they temporarily modify the destination of the navigation plan in order to avoid the collision. Navigation to a destination is realized by teaching a knowledge of paths for navigating robots to their destinations. In the following

Mobile Robot Navigation by a Distributed Vision System

7

1

A

Fig. 4

2

B

3

4

C

Navigation paths taught by a human operator

subsection, the task teaching phase is described. 3.2

Task teaching

The system switches into the task teaching phase with instruction of a human operator. In this phase, VAs memorize tasks shown by a human operator. The task consists of several subtasks, which are navigation between intersections, in this prototype. By linking subtasks, the system navigates robots to their destination. First, VAs detect robots in each view. Since the VAs are xed in an environment, they can easily detect objects by background subtraction. Then, the VAs distinguish robots from other objects, and identify robots by their colors. Each VA tracks the robot controlled by a human operator, and memorizes its apparent trajectory in the view as a navigation task. When the robot passes over a speci c place (e.g., in front of a building), the operator noti es the meaning of the place to the system. The system divides the tasks shown by the operator into several subtasks of navigation between intersections. In this prototype, the subtasks are directly shown by the operator in order to simplify the experimentation, since the experimental environment is small and there are only two intersections as shown in Fig. 4. The operator shows navigation paths 1 through 4 to the VAs as subtasks, and noti es the places A, B and C. In this prototype, A and C are navigation goals. The VAs can detect robots more robustly with redundant visual information. In Fig. 5, for example, the visual eld of VA4 does not overlap with

Toru Ishida

8

VA1 VA2 Robot 1

Robot 2 VA3

VA4

Overlaps of VAs' visual elds

Fig. 5

Vision agent

Observation

Observing the robot motion

Camera

Vi

Estimating camera parameters

i

Planning a robot motion

Memorized robot path

* i i

Selecting the plans

Integrating the plans

Controlling the robot

Robot Fig. 6

Communication between VAs and a robot in the navigation phase

those of VA1 and VA3. Therefore, when all VAs observe a robot with the same visual features, it is estimated that VA4 observes a di erent robot from that of other VAs. In the developed system, the overlaps are acquired by simultaneously observing a robot used in the task teaching phase from all VAs, then they are used in the navigation phase. In addition, the VAs can robustly detect robots using knowledge of local environments around them. In this prototype, for example, the VAs recognize the objects as robots if their bottom is in the free region in the view of the VAs as shown in Fig. 3. 3.3

Navigation of mobile robots

After the task teaching phase, the DVS navigates robots by iterating the following steps (see Fig. 6): 1. A robot requires the DVS to navigate itself to a destination.

Mobile Robot Navigation by a Distributed Vision System

View of VA i

9

(4) Vi wi

(3) Gi

Robot Memorized path (5) *i

i

(1) Nearest point

Vi* Gi* t

(2) Navigation goal Fig. 7

Generating a navigation plan

2. VAs communicate with each other to make a global navigation plan to its destination. 3. Each VA sets an instant navigation goal in front of the robot, then generates a navigation plan and sends it to the robot. 4. The robot receives the navigation plans from multiple VAs, then selects proper ones, integrates them, and moves based on the integrated plan. Each VA generates a navigation plan as follows. First, the VA estimates the nearest point on the memorized paths from the robot in its view (see Fig. 7 (1)). Then, the VA sets an instant navigation goal at a certain distance 1t from the estimated position (Fig. 7 (2)), and computes an angle i3 between the direction to the goal G = (gxi ; gyi)T and the current motion direction of the robot V = (vxi ; vyi)T as follows (see Fig. 7 (3), (4) and (5)): i

i

i3 = arcsin

where V 3 and G3 are: i

i

 V 3 1 G3  i

(1)

i

jVi3 jjG3i j !

!

V = 10 01 V ; G3 = 10 01 G : sin i sin i i is the tilt angle of the vision sensor of VA i, which is automatically estimated by observation (see Section 3.4). Note that we assume orthographic projection here. Each VA sends i3 to the robot as a navigation plan. When a VA has 3

i

i

i

i

Toru Ishida

10

detected an obstacle (including other robots) between the navigation goal and the robot, the VA modi es the goal so as to avoid the obstacle. After the robot received the navigation plans (i3 ) from VAs, it estimates the error of i3 to eliminate navigation plans which include large errors. The error of i3 includes an observation error of the motion direction of the robot and an estimation error of i (the estimated tilt angle of VA i). Here, we assume that the former error is inversely proportional to the apparent size of the robot wi in the view of VA i. The latter error (let this be 1i3 ) is related to the error of the estimated tilt angle i and is computed from equation (1) as follows: !

 V 3 1 G3  3 0 1 G3 0 V 3 1i = arcsin jV 30 jjG3 0j 0 arcsin jV 3 jjG3 j where V 3 0 and G30 are: i

i

i

i

i

i

i

i

i

(2)

i

! 1 01 V = 0 V; sin( i + 1 i ) ! 1 10 G: G3 0 = 0 sin( i + 1 i ) 30

i

i

i

i

1 i is the estimated error of i (see Section 3.4). Then, if the size of the robot observed by a VA is less than 2/3 of the largest of all, the robot considers that the navigation plan generated by the VA includes a relatively large error, and eliminates it. Furthermore, the robot also eliminates navigation plans the estimated errors of which (i.e., 1i3 ) are more than twice of the smallest of all. Next, the robot integrates the remaining navigation plans. Since they are represented with angles on a common coordinate system along the motion direction of the robot, they can be integrated by computing an average angle of them. Here, the robot computes an average angle of i3 weighted with the estimated error 1i3 as follows: Pi 3 k w 3  = Pi i i ; ki = i3 ; j1i j ki

(3)

where wi is an apparent size of the robot observed by VA i. Finally, the robot generates actuator commands based on 3. 3.4

Estimating camera parameters by observation

Mobile Robot Navigation by a Distributed Vision System

11

v u

View of VA i

i

Vi Robot

i

i

y V

x Fig. 8

Path of the robot

i

Estimating camera parameters

In general, the position of a vision sensor is represented with six parameters: rotation and translation parameters. Hosoda and Asada proposed a method for estimating the camera parameters by visual feedback. In the DVS, each camera observes moving robots in an environment in order to estimate the camera parameters in the same way. However, in the DVS, the bottom of the robot is regarded as its position so that the robot position measured by observation is imprecise, which makes it dicult to estimate all the six camera parameters. Therefore, we estimate three parameters i and i as shown in Fig. 8 in an on-line manner, which are needed for robot navigation if orthographic projection is assumed. 3)

[1]

Estimation method

Let x and y be reference axes of rectangular coordinates, where the direction of the x axis indicates the motion direction of the robot, and let i, i and i be the tilt angle of VA i, the angle between VA i and the y axis, and the rotation angle around the viewing direction of VA i, respectively. Assuming orthographic projection, the velocity of the robot V is projected into the view of VA i as follows: V i = S i T i Ri V where the vector V i = (ui ; vi)T is the velocity projected in the view of VA i, Ri and Si represent rotation matrices of i and 0 i , respectively, and T i represents a matrix of orthographic projection:  0 sin  i i Ri = cos sin i cos i ;

Toru Ishida

12

 cos i

sin i  ; cos i

S i = 0 sin i 1 0  T i = 0 sin : i

Hence, the velocity V is represented as follows using V i = (ui; vi )T : V = R0i 1T 0i 1S 0i 1 V i 0 sin i 1  0  cos i sin i A ui =@ 0 cos 0 sin i sin i vi i where u0i and vi0 are:  u0  i

vi0

Therefore,

(4)

= S 0i 1V i

 0 sin   u  i i i = cos sin i cos i vi :  0 2

V 2 = u0i2 + sinvi i

(5)

:

If a human operator controls the robot with a constant speed, jV j is a known value. Consequently, i can be computed with the following equation: s

02

(6) sin i = V 2v0i u02 (vi0 6= 0): i Furthermore, the component y of the velocity V is always zero, so that i can be computed from equation (4) as follows: v0 u0i sin i 0 i cos i = 0: sin i

(7)

By observing two velocities of a robot (i.e., observing two di erent V i ), i , (two di erent) i and i are acquired based on equations (5), (6) and (7). In this experimentation, in order to simplify the estimation we assume i = 0, that is, the cameras are set up in parallel with the plane where robots move. On this assumption, i and i are computed with equations (6) and (7), respectively, by observing one motion of a robot. Note that, in practice, the velocity of the robot V i is normalized with wi (the size of the robot in the view of VA i).

Mobile Robot Navigation by a Distributed Vision System

13

45 1%

40

5%

10%

[degree]

35 30 25 20 15 10 5 0 0

10

20

Fig. 9

Table 1

i

30

40 50 60 [degree]

Estimation error 1

70

80

90

i

of vision agents estimated by observation

VA VA1 VA2 VA3 VA4 Actual 30 31 9 28 Observation 1 21.7* 35.8 30.4* 6.49* Observation 2 24.9 8.08* 16.9* 34.4 [2]

Estimation error

The relation between an observation error of the robot velocity (1ui ; 1vi ) and an estimation error of the tilt angle 1 i is represented as follows from equation (6): ) (s ( vi + 1vi )2 0 1 1 i = sin V 2 0 (ui + 1ui )2 0 i where we assume i = 0. Fig. 9 shows 1 i when i = 30 , and 1ui and 1vi are 1%, 5% and 10% of jV j. In Fig. 9, the horizontal axis indicates i since ui and vi are determined by equations (6), (7), and i . Thus, the estimation error 1 i becomes larger when i approaches zero, that is, when the velocity of the robot approaches the horizontal direction in the view of VA i. Note that 1 i is used in equations (2) and (3) for integrating navigation plans generated by multiple VAs. Table 1 shows an example of the tilt angles (in degrees) of four VAs estimated by observing two motions of a robot as shown in Fig. 10. Comparing

Toru Ishida

14

VA1

VA2

VA3

VA4

Observation 1

Observation 2

Fig. 10

Images used for estimating

i

with the actual angles (indicated as `Actual' in Table 1), the estimation error becomes larger (values denoted with `3' in Table 1) when the velocity of the robot approaches the horizontal direction as discussed above. The estimated parameters are not exactly precise, however, the DVS can still navigate robots with the estimated parameters though the robots move in a zigzag. This is because the navigation plan is represented with a di erential angle between the current motion direction of the robot and the direction to an instant navigation goal. Therefore, the direction (right or left) in which the robot is navigated is not a ected by the estimation error 1 i. Furthermore, the DVS can successfully navigate robots by integrating navigation plans generated by multiple VAs. x4

Experimentation We have developed a prototype system based on the concept of the distributed vision described above. Fig. 11 shows a model town and mobile robots used in the experimentation. The model town, the scale of which is 1/12, has been made for representing enough realities of an outdoor environment, such as shadows, textures of trees, lawns and houses. Sixteen VAs have been established in the model town and used for navigating two mobile robots. Fig. 12 shows the hardware con guration. Images taken with the VAs are sent to image encoders which integrate sixteen images into one image, then sent to a color frame grabber. The size of the whole image is 6402480 pixels, and each VA's image is 1602120 pixels. The main computer, Sun Sparc Station 10, executes sixteen VA modules, which process data from the color frame grabber at 5 frames per second, and communicate with the two robots through serial devices. The robots avoid a collision based on VA's commands, however, if a collision is detected with their touch sensors, they move backward and change

Mobile Robot Navigation by a Distributed Vision System

15

Robots Vision agents

Fig. 11

Model Town

Encoders

NTSC

4 4 4 4

Video camera x 16 Fig. 12

1 1 1 1

Robot x2

4

Sun SS10

1 NTSC Serial

Wireless unit x2

Hardware con guration

the direction in order to avoid the collision. In this prototype, a human operator shows VAs two robots and each VA memorizes their colors (red and black) in order to identify them. Then, the operator shows several paths using one of the robots. Finally, the system simultaneously navigates the robots along the paths shown by the operator. Fig. 13 shows images taken by VAs in the navigation phase. The vertical axis and

Toru Ishida

16

Time (sec) 0

VA 3

4

5

6

9

11

16

5 10 15 20 25 30 35 Fig. 13

Images processed by VAs

the horizontal axis indicate time and the ID numbers of the VAs, respectively. The solid and the broken rectangles indicate VAs selected for navigating the red and the black robot, respectively. As shown in Fig. 13, VAs are dynamically selected according to navigation tasks. In other words, the system navigates robots observing them from proper viewing points. Fig. 14 shows robot trajectories navigated by the DVS exhibited at the international conference IJCAI'97. The sixteen cameras were simply set up so as to cover the whole environment. Although their locations were not measured, the system continuously navigated the robots for three days during the exhibition. The concept of the DVS, such as simple visual functions, exible navigation strategies and use of redundant visual information, realizes robust navigation in a complex environment. x5

Discussion and Conclusion We have developed a prototype of the DVS. In this paper, we mainly de-

Mobile Robot Navigation by a Distributed Vision System

Fig. 14

17

Robots navigated by the DVS

scribed the details of the navigation method of mobile robots using multiple VAs. In addition, the prototype system partly deals with communication among VAs and robots, and also deals with construction and management of environment models. With the experimental result of robot navigation, we have con rmed that the DVS can robustly navigate mobile robots in a complex environment. In distributed arti cial intelligence, several fundamental works such as (DVMT) and (PGP) dealing with systems using multiple sensors have been reported. In these systems, which are based on the blackboard model , agents symbolize sensory information with a common representation, and gradually proceed with their recognition by exchanging them. Thus, these systems deal with recognition based on symbolized information. On the other hand, the purpose of the DVS is to navigate robots. In the task teaching phase, the VAs independently memorize navigation paths as robot tasks from their own viewing points without symbolizing them. In the navigation phase, the VAs plan a global path of a robot by communicating with each other, generate instant navigation plans, and the robot generates an instant actuator command based on the plans. Thus, the DVS deals with motion recognition by multiple agents, and regeneration of the robot tasks by cooperation of the agents. As a future work, more detailed communication among VAs should be Distributed Vehicle Monitoring Testbed

4)

Partial Global Planning

5)

6)

Toru Ishida

18

considered. In the experimentation of Section 4, a human operator controls a robot to show subtasks of the robot (i.e., navigation between intersections) while directly indicating speci c places such as intersections, and the VAs learn the subtasks by observing the robot motion. However, they need to communicate with each other to autonomously lean the subtasks. In addition, in the navigation phase, the VAs communicate with each other to make plans for navigating robots to their destinations, however, in the prototype the communication is simpli ed and specialized for the small experimental environment. For real world applications, more sophisticated communication will be needed to perform

exible planning by VAs' communications. Furthermore, the following issues should be considered for extending the scale of the DVS:  more accurate identi cation of multiple robots,  dynamic organization for navigating many robots by a limited number of VAs. With respect to the rst point, the DVS does not utilize a geometrical map, which realizes robustness and exibility of the system. Instead, it will be achieved by observing relations between robot commands and actual robot motions, and utilizing a qualitative map which represents rough positional relations of VAs. With respect to the second point, we have to analyze behaviors of the system in such a situation and develop more sophisticated communication among the VAs. Recent developments of multimedia computing environments have established huge number of cameras and computers in oces and towns. The DVS is considered as such growing infrastructures for robots consisting of sensors and networks. They are expected to be integrated to become more intelligent systems | (PII) , which provides various information to real world agents such as robots and humans. 7)

Perceptual Information Infrastructure

2)

References 1)

Ballard, D. H., \Reference frames for animate vision," 1635{1641, 1989.

2)

Ishiguro, H., \Distributed vision system: A perceptual information infrastructure for robot navigation," Proc. IJCAI, pp. 36{41, 1997.

3)

Hosoda, K. and Asada, M., \Versatile visual servoing without knowledge of True Jacobian," Proc. IROS, pp. 186{193, 1994.

4)

Lesser, V. R. and Corkill, D. D., \The distributed vehicle monitoring testbed:

Proc. IJCAI

, pp.

Mobile Robot Navigation by a Distributed Vision System A tool for investigating distributed problem solving networks," pp. 15{33, 1983.

19

AI Magazine

,

5)

Durfee, E. H. and Lesser, V. R., \Partial global planning: A coordination framework for distributed hypothesis formation," IEEE Trans. SMC, Vol. 21, No. 5, pp. 1167{1183, 1991.

6)

Erman, L. D., Hayes-Roth, F., Lesser, V. R. and Reddy, D. R., \The HearsayII speech-understanding system: Integrated knowledge to resolve Uncertainty," Comput. Surveys, Vol. 12, pp. 213{253, 1980.

7)

Sogo, T., Ishiguro, H. and Ishida, T., \Acquisition of qualitative spatial representation by visual observation," Proc. IJCAI, pp. 1054{1060, 1999.