Modifed Newton' s Method Applied to Potential

Proceedings of the 42nd IEEE Conference on Decision and Control Maui, Hawaii USA, December 2003

FrP08-4

M o d i f e d N e w t o n ' s M e t h o d A p p l i e d to P o t e n t i a l F i e l d Navigation Kenneth A. Mclsaac, Jing Ren and Xishi Huang Dept. of Elec. and Comp. Eng. University of Western Ontario Elborn College, 1201 Western Rd London ON Canada N6G 1H1 kmcisaacQengga.uwo.ca, jren2Quwo.ca, xhuang22Quwo.ca (steepest descent) in this scalar field. The steepest descent approach is satisfactory in most situations, but has some inherent flaws. Solutions tend, for example to exhibit "zig-zagging" (rapid changes in direction), especially in proximity to obstacles. Although stability can be proven m a t h e m a t i c a l l y [9], in practice, this zigzag phenomenon means t h a t either step sizes (controller sampling rates) must be short or robot speeds must be slow. Also, gradient descent methods tend to complete their tasks slowly, as m a n y "unnecessary" controls are generated. To overcome the above shortcomings, in this paper, we present a potential-field navigation scheme based on a modified Newton's m e t h o d t h a t retains all the merits of the potential-field approach, while at the same time eliminating zigzagging, achiering faster task completion and allowing a bigger step size (faster speed).

Abstract

In this paper, we propose the use of a modification of Newton's method for potential field navigation. The use of the modified Newton's method, which applies anywhere (79 continuous navigation functions are defined, greatly improves system performance in the presence of obstacles, when compared to the standard gradient descent approaches. To illustrate the technique, we also propose a hierarchical software architecture for robots that supports multi-robot coordination and can be easily extended for other applications. Simulations show that a robot team can accomplish a cooperative material-handling task in an initially unknown environm e n t while avoiding collision with static obstacles and other team members. We derive a control law based on the modified Newton method that guarantees team stability for all time.

A second important issue in the design of a coordinating robot t e a m is the choice of coordination strategy and communication type. In this paper, we adopt a type of peer-to-peer negotiation coordination based on explicit communication [1]. There is no leader in the team; each robot communicates and negotiates with other robots using a semantic coordination protocol [6] and makes its own coordination decision solely based on its best knowledge of the world model. To gather information, robots use sensors as well as inter-robot communication performed through a socket-based channel. At pcriodic intervals, robots broadcast their internal maps to the other robots. Maps received via broadcast are merged with information derived from personal exploration (internal maps) to form a common world model. In the limit of perfect communication, this means t h a t the robots share a "collective consciousness" of their environment. In this paper, we restrict our interest to a homogeneous team, since it simplifies the task of modelling teammates.

1 Introduction

The use of multi-robot teams in mobile robotics has a number of potential advantages over single robot systerns. A group of simple, general purpose t e a m merehers working together can accomplish the task of a complex, purpose-built system in a fraction of the time, and the built-in redundancy of having m a n y t e a m members leads to a more robustness and fault tolerance. However, for an important class of problems, tasks can not be accomplished by individual t e a m members, which means t h a t some mechanism must exist for real-time coordination of activities. In this paper, we will consider one such problem: a search-and-carry-back task t h a t requires coordinated materials handling. In this paper, the central issue we consider in the design of our multi-robot t e a m is the motion planning strategy. Potential field approaches [8, 5] are widely used for motion planning in mobile robotics because of their simplicity and elegance. In Koditschek's basic formulation [7], a scalar field comprising artificial "hills" (representing obstacles, or other robots) and "valleys" (attractive positions) in the robot's world map lead naturally to a stable path towards a "low-energy" goal position. Control laws for potential field systems typically generate motion controls along gradient directions

By combining the technique of artificial potential fields, optimization and ideas from agent theory, we show t h a t our robot t e a m can successfully perform a coordination task in an initially unknown environment while avoiding collision with static obstacles and other t e a m members. At the same time, we present a control law t h a t guarantees t e a m stability for all time, and has the potential to be local-minimum free.

1This work was supported by an NSERC grant.

0-7803-7924-1/03/$17.00 ©2003 IEEE

5873

2 A s s u m p t i o n s of R o b o t C a p a b i l i t i e s

I Actuator I

I Sensor

I

Coordinationr-I Morecoordination I ',devices I I Sh~Info I I HelpNego I Kn°wledgeBaseH ~ . ~ t

For the purposes of algorithm development, we assume a certain set of robot capabilities, all of which can be iraplemented by off-the-shelf components. The robots are assumed to have knowledge of their position, t h r o u g h a positioning system such as GPS. Robots are assumed to have a sensor (ie: camera) capable of determining the relative position of obstacles and targets. Finally, we assume t h a t robots have omnidirectional navigational capabilities. In future work, we will explore modifications to our control structure t h a t allow us to relax some of these assumptions.

I DecisionMakingI

I

II

Communicatio~..~ Receiver

I L : ;r/ I

As mentioned above, we assume t h a t the robots communicate sensor and pose information to create a shared global knowledge base. The design of our agentbased guidance software is presented in detail below in Section 4.

[ Otherteammember ] F i g u r e 1" Agent based navigation software architecture. Local and remote sensor information are integrated into the knowledge base, and no distinction is made between them during motion planning.

3 Problem Statement We test our approach in simulations of a simple searchand-carry-back application, similar to those studied in [1]. We assume a t e a m of Q robots is assigned to search an u n k n o w n field. The configuration q~ of each robot is given by the vector qi - (xi, yi) of the position of its center of mass. We also define q = (ql, qg, ..., qQ) as the state vector of the robot team. The area of the field is known to be M x N units, and the area is assumed to contain an u n k n o w n n u m b e r of convex obstacles and targets. For the purposes of searching, the area is divided into an appropriate n u m b e r of known and unknown sectors. The area of one sector corresponds to the area t h a t can be surveyed in one sensor sweep.

the result in the KBase object, without interfering with regular decision making. As a result, it is a simple matter to add new coordination devices if necessary for new applications. In this application, we use two coordination devices, called Sharelnfo and HelpNego. Robots communicate using socket-based E t h e r n e t ruessage exchange. All messages are broadcast using specific sockets and received by all robots. We define two classes of messages: regular u p d a t e messages and negotiation messages. Regular u p d a t e messages are periodically broadcast to other t e a m m e m b e r s by the ShareInfo object and contain the robot's current pose information and any new d a t a added to the KBase (searched cells, found objects and found obstacles). Robots receiving this information will u p d a t e their own KBase objects, to synchronize the t e a m world model. Negotiation ruessages are exchanged by the HelpNego object only when necessary, and are used to coordinate the carrying of big objects.

The robot t e a m is required to search the u n k n o w n work area, find all the target objects and r e t u r n t h e m to a (known) goal position. Target objects are defined as "big" and "small". By definition, a single robot can t r a n s p o r t a small object, but big objects can only be t r a n s p o r t e d by teams of two robots, and will require task coordination. The robot's sensor is assumed to be able to discriminate between big and small objects.

4.1 D e c i s i o n M a k i n g The Decision Making (DM) layer is the main intelligence in the system, and generates motion plans based on the information stored in the KBase. The DM performs motion planning based on on a hybrid-systems a d a p t a t i o n of our modified potential field approach. Hybrid system theory [4] is concerned with the time evolution of systems represented with a continuous state vector q and a discrete mode S, such t h a t in mode & the dynamics of q are governed by the differential equation q = f~(q). In our DM layer, we associate a different navigation function with each mode, and the robots switch between modes based on external events. In this section we define our mode switching protocol. We will discuss the navigation functions in a later section 6.1.

4 Software A r c h i t e c t u r e Our control software (Figure 1) is s t r u c t u r e d in four hierarchical layers, called the DecisionMaking (DM), KnowledgeBase (KBase), Coordination and Communication layers. We define the functionality of each layer as well as a message passing interface between layers. Control software is multi-threaded, with each main ohject running in its own thread. Devices at the Coordination layer separate coordination tasks from the main periodic motion planning task. Coordination between robots is performed using an unambiguous communication protocol [6]. We define interraces between the coordination devices and b o t h the KBase and Communication layers. Devices at the Coordination layer perform coordination tasks and store

In our simulation of the search-and-carry-back task, robots switch between five generic modes, called Search,

5874

A ssignmentInitiate, A ssignmentReply, TransportSObject, TransportBObject. In the Search mode, the robot is a t t r a c t e d to unexplored territory. In the AssignmentInitiate mode (entered when a robot has collected a big object), the robot's HelpNego device starts

mode X. During Search modes, the unsearched cells are the attractors, and therefore NA = Nu. During the transport phase, there is only one attractor (the goal), so NA = 1. In general, then:

an assignment-initiate session and waits for a helper robot to arrive. In the A ssignmentReply mode (entered when robots receive assignment-initiate messages from a t e a m member who has found a big object), the robot's HelpNego object enters into an assignment-reply session but continues its other tasks. In the TransportSObject mode (entered when a robot has collected a small ohject), the robot is a t t r a c t e d to the goal. In the TransportBObject mode (entered when a helper robot arrives at a big target), two robots are bonded together with the target and a t t r a c t e d to the goal. In all five modes, robots are repelled by obstacles and other t e a m merebers.

t O x (qi) represents the sum of the effects on robot i of all the known, fixed obstacles in the system during mode X. This value is always No, so we can state the general case as: No

1

(X--rx)2+(y--ry) cr 2

6.1

Derivation

of Modal

6.2

Law

dient of ~ x (q) with respect to only qi, and the parameter c~ determines robot speed. We use a unit gradient because our Gaussian-like potential functions decay very rapidly and robot speed should not depend on position.

(l)

The m a t r i x B(q), introduced in Equation 7, is based on the modified Newton's method, familiar in optimization theory [2]. We define B(q) -- (cI + H) -1 as a positive definite matrix, for c = c(q) _> 0 and H the Hessian matrix of ~ x (q). To prove t h a t B ( q ) i s positive definite, we first recall the following:

2

(2)

L e m m a 1: A is a positive definite matrix if and only if all its eigenvalues are positive. L e m m a 2: If matrix A is positive definite, then A -1 is also positive definite.

Function

VR(q{, %)

(6)

In Equation 7, the operator avx aq~(q) represents the gra-

W i t h these lemmas, we can prove the following theorem:

Using the technique of artificial potential fields, we construct a navigation f u n c t i o n , ~X(q) for each robot i in mode X (where X can be any of the modes defined above). To construct the navigation function for a given mode, we use the three part formula:

< x (q) _ VA x (q{) + t o x (q{) + ~

Control

~2

Using the navigation function defined above, the (mode-dependent) dynamics of each robot are then given by the control law:

Control Laws

of the Navigation

(5)

1 (xi_xj)2@(yi_yj)2C VR(q~, qj) = e

For some positive integer C. The variance, or, is a measure of the size of the obstacle. The variable C determines the effective range (steepness) of the obstacle. We represent convex obstacle shapes with a superscribed circle.

6 Derivation

(xi--(rk)x)2+(yi--(rk)y)2 C) c~2

Finally, the functions VR(qi, %) represent the repulsor functions between pairs of robots i and j. Note t h a t VR is not mode dependent, since the number of robots is assumed to be constant. Thus, in general:

Repulsive points, such as obstacles and other robots, located at (r~, ry) are modeled with the circular, twodimensional Gaussian-like repulsor function:

fR(x, y) -- e

1

e __~

k=l

We base our potential fields on the two-dimensional Gaussian function [9]. Attractive points, such as targets and the goal position, located at (as, ay) are represented by the negative Gaussian attractor function:

fA(x,y)=l--e-

(

lO x ( qi ) -- E

Field Models

(cc--acc)2+(y--ay )2 2~2

(4)

k=l

Based on these five generic modes, a robot can be in an indefinite (but finite) number of specific modes, based on sensor information and communication information, and specifically, on the numbers of unsearched sectors, found obstacles and found targets. This mode definition allows us to guarantee t h a t the mode-switching graph will be acyclic. A detailed descriptions of our mode-switching strategy can be found in [9].

5 Potential

NA I l - e - (xi--(ak)x)2@(Yi--(ak)y)2 ). ~

VA x ( q ~ ) - ~

Theorem:

Given

6

>

0,

let

c

(h11+h22)--V/(h11--h22)2+4h12,0}, then B is a positive ~efinite matrix.

(3)

=

max{6(cI + H) -1

Proof: If the matrix cI + H is positive definite, then B = (cI + H) -1 is also positive definite. Therefore, it suffices to prove t h a t c I + H is positive definite under given conditions.

In Equation 3, VA x (qi) represents the sum of the effects on robot i of all the NA attractors in the system during

5875

Let # 1 , ' " , #~ be the eigenvalues of H and , ~ 1 , " ' , , ~ the eigenvalues of s I + H . Because H is a real symmetric matrix, the eigenvalues > 1 , . . . , >~ are real, but may not be positive.

The modified Newton's m e t h o d has the following three advantages over the steepest descent method:

Speed of task completion: In most optimization problems, the modified Newton's m e t h o d will converge much faster than steepest descent method. In the language of robot motion planning, this means t h a t robots with the same control interval complete their tasks much faster using a Newton's m e t h o d based navigation approach than using a gradient descent approach. In our simulation with 1 target and 3 obstacles (Figure 2), robots using the modified Newton's m e t h o d reach the goal after in 115 control intervals, compared to 335 for the steepest descent method.

Given 5 > 0, let c _> 0 be the smallest scalar t h a t would make ~ _> 5, Vi. According to L e m m a 1, choosing c to satisfy this condition will guarantee s I + H is positive definite and invertible. To complete the proof, we find a construction for c. First, observe:

I~,Z -

(~z

+

H)I

-

I(~, - ~ ) z #z- HI

-

H

0 Therefore, #i -- , ~ i - c, and the problem of finding an c to guarantee B(q) is positive definite reduces to finding the eigenvalues of H ( q ) .

#I-

H

ff-

hll

-h12

-h21

ff-

h21 h22

_

6 ..........................................................................................................

hi 2

#2 _ (hll -I- h 2 2 ) # - 'h22 -I- h11h22 - 0

5 7--:-:--:-:4:

(h11+h~)+ v/(h**- h~)~ +n(hf~)

#1

,

Gradient Method

'

"............ # ..............................

# ...........................

/@

.......

/,,

(h**+h~ )- v/(h, , - h ~ )~+n (hf~) #2

4 .........

Now, we can guarantee ,~

-

c

;,i/":...................

ill.- .....................................................................

2

"\

i

i

.

+ >~ _> 5, Vi, by choosing:

........"i

........ . . . . . . .

c -- m a x ( 5

where > , ~

-

-

rnirt(/tl,/t9)

- p ~ i ~ , O)

(s)

is the smallest eigenvalue of

i

H.

i Newton's Methiod

:

1 .............................

i................................

i...........................................

#

We make the following remarks about this control law: 0

R e m a r k 1. From the viewpoint of optimization theory [2], the above control law corresponds to the roodification of Newton's method. If s = 0, the control law reduces to Newton's method, which enjoys a secondorder rate of convergence. If c ~ oc, the modified Newton's m e t h o d approaches a pure gradient m e t h o d t h a t has only a linear convergence rate. Generally speaking, Newton's m e t h o d is more efficient than the gradient method.

0

1

2

3

4

5

6

7

8

9

Figure 2: Comparison of the speed of task completion. The robot using our modified Newton's method based planner reaches its goal after 115 control intervals. During that time, the robot using gradient descent makes almost no progress.

.

R e m a r k 2. The positive 5 is a threshold to avoid the ill-conditioning problem, so it is i m p o r t a n t to properly select 5. If 5 is chosen to be very small to ensure the asymptotic quadratic convergence rate because of the reduction of the m e t h o d to Newton's method, then illconditioning might occur at points where the Hessian H is (near) singular. On the other hand, if 5 is chosen to be very large, which would necessitate using a large value of c and would make B diagonally dotalnant, then the m e t h o d would behave similarly to the steepest descent method. R e m a r k 3. The problem solved in this paper is twodimensional and we can find the analytical solution, so we can overcome several known difficulties of Newton's m e t h o d such as the amount of computation involved and the difficulty of inverting the Hessian matrix.

.

R e m a r k 4. In motion planning for robots, it is more reasonable to adopt a constant step size (control interval) rather than optimal step size in order to let robots move smoothly.

5876

Overcome the zig-zag phenomenon: Controls using the steepest descent m e t h o d exhibit a zig-zag phenomenon, in which the trajectory of robots moving in the presence of obstacles becomes oscillatory. This phenomenon can become quite severe when robots work in a long-narrow-valley potential field (i.e. high potential value on both sides and low potential value in the middle), which occur quite often with a complex configuration of multiple targets and obstacles. Modified Newton's m e t h o d doesn't have this flaw and therefore has a steady good performance. Bigger step size: Although Newton's m e t h o d and steepest descent m e t h o d both require the optimal step size in optimization theory, we modify them to use fixed step size (control interval). One reason is t h a t searching for optimal steps increases the computation time t h a t is limited in real-time applications. A second reason is t h a t frequent changes to robot speed are probably not practical, due to constraints on the robot's dynamics. However, fixed step sizes m a y cause instabilities

and show t h a t the dynamics within any given mode are stable.

~°I..... /

4.75

:

: /

:

:~: ...... : Q

: /J

:

i

.. :

i

i i

...... ~