Resistors, Markov Chains and Dynamic Path

0 downloads 0 Views 386KB Size Report
tion to resistor circuits and their accompanying currents ... proportional with the number of grid elements and reduc- ... such that f(x, y) = g(x, y) for all boundary points, then ... ficient to only consider the exact solution at a finite num- ... This is relevant when we are idealizing our path planner as an optimal random walker.
Proceedings of the 2002 IEEE International Conference on Robotics & Automation Washington, DC • May 2002

Resistors, Markov Chains & Dynamic Path Planning John S. Zelek School of Engineering, Univ. of Guelph Guelph, ON, N1G 2W1, Canada. e-mail: [email protected] changes between neighbouring nodes thus extending the limits of resolution plausible on a computer (i.e., there even is a constraint of 7:1 ratio of width:length of open space representable on 32 bit machines [4]). In most applications notification of convergence is essential for devoting computational resources from iteration kernel iteration to other tasks for the autonomous agent such as sensing or reasoning. In addition, most uses of harmonic functions for planning have assumed a regular grid sampling which may be inefficient depending on the environmental map configuration. Computation time increases proportional with the number of grid elements and reducing the number of grid elements is therefore beneficial. Therefore an irregular sampling strategy is desirable (a quad tree representation is shown).

Abstract Dynamic planning involves continuously updating a map by sensing changes in the environment and planning appropriate actions, with all tasks sharing common computational resources. We use harmonic functions for dynamic planning. Analogous representations of harmonic functions as Markov chains and resistor networks are used to develop the notion of escape probability and energy dissipation. These measures are used to indicate convergence (event that permits resources to be devote to nonplanning tasks) more robustly than monitoring maximum or average field changes between iterations. The convergence of the harmonic function is related quadratic-ally to the number of grid elements. An example of an irregular sampling strategy - quad tree - is developed for harmonic functions, which is complete yet imprecise. Quad trees are not a sufficient sampling strategy for addressing the exponential growth of multiple dimensions and therefore current investigations include other sampling strategies or dimensional parallelization.

Sensor-based discovery path planning is the guidance of an agent - a robot - without a complete a prior map, by discovering and negotiating the environment so as to reach a goal location while avoiding all encountered obstacles. The dynamic path planning problem extends the basic navigation planning problem [5].

1 Introduction 2 Harmonic function Path Planning Typically path planning problems have been posed as an AI search problem [1]. One particular path planning technique we have used is based on solving Laplace’s equation using iterative finite-difference equations [2]. This technique is equivalent to looking at the problem as random walks on finite networks which are regarded as finite state Markov chains [3]. There is also an logical connection to resistor circuits and their accompanying currents and voltages. This connection allows us to make use of electrical network theory and its tools in analyzing the path planning problem. In particular, we find that using escape probability or circuit energy dissipation measures give better indication of convergence of the field when compared with observing changes between iterations (either average or maximum change). Other methods [2] have typically looked at changes (average or maximum) between iteration field values. Theoretically, this is an adequate approach. However, it is flawed practically. Harmonic function discrete components experience small

0-7803-7272-7/02/$17.00 © 2002 IEEE

The computation of a potential function from which the path is generated is performed over an occupancy grid representation of a map. It is executed as a separate process from the path executioner. The path executioner requests trajectories which are computed by performing steepest gradient descent on the potential function. Linear interpolation is used to approximate potential function values when the position of the robot is between mesh points. It is also assumed that space is void of obstacles when its actual composition is unknown. This computation has the desirable property that map features can be added or subtracted dynamically (i.e., as they are sensed) alongside with the correction of the robot’s position. In addition, these events are independent of the computation and the execution of the path. As a result, localization of the robot can be performed independent of path planning and execution. These properties are available because the potential function is a harmonic function.

4249

A harmonic function on a domain Ω ⊂ Rn is a function which satisfies Laplace’s equation:

∇2 φ =

Xn δ 2 φ =0 i=1 δx2 i

Let U0,0 define the origin of a Cartesian coordinate space. The space does not necessarily have to be confined to two dimensions. In addition, rather than Cartesian space, we could also use Configuration space C, which is a space that is expressed in terms of the degrees of freedom of the robot. If equal sampling intervals are assumed, i.e., Ui+∆i,j+∆j = Ui+1,j+1 , then the sampling interval when ∆i = ∆j is determined by the size of the robot and the spacing between obstacles. The robot is modelled as a point, and obstacles are padded accordingly. Nonsymmetrical shapes (e.g., rectangular) can either be addressed using the maximum direction. However, in tight environmental situations, the minimal inscribing circle can be used and orientation constraints can be introduced into the grid.

(1)

The value of φ is given on a closed domain Ω in the configuration space C. For mobile robot navigation, C corresponds to a planar set of coordinates (x, y). A harmonic function satisfies the Maximum Principle and Uniqueness Principle [3]. The Maximum Principle states that a harmonic function f (x, y) defined on Ω takes on its maximum value M and its minimum value m on the boundary and guarantees that there are no local minima in the harmonic function. The Uniqueness Principle stipulates that if f (x, y) and g(x, y) are harmonic functions on Ω such that f (x, y) = g(x, y) for all boundary points, then f (x, y) = g(x, y) for all x, y.

3 Other Frameworks 3.1 Markov Chains Equivalent to iteration kernels, the solution of the harmonic function can be found using the method of Markov chains [3] using probability functions for a random walk on a Markov chain [3]. A Markov process is a random process whose transition probabilities at the current time do not depend on prior transitions. A finite Markov chain is a special type of change process that moves around the set of states S = {s1 , s2 , ..., sr } in a Markovian fashion. When the process is in state si , it moves with probability Pij to state sj . The transition probabilities Pi,j are represented by the transition matrix. Each cell in the occupancy grid representation is a potential state for the transition matrix. In an equal spaced grid where each grid element is 4-connected, the probability of transition for an interior node is 14 . If P n is the matrix P raised to n the n’th power, the entries Pi,j represent the probability that the chain, started in state Ui,j will be in state Ui0 ,j 0 after n steps. A Markov chain is a regular chain if some power of the transition matrix has no zeros. States that are referred to as traps (i.e., absorbing states) are typically the goal states, the cells that were assigned to a low potential.

Iterative applications of a finite difference operator are used to compute the harmonic function because it is sufficient to only consider the exact solution at a finite number of mesh points. This is in contrast to finite element methods which compute an approximation of an exact solution by continuous piece-wise polynomial functions. The obstacles and grid boundary form boundary conditions. The boundaries are fixed to a high potential value while the goal location(s) is fixed to a low potential. Points that are not boundary or goal points are allowed to fluctuate (i.e., the free space) and produce the harmonic function when computed using an iteration kernel, which is formed by taking advantage of the harmonic function’s inherent averaging property, where a point is the average of its neighbouring points. A five-point iteration kernel for solving Laplace’s equation is used: Ui,j =

1 (Ui+1,j + Ui−1,j + Ui,j+1 + Ui,j−1 ) 4

(2)

The computation of the harmonic function can be formulated with two different types of boundary conditions. The Dirichlet boundary condition is where u is given at each point of the boundary. This inherently makes all applicable boundary points into sources (i.e., in terms of sources and sinks for modeling liquid flow). The Neuδu , the normal commann boundary condition is where δn ponent of the gradient of u, is given at each point of the boundary. In order to have flow, there has to be a source and a sink. Thus, the boundary of the mesh is modeled as a source, and the boundary of the goal model is modeled as a sink. The boundaries of the obstacles are modeled according to the type of boundary condition chosen: Dirichlet or Neumann. For our applications, Dirichlet boundary conditions are used because of their inherent property of minimizing the hitting probability. Hitting probability minimization is ideal when uncertainty is present in obstacle and robot location mapping.

3.2 Resistor Networks Yet another alternative method for solving a harmonic function as opposed to an iteration kernel or Markov random chains is to analyze the grid network as a resistor network [3] and to also draw this interpretation into the language of a random walker on a collection of grid points. This is relevant when we are idealizing our path planner as an optimal random walker. The finite collection of points (i.e., vertices, nodes) can be viewed as a graph structure with point pairs connected by edges (also called branches). The graph structure we have been referring to is regular and 4-connected, meaning that all interior points of the graph are connected to four equally-spaced neighbours (to the right, left, up or down). We can also

4250

talk about 8-connected networks, but we will confine the discussion to the former. Let the nodes in the graph encode voltages and the edges encode resistor values, which directly correspond to spatial distances. In a 4-connected network, the resistor values are always equal and set to unity.

P P ( x Ihx )( x Rhx ) Ih X Rhx = (4) pescape = ( ) x Vh Vh where the numerator is just the summation of the voltages to the neighbouring nodes: P (Vhx − V(h−1)x ) pescape = x (5) Vh

Let the voltage to the circuit be applied across the obstacles (include the boundary) and goal locations. When a unit voltage [3] is applied between h (i.e, obstacles) and l (i.e., goal(s)), making vh = 1 and vl = 0 (i.e., Dirichlet boundary conditions), the voltage vx at any point x represents the probability that a walker starting from x will return to h before reaching l. The voltage values can also be referred to as a hitting probability. The definition for current across a connected node directly corresponds to the gradient in that particular direction. The current Iij is proportional to the expected net number of movements along the edge from i to j where movements from j back to i are counted as negative. Steepest gradient descent represents choosing the target current value moving outwards from a particular node. The gradient in the potential (i.e., 4) directions represents a current. It also corresponds to minimizing the hitting probability, which is what the voltage encodes.

Applicable to escape probabilities is Rayleigh’s Monotonicity Law and its variants. Rayleigh’s Monotonicity Law states that if the resistances of a circuit are increased, the effective resistance REF F between any two points can only increase. If they are decreased, it can only decrease. Similar rules that result out of applying the Monotonicity Law are as follows: • Shorting Law: states that shorting certain sets of nodes together can only decrease the effective resistance of the network between two given nodes. • Cutting Law: states that cutting certain branches can only increase the effective resistance between two given nodes.

The resistance Ri,j of an edge i, j leads us to also defining the conductance of an edge i, j as Ci,j = R1i,j . A random walk on G is a Markov chain withP a transition matrix P C given by Pxy = Cxy where C = x y Cxy . The property x of a Markov chain encoded as a graph that is connected (the walker can go between any two states) is called an ergodic Markov chain. We will only consider these types of Markov chains.

Decreasing the effective resistance, increases the escape probability while cutting decreases the escape probability. This is applicable for dynamic path planning when new static obstacles are discovered and more immediately applicable for dynamic obstacle detection and updating.

The high and low potentials referred to earlier, can also be cast as an applied voltage between two or more points. By Ohm’s law, the voltages across the resistors is related to the current and applying Kirchoff’s Current Law (requires that the total current flowing into any point either than the ones where the potential is across to be 0) results in: X Ii,j = (Vi − Vj )Cij = 0 (3)

Therefore, the total energy dissipation in a circuit is:

The energy dissipated through a resistor is given by: 2 Rij Ii,j

E=

(7)

and since vh i h =

1X 2 1X Ii,j (Vi − Vj ) = ii,j Ri,j i,j i,j 2 2

(8)

which is equivalent to the last formulation and recall that Ref f = vihh , therefore:

j

This in essence is the harmonic property that holds for all points (i.e., nodes) that are not fixed to a high or low potential.

E = ih 2 Ref f =

By imposing a voltage P between obstacles Vh and goals Vl , a current Ih = x Ihx will flow into the circuit from the outside source. The effective resistance REF F between the h’s and l’s is defined as REF F = VIhh . The 1 = VIhh is called the reciprocal quantity CEF F = REF F effective conductance. The effective conductance is used to calculate the escape probability which is given F P 1 . Thus the escape as pescape = CEF Ch , where Ch is R x

1 X2 2 Ii,j Ri,j i,j 2

(6)

1X 2 ii,j Ri,j i,j 2

(9)

The currents in the circuit minimize the energy dissipation. We hypothesize that the escape probability gives us a measure of the the environmental complexity. Also, energy dissipation gives a measure of convergence of the solution (as opposed to monitoring error levels between iterations). Energy formulations have been used in the past but only in the context of including the gradient as an energy term alongside forces for dynamic arm trajectory planning and not as a convergence measure [2].

hx

probability can be expressed as follows:

4251

be a vector that has a magnitude ∆s and points from P to P 0. Since ∆s = i∆x + j∆y and the gradient of φ is δφ ∇φ = i δφ δx + j δy , it follows that ∆φ = (∆s)·∇φ + . . .. ˆ ∆s, and ∆φ = (ˆ Let ∆s = u u·∇φ)∇s + . . ., so that ∆φ ˆ ∆s = u·∇φ + . . . . Taking the limit of this equation gives δφ ˆ us δφ δs = u·∇φ. The quantity δs is called the directional derivative of φ. The maximal value of ∆s is the direction of the steepest descent gradient - ∆sm - and is the trajectory we have used when the target approach orientation was not specified. However, there is a family of functions bounded by the equi-potential contour at P . Let the two directions ∆s1 and ∆s2 define the directions of the equi-potential contour emanating from P . The minimal hitting probability path is defined by the steepest descending gradient but other paths can be systematically chosen between the equi-potential contours. In addition, rather than confining plausible directional changes to one of the eight possible neighbours of a node, quadratic interpolation is used to determine an approximation to the continuous. Let Dw be the direction with greatest negative gradient of the eight directions possible when quantizing the directions into the neighbouring cells, i.e., 0 to 360 degrees at 45 degree intervals. Let Dw−1 be the counter-clockwise neighbour and Dw+1 be the clockwise neighbour. Let Gw , Gw−1 and Gw+1 be the associated gradient values. The quadratic function approximated has the form D(G) = aG2 + bG + c. To find the maximum of D(G), the derivative is taken with respect to G and set equal to 0. This results in Dmax and Gmax , where Dmax corresponds to the maximum interpolated gradient.

4 Spatial Representations The circuit analogy lends itself for a discrete implementation of reality. Sampling the environment at regular intervals is the simplest. The irregular sampling method is based on a quad-tree formulation and is initiated to minimize computation for large homogeneous regions. 4.1 Regular Sampling The environment is sampled at regular intervals and encoded as an occupancy grid [6]. The interior of obstacles as well as the boundary of the working area is referred to as the boundary condition. All other nodes are where the path is executed. The discrete implementation lends itself for describing discrete spatial representations as well as continuous representations with interpolation. A uniformly sampled grid is natural but if implemented on a computer, a non-uniform grid may be more efficient. We have experimented with setting the goal to a low potential and setting the boundary of the working area as well as the cells occupied by obstacles to a high potential (see Figure 1).

Potential Field Contour Display 0

10

20

30 Y axis

4.2 Irregular Sampling It was found that harmonic function convergence is proportional quadratic-ally with the number of grid elements [4]. The time taken for each iteration is linearly proportional to the number of elements while the number of iterations required for convergence is linearly proportional the number of grid elements also. Thus reducing the number of grid elements will improve performance.

40

50

60

70

0

10

20

30

40

50

60

Pyramids and quad-trees are a popular data structure in both graphics and image processing [7] and are best used when the dimensions of the matrix can be recursively evenly subdivided until one grid point represents the entire grid cell. Approximately 33% more nodes are required to represent the image but usually the algorithm will be a lot faster, especially if there are large open spaces. Quad-trees have also been previously used for path planning [8] with the the A* algorithm.

70

X axis

Figure 1:

Paths Generated at 64 by 64. The figure shows the equal-potential contours generated (solid lines) and the various paths computed (dashed lines) from varying starting positions using steepest gradient descent for the above configuration. The resolution of the grid was 64 by 64.

For any irregular grid sampling (one possible configuration being a quad tree representation), the iteration kernel can be rewritten as shown in Equation 10, where the Un ’s are the m neighbouring points to Ui,j , each being Dn distance away. Pn=m Un Pn=m Ui,j (10) n=0 Dn n=0 Dn =

4.1.1 Trajectory: At any point P (x, y) on the potential function φ, there is a vector that is the direction in which φ undergoes the greatest rate of decrease and that is referred to as the steepest descent gradient. In addition to this direction, there is actually a whole family of directions that are of descending value. Let ∆s

4252

Potential Field Contour Display

Potential Field Contour Display 0

0

2

2

2

4

4

4

4

6

6

10

10

12

12

12

12

14

14

16

0

2

4

6

8

10

12

14

16

18

18

5 Results

2

4

6

8

10

12

14

16

18

18

16

0

2

4

6

8

X axis

(a)

12

14

16

18

18

Potential Field Contour Display

Potential Field Contour Display 0

2

2

4

4

4

6

6

10

12

12

12

12

14

14

16

4

6

8

10

12

14

16

X axis

(e)

Figure 3:

18

18

2

4

6

8

10 X axis

(f)

12

14

16

18

16

18

14

16

18

14

16

0

14

Y axis

Y axis 10

2

12

8

10

0

10

6

8

Y axis

Y axis

8

10

18

8

Potential Field Contour Display

0

2

16

6

(d)

0

14

4

(c)

4

8

2

X axis

2

6

0

X axis

(b)

Potential Field Contour Display

10

0

18

16

0

2

4

6

8

10 X axis

(g)

12

14

16

18

18

0

2

4

6

8

10

12

X axis

(h)

Dynamic progression of discovery. The collection of figures shows the potential function converged after obstacles are added. The progression should be followed from the left to right from top to bottom. See Figure 4 for corresponding escape and energy measures.

originally increases when the first obstacle is added. This is because there is high probability in an environment with no obstacles with the robot escaping. As obstacles are added escape probability decreases while energy increases and then does not change much. When an area of the environment is blocked from accessing the goal as shown in the last figure of the sequence in Figure 3, there is a slight increase in escape probability but minimal change in dissipated energy.

7

log10(Energy)

log10(escape probability)

14

16

0

X axis

−2

8

10

18

−1.5

6

8 Y axis

8 Y axis

8

Y axis

6

10

16

−1

Potential Field Contour Display

0

2

14

5.1 Escape Probability and Energy Figure 2 demonstrates that either escape probability or energy dissipated is more reliable than either maximum or average value change of voltages between iteration as an indicator of convergence. Note that the rate of logarithmic change in the top two plots in Figure 2 converges to zero indicating convergence while the Boote two plots in Figure 2 do not convey this. As indicated in the algorithm used for dynamic path planning, the iteration kernel competes for resources with other tasks such as mapping and robot localization. The reason why there is no fluctuation for the escape probability or energy as compared to the other two measures is that their computations rely on current values from sources which are always larger than the current values near the goal. Current values near the goal eventually compete with the resolution of the computer storage units.

Potential Field Contour Display

0

Y axis

Typically, the grid point corresponds to the center of the grid cell it represents. Trajectories are generated by linearly interpolating to arrive at a dense local neighbourhood (at the finest sampling level in the quad tree).

6.5

6

−2.5

−3

0

100

200 iterations

300

5.5

400

0 −1 −2 −3 −4

100

200 iterations

300

400

0

100

200 iterations

300

400

5.2 Quad tree Computation time for a regular grid is quadratic-ally proportional to the number of grid points. Therefore, if a reduction of n times fewer grid points is achieved by the quad-tree representation, then a reduction of n2 less time for convergence is required. Thirty-six times less operations were executed to obtain convergence for the quad map when measured as the maximum change in grid element values on successive iterations being less than .01% (see the graph in Figure 5b).

1 log10(average change)

log10(maximum change)

1

0

0

100

Figure 2:

200 iterations

300

400

0 −1 −2 −3 −4 −5

Escape Probability and Dissipated Energy. as the circuit converges to its steady state value. The top left diagram shows the escape probability as the circuit shown in Figure 1 converges to a steady state value. The top right figure shows the dissipated energy. The bottom left figure shows how the maximum change between iteration values (of voltage) changes while the bottom right figure shows the average change of voltage value between iterations. Convergence is evident in the derivatives of the top two plots long before shown in the bottom two plots.

6 Discussions This paper has discussed two innovations for dynamic robot path planning using harmonic functions: (1) the use of escape probability (or energy dissipation) for indicating circuit convergence (and indirectly for determining blockages); and (2) using irregular sampling (i.e., quad tree) for reducing the number of elements processed and thus achieving faster convergence response (especially for sparsely populated environments) at the expense of slightly less than ideal paths. We are interested in extending these approaches to dimensions greater than two and incorporating non-holonomic constraints [2]. Irregu-

Figure 4 shows how the escape probability (and energy) change accordingly as obstacles are added in the environment as shown in Figure 3. Note that escape probability

4253

Potential Field Contour Display

5.95

0

5.9 2

−1.4

log10(Energy)

−1.5

−1.6

5.85 4

5.8 5.75

6

5.7 8 Y axis

log10(escape probability)

−1.3

5.65 −1.7

0

500

1000 iterations

1500

5.6

2000

0

−5

−10

−15

0

500

1000 iterations

1500

2000 12

5 log10(average change)

log10(maximum change)

5

10

0

500

Figure 4:

1000 iterations

1500

2000

14

16

0

18

0

2

4

6

−5

10

12

14

16

18

X axis

(a)

−10

−15

8

0

500

1000 iterations

1500

2000 3 m a x . 2

Escape Probability and Dissipated Energy are shown in the top left (escape) and top right (energy) figures while the maximum (shown in bottom left) and average voltage (shown in bottom right) are also shown. These plots correspond to the changes between iterations of the sequence shown in Figure 3. The scale is such that the identification of convergence of escape and energy measures is not as evident as it is in Figure 2. It is still obvious that the convergence is still indicated by the rate of change in the curves in the escape and energy measures.

regular grid

% e 1 r r o r 0 1000

quad grid 10000

100000

1000000

flops

(b)

Figure 5:

lar sampling (e.g., quad tree) will not be sufficient for dimensional extension and we are currently exploring other sampling strategies or dimensional parallelization. Irregular sampling reduces the precision of the solution but the solution is still complete. This is in contrast to the probabilistic and random techniques [9].

Irregular Sampling: quad tree: (a) Equi-potential Contours of Harmonic Function of the quad map. The solid path was computed with a regular grid while the dotted path used a quad representation. (b) Computations for quad-Tree vs. regular Grid: The graph illustrates the maximum difference between grid elements on successive iterations plotted against the accumulated number of flops (floating point operations) executed for the quad map. Each plotted point on the graph corresponds to successive increments of 10 iterations. All computations were floating point operations and the test was executed using Matlab. Actual performance speeds can be increased if some of the floating point operations are converted to integer operations.

7 Acknowledgments [5] J.-C. Latombe, Robot Motion Planning. Kluwer Academic Publishers, 1991. [6] A. Elfes, “Sonar-based real-world mapping and navigation,” IEEE Journal of Robotics and Automation, vol. 3, pp. 249–265, 1987. [7] T. Pavlidis, Algorithms for Graphics and Image Procesiing. Computer Science Press, 1982. [8] D. Z. Chen, R. J. Szczerba, and J. J. Uhran, “A framedquadtree approach for determining euclidean shortest paths in a 2-d environment,” IEEE Transactions on Robotics and Automation, vol. 13, pp. 668–681, October 1997. [9] L. Kavraki, P.Svestka, J. Latombe, and M. Overmars, “Probabilistic roadmaps for path planning in highdimensional spaces,” IEEE Transactions on Robotics and Automation, vol. 12, no. 4, pp. 566–580, 1996.

The authors express thanks to funding from NSERC (National Science and Engineering Research Council) and MMO (Materials and Manufacturing of Ontario).

References [1] A. Stentz, “The focussed d* algorithm for real-time replanning,” in Proceedings of the International Joint Conference on Artificial Intelligence, (Montreal, PQ), Aug. 1995. [2] C. I. Connolly and R. A. Grupen, “Nonholonomic path planning using harmonic functions,” Tech. Rep. UM-CS-1994050, University of Massachusetts, Amherst, MA, June 1994. [3] P. G. Doyle and J. L. Snell, Random Walks and Electric Networks. The Mathematical Association of America, 1984. [4] J. S. Zelek, SPOTT: A Real-time Distributed and Scalable Architecture for Autonomous Mobile Robot Control. PhD thesis, McGill University, Dept. of Electrical Engineering, 1996.

4254