IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 13, NO. ... that the platform is symmetrical, i.e., that the two neck axes intersect and that the ...
IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 13, NO. 3, JUNE 1997
437
Short Papers On the Kinematics of Robot Heads Paul M. Sharkey, David W. Murray, and Jason J. Heuring
Abstract— The following presents a standardized approach to the kinematics of a generalized stereo robot head, providing both forward and inverse kinematic solutions as well as a discussion on the head Jacobian. The paper is intended as a comprehensive tutorial and as a standard notation reference for researchers in the field of active vision. Index Terms—Inverse kinematics, Jacobians, kinematics, robot heads.
I. INTRODUCTION Over the past seven years, interest in the use of robot heads, so called from their resemblance to the primate vision systems, is evidenced by the development of a wide range of devices with different designs, capabilities, and performance. Details of some examples are given in [1]–[7]. What has been lacking, however, is a standardized approach to the notation used in describing the motions of such devices. This paper presents a straightforward logical approach for describing the motion of a generalized robot head from the user’s (i.e., vision researcher’s) point of view rather than the standard robotics notation (Denavit-Hartenburg, see [8]). The generalized robot head, described schematically in Fig. 1, has nine mechanical degrees of freedom (DOF). The neck comprises intersecting roll and pan axes (or roll and yaw in nautical/aviation terms). Both cameras are separated by means of a motorized baseline (a prismatic joint), after which each camera is driven in elevation (also called pitch or tilt), vergence, and cyclotorsion. It is assumed that the platform is symmetrical, i.e., that the two neck axes intersect and that the “eyes” are symmetrical with respect to the neck. The order in which each rotation/translation is achieved will have a considerable effect on the kinematic transformations of the head. The first axis of the generalized robot head is assumed to be the roll axis, followed by the pan axis. The camera axes are configured in the more common Helmholtz configuration (elevate then verge [9]) with cyclotorsion as the final rotation. Examples of the Helmholtz configuration (usually where the elevation axes are constrained to a common elevation) are wide and varied, one such design is Yorick [5]. An extension to the Fick configuration (verge then elevate [9]) is also provided; see for example the KTH head [4]. Kinematics for simple pan/tilt devices and three camera heads (such as Triclops [7]) are easily obtained. The mechanical DOF are coupled to the optical DOF by six offsets (three in translation and three in rotation) representing the offset between the camera mount and the CCD array of the camera. This final transform is fixed and, though not necessarily directly Manuscript received April 26, 1995; revised December 11, 1995. This paper was recommended for publication by Associate Editor R. V. Dubey and Editor A. Goldenberg upon evaluation of the reviewers’ comments. P. M. Sharkey is with the Department of Cybernetics, University of Reading, Whiteknights, Reading RG6 2AY, U.K. D. W. Murray and J. J. Heuring are with the Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, U.K. Publisher Item Identifier S 1042-296X(97)01388-8.
Fig. 1. Schematic of the mechanical degrees of freedom for a generalized robot head.
measurable, may be minimized to within certain tolerances in the design of the head. The three optical DOF of each camera are provided by motorizing the iris, focus, and zoom for each camera. The first DOF controls the light level by opening and closing the aperture of the camera and has no effect on the kinematics of the head. It does, however, have direct relevance to the image processing and lens control (if not automatic). Both focus and zoom affect the kinematics by shifting the optical origin of the camera (on the CCD array) as the focus and zoom levels are changed. Furthermore the zoom DOF is coupled directly to the kinematics of the platform as the extent of an object of interest in the image will depend on the level of zoom as well as how close (or far) the platform can place the camera with respect to the object. Mechanical offsets in terms of unknown twist angles and link lengths will remain within the tolerances of the design but may still be incorporated into the kinematic development. However, with good head design these offsets should be minimized, allowing the optical calibration to identify them if required. These offsets are, therefore, not included in this paper. II. COORDINATE FRAMES The definition of coordinate frames will facilitate the development in subsequent sections of the kinematics of the head via homogenous transformation matrices between each coordinate frame. The standard method for defining frames in robotics is the Denavit-Hartenburg (DH) notation [8] whereby frames are selected in pre-defined ways using four parameters for each degree of freedom of the manipulator: link twist i ; link length ai ; joint distance di ; and joint rotation i : Using the four parameters a standard homogenous transformation matrix from link i 0 1 to link i is given as
01 Ti = Rot(xi01 ; i01 ) Trans(xi01 ; ai01 )
i
1 Trans(z ; d ) Rot(z ; ): i
i
i
i
(1)
For active vision researchers the standard D-H definitions, when applied to a robot head, present coordinate frames that do not coincide with standard image frame definitions. Additionally, D–H requires careful consideration in placing each frames to ensure that
1042–296X/97$10.00 1997 IEEE
438
IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 13, NO. 3, JUNE 1997
Fig. 2. Coordinate frames in the HOME position (subscripted H) for the generalized robot head. Dashed lines denote offsets, solid lines denote axes of rotation or mechanical linkages. For all frames in the above diagram, the z -axis points to the right, the y -axis up, the x -axis completing a right hand set.
the standard D–H notation may indeed be applied. Such placements (for example requiring x i and z i01 to intersect) are not appropriate for binocular systems, requiring further offset transforms to reach a desired kinematic frame. Also, for robot heads, where twist angles between links are generally 6k=2; the standard D–H notation is seen to further complicate the issue. Furthermore, while the D-H notation assigns the z -axes to the axes of rotation (or translation for prismatic joints), vision researchers assign the z -axis to point in the direction in which the camera is looking—the value of the z coordinate represents the distance to the object in the center of the frame. Assigning positive values along the y -axis to features appearing in the top half of the image, and using the standard right hand rule, positive values in the x-axis denote features to the left of center in the image. Positive rotations about the z -axis result in objects rotating in a clockwise direction about the center of the image. This orientation of the optical frame (x; y ; z ! left, up, distance) is notionally extended to preceding coordinate frames of the robot head such that, from a vision point of view, the z -axes of all other frames represents the direction in which that coordinate frame is “looking”. A major advantage with this approach is the reduction of twist angles in the homogenous transforms, and it is the approach we adopt here.
Fig. 3. Axes of rotation and translation. Direction of rotation should be viewed as being “in front of” the axis of rotation.
fRg
fP g
fCSL=R g
A. Home Positions As with any manipulator, a HOME position needs to be defined to establish a reference from which the kinematics may be developed. Mounting the head on a true horizontal surface fS g; a base frame fBg can then be attached to the base of the head which maintains the same orientation as fS g: HOME may then be taken to be where both cameras are level with each other and are looking at a point at infinity on the horizon. HOME for each DOF can then be defined with respect to fB g; irrespective of fS g: B. Definition of Coordinate Frames From a vision researcher’s point of view, therefore, the coordinate frames for the generalized robot head (roll, pan, left/right elevation, left/right vergence, left/right cyclotorsion), represented, respectively, by the rotations (R ; P ; EL ; ER ; V L ; V R ; CL ; and CR ); may be defined as follows (see Figs. 2 and 3): fW g World—a universal, fixed frame of reference. fBg Base—the frame at the base of the head. Usually fixed, fB g is related to fW g by a simple translation, where the y -axis of this frame, referred to as y B ; points “up”. The origin of fB g will be assigned in
fEL=R=C g
1A
conjunction with the roll and pan axes. For heads mounted on manipulators, the relationship between fBg and fW g will be via a dynamically changing homogenous transformation matrix W TB ; being the forward kinematics of the arm itself. For heads mounted on AGV’s or other mobile platforms, W TB may not be directly available and must be estimated. Roll—the axis of rotation is z R ; with z R = z B ; i.e., fRg coincides with fBg when R = 0 (HOME). For telepresence systems this axis can be driven directly from the roll of the operator’s head movements, while for active vision systems this axis is principally used to counter any roll of the base fB g (for example, AGV mounted robot heads) to provide a level view for the two cameras. Pan—as the roll and pan axes must intersect (symmetry) the axis of rotation, y P ; is chosen such that z P represents the “pan direction” and such that y P = y R : fP g coincides with fRg when P = 0: Thus, fRgHOME = fP gHOME = fBg: In other words, fRgHOME is defined where the rotation of the pan axis occurs in a plane parallel to y B = 0: Camera Separation (left/right)—This axis provides a linear mechanism for changing the baseline, defined as 2; between the two cameras (cf. inter-ocular separation). An obvious choice for the frame is such that fCSL=R g and the elevation frames coincide when the EL = ER = 0: Translations occur in the xP (= xEL = xER = xCSL = xCSR ) direction, by 6; with fCSL=R g located at the end of that translation. fCSL=R gHOME is related to fP g by 6nominal in xP ; and two other constant offsets: yCSo in y P and zCSo in z P : Independently Elevated (left/right/center)1—related to fCS g via the rotation of the elevation axes, by angle –E about xE ; such that the z E axes is coincident with the z V axes of the vergence DOF2. Note that the rotation is in the negative direction to ensure that positive values of E indicate “up.” When in the HOME position, z E is parallel to z P : For robot heads
centrally mounted (Cyclopean) camera may also be available, hence
fEC g; see [7] for example.
2 For clarity we use to refer to either E EL or ER depending on context and unless otherwise specified. We will apply a similar notation for the “eye” frames.
IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 13, NO. 3, JUNE 1997
with common elevation configuration the elevation axes of each camera are mechanically coupled (or singly actuated) so that both cameras pitch together, hence a common elevation frame can be defined fE g; usually at the center of the baseline. The above frames allow the definition of two supplementary frames. fGg Gaze—also referred to as the Cyclopean frame, the z -axis of this frame, z G ; provides a gaze direction along which both cameras might verge, such that the object of interest is equidistant from each camera, with the origin being equidistant between and colinear with the two vergence frame origins. Ideally this point should be equidistant between the two optical frame origins (see fOL=R g below) but this is a less flexible definition as it is necessarily dependent on the camera/lens parameters, zoom and focus, as well as build quality and the precision of camera mounting. The fGg frame is related to fE g via a translation (0; yV o ; zV o ) in (x E ; y E ; z E )—see fV g below—and will coincide with fE g only if the elevation and vergence axes intersect. fU g Unelevated Gaze—provides a direction that always points to the horizon, fU g coincides with fGg when in the HOME position (E = 0): As the head is symmetrical about the neck axes of rotation, the direction of z U is parallel with z P ; and fU g frame is related to fP g via constant offsets (0; yCSo + yV o ; zCSo + zV o ) in (xP ; y P ; z P ): fVL=Rg Vergence (left/right)—related to fE g via a translation (0; yV o ; zV o ) in (0; y E ; z E ); and followed by a rotation of the left/right vergence axis through V : The HOME direction, V = 0; is when z V is parallel to z G and z E : fCL=Rg Cyclotorsion (left/right)—rotates the optical frame about the z V = z C axes, by C ; to align both images. Cyclotorsion is only relevant for binocular systems, and then only to aid direct matching between the images, see [10] and [11]. Certain types of image processing for achieving stereo matching, image processing involving the raw image rather than features, benefit if the two images are aligned as well as possible before matching. Mechanical cyclorotation can be used to achieve alignment (KTH has this facility), but rotating sparse image features (not image pixels) in software is simpler and computationally inexpensive. HOME aligns fCL=R g with fVL=R g: fOL=Rg Optical (left/right)—defines the optical center and orientation of the camera—all image processing results are returned with respect to these frames. fOL=R g are related to fCL=R g (or fVL=R g for systems without cyclotorsion) by six offsets, three in position (u; v; w) and three in orientation (; ; ): all offsets are dependent on the precision of the camera and lens, and the zoom and focus settings. These offsets should be quite small (measurable to within a small tolerance) when compared to the baseline. HOME can be defined when focused at infinity and when set to a nominal field of view. fT g Target object—describes the position of the target object of interest with reference to the world coordinate frame fW g: For robot heads without cyclotorsion there is little ability to change the orientation of
439
=
= =0 f g
=
Fig. 4. General attitude of the head with R CL CR 0; with P E ER EL V R 30 C; with V L 30 C; and with 100 mm. The orientation of the target frame, T ; is notionally taken to be the same as the base frame, B :
= =
=
=
=
= f g
the object through movement of the cameras so we concentrate on the position of the target as defined by a center of “gravity” of the object as seen in the image of each camera, i.e., at a distance dT L or dT R along the z -axes of the cameras, z OL or z OR , respectively. For simplicity we further assume that both centers represents a single point in fW g: Fig. 4 shows a general attitude of a Common Elevation robot head, assumed to be pointing at the same target position in space with no roll or cyclotorsion. C. Optical Degrees of Freedom and Calibration In systems employing fixed focal length lenses and constant focus, the offsets between the optical axes and the mechanical DOF will be fixed and may be identified through calibration of the mount [12]. When active focus and zoom are employed these offsets change dynamically. The changes in these offsets, however small, tend to be highly nonlinear and can only be determined for a particular camera/lens combination by undertaking extensive calibration at different focus/zoom settings, and then employing look-up tables and interpolation techniques to incorporate their effect into the kinematics. For systems employing zoom lenses the target description may be augmented with a measure of the extent of the object, again assumed to be independent of orientation of the object. Then a combination of zoom level and camera position will result in a desired cropping of the image around the target. The performance of motorized zoom lenses, however, tends to be poor with respect to most current head designs. Thus most systems should be able to center any object of interest along the gaze direction long before zoom can crop the image. With such separation of dynamics there will be negligible interaction between the zoom and camera positioning from a kinematic point of view. III. FORWARD KINEMATICS Given the frame descriptions of the previous section homogenous transformations may be derived between two successive frames on the head. The platform kinematics developed below are for either camera parameterized by s; where s = +1 for the left camera, and s = 01 the right camera: B TR Rot(z R ; R ); R TP Rot(y P ; P ); PT Trans(xP ; s ) Trans(y P ; yCSo ) Trans(z P ; z CSo ); CS
440
IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 13, NO. 3, JUNE 1997
Rot(xCS ; 0E ); Trans(y E ; yV o ) Trans(z E ; zV o ) Rot(y V ; V ); Rot(z C ; C ); Rot(; ; ); Trans(z 0 ; dT ). The intermediate gaze frames are given by
CS T E E TV VT C C TO O TT
TU = Trans(y P ; yCSo + yV o ) Trans(z P ; z CSo + zV o ) TG = Trans(xE ; 0s) Trans(y E ; yV o ) Trans(z E ; zV o ): (3) Thus the homogenous transform from the world frame fW g to the target frame fT g is given by W TT = W TB B TR R TP P TCS CS TE E TV V TC C TO O TT (4) where, if we assume that C TO = I4 ; the 4 2 4 identity matrix, the kinematics calculation from frames fB g to fT g involves five P
E
rotations and three vector translations. The forward kinematics for the open kinematic chain from the base to the target position is found as shown in (5) at the bottom of the page. As expected, cyclotorsion does not feature in this equation, reflecting the fact that the target is assumed to be a point. Now, if both cameras are directed to look at the same object (point) in fW g; then both left and right open kinematic chains will be closed around the target and the first constraint on the kinematics is found as
D = B Dleft = B Dright = W TT left 2 [0 0 0 1]T = W TT right 2 [0 0 0 1]T (6) i.e., substituting EL ; V L ; dT L and s = 1 on the left-hand side, and ER ; V R ; dT R and s = 01 on the right. B
A. Interchanging Roll and Pan Axes As the pan and roll frames coincide in the HOME positions, and, with the assumption of head symmetry, they may be easily interchanged, with B TCS = B TR R TP P TCS ; as above, or B TCS = B TP P TR R TCS ; where B TP = Rot(y P ; P ); P TR = Rot(z R ; R ); and R TCS = Trans(xR ; s ) Trans(y R ; yCSo ) Trans(z R ; zCSo ):
B. Fick Configuration An extension of the kinematic model (2)–(6) to the Fick or independent gun-turret model is provided with the addition of two independent elevation frames fEFICK:L=R g between the vergence and cyclotorsion axes of each camera, where xFICK = xV ; the origins of fV g; fEFICK g and fC g coincide, and z FICK = z V ; when fEFICK g is in the HOME position. The new homogenous transforms are
TC = V TEF EF TC TEF = Rot(xEF ; 0EF )
V V
EF
TC = Rot(z C ; C )
(7)
the negative sign conforming to image coordinates notation. Substituting (7) into (4) yields a new 11 DOF kinematics transform which includes independent neck and eye elevations. The complexity may be reduced by considering zero neck elevations, E = 0; or equivalently
CS
TE
= I4 :
C. Constrained Kinematics As the use of the first DOF, roll, is not prevalent in current head designs, for this section we consider R to be zero at all times. Let us further assume that each camera is looking at the same target so that the elevation axes work as a common elevation, i.e. E = EL = ER : This configuration constrains the z V L and z V R to a single plane, whereby simple triangulation confirms the following constraints on V L ; V R ; dT L ; and dT R :
dT R CV R = dT L CV L ; 2 = dT R SV R 0 dT L SV L dT R = 2CV L = sin(V R 0 V L ) = 2CV L =SV R0V L dT L = 2CV R =SV R0V L :
(8)
Substituting into the general kinematics (5) yields (9) as shown at the bottom of the page. By constraining the head further such that the target is along the gaze direction, z G ; i.e. the cameras verge along the gaze direction only (V R = 0V L ; whereby SV R0V L = 2SV R CV R ; dT R = dT L ;
D = W TT 2 [0 0 0 1]T = [f1 (2; s ) f2 (2; s) f3 (2; s) 1]T f1 (2; s ) CR CP 0CR SP SE 0 SR CE CR SP CE 0 SR SE sCR CP 0 yCSo SR + zCSo CR SP f2 (2; s ) = SR CP 0SR SP SE + CR CE SR SP CE + CR SE sSR CP + yCSo CR + zCSo SR SP 0SP 0CP SE CP CE 0sSP + zCSo CP f3 (2; s ) B
0
1
=
dT CR CP SV dT SR CP SV
dT SV yV o dT CV + zV o
0 0 1 1 0 yV (CRSP SE + SR CE ) + (dT CV + zV )(CR SP CE 0 SR SE ) + sCR CP 0 yCS SR + zCS CR SP 0 yV (SR SP SE 0 CR CE ) + (dT CV + zV )(SR SP CE + CR SE ) + sSR CP + yCS CR + zCS SR SP : 0dT SP SV 0 yV CP SE + (dT CV + zV )CP CE 0 sSP + zCS CP 1 (5)
B
CP (dT R SV R 0 ) + SP (CE (dT R CV R + zV o ) 0 yV o SE + zCSo ) SE (dT R CV R + zV o ) + yV o CE + yCSo 0SP (dT R SV R 0 ) + CP (CE (dT R CV R + zV o ) 0 yV o SE + zCSo ) CP SV R+V L + SP CE 2 CV L CV R + zV o 0 yV o SE + zCSo SV R0V L SV R0V L C V L CV R SE 2 + zV o + yV o CE + yCSo : SV R0V L 0SP SSVV RR0+VV LL + CP CE 2 CSVV LRC0VV LR + zV o 0 yV o SE + zCSo
x D= y z
=
=
(9)
IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 13, NO. 3, JUNE 1997
= ) the forward kinematics further reduce to S (C ( cot +z ) 0 y S + z ) D= +z ) + y C + y S ( cot C (C ( cot +z ) 0 y S + z ) where cot = cotan( ):
dT R SV R
P
B
E
VR
E
P
Vo
VR
E
Vo
VR
VR
Vo
Vo
Vo
E
E
E
B. Vergence off Gaze (Version)
CSo
CSo
Vo
441
(10)
CSo
VR
C. Other Configurations The kinematic solutions for single eye devices can be trivially obtained by considering the transform, W TT = CS TE E TV V TC C TO O TT ; from (4), while solutions for three eye robot heads such as Triclops [7] may be found by constraining the vergence axes to the gaze direction using coordinate frames fGg or fU g (where appropriate).
By assuming that the baseline will only be adjusted from one fixed position to another, depending on application, we again reduce the problem to 4 DOF. A simple solution arises from noting that the pan axis will have to drive a significantly higher load inertia than either of the vergence axes. Thus the dynamic performance of the pan axis will generally be lower than that of vergence. By slaving the pan axis from V R + V L ; where V R + V L = 0 indicates that the target is along the gaze direction, the inverse kinematics reduces again to a 3 DOF problem. The current value of P is then used along with the position of the target to derive the required positions for (E ; V R ; V L ): Premultiplying both sides of (9) with Rot(zP ; 0P ) we obtain
( (
IV. INVERSE KINEMATICS The inverse kinematics describe a method for generating the required joint angles of the head given the target point B D: The inverse kinematics algorithm, therefore, maps the 3 DOF point into the DOF joint space of the head. For the general head, there are 7 relevant DOF (the 2 cyclotorsion axes have no effect). To develop an efficient inverse kinematics solution we must look at ways of reducing this redundancy. Assuming that the roll axis is used to level the head (for example being driven independently of the vision processing using gyros etc.), then both cameras will elevate, whether mechanically coupled or independently, to the same angle in order to view the same target, thereby reducing the DOF to 5 (pan, baseline, elevation, 2 vergence). For robot heads with independent pan and vergence axes, two configurations are available: 1) verging the cameras in front of the head or 2) behind the head (where the vergence angles are greater than 6=2): Most systems avoid the latter by either hardware limiting the range of the vergence axes to less than 6=2; or limiting the vergence range through the control system, and using the pan axis to bring the target with range of the vergence axes. Given a target at (x; y; z ) relative to the fB g frame, two simple inverse kinematic solutions can be found with the following constraints: A. Vergence on Gaze In this case, V R
= 0
VL
; whereby
( cot +z )C 0 y VR
Vo
E
Vo
P SE
( cot +z )S + y C 0AS + BC VR
Vo
E
Vo
E
cot
E
VR
= cos01(y = (A + y
Vo
Vo
=
SE
E
E
= tan01 (x=z) = x=S 0 z = z=C 0 z = A =y 0 y = B =y P
CSo
P
CSo
CSo
Vo
(A2 + B2 ))0cos01(B= (A2 + B2 )) )=C 0 z : (11) E
V
The last equation indicates that the baseline is directly coupled to the vergence angle V R : A dynamic solution to this coupling can be found by slaving the dynamically slower axes off the faster one, allowing the DOF to be effectively reduced to 3. Note, when evaluating the inverse of trigonometric functions great care must be exercised to ensure that the sign of the angle is preserved. This solution becomes ill-conditioned as z tends to zero, or when the resulting vergence or elevation angles approach 6=2 (see Section VI).
0 + )+ + )0 0 zS
dT R SV R SE dT R CV R zV o yV o CE CE dT R CV R zV o yV o SE
=
xCP
P
xSP
+ zC
P
dT R SV R dT R CV R zV o dT R CV R zV o
+ )+y + )0y xC 0 zS + = y0y xS + zC 0 z =y P
Vo
CE SE
Vo
P
=
CSo
yo CE
E V R
P
0z S o
E
CSo
CSo
y
( (
SE CE
+y +z
P
CSo
xo yo zo
Vo
= cos01 (y = (z2 + y2 )) 0 cos01 (y = (z2 + y2 )) = cot01 ([(y 0 y C )=S 0 z ]=x ) (12) Vo
o
o
Vo
o
o
E
E
Vo
o
o
o
and we can solve for dT R (i.e., V L ); in a similar way to before. V. JACOBIANS For robot manipulators, the head Jacobian matrix describes the relationship between the velocity of the end effector and the velocities of the joints, the size of the Jacobian being n 2 m; where n is the number of DOF of the manipulator and m is the number of DOF of the output space. The Jacobian varies with joint positions and can be found by differentiating the forward kinematics solution with respect to time. For a point in space there are just three DOF so the Jacobian for the generalized head may be written as a 3 2 7 matrix such that
[x_ y_ z_ ] = J (2; )[_ _ J (2; ) = f (2; )=@x x = ; ; ; T
i;j
R
i
j
R
P
j
P
EL
ER
_
_ _ i = 1; 2; 3
EL
ER
; V L ; V R ; :
VL
_
_]
V R
T
(13)
The importance of the Jacobian arises when, given a target position and velocity, one wishes to calculate the joint positions and velocities required to track the target. This requires the matrix J to be inverted. Problems arise when J is nonsquare or the determinant of J tends to zero—at “singularities.” When this occurs, it implies that infinite joint velocities will be required to track even small target velocities. Thus any trajectory generator that uses the inverse Jacobian should avoid known singularities of the robot. For the nonsquare Jacobian of the robot head, the singularities can be found when the rank of the Jacobian is less than min (n; m) = 3, reflecting the 3 DOF of the output space. The Jacobian matrix for the constrained robot head is a 3 2 5 matrix, mapping the velocities of [P ; E ; V L ; V R ; ]T to the
442
IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 13, NO. 3, JUNE 1997
[
velocities of x; y; z J11
]
T
SV R+V L SV R0V L
= 0S
P
0y
V o SE
J12
= 0S
J13
=C
J14
= 0C
where
+z
CSo
2 + 2S
+z
VL
+y
Vo
Vo
+z
=0
Vo
jJ
E;V L;V R
CE
VL
P
VR
E
With zV o
6=2 or
VL
VR
VL
Vo
SE
2
VL
VR
E
VL
VL VR
SV R0V L
J31
SV R+V L SV R0V L
= 0C
P
0y
Vo
J32 J33
= 0C
SE
SE
P
= 0S
P
0S
P
+z
CSo
2 CS
VR
CE
2 + 2C
+z
VL
+y
Vo
Vo
2 P CE C
S2V R SV2 R0V L
CV L
0
VR
= 0x
CV R CV L SV R0V L
+z
Vo
CE
VR
VL
+2
SV R+V L CP CE CV R CV L : SV R0V L
(14)
)
=0
3 3
A. Vergence on Gaze The determinant is given by
jJ
P;E;V
j = 2S2 1(1C 0 y E
VR
1 = C +z S VR
VR
Vo
Vo
SE
+z )
:
CSo
(15)
1=0
=
; i.e., at V R The most obvious singularity occurs at Thus represents when the cameras are Vo pointing directly at the origin of the gaze frame fGg: If the vergence ; this singularity occurs at and elevation axes intersect zV o 6= ; as expected. V R
tan01 (0=z ): =
2
1=0 ( = 0)
B. Vergence off Gaze (Version) In this case, we assume that the pan axis is driven, as before, from the “off-gaze” angle V R V L : The pan axis velocity will, therefore,
+
VR
Vo
=
edu/grasp/head/headpage/headpage.html
Finding the singularities from the 5 2 3 matrix above is an involved and lengthy process. However, the common factor in columns 3 and : 4, =SV2 R0V L ; immediately yields a trivial singularity at By looking at the constrained options of Section IV, and assuming that the baseline is fixed, the Jacobians reduce to 2 matrices and the singularities for each configuration can be deduced.
(
(16)
2
[1] C. M. Brown, “The Rochester robot,” Tech. Rep. 257, Dept. Comp. Sci., Univ. Rochester, NY, 1988. [2] U. Cahn von Seelen and B. Madden, http://www.cis.upenn.
0
VR
P
zV o :
REFERENCES
SP S2V L 0 2CP CE CV2 L J34 = S2
0S J35 =
VL
2
= 02 SS2 C 0 2 S C C J25 = E
CV L
0
between vergence and elevation. Physically, when close to this singularity movement in the elevation axis will cause cyclotorsion. V L 6= : Another singularity occurs on the locus V R As the cameras are pointing at a single point target, the physical interpretation of this is the situation where an object is very close to one camera while being approximately from the other—an unlikely situation. The regions in which the inverse kinematic solutions become illconditioned and the regions enclosing the singularities of the Jacobian are the same. However, it is noted that such regions are at the extremes of operation in terms of vision processing, in which the “ideal” operating region is along the unelevated gaze direction, z U ; and where the target lies within a minimum and maximum range limit that is determined by the baseline between the two cameras.
=2
J24
+V L ;
+
0y
Vo
VL
= 0; then the singularities occur when either = = 6=2; with z 6= 0 accounting for the offset
VL
CV R CV L
1C
VR
0
VR
VR
VL
VR
SE CV2 R 2 SV R0V L
E
2
VR
02
VR
j = S 3 4
1 =2 CS
S2V L SP CE CV2 L 2 SV R0V L
P
P
CV L
0
VR
=z
= C S + S+ 20S C J21 =0 J22 = C 2 CS 0C + z J23
VR
2 S2V R P CE CV R SV2 R0V L
P
J15
2 CS
CE
P
CV R CV L SV R0V L
SE
P
+C
have no impact on the trajectory generation and the Jacobian will be concerned with the mapping of velocities from the elevation and independent vergence axes to Cartesian space. The pan axis can be : The determinant in this case is assumed to be stationary, say P
[3] J. J. Clark and N. J. Ferrier, “Modal control of an attentive vision system,” in Proc. 2nd Int. Comp. Comput. Vision, Tampa, FL, 1988, pp. 514–523. [4] K. Pahlavan, “Active robot vision and primary ocular processes,” Ph.D. degree, Dept. Num. Anal. and Comp. Sci., Royal Inst. Technol., Stockholm, Sweden, 1993. [5] P. M. Sharkey, D. W. Murray, S. Vandevelde, I. D. Reid, and P. F. McLauchlan, “A modular head/eye platform for real-time reactive vision,” Mechatron., no. 4, vol. 3, pp. 517–535, 1993. [6] P. M. Sharkey and D. W. Murray, “Delays vs. performance of visually guided systems,” Inst. Elec. Eng. Proc., Pt. D, vol. 143, 1996. [7] A. J. Wavering, J. C. Fiala, K. J. Roberts, and R. Lumia, “TRICLOPS: A high-performance trinocular active vision system,” in Proc. IEEE Int. Conf. Robot. Automat., Atlanta, GA, 1993, pp. 410–417. [8] J. Denavit and R. S. Hartenberg, “A kinematic notation for lower-pair mechanisms based on matrices,” J. Appl. Mech., vol. 77, pp. 215–221, 1955. [9] R. H. S. Carpenter, Movements of the Eyes. London, U.K.: Pion, 1988, second ed. [10] J. L. Crowley, M. Mesrabi, and F. Chaumette, “Comparison of kinematic and visual servoing for fixation,” in Proc. IROS ’95, Pittsburgh, PA, Aug. 1995, pp. 335–341. [11] G. Westheimer, “Kinematics of the eye,” J. Opt. Soc. Amer., no. 10, vol. 47, pp. 967–974, 1957. [12] P. F. McLauchlan and D. W. Murray, “Active camera calibration for a head-eye platform,” IEEE Trans. Pattern Anal. Machine Intell., vol. 18, pp. 15–22, 1996.