Position correction of a mobile robot using predictive vision - CiteSeerX

1 downloads 0 Views 292KB Size Report
Especially for di cult navigation tasks like passing a door or entering an elevator the robot has to drive closer ... sors, 24 ultrasonic sensors and the camera head.
Position correction of a mobile robot using predictive vision P. Weckesser, F. Wallner and R.Dillmann  Institute

for Real-Time Computer Systems & Robotics, Department for Computer Science, University of Karlsruhe, D-76128 Karlsruhe, Germany.

Abstract. The path of a mobile robot in indoor environments can be described as a sequence of low level, so called re exive navigation tasks. Such a task is collision avoidance; a behavior which enables the robot to follow a designated path in environments with large free space areas and to avoid obstacles simultaneously. Keeping up a continuous speed is sometimes more important than always having precise position knowledge. This behavior is not adequate in every situation though. Especially for dicult navigation tasks like passing a door or entering an elevator the robot has to drive closer to obstacles than permitted by collision avoidance safety requirements. Vision systems help to overcome the problem that range sensors do not nd structures to correct the robot's position if the robot operates close to obstacles or if the environment is too cluttered. Using an active vision system the robot is able to chose any visible object of which the position is known and navigate relative to it. By way of contrast to approaches that use image sequences the method described performs a correction just using one pair of stereo images. Because of that the robot does not have to slow down signi cantly in order to perform a position correction. Key Words. mobile service robot, active vision, structured light, camera calibration, scene reconstruction, predictive vision

1. Introduction Several approaches using mono camera systems have been presented in literature (Betke and Gurvits, 1994). The general problem with these systems is that observations of an object from di erent positions need to be available. This procedure is time expensive and sensitive to errors caused by imprecise robot motions. In this paper an approach to real-time position correction based on odometry, ultrasonic sensing, active stereo vision (bin- and trinocular) is presented. Knowledge-based image processing allows detecting and classifying landmarks in the stereo images uniquely. With only one observation the robot's position and orientation relative to the observed landmark are found with high precision.

2. The mobile system PRIAMOS The need of reliable sensors for position correction and object recognition has led to the development of a multiple sensor system for the mobile robot PRIAMOS1 . This was necessary in order to operate PRIAMOS safely and accurately in laboratory or industrial environments. The sensor equipment of PRIAMOS are odmetric sensors, 24 ultrasonic sensors and the camera head 1 PRI = Permutation of the German name of our institute, AMOS = Autonomous MObile System

KASTOR. PRIAMOS (Dillmann et al., 1993) is equipped with four Mecanum wheels, which enable three degrees of freedom. This means that PRIAMOS is able to move forth and back, left and right, and turn on the point. All these degrees of freedom can be combined. PRIAMOS is equipped with two independent VME-Bus systems that exchange information via a local Ethernet. In the rst VME-Bus system the robotcontrol unit (odometry) and the ultrasonic sensor system are included, and the other one contains the image-processing and camera-head control hardware. Except for these sensor-systems PRIAMOS is equipped with a passive camera that is used for multiple purposes. This third camera can be fully included in a trinocular stereo reconstruction process providing an additional epipolar line to simplify the stereo matching algorithm. In addition to this, laser lines are projected onto ground and observed by the third camera. By doing this a very fast and reliable obstacle detection is achieved. On the other hand the image of this camera is generally transmitted by a video link to support teleoperation of the robot.

2.1. System architecture and communication structure The connection to the outside world ('sun' workstations) is realized by a data-radio system with

9600 baud. The limited data rate requires to perform sensor data processing and low level control on board. Only parametric information about the environment and commands to the robot are transmitted via radio. As three di erent types of communication (PVM (Oak, 1993), sockets and serial radio link) are used for PRIAMOS, it was necessary to develop a general communication protocol called RobMessages (Appenzeller, 1994). With this protocol it is possible to exchange information from any system unit with any other without the necessity of taking care of the physical communication link. The RobMessage-System also supports the teleoperation of PRIAMOS and the camera head KASTOR. An operator can control PRIAMOS as well as KASTOR with a six degrees of freedom space-mouse. (Lin et al., 1995)

3. The active vision system KASTOR KASTOR (Weckesser and Wallner, 1994) consists of two cameras, mounted on a platform with motor-controlled tilt and turn. Zoom and focus as well as the vergence of each camera unit are equipped with motors. Later a third passive camera was added to the system in order to improve the reliability of the vision system (Dold and Mass, 1994). The cameras are connected to a real-time vision system, which has been modi ed for stereo image-processing purposes. The image-processing system is able to produce a symbolic contour description by edges extracted from the stereo images in real-time. With a calibrated stereo camera system it is possible to compute a 3d-reconstruction of the scene.

3.1. Camera calibration with sub-pixel accuracy Any precise image interpretation is not possible before the perspective transformations describing the relation between a scene point and its image points are known. The classical way is to extract the external and internal parameters of the cameras. In general it is very dicult to nd these parameters, especially when zoom-lenses are used.

3.1.1 The camera model In the following the

camera model that is used for the calibration is introduced. The camera is characterized by its image plane (ccd-chip) and the position of the optical center C.

Y

C(X 0 ,Y 0,Z 0)

v c B u

H

I(u,v)

X

Y

P(X,Y,Z) 0

X0

Z0

Z

Figure 2 Camera model Altogether there are 11 camera parameters in this camera model (Tsai, 1986).

 external camera parameters :  position of the optical center C(x0; y0 ; z0)  orientation of ccd chip (; ; )  internal camera parameters :  center of image plane H(uz ; vz )  focal length c and pixel size (du; dv )  angle between the image coordinate axis

Figure 1 Front view of KASTOR

Instead of extracting the camera parameters directly from the camera geometry it is possible with the knowledge of at least 6 corresponding scene (S) and image (I) points to compute the dltmatrices (M) (direct linear transformation) for the left and right camera (Abdel-Aziz and Karara,

1971). With the dlt-matrix a homogeneous formulation ( equation 1 ) of the perspective transformation is possible. The 12 parameters of the dlt-matrix mi;j contain all 11 camera parameters (Foehr, 1990). The perspective transformation can be formulated as follows:

this new reference object. It is a black box with 16 circles in di erent distances to the backside. 25cm 60cm 10cm

I = M S

0 wu 1 I = @ wv A w

(1) (2)

homogeneous image-coordinates

0m m m m 1 1;1 1;2 1;3 1;4 M = @ m2;1 m2;2 m2;3 m2;4 A (3) m3;1 m3;2 m3;3 m3;4 dlt-matrix

0X BY S=B @Z 1

1 CC A

3cm

50cm

30cm

Frontview

Sideview

Figure 3 CAD-model of reference object for calibration It is of great advantage for the calibration if a large number of reference points are used that are equally distributed in the images. The number of 16 circles is a compromise between a good distribution of points and the possibility of automatically nding corresponding image and scene points.

(4)

homogeneous scene-coordinates

From equation 1 follows: m1;1X + m1;2Y + m1;3 Z + m1;4 u= m 3;1X + m3;2Y + m3;3 Z + m3;4

(5)

m2;1X + m2;2Y + m2;3 Z + m2;4 v= m 3;1X + m3;2Y + m3;3 Z + m3;4

(6)

The practical problem with this approach is the exact determination of the positions of the reference points in the images. In (Schmid, 1992) a cube is used as reference object. The corners of the cube can be located with a corner operator or by the intersection of the edges of the cube. Experimental results have shown that the accuracy that can be reached by this method is not very high (2 pixels). The accuracy of corner detection determines the quality of calibration.

3.1.2 Detection of reference points In or-

der to improve this poor accuracy in the detection of reference points photogrammetric methods are applied for the calibration (Beyer, 1991). An automatic calibration technique that detects the reference points in the images with sub-pixel accuracy was developed (Hetzel, 1994). In order to reach the accuracy of 0:1 pixels in the point detection a new reference object is used. Figure 3 shows a CAD-Model and gure 4 shows a foto of

Figure 4 Reference object for calibration A retro-re exive material is used for the circles. Retro-re exive materials re ect light only back to the direction it comes from and so appear very bright in the images. So it is possible to receive high-quality images of the calibration object with just a little source of light directly behind the cameras. The middle points of the circles are used as reference points for the calibration. A coordinate measuring system was used to determine the position of these reference points in the scene coordinate system with the precision of 1m. In the images circles are generally seen as ellipses. The center of gravity of the ellipses corresponds with the middle points of the circles. In order to compute the center of gravity of the ellipses a weighted centroiding method is used. The location of the pixel is multiplied with its grey-value. By this method the location of the middle points of the circles can be determined with the accuracy of 0:1 pixels. With this technique for the selec-

tion of reference points the accuracy of calibration was improved enormously.

3.1.3 Computation of the dlt-matrices For the computation of the dlt-matrices at least 6 corresponding scene and image points are necessary. If 16 points are used, this results in an overconstrained system and a least square solution has to be applied. The dlt matrix is so computed that the origin of the coordinate system is the intersection of the turn and tilt axis. In order to do this, the location of the optical centers of both cameras have to be extracted from the dlt matrix. From the coordinates of the optical center the translation vector from the coordinate system of the reference object to the system of the camera head can be computed. With the use of this algorithm it is possible to compute the dlt-matrices for a speci c setup of the camera head. The new idea now is to compute dlt-matrices for a number of setups in advance and to use them later during autonomous operation of the system. This is possible due to the mechanical accuracy of the camera head and the high precision stepper motors.

3.1.4 Automatic calibration for di erent camera setups The whole calibration proce-

ceived is compared to the image point that is currently detected in the image. The distance of the two image points is a possibility of evaluating the accuracy of point detection and the quality of calibration. In equation 7 the average error in point detection is calculated. As the points are equally distributed over the images this is the best way to check the quality of calibration. The smaller the value F gets, the better is the calibration. F=

PN k MS ? I k i i i=1 N

(7)

In any case subpixel accuracy for the detection of the reference points is achieved (Weckesser and Dillmann, 1994). As a result one can say that it is possible to change the optical parameters and come back to a pre-calibrated position due to the high precision of the mechanical construction of the system. R.Wilson (Wilson, 1993) has done some further numerical investigations on the calibration of zoom lenses. Because of tolerances in the lenses the best results are received if the positions are always adjusted from the same side. The accuracy of the calibration used will always be in the subpixel dimension.

4. High level image-processing

dure is fully automized (Hetzel, 1994). The only thing the user has to do is to give an interval for zoom and focus the calibration is supposed to be performed for, and he has to focus both cameras at the beginning on the reference object with the maximum focal length (upper limit of the interval). Then the calibration is performed for this camera setup. As described before, the corresponding reference points in the images and in the scene are found automatically. After this a new camera setup with a reduced focal length is adjusted and the calibration is performed again. This procedure is repeated until the lower limit of the interval is reached. The same procedure is repeated for the focus adjustment in the given interval. So it is possible to calibrate the camera system with a very high accuracy without any user interaction.

As the image processing system is able to perform a real-time edge detection, high-level image processing makes extensive use of the extracted edges. The edge extracting procedure is a standard sobel- ltering algorithm su ering from the known problems of randomly broken edges and edges not extracted in edge intersection areas.

3.1.5 Experimental results for the calibration of an active camera In order to test

4.1. Fusion of broken edge-segments

the quality of calibration the following experiment was performed: The reference object is placed in front of the camera head and the system is calibrated with the procedure described above. Once the camera head is calibrated, the reference object is used as test object for reconstruction. Now the coordinates of the reference object in the scene are multiplied by the dlt-matrix. The image point re-

So the rst step in high level image processing is a post-processing of the extracted line segments in order to solve the following problems:

 randomly breaking edges because of noisy images  problems of the extraction algorithm at edge intersections

The problem of broken edges is solved by iterative grouping of collinear edge segments. Firstly edges are tested on being collinear and then grouped to a new edge. To do this reasonable thresholds for the distance of the edges and the di erence in orientation have to be employed. Secondly the grouped edges are tested for common edge segments. If this is the case they can be fused together as well.

This algorithm is continued until no broken edges can be fused any more. Figure 8 Edge skeleton of an object Figure 5 Fusion of collinear edges

4.2. Grouping of straight lines to contours After the repair of the linear edge segments, these straight lines are being grouped to contours. Firstly neighboring edges are analyzed for geometrical relations (parallel, perpendicular, intersecting : : :). According to these geometrical relations the edges can be grouped to more complex contours. These contours can be:

 parallelograms : two neighboring pairs of

parallel edges of the same length are grouped.  intersections: two intersecting edges are grouped. In a natural environment edges that run up to each other generally intersect in a corner. With this relation the lack of the edge operator in corner areas can be made up.

corresponding endpoints of edges with the use of the epipolar constraint. In (Dold and Mass, 1994) it was shown that corresponding image points can be computed unequivocally if three or more calibrated cameras are used. With a binocular stereo system that is not provided with any knowledge of the environment the matching is very dicult. To every point an epipolar line can be computed in the other image on which the corresponding point must lie on. This results very often in an ambiguity of possible matches. This problem can be solved with a third camera. With the help of the third camera a second epipolar line can be computed for every point where the corresponding one has to lie on; this means that the corresponding point is the intersection of the two epipolar lines. By applying this technique a very high reliability for the stereo matching algorithm was achieved.

Figure 6 Parallel and perpendicular edges

Figure 7 Parallelograms Simple objects like cubes generally can be described by contour lines made out of the structures described above.

5. Reconstruction of an indoor scene The nal goal of image-processing is the reconstruction of the scene. In order to reconstruct the environment the perceived images have rst to be matched. With KASTOR being able to perform a real-time edge detection, an edge-based matching algorithm is applied. The high quality of calibration enables a highly precise computation of

Figure 9 Indoor scene

6. Landmark-Recognition Figure 9 shows an indoor scene that will be reconstructed in an example. The task for the robot will be to drive through the door that is known in the robot's internal map. It is assumed that the maximum position error of the robot is 20cm. In order to drive through the door an error of only 3cm is allowed though. So the robot will use the door itself as a landmark and correct its position with the help of the stereo vision system.

6.1. Edge-model representation In order to recognize landmarks an adequate model of this landmark has to exist in the robot's internal map. Again a basically edge-based description of the landmarks is used. The landmarks are modeled as edge-skeleton. Additionally the edges can be attributed. Such attributes can be visibility of an edge, a pointer to neighboring edges, the grey-value of neighboring planes or anything that supports the recognition process.

is localized in both images the two images can be matched and the position and orientation of the door relative to the robot can be computed. From this results a position correction that enables the navigation of the robot through a narrow door.

Figure 11 position estimation

6.2.1 Vision-based passing of narrow doors

Figure 10 Landmark model

6.2 Predictive vision In order to navigate a robot in a partially known environment the observations of natural landmarks are used to update and correct the robot's position from time to time. Especially when high precision is necessary for the performance of a navigation task, the robot needs to know its position very accurately. In the following example a task like this is being discussed. The odometry sensors of the robot provide a position estimation. The error of this estimation depends on the driven distance. For safe navigation a position correction is necessary if this error exceeds an interval of 20cm or for special navigation tasks that require high precision. The robot's width is 78cm and it is commanded to drive through the door ( gure 9) with a width of 90cm. This task can not be executed with a position error of about 20cm and an orientation error of about 5 degrees. Because of that the door that is known in an internal map is used as natural landmark. The robot's estimated position relative to the door allows predicting the position of the door in the two camera images with the equations 5 and 6. Then a snapshot is taken with both cameras and the prediction of the door is matched with the extracted edges in the images. By doing this it is possible to recognize the object door and localize it in both images. As soon as the door

In the following the task of passing a door will be illustrated in an example. The robot has got a position estimation that enables it to make a planned perception of the area where the door is expected ( gure 11). The prediction of the door is matched to the perception and so it is possible to identify certain structures of the door. In this example the left and the right side of the door are observed separately to improve the accuracy of the reconstruction. In gure 12 and 13 the greyvalue images and the extracted edges of the left and right side of the door are shown.

Figure 12 Left side of door

Figure 13 Right side of door

In between the perception of the two image pairs the head was turned by 10 degrees. With the help of the prediction it is possible to detect the relevant structures (vertical edges) of the door. Then the position of the door is reconstructed. In gure 14 the reconstruction of the door is shown as ground plan.

before correction

after correction

mod

Xd

reconstruction

∆Φ rec

rec Xd =

mod Xd

model

Xd

estimated position of robot

corrected position

Figure 15 Computation of position and orientation-correction

Figure 14 Scene-reconstruction The distance of the robot to the door was measured to be 2:4m. The accuracy of the reconstruction in this example is: x = 2cm ; y = 1cm The ~ ) are position- and orientation correction (X; computed from the displacement of the position of the landmark in the world model and the computed scene-reconstruction ( gure 15). ~ dmod  X~ drec cos() = X jX~ dmod j  jX~ drec j

(8)

X~ = R()  X~ rec ? X~ mod

(9)

X~ dmod : Position of landmark in world model X~ drec : Reconstructed position of landmark R() : Rotation matrix In the experiment performed, robot ended up with a position uncertainty of less than 2 cm. So it was possible to plan a path through the door ( gure 16.

7. Conclusion and future work In this paper the use an active vision system for position correction was presented. A highly precise and fast calibration technique for active cameras was presented. Based on this strategies for

Figure 16 Position and orientation-correction high level image processing techniques like landmark recognition were described. It was illustrated in an example how knowledge base image processing is used for landmark-recognition. The same techniques described above that are used for position correction will also be applied for an automatic mapping of the environment. In a rst experiment the robot started of with a very rough map of the environment. It was just given the ground-plan of a hallway and asked to drive through this hallway only using ultrasonic sensing. On the way through the corridor it was the task to recognize all doors (closed or open) automatically and to update the internal map with the found positions of the doors. In the next step it is planned to start without any environmental knowledge and automatically map the whole accessible oor of the building. Future work will focus on a more advanced fusion of the di erent sensor informations. It is planned to process these informations making use of extended Kalman- ltering, especially for range and visual sensor data. On the other hand there will be a focus on the application of a mobile service robot that can supply an operator with all the processed sensor measurements in order to achieve a very reliable teleoperation-system with a high degree of autonomy.

8. Acknowledgment The authors would like to thank G. Hetzel for his contributions to the development of the calibration technique, S. Hampel for his for his engagement in landmark recognition and G. Appenzeller for his contributions to robot's communication structure. This work has been performed at the Institute for Real-time Computer Control Systems & Robotics, Prof. Dr.-Ing. U. Rembold and Prof. Dr.-Ing. R. Dillmann, Department of Computer Science, University of Karlsruhe, 76128 Karlsruhe, Germany.

REFERENCES Abdel-Aziz, Y.I and H.M. Karara (1971). Direct linear transformation into object space coordinates in closerange photogrammetry. In: Symposium on CloseRange Photogrammetry. Universty of Illinoios at Urbana-champaign. Appenzeller, G. (1994). Priamos: Dokumentation des Message-Systems. RobMsg. Ayache, N. (1991). Stereo Vision and Multisensor Perceptions. MIT-Press. Betke, M. and L. Gurvits (1994). Mobile robot localization using landmarks. In: Intelligent Robots and Systems. pp. 135{142. Beyer, H. (1991). An introduction to photogrammetric camera calibration. Technical report. Institute of Geodesy and Photogrammetry, ETH, Zuerich, Schweiz. Calvary, J., J. Crowley and B. Zopis (1992). Perceptual grouping for scene interpretation in an active vision system. Li a (IMAG). Cappa, P. (1994). Out tting your hospital for the new wave of robots. Journal of healthcare materiel mangement. Crowley, J. and C. Schmid (1993). Maintaining stereo calibration by tracking image points. In: CVPR New York. Dillmann, R., J. Kreuziger and F. Wallner (1993). The control architecture of the mobile system priamos. In: Proc. of the 1st IFAC International Workshop on Intelligent Autonomous Vehicles. Southampton. Dold, J. and H.-G. Mass (1994). An application of epipolar line intersection in a hybrid close range photogrammetric system. In: Close Range Techniques and Machine Vision (Prof. J.F. Fryer, Ed.). ISPRS Commision V. pp. 65{70. Elbs, M. (1994). Optische Hindernisserkennung mit Strukturiertem Licht. Master's thesis. Institute for RealTime Control Systems and Robotics, University of Karlsruhe. Foehr, R. (1990). Photogrammetrische Erfassung raumlicher Informationen aus Videobildern. Vol. 7 of Fortschritte der Robotik. W. Ameling and M. Weck. Gengenbach, V. (1994). Einsatz von Ruckkopplugen in der Bildverarbeitung bei einem Hand-Auge-System zur automatischen Demontage. PhD thesis. University of Karlsruhe. Hampel, St. (1995). Positionskorrektur eines mobilen Roboters mit einem aktivem Stereo-Sichtsystem durch Detektion naturlicher Landmarken. Master's thesis. Institute for Real-Time Control Systems and Robotics, University of Karlsruhe.

Hetzel, G. (1994). Kalibrierung eines Stereo-Kamerasystems. Master's thesis. Institute for Real-Time Control Systems and Robotics, University of Karlsruhe. Knieriemen, T. (1991). Sensordateninterpretation und Weltmodellierung zur Navigation in unbekannter Umgebung. Autonome mobile Roboter. BI - Wissenschaftverlag. Laengle, T. (1992). 3-D Maschinen-Sehen. Master's thesis. Institute for Real-Time Control Systems and Robotics, University of Karlsruhe. Lin, I.S., R. Dillmann and F. Wallner (1995). An advanced telrobotic control system for mobile robots. In: IAS (Prof. U. Rembold, Ed.). to appear. Melen, T. (1993). Extracting physical camera parameters from the 3x3 direct linear transformation matrix. In: Second Conference on Optical 3-D Measurement Techniques (Prof. H. Grun, Prof. A. Kahmen, Ed.). ETH-Zurich. pp. 355{365. Oak (1993). PVM 3.0. ornl/tm-12187 ed. Rehfeld, N. (1991). Auswertung von Stereobildfolgen mit Kantenmerkmalen. VDI-Verlag. Dusseldorf. Schmid, C. (1992). Auto-calibration of cameras by direct observation of objects. Master's thesis. University of Karlsruhe, Institute for Real-Time Control Systems and Robotics, University of Grenoble, LIFIA. Strutz, T., W. Riechmann and T. Stahs (1994). Tiefendatengewinnung mit dem Codierten Lichtansatz - Einsatzmoglichkeiten in der Automobilindustrie. Technical report. Volkswagen Konzernforschung Wolfsburg / Institut f"ur Robotik und Prozessinformatik, TU Braunschweig. Sugihara (1986). Machine Interpretation of Line Drawings. MIT-Press. Tsai, Roger Y. (1986). An ecient and accurate camera calibration technique for 3d machine vision. In: IEEE Computer vision and pattern recognition. Miami Beach, Florida. pp. 364{374. Vuylsteke, P. and A. Oosterlinck (1990). Range image acquisition with a single binary-encoded light pattern. IEEE Trans. on Pattern Analysis and Machine Intelligence 12(2), 148{164. Weckesser, P. and F. Wallner (1994). Calibrating the active vision system KASTOR for real-time robot navigation. In: Close Range Techniques and Machine Vision (Prof. J.F. Fryer, Ed.). ISPRS Commision V. pp. 430{436. Weckesser, P. and G. Hetzel (1994). Photogrammetric calibration methods for an active stereo vision system. In: Intelligent Robotic Systems (IRS) (Prof. A. Borkowsky and Prof. J. Crowley, Eds.). pp. 430{436. Weckesser, P. and R. Dillmann (1994). Accuracy of scene reconstruction with an active stereo vision system using motorized zoom lenses. In: Intelligent Robots and Computer Vision XIII: 3D Vision, Product Inspection, and Active Vision (Prof. D.P. Casasent, Ed.). SPIE. pp. 470 { 481. Wilson, R. (1993). Modeling and calibration of automized zoom lenses. PhD thesis. CMU, Pittsburgh, USA.

Suggest Documents