Jun 23, 1995 - line output of the C40 display unit at arbitrarily chosen time{slots. Figures a{c) are views of the tele camera. d) is a view of the wide angle frontal ...
An Integrated Obstacle Detection Framework for Intelligent Cruise Control on Motorways Stefan Bohrer, Thomas Zielke and Volker Freiburg
C-VIS Computer Vision und Automation GmbH, Universitatsstr. 142, D{44799 Bochum, Germany Pho +49/234-9706610 FAX +49/234-9706630 representing the CLEOPATRA? consortium June 23, 1995
Abstract. This paper deals with the development and implementation of a purely visual obstacle detection framework for autonomous driving on motorways. Our activities are embedded in the SMART VEHICLE subproject of the ESPRIT project CLEOPATRA. The aim of SMART VEHICLE is the development of a visually controlled intelligent cruise control (ICC) for a prototype passenger car, the Mercedes{ Benz research car VITA II. The vision modules are operating concurrently on a net of digital signal processors with multiple video inputs. Our obstacle detection framework bases on the application of highly adapted machine-vision elements such as robust symmetry measuring, neural net{based adaptive object recognition, real{time tracking of multiple vehicles, and inverse-perspective stereo image matching (IPM). We will show detailed results from extensive closed{ loop autonomous driving on public motorways and we will present the nal HPC hardware system which is part of the application computer of VITA II. The results presented in this paper are building upon the work presented at the IV'94 in Paris [1].
1 Introduction
The main objective of the CLEOPATRA sub-project SMART VEHICLE is the development of a visually guided intelligent cruise control (ICC) system for the Daimler-Benz research passenger car VITA II, a Mercedes-Benz 500 SEL sedan [8]. The SMART VEHICLE work package is a highly challenging HPC (High Performance Computing) task which aims several steps further as the recently nished european research program PROMETHEUS. PROMETHEUS had the main goal of demonstrating the feasibility of certain functions such as obstacle detection, fully autonomous driving or lane tracking. SMART VEHICLE builds upon the PROMETHEUS deliverables and focuses on an increased reliability, better performance and smaller size in order to get in line with product and market requirements [1]. The aim is to concentrate the research activities in order to develop exploitable ? The work in the CLEOPATRA (Clusters of Embedded Time{Critical Applications) project is performed within the framework of the ESPRIT programme and partly funded by the Commission of the European Communities. The following companies form the consortium: AEG(D), C-VIS(D), DASA(D), DBAG(D), DLR(D), ECSA(F), GRAPHIKON(D), MCS(F), OSS(DK), PAC(UK), PERIMOS(D), THO-TBS(F), UBM(D).
products which can be integrated in normal passenger cars in the near future. This paper presents the results achieved in CLEOPATRA considering these requirements. Cost{eectiveness is an important issue for car following applications and intelligent vehicle highway systems in order to be able to realize exploitable products within this eld [4, 2]. The basic organisation of the ICC functionality stems from a purely visual information analysis by using 4 CCD-cameras, a lane-tracking module, the obstacle detection system, a trac sign recognition module and the overall planning and decision module [8, 5]. The obstacle detection functionality is realized by two competing image processing schemes, on the one side the multi-focal multiple object recognition system of the group of Prof. Dickmanns [6] at the UBM in Munich and on the other side the visual obstacle detection framework (ODF) described in this paper. Both systems rely on structurally dierent image processing principles and serve as redundant obstacle detection modules. The C-VIS approach uses two{dimensional adaptive object recognition algorithms basing on neural information processing principles whereas the UBM approach uses the 4D paradigm to dynamic machine vision with an explicit road model. The visual information is provided by two sets of CCD{cameras, one consisting of a bifocal module (wide{angle and tele) and the second set is a stereo camera system. The ICC keeps the car on the correct lane and automatically monitors the trac ahead. The monitoring task is realized by a two{stage obstacle detection scheme which tracks and classi es multiple vehicles in parallel at a long distance range and concurrently checks the road for general obstacles at a short distance range. A detailed overview of the dierent obstacle detection applications for car{following tasks can be found in the proceedings of the IV'94 in Paris.
2 ICC Scenario and Task Description The SMART VEHICLE has to be able to drive on regular public motorways. The ICC system does not have to monitor approaching trac because one of the key features of motorways is unidirectional trac. The visual information is provided by two xed cameras (wide{angle and tele) behind the middle of the windshield and a stereo pair of cameras on the left and right side of the windshield within the car. The ICC
functionality includes the detection of possible failures, e.g. due to bad weather or illumination conditions. In the failure case of poor visibility the ICC has to realize the transfer of the control to the driver. The ICC does not provide all{around vision and thus no overtaking recommendations will be given. These requirements have been taken by Daimler{Benz because of strictly product{driven aspects and therefore the lateral obstacle detection (SideView [1, 8]) is not being taken into account within the ICC. The main objective of the ICC is not the ability of fully autonomous driving including overtaking but an improved driver security and comfort. Nevertheless the SideView{system is part of VITA II's full functionality and will be used in subsequent projects for city trac applications.
vehicle objects detected and tracked by CarTrack are modelled as boxes in a bird's{eye view representation as shown in gure 1. CarTrack is based on active bifocal vision analysis by using a wide-angle camera for detection and short distance tracking and a tele camera for long distance tracking tasks. A detailed technical description of a previous version of CarTrack has been published in [10]. VisionBumper, a stereo{based inverse{perspective obstacle paradigm operates within the stop-and-go scenario, ranging from 0 to 50 km=h and within a distance range of 5 to 35 m in front of the vehicle. This functionality does not include any kind of classi cation. For a complete description of the underlying inverse{ perspective paradigm and applications to autonomous mobile robots we refer the reader to [3, 9].
Flowing Traffic Scenario CarTrack (range up to 100 m) 130 km/h
Stop-and-Go Scenario Vision Bumper (range 5 - 35 m ) up to 50 km/h
bifocal CarTrack cameras
stereo Vision Bumper cameras
VITA II
Figure 1: Abstract model of the ICC environment model. The perceived area is divided into two distance areas a) Short range area for the stop and go scenario where two obstacles can be monitored in parallel (VB). b) Large distance range for the monitoring of up to 3 vehicles concurrently (CT).
We use two visual obstacle detection approaches for the two dierent application scenarios: CarTrack: a real{time bifocal image processing module for the detection and tracking of up to 3 vehicles concurrently from behind including classi cation of dierent vehicle types ( owing trac scenario). VisionBumper: real{time short range frontal stereo detection of elevated objects used for warning functions in city trac or during stop{and{go on freeways (stop{and{go scenario). CarTrack is a sophisticated car{following and vehicle classi cation scheme. It is operated within the so{called owing trac scenario up to a speed of 130 km=h. The detection distance range lies between 5 and 80 m. A detected object ahead can be tracked up to a distance of 100 m. CarTrack is able to track 3 vehicles ahead in parallel. Thus the concurrent inspection of three lanes on the motorway is possible. The
Figure 2:
CarTrack results. The images show the on{ line output of the C40 display unit at arbitrarily chosen time{slots. Figures a{c) are views of the tele camera. d) is a view of the wide angle frontal camera. The grey{level images show dierent system states in which both parallel trackers are in action. In all examples, the system tracks at one time two vehicles including an inherent classi cation of the vehicle type.
The specialized image processing hardware consists of a net of digital signal processors (TMS320C4X) with several video input nodes and a graphic display node dedicated to visualization of the perceived environment. Both schemes are concurrently running in real{ time on the HPC DSP{network. They do not require a continuously moving car or any kind of speci c road model which has to be temporally tracked. Due to their structure both schemes work on static as well as on dynamic scenes. VisionBumper and CarTrack are both coupled to the DDB (Dynamic Data Base) of VITA II. All sensor modules communicate via the dynamic data base with the planning and decision module. The DDB serves as the general interface between all modules. The planning and decision module realizes the overall
45 CtDistance
Distance [m]
40 35 30 25 20 15 0
20
40
60
80
100
Time [sec] 6 CtSpeed 4
Speed [m/s]
2 0 -2 -4 -6 -8 0
20
40
60
80
100
Time [sec] 2 CtOffset
1.5
Offset [m]
1 0.5 0 -0.5 -1 -1.5 -2 0
20
40
60
80
100
Time [sec]
dynamically measured. The images are on{line recordings of the output of the HETVIO display module of the C40 network. These images are displayed on{line during operation on the HETVIO showing the original images of either the wide angle or the tele camera. Detected cars are visualized by a surrounding black box which is marked by the type of the detected vehicle. The upper row of text shows the current system status for two parallel trackers in operation. The trackers can be either in search or in tracking mode after having detected a leading vehicle. An arbitrarily chosen excerpt of a DDB recording for a time sequence of 100 s during autonomous driving is shown in Figure 3. All data sets are delivered every 80 ms to the DDB. The gure gives the results of the estimated distance, relative speed and oset of a leading car being currently tracked. The shown quality numbers in d) evaluate the current estimation values. The resulting relative velocity is predicted by a recursive Kalman{ ltering step. The quality value is a function of the illumination conditions and internal system parameters.
3.2 VisionBumper Results
100 CtQuality
Quality [%]
80
60
40
20
0 0
20
40
60
80
100
Time [sec]
Figure 3: CarTrack results. Excerpt of a 100 s on{line recording during driving. All the data are directly sent to the DDB. a) Distance, b) relative speed, c) vehicle oset on the lane, d) quality value. control and thus plans the manoeuvres of the car during closed{loop autonomous driving [5]. Both obstacle detection schemes deliver measurements of distance, oset and partially (CarTrack) relative velocity of the detected obstacles { 1{2 for VisionBumper, up to 3 for CarTrack { to the DDB. They both provide on{ line quality measurements estimating the recognition reliability.
3 Closed{Loop ICC on Motorways The following section gives ICC results which were recorded during real-time control of VITA II on public motorways around Bochum. The algorithms have been extensively tested on more than 2000 km drives up to now on normal motor{ and highways within fully autonomous driving. These test drives included all kind of real weather and illumination conditions and changes.
3.1 CarTrack Results Figure 2 shows the CarTrack system in operation. The leading cars are being tracked and their sizes are
Figure 4: Visualization of the VisionBumper paradigm. The upper two images show the original incoming stereo images of a typical road scene. The lower two images are the two resulting inverse{perspectively mapped images which are fed to the correlation procedure. Both IPM images are mapped onto a common region on the road within the binocular area. They only dier in parts of the images not belonging to the at road.
The VisionBumper geometry is shown in Figure 4. It consists of the original stereo view of a typical scene and the resulting inverse{perspective images. The left and the right original images are both inverse{ perspectively mapped onto a common view within the binocular area on the road. The depth resolution of the VisionBumper system can easily be adapted to a speci c application by dynamically changing the current inspection area on the road. Table 1 shows dif-
Figure 5: VisionBumper results 1. The images show the
on{line output of the C40 display unit at arbitrarily chosen time{slots. In each image, the original image of the left stereo camera is shown together with the overlayed small images of the inverse-perspectively mapped (IPM) target images. The small image in the middle is the dierence of both mapped target images in the binocular area. The minimal detected distance is projected into the original perspective view and is visualized by marking this distance with a white horizontal bar.
ferent optimized binocular inspection areas within the road which can be activated with respect to the current depth range. The 1/2"-CCD stereo cameras have tilt angles of about 9 and pan angles of 1:5 together with a base width of 1:16 m and a height of 1:36 m. The focal length of both cameras is 15 mm.
each having a resolution of 64 64 pixels). The correlation process for segmenting obstacles in the scene is done on these resulting mapped images. The black bars in the lower inverse-perspective image (right camera) show the regions where signi cant changes in both images have been detected. The minimal detected distance is projected into the original perspective view and is visualized by marking this distance with a white horizontal bar. This bar is drawn with respect to a xed object width of 1:8 m and is located in the middle of the perspective view in front of the car for a de ned distance. The detected distance is printed on the lower part of the images. The small image in the middle{left which is found in some of the result images is the subtracted image of the two inverse{perspective mapped images shown above and below. The chosen pictures show dierent typical situations found in normal trac. It is obvious that strong texture signals on the street (as arrows or other types of road-markings) do not aect the obstacle detection functionality. The algorithm is also robust against variations of the outer camera parameters. During driving, the car typically has tilt angle variations of about 5 degrees. This leads to a decalibration of the system during operation. Nevertheless the algorithm is robust enough to compensate these distortions. 30 VbDistance 25 20 15 10 5 0 0
5
10
15
20
25
30
35
40
2.5 VbWidth 2
1.5
Monitored Road Area 1 2 3 + vmax 12.0 m 35.0 m 50.0 m + vmin 5.0 m 3.8 m 10.0 m h+max 1.6 m 2.3 m 2.8 m hmin -0.4 m -1.5 m -4.0 m
Table 1: Dierent possible rectangular inspection regions
on the road within the binocular area. The dierent geometries allow varying depth resolutions and thus the handling of varying minimum object sizes. The origin of the reference coordinate system is situated at a central position within VITA II. v denotes the depth direction and h denotes the lateral direction with respect to the VITA II reference point.
Figure 5 shows four dierent on{line outputs of the C40 display unit at arbitrary moments. In each image the big image in the background is the original on{line image of the left VisionBumper camera. The two small images on the left side of the pictures are the inverse{perspective images of the stereo cameras. These images have a common inverse{perspective view of the stereo geometry within the binocular area (upper image : left camera, lower image : right camera,
1
0.5
0 0
5
10
15
20
25
30
35
40
1 VbQuality 0.8
0.6
0.4
0.2
0 0
5
10
15
20
25
30
35
40
Figure 6: VisionBumper results. Excerpt of the data ow
which is transmitted to the DDB. Time slice of 35 s data recording during driving. a) Distance, b) vehicle width, and c) quality value.
Again an arbitrarily chosen excerpt of a DDB recording is shown for a longer time sequence. The gure 6 shows a fraction of the data delivered to the DDB. The most important measurement in this case is the distance estimation. The width has xed values for dierent vehicle types and the oset is considered rela-
System Parameter Scan Rate Latency Time Vehicle Speed Operat. Distance Number of Objects Classi cation Bifocal Functionality
Video Input from Cameras
CarTrack
80/60 ms 160/80 ms 0{130 km/h 5{100 m 3 car, truck yes
VisionBumper
80 ms 160 ms 0{50 km/h 5{35 m 1{2 | possible
Video Crossbar
VB
CT-1
CT-3
CT-2
Display
Table 2: Performance data of both obstacle detection subsystems.
tive to the lane where VITA II is driving. The quality value again is a function of the external illumination conditions and internal system parameters.
ROOT
4 Embedded HPC{System Hardware For the implementation of the obstacle analysis system we use a hardware platform based on the TMS320C40 and TMS320C44 signal processors ('C4x for short) from Texas Instruments. This type of DSP follows a concept similar to the INMOS transputer with its incorporated communication facilities, but is much better suited for doing such HPC tasks as iconic image processing. The 'C44 is, like its predecessor the 'C40, a 275 MOPS and 50 MFLOPS rated oating-point DSP, but with only four communication ports and reduced power and space consumption (power management functions) as compared to the 'C40. The parallel communication links are capable of bidirectional data transfers up to rates of 20 MBytes/sec each. In order to use an industry standard system we chose the TIM{40 module format. One of the most crucial features of this standard is its inherent scalability. Networks of 'C4x nodes can thus easily be plugged together. Concerning the software structure, the requirement has been to stay within a standard programming paradigm for parallel processing. This is a necessary and desirable objective as it greatly facilitates software reusability for subsequent implementation of new applications. For software development we use Parallel C, which makes the inherent communication features of the 'C4x DSP available in a high level language, thus simplifying programming using the well established paradigm of communicating sequential processes. These prerequisites on the hardware and the software side puts us into the position to build up a coarse grain parallel system for obstacle analysis. Figure 7 shows a sketch of the hardware platform embedded in the VITA II test car and its interconnections to video sources and data output interfaces. At the coarse level there are four major processing branches which all perform their functional tasks in parallel. Three of them are dedicated to CarTrack and one to the VisionBumper module.
HOST
I/F to DDB
Figure 7: Sketch of the embedded processing hardware used in VITA II (VB: VisionBumper, CT-1..3: Instances of CarTrack). Video in
R G B
’C44
’C44
’C44
Results
Figure 8: Detail of processing pipeline dedicated to a Car-
Track or a VisionBumper process.
The branches depicted as shaded boxes in gure 7 consist each of a pipeline of three DSP's. The components of this pipeline are detailed in gure 8. The video signals are received by special TIM{40 processor nodes which incorporate digitizer chips and frame buers. Stereo and bifocal image acquisition is done by RGB color digitizers providing up to three channel simultaneous image capture. A multi channel video multiplexer is used for the distribution of the video signals in the computer vision system. One of these pipelines is used by a VisionBumper process while the other two are dedicated to a CarTrack tracking process each. In the case of VisionBumper the rst two nodes are employed for the twopass inverse{perspective mapping of the stereo images while the third is used for matching the resulting mapped images. For CarTrack, the subfunctions feature extraction, neural net-based object detection and robust symmetry measuring are distributed over the three nodes available. A sometimes occurring third object will be tracked by a CarTrack process running on the display unit (right side in g. 7) concurrently to the display functions. Although this process is a cut down version in comparison to the other two, this is compensated by a kind of load balancing which assures that the third process always tracks the most distant object of the three, i.e. the one expected to be of less importance. Finally, the root node in the center of gure 7 has
several tasks. First, it has to schedule the tracking processes for the objects detected by them, thus providing a loose coupling of the trackers. Second, it has to distribute the tracking results to the display unit and the vehicles dynamic data base. The interface to the DDB consists of a single bidirectional communication channel which allows the transfer of results to the application computer. For developing the root node also takes on additional tasks like system pro ling and logging of tracking results. In total, the obstacle analysis system consists of eleven 'C4x DSP nodes including four frame grabber devices and a RGB color display unit. The system as a whole occupies 3 VME slots in the rack located in the trunk of the test car. Currently the system is hosted by a SPARC VME card for booting and development. By tting a special TIM{40 module containing nonvolatile memory for the application code, the network can be brought to stand-alone operation, so that a host system is no longer needed. With respect to the 12 additional lateral cameras embarked in the vehicle, not all of them can be connected directly to all frame grabbers in the network. Therefore, the video multiplexer used here is in effect a highly integrated 16 16 video crossbar which is controllable by one of the DSP nodes. This feature is mainly exploited by the SideView application, enabling scanning of all cameras and thus providing all{around vision. Since the space occupied by the computer vision system is a crucial point when going into product, eorts have been made to reduce the space consumption further. In comparison with the previous system demonstrated in the scope of PROMETHEUS [1], our current hardware is now capable of running CarTrack and VisionBumper concurrently without increasing the mechanical size of our system due to the recent release of the 'C44 processor and the availability of higher integrated TIM{40 modules on the market. We even expect to decrease the space needed further, because of the perspective given by future successors of the well established 'C40 DSP. Concerning the obstacle analysis modules CarTrack and VisionBumper on their own, we already achieved a higher performance to space ratio. Note that it is not necessary having all modules running together. The system may as well be split up in order to have only the VisionBumper functionality or a single instance of CarTrack in use.
5
Conclusions and Future Work
We have presented a complete obstacle detection framework (ODF) for intelligent cruise control on public motorways of a prototype passenger car. The system's visual inputs are a bifocal and a stereo camera system that are both looking in front of the car. The two distinct obstacle detection schemes are operating concurrently on a dedicated 'C40{based network in the two de ned application scenarios of a) owing and b) stop{and{go trac over a wide distance range. The
design of the HPC platform follows strictly product{ driven requirements of cost eectiveness, task separability, low space consumption and a high performance together with an easy integration to the transputer network within the application computer of VITA II. Future work will focus on the extension of the functionality to night driving and the integration of additional adaptive object recognition functions. An example is the approximation of model{based CAD contour forms based on B{Splines analysis for the task of adaptive visual object measurement. The SideView system will be used for automated overtaking control with an optical ow analysis comparable to the approach of Tistarelli et al. [7]. Our future research activities will be focused on the special requirements of the city trac scenarios.
References
[1] M. Brauckmann, C. Goerick, J. Gro, and T. Zielke. Towards All Around Automatic Visual Obstacle Sensing for Cars. In Intelligent Vehicles '94, pages 79{84, Paris, October 24-26, 1994. IEEE. [2] D. Koller, J. Weber, T. Huang, J. Malik, G. Ogasawara, B. Rao, and S. Russel. Towards Robust Automatic Trac Scene Analysis in Real-time. In 12th IAPR, International Conference on Pattern Recognition, pages 126{131, Jerusalem, October 9-13, 1994. IEEE. [3] H.A. Mallot, H.H. Bultho, J. J. Little, and S. Bohrer. Inverse Perspective Mapping Simpli es Optical Flow Computation and Obstacle Detection. Biological Cybernetics, 64:177{185, 1991. SpringerVerlag. [4] I. Masaki, S. Dicker, A. Gupta, and B.K.P. Horn et al. Cost{Eective Vision Systems for Intelligent Vehicles. In Intelligent Vehicles '94, pages 39{43, Paris, October 24-26, 1994. IEEE. [5] D. Reichardt and J. Schick. Collision Avoidance in Dynamic Environments Applied to Autonomous Vehicle Guidance on the Motorway. In Intelligent Vehicles '94, pages 74{78, Paris, October 24-26, 1994. IEEE. [6] F. Thomanek, E.D. Dickmanns, and D. Dickmanns. Multiple Object Recognition and Scene Interpretation for Autonomous Road Vehicle Guidance. In Intelligent Vehicles '94, pages 231{236, Paris, October 24-26, 1994. IEEE. [7] M. Tistarelli, F. Guarnotta, D. Rizzieri, and F. Tarocchi. Application of Optical Flow for Automated Overtaking Control. In Proceedings of the 2nd Workshop on Applications of Computer Vision, pages 105{113, Sarasota, December 5-7, 1994. IEEE. [8] B. Ulmer. VITA II - Active Collision Avoidance in Real Trac. In Intelligent Vehicles '94, pages 1{6, Paris, October 24-26, 1994. IEEE. [9] W. von Seelen, S. Bohrer, J. Kopecz, and W. Theimer. A Neural Architecture for Visual Information Processing. International Journal of Computer Vision, 1995. in press. [10] T. Zielke, M. Brauckmann, and W. von Seelen. Intensity and Edge-Based Symmetry Detection with an Application to Car-Following. CVGIP: Image Understanding, 58(1), July 1993.