Extending two non-parametric transforms for FPGA based stereo

1 downloads 0 Views 2MB Size Report
classification tasks, we propose a technique for extending the rank and the census transform for increased robustness on gray scaled bayer patterned images.
Extending two non-parametric transforms for FPGA based stereo matching using bayer filtered cameras∗ Kristian Ambrosch Austrian Research Centers GmbH - ARC A-1220 Vienna, Austria

Martin Humenberger Austrian Research Centers GmbH - ARC A-1220 Vienna, Austria

[email protected]

[email protected]

Wilfried Kubinger Austrian Research Centers GmbH - ARC A-1220 Vienna, Austria

Andreas Steininger Vienna University of Technology A-1040 Vienna, Austria

[email protected]

[email protected]

Abstract

ity and associated costs to realize such a kind of sensor. The key point of stereo vision is to find the correspondence between both images of a stereo pair. The challenging task is to identify corresponding pixel pairs for calculating depth information from their horizontal displacement, called disparity, via triangulation. Therefore, to get 3D information, this correspondence problem has to be solved. The computation of a three dimensional depth map using stereo matching algorithms is computationally extremely expensive. Fortunately, area based matching algorithms proved to be very suitable for highly parallel implementations using Field Programmable Gate Arrays (FPGAs) [13] and are therefore highly attractive for the use in embedded real-time stereo vision sensors. For an efficient FPGA implementation, these algorithms must avoid the usage of resource intensive functions like multiplications or divisions. Under this constraint non-parametric transforms, the rank as well as the census transform [21] in particular, proved to be very resource efficient when implemented on FPGAs [4], while producing dense and accurate disparity maps.

Stereo vision has become a very interesting sensing technology for robotic platforms. It offers various advantages, but the drawback is a very high algorithmic effort. Due to the aptitude of certain non-parametric techniques for Field Programmable Gate Array (FPGA) based stereo matching, these algorithms can be implemented in highly parallel design while offering adequate real-time behavior. To enable the provision of color images by the stereo sensor for object classification tasks, we propose a technique for extending the rank and the census transform for increased robustness on gray scaled bayer patterned images. Furthermore, we analyze the extended and the original algorithms’ behavior on image sets created in controlled environments as well as on real world images and compare their resource usage when implemented on our FPGA based stereo matching architecture.

1. Introduction

Area based algorithms match blocks of pixels to solve the correspondence problem between both stereo images. Thus the algorithms’ robustness highly depends on the size of these blocks. Then again, big block sizes lead to a smoothed disparity map and furthermore to deformed object edges [6].

In the last years, stereo vision has become a very interesting sensing technique for service robotic platforms. Compared to competing techniques like LIDAR [16], ultrasonic [20], monocular vision [19] or omnidirectional vision [1], stereo vision [11] offers several advantages. It is passive, it does not affect its neighborhood, it is small in size and very flexible. Therefore, stereo vision is considered as an interesting approach, because it would offer most flexibility. However, the drawback is a very high algorithmic complex-

Stereo vision typically generates depth maps, being centered on one of the camera images. Therefore it could be a good pick for object classification algorithms, that require a three dimensional depth map as well as an intensity image. On the other hand, object classification algorithms often require color images, while most stereo vision algorithms enforce the use of gray scale images, due to the lower com-

∗ The research leading to these results has received funding from the European Community’s Sixth Framework Programme (FP6/2003-2006) under grant agreement n◦ FP6-2006-IST-6-045350 (robots@home).

1

978-1-4244-2340-8/08/$25.00 ©2008 IEEE

putational complexity. One way to deal with this issue is to use color cameras for the stereo vision sensor and convert the images to gray scale for the stereo matching algorithm. When using bayer filtered color cameras for this task, the image noise in the gray scaled images is significantly increased. This is because in a bayer filtered image, each pixel contains the intensity value for one specific color only, i.e., for red, green or blue. When simply using the intensity values for the gray scaled images, this causes a specific noise pattern. Even if color interpreting algorithms can reduce this issue, most of them are not suitable for FPGA based implementations due to their high logic consumption when implemented in hardware. Even if the rank and the census transform are capable to handle such noisy images, they require largely increased block sizes for the computation of accurate disparity maps, leading to the aforementioned deformations of the object edges. FPGA technology has made a big leap forward since the rank and the census transform have been proposed. They offer far enough logic resources for improved and therefore more complex algorithms with higher robustness on equal block sizes. Therefore, we analyzed the rank as well as the census transform on their capacity for improvements, keeping our focus on real world images produced by bayer filtered cameras, rather than image sets taken under perfect illumination in a controlled environment with high end cameras. We propose a novel technique to extend the rank and the census transform, resulting in highly improved robustness on bayer patterned real word images for the retrieval of color images for object classification in service robots, while avoiding resource intensive operations like multiplications or divisions.

2. Algorithms 2.1. Non-Parametric Transforms Area based algorithms usually transform the stereo images before their correlation. In contrast to parametric measures, like the mean or variance, non-parametric transforms use relative ordering of intensity values rather than the intensity values themselves. After the transformation of the stereo images, the resulting matching costs have to be calculated and the most accurate match has to be searched for. In our work, we always use the absolute minimum of the matching costs, also called the Winner Takes All (WTA) algorithm, rather than statistical approaches for the selection of the resulting disparity.

2.2. The Rank Transform The rank transform [21] is defined as the number of pixels in a neighborhood region that are smaller (i.e. have a lower intensity) than its center pixel. Therefore, Zabih and Woodfill defined a comparison function ξ(P, P 0 ) for

evaluating the sign of the pixel differences, that is one if the neighborhood region’s center pixel P is bigger than the pixel P 0 . 1|P >P 0

ξ(P, P 0 ) = {0|P ≤P 0

(1)

Using this comparison function the rank transform R(P ) can be defined as X R(P ) = ξ(P, P 0 ) (2) P 0 N (P )

where N (P ) is the neighborhood region without its center pixel P . The computation of the matching costs M Crank for two rank transformed pixels Pr1 , Pr2 is performed by calculating the absolute difference. M Crank (Pr1 , Pr2 ) = |Pr1 − Pr2 |

(3)

Even if the rank transform is very suitable for FPGA implementations due to its simple computation, it is used rarely. One FPGA based implementation is [13].

2.3. The Census Transform The census transform [21] makes use of the same comparisonNfunction as the rank transform (Eq. 1). It concatenates ( ) the comparison function’s results over the neighborhood region N (P ) to a bit vector, where N (P ) does not contain the center pixel. Therefore, the census transform C(P ) can be defined as C(P ) =

O

ξ(P, P 0 ).

(4)

P 0 N (P )

Here, the matching costs for two transformed pixels Pc1 , Pc2 are calculated using the hamming distance between the bit vectors, which can L be computed by aggregating the set bits of the xored ( ) vectors. Thus the census transform’s matching costs M Ccensus for a neighborhood region containing k pixels without the center pixel is

M Ccensus (Pc1 , Pc2 ) =

k X

Pc1 (n)

M

Pc2 (n).

(5)

n=1

Due to its high accuracy, the census transform is a very popular algorithm for hardware implementations, even if it requires more logic resources for the implementation than the rank transform. There exist various FPGA based implementations [3, 12, 14, 18] as well as implementations that are using Application Specific Integrated Circuits (ASICs) [10, 17].

2.4. Extending the Transforms

2.6. Extended Census Transform

To gain a better robustness on noisy images, the rank and the census transform must become able to distinguish between high and low deviances between the neighborhood pixels and their center pixel. Therefore, we introduce the use of the absolute difference AD instead of the transforms’ original comparison function ξ.

The extended census transform computes the vector of the absolute differences and is therefore defined as

AD(P, P 0 ) = |P − P 0 |

(6)

By using the absolute differences instead of the sign of the differences, the transforms gain a higher resolution of the pixel deviances and can therefore better differentiate between slight variations caused by noise and strong variations caused by texture. The absolute difference requires more logic resources than the original function ξ, but it contains only a subtraction and a comparison and is therefore still quite resource efficient when implemented in hardware. Furthermore, it requires less logic resources than using the signed difference would, since it avoids the handling of negative values and the discrepancy in the resulting robustness is negligible. In difference to the well known sum of absolute differences (SAD) algorithm [9] that is using the absolute differences between two neighborhood regions, this extension still uses the relative pixel values within the neighborhood and therefore the transforms stay more illumination invariant. In [2] we analyzed the behavior of an FPGA based implementation of the SAD algorithm and our results showed, that the SAD algorithm is not able to cope with low textured surfaces even under perfect illumination. When compared to the Modified Census Transform (MCT) proposed by Froeba and Ernst [5] our extension comes along without the computation of the mean over the neighborhood region and therefore is more resource efficient. Furthermore, the MCT is targeted to deal with high illumination variations in the application of face detection. Both stereo images are taken at the same time and under the same illumination conditions. Thus such high illumination variations are not likely to happen and therefore do not justify the high resource usage caused by the mean operation, when already using a non-parametric transform.

2.5. Extended Rank Transform By using the absolute difference instead of the original comparison function ξ, the extended rank transform is defined as Rext (P ) =

X P 0 N (P )

AD(P, P 0 ).

(7)

The computation of the matching costs is performed analogous to the original rank transform’s (Eq. 3).

Cext (P ) =

O

AD(P, P 0 ).

P 0 N (P )

(8)

Here, the matching costs M Cextcensus have to be computed using a byte vector rather than a bit vector. Therefore, we use the sum of the absolute differences over the generated vector.

M Cextcensus (Pc1 , Pc2 ) =

k X

|Pc1 (n) − Pc2 (n)|

(9)

n=1

3. Experimental Evaluation To contrast the performance and quality of the rank and the census transform as well as our proposed extended versions, we realized an FPGA based stereo matching architecture, that is targeted at non-parametric techniques. We tested the algorithms on six different stereo image pairs reaching from arranged datasets taken in an controlled environment, over datasets with artificial gaussian noise, up to bayer patterned office images. Furthermore, we analyzed the resource usage of the different algorithms, when implemented on the FPGA.

3.1. FPGA based Stereo Matching Architecture Our FPGA based stereo matching architecture consists of three major pipeline stages: the input, the calculation, and the evaluation stage. The input stage reads the image data from the input port, storing and providing them for the calculation stage. The calculation stage performs the stereo matching iteratively as soon as the input stage has stored enough lines to start the computation. Here, the disparity range d is divided into r partitions and each partition is computed in a separate calculation round. First the image blocks are transformed and stored in the registers. After the first dr blocks are transformed, the first pixels’ matching costs are computed. Then, the matching costs are aggregated. Afterwards the WTA algorithm selects the smallest matching costs and their position, storing them in the internal memory as the results for the current partition. The evaluation stage reads the partitions’ results from the internal memory, as soon as all calculation rounds for the previous line have been completed. For these results it again uses the WTA algorithm to find the smallest matching costs and selects their position as the resulting disparity value.

Input Stage

Calculation Stage

Evaluation Stage

Image Buffer Management Partition 1b Values Transformation

Input Port

Input Management

Image 2

Line 2

Image1

Line 2

Image 2

Line 1

Image 1

Line 1

Transformation

Partition 1a Values Partition 0b Positions

Matching Costs

Partition 0a Positions Partition 0b Values

Aggregation

WTA

Partition 0a Values

WTA

Output Port

Memory

Figure 1. FPGA based stereo matching architecture.

The advantage of using a round based computation is that the architecture’s frame rate can be adjusted to the algorithm’s requirements. E.g. for a resource intensive algorithm, the number of calculation rounds can be increased, resulting in a lower number of performed transformations per round and therefore in a lower logic consumption. On the other hand, the number of calculation rounds can be decreased when using a small and resource efficient algorithm, resulting in a shorter processing time. Figure 1 presents the block diagram of our stereo matching architecture.

3.2. Test Configuration For testing the algorithms’ robustness, we measured the correct matches of the resulting disparity maps centered on the left camera for three different stereo datasets, after performing a left/right consistency check for the removal of occluded or mismatched areas. Here, we allowed a maximum deviance of three disparity levels for the left/right consistency check and one disparity level for the correct matches, when compared to the ground truth images. To reduce the stereo matching procedure to one dimension, all considered stereo datasets are rectified fulfilling the epipolar geometry [22]. Therefore, our architecture needs to search for the correspondence along the image rows only. First, we tested the algorithms using the teddy and the first wood images from the Middlebury dataset [7, 15], which were converted to 8 bit gray scale as depicted in figures 2 and 3. Then we added a gaussian noise to the datasets with zero mean and a variance of 0.002 for analyzing the algorithms’ behavior on noisy images. Finally, we used the couch and fax images proposed by Humenberger et al. [8], which were taken using bayer filtered color cameras and converted to 8 bit gray scale as depicted in figures 4 and 5.

These two image sets are especially interesting for our purpose, because they contain two typical situations service robots have to cope with. The illumination is moderate, coming from multiple sources on the ceiling, and there is only sparse texture in the images, a typical situation in an office building. Thus the images from the Middlebury dataset are not very suitable for testing stereo matching algorithms targeted at the use in service robots, but are part of the analysis for the completeness of our evaluation. For all these stereo images, we tested different block sizes for the transforms, reaching from 7 × 7 up to 19 × 19. After the computation of the matching costs we used a 3 × 3 aggregation to improve the results. For the evaluation of the resource usage, we synthesized all algorithms for an Altera Stratix EP2S180, a high end FPGA offering 180000 logic elements. We selected an image size of 450 × 375, a disparity range of 100 pixels and again block sizes reaching from 7 × 7 up to 19 × 19. Our FPGA based stereo matching architecture was configured to compute 10 disparity levels per calculation round, which results in 16ms processing time per image or a frame rate of 62.5f ps when reaching a system frequency of 100 M Hz. To avoid border effects, the disparity was not calculated for the first 100 pixels, since there would not be sufficient image information available in the secondary image to perform the matching for the whole disparity range. Hence these image areas are black and disregarded for the further evaluation.

3.3. Results The measured correct matches for the teddy and wood datasets are depicted in figures 6 and 7, the results using gaussian noise in figures 8 and 9. The results for the couch and the fax images are presented in figures 10 and 11.

Figure 2. Teddy images from the Middlebury dataset. Top: Camera images; Bottom: Ground truth image.

Figure 5. Fax images from the dataset by Humenberger et al. Top: Camera images; Bottom: Ground truth image.

in figure 14. For the extended census transform block sizes higher than 9×9 did not fit into the EP2S180. Therefore we present the logic consumption estimation only, before the design was fit into a specific device. For block size 19 × 19 the synthesis was not able to estimate the logic consumption. The stereo matching architecture’s system frequency was linearly corresponding to the algorithms’ block size rather than to the algorithms themselves, being about 100 M Hz for 7 × 7 and 50 M Hz for 19 × 19 respectively. Hence, the achievable frame rates using the stated architecture configuration varied between 62.5 and 31.25f ps.

Figure 3. Wood images from the Middlebury dataset. Top: Camera images; Bottom: Ground truth image.

39,16 43,67

Block Size

7x7

9x9

42,17 44,73

11x11

43,88 44,86

13x13

44,82 44,76

15x15

45,27 44,27

17x17

45,37 43,57

19x19

45,11 42,50

Figure 4. Couch images from the dataset by Humenberger et al. Top: Camera images; Bottom: Ground truth image.

Rank

Ext. Rank

Census

73,22 74,01

76,15 76,13

76,98 76,06

76,91 75,34

76,24 74,34

75,35 73,17

74,49 72,00 Ext. Census

Figure 6. Correct matches (%) for the teddy dataset.

Figure 12 illustrates the disparity maps calculated for the couch and fax datasets after the left/right consistency check, using a block size of 19 × 19 for the transforms. The algorithms’ resource usage in terms of logic elements is depicted in figure 13 and the memory consumption

3.4. Discussion The results from the teddy and wood datasets do not show improved results for the extended versions of the rank

47,47 44,72

49,53 45,44

9x9

51,20 45,61

Block Size

11x11

51,68 45,15

13x13

42,23

19x19

Rank

Ext. Rank

9x9

85,36 85,33

11x11

3,25 4,58

3,21

Block Size

11x11

3,87

4,07 17x17

4,31 19x19

2,64

17x17

82,68 80,64

2,17 2,78

19x19

9,29

Ext. Census

Rank

Ext. Rank

16,46 21,07

9x9

9,02

19,44 19,66

11x11

21,97 23,03

9,76 13x13

10,37 15x15

23,98 25,89

11,16

25,54 28,14

17x17

11,70

Rank

26,91 Ext. Rank

Census

Census

19x19

Figure 8. Correct matches (%) for the gaussian noised teddy dataset.

and the census transform. Actually the number of correct matches is slightly decreased. When using gaussian noise on both datasets the advantages of the extended rank transform over the original one become apparent. For the extended census transform this is only true for the teddy dataset when using block sizes of 11 × 11 and above, while the results for the wood dataset show an advantage for the original census transform on all block sizes. Anyway, when using the couch and fax datasets from Humenberger et al., which have a different kind of noise

Ext. Census

18,01

26,16

26,59

18,98

19,09

19,14

32,69

30,68

37,57

33,85

Rank

19,43

44,24

38,57

19,58

Ext. Rank

41,31

36,31

40,42

30,11

Ext. Census

10,96

Figure 9. Correct matches (%) for the gaussian noised wood dataset.

15,91 15,39

6,95

10,41

8,49

7,97

6,80

9,51

2,00 2,75

6,83

6,57

15x15

8,45

6,37

7,38

81,34 78,90

6,39

13x13

2,35

7x7

6,05

3,52

1,49 13x13

15x15

11,48 10,78

5,44

6,90

4,96

Block Size

2,71 9x9

5,22

1,28 1,97

1,73

Figure 7. Correct matches (%) for the wood dataset.

7x7

1,07 1,57

51,08

Census

3,42

3,40

83,89 82,39

51,56 43,67

17x17

0,86 1,27 2,02

84,81 85,63

84,95 83,93

51,83 44,48

15x15

7x7

81,56 83,50

Block Size

7x7

Census

46,58

48,48

Ext. Census

Figure 10. Correct matches (%) for the couch dataset.

caused by the bayer filter and suboptimal lightning conditions like in real indoor scenes, the advantages of the extended transforms are striking. For the extended rank transform, the number of correct matches is increased between 40% and 140% when compared to the original one, while the extended census transform shows a lower increase between 8% and 24%. The disparity maps depicted in figure 12 illustrate the remarkable increase in the disparity maps’ density for the extended transforms. The increase in required logic elements of the extended rank transform is in the same scale as the increase in quality

11,95

7,68

13,11

9x9

8,78

Block Size

9,95

14,94

10,89 15x15

34,79 38,57

38,33

16,75

Ext. Rank

10174 17624 17713

9x9

13514 24673 26788

23554 52356 62998

48,37 19x19

52,21 * No Device

Ext. Census

148213 *

15x15

205448 *

272068

348133

*

79015 97824 Rank

Ext. Rank

Census

Ext. Census

Figure 13. Required logic elements of the algorithms.

Figure 11. Correct matches (%) for the fax dataset.

Rank Transform

557056 7x7

557056

688128 688128

589824 9x9

589824

720896 720896

622592

Block Size

11x11

Extended Rank Transform

*

28686 64524 79664 34053

Census

131616

17475 33179 38006

13x13

17x17

43,49

80598

21999 43406 51370

43,66

41,10

12,43 17,29

Rank

7x7

11x11

30,60 33,22

16,09

11,80 17x17

19x19

25,64 27,64

13,89

11x11

13x13

19,54 21,54

Block Size

6,27 7x7

622592

753664 753664

655360 13x13

655360

786432 786432

688128 15x15

688128

819200 819200

720896 17x17

720896

Census Transform

851968 851968

753664 19x19

753664 Rank

Ext. Rank

Census

884736

Ext. Census

Figure 14. Required memory bits of the algorithms.

Extended Census Transform

Figure 12. Generated disparity maps using block size 19 × 19 for the transform and 3 × 3 for the aggregation. Left: Couch dataset; Right: Fax dataset.

for the bayer patterned images. The memory consumption behaves likewise. Therefore, the extended rank transform can narrow the gap between the rank and the census transform, both in robustness and resource requirements. The extended census transform has an extensive increase in logic elements, when compared to the original census transform. This is caused by the increased size of the transformed vectors, which have to be stored in the FPGA’s registers for the calculation of the next pixel’s matching costs. Taking this into consideration, it might be more advisable to use an architecture that recalculates the transformed image pixels for each matching procedure, rather than storing

the transformed values, when the higher robustness of the extended census transform is demanded. The results also show that the increased robustness of the extended transforms can also be achieved by using bigger block sizes, but as mentioned before these increased block sizes also lead to deformations of the object edges. Thus extracting the objects’ shape from the generated disparity maps can be a pretty tough task for the object classification algorithms, when using an increased block size.

4. Conclusions Using real world images rather than images taken under perfect conditions in a controlled environment significantly affects the robustness of the rank and the census transform. Extending these two transforms by using the absolute differences instead of the original comparison function highly improves their robustness on gray scaled images taken with bayer filtered cameras. Our results show, that for this kind of images the extended rank transform narrows the gap between the census and the rank transform both in robustness and in resource usage, when implemented using FPGAs. The extended census transform gives the opportunity to improve the census transform’s quality, when logic consumption is a subordinate issue.

References [1] G. Adorni, M. Mordonini, S. Cagnoni, and A. Sgorbissa. Omnidirectional Stereo Systems for Robot Navigation. In Proc. of the IEEE Workshop on Omnidirectional Vision and Camera Networks, pages 79–89, 2003. [2] K. Ambrosch, M. Humenberger, W. Kubinger, and A. Steininger. SAD Based Stereo Matching Using FPGAs. Chapter of Embedded Computer Vision, Springer Verlag, To appear in 2008. [3] P. Corke and P. Dunn. Real-Time Stereopsis Using FPGAs. In Proc. of the IEEE Conference on Speech and Image Technologies for Computing and Telecommunications, 1997. [4] P. Corke, P. Dunn, and J. E. Banks. Frame-rate stereopsis using non-parametric transforms and programmable logic. In Proc. of the IEEE Conference on Speech and Image Technologies for Computing and Telecommunications, 1997. [5] B. Froeba and A. Ernst. Face detection with the modified census transform. In Proc. of the Sixth IEEE Conference on Automatic Face and Gesture Recognition, 2004. [6] H. Hirschmueller, P. R. Innocent, and J. Garibaldi. Real-time correlation-based stereo vision with reduced border errors. International Journal of Computer Vision, 47(1). [7] H. Hirschmueller and D. Scharstein. Evaluation of Cost Functions for Stereo Matching. In Proc. of the 2007 Conference on Computer Vision and Pattern Recoginition, 2007. [8] M. Humenberger, D. Hartermann, and W. Kubinger. Evaluation of Stereo Matching Systems for Real World Applications Using Structured Light for Ground Truth Estimation.

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

In Proc. of the IAPR Conference on Machine Vision and Applications, 2007. T. Kanade, A. Yoshida, K. Oda, H. Kano, and M. Tanaka. A Stereo Machine for Video-rate Dense Depth Mapping and Its New Applications. In Proc. of the 15th Conference on Computer Vision and Pattern Recoginition, 1996. M. Kuhn, S. Moser, O. Isler, F. K. Gurkaynak, A. Burg, N. Felber, H. Kaeslin, and W. Fichtner. Efficient ASIC implementation of a real-time depth mapping stereo vision system. In Proc. of the 46th IEEE International Midwest Symposium on Circuits and Systems, 2004. D. Murray and J. Little. Using real-time stereo vision for mobile robot navigation. Autonomous Robots, 8:161–171, 2000. A. Naoulou, J.-L. Boizard, J. Y. Fourniols, and M. Devy. A 3D real-time vision system based on passive stereo vision algorithms: Application to laparoscopic surgical manipulations. In Proc. of the International Conference on Information and Communication Technologies, 2006. R. B. Porter and N. W. Bergmann. A generic implementation framework for FPGA based stereo matching. In Proc. of the IEEE Region 10th Annual Conference on Speech and Image Technologies for Computing and Telecommunications, 1997. P. J. Rajda. Optimization of Logic Use on Stereo Vision Algorithm Example. In Proc. of the 9th IEEE Symposium on FPGAs for Custom Computing Machines, 2001. D. Scharstein and R. Szeliski. High-Accuracy Stereo Depth Maps Using Structured Light. In Proc. of the 2003 Conference on Computer Vision and Pattern Recoginition, 2003. R. Sheh, N. Jamali, M. W. Kadous, and C. Sammut. A LowCost, Compact, Lightweight 3D Range Sensor. In Proc. of the Australian Conference on Robotics and Automation, 2006. J. I. Woodfill, G. Gordon, and R. Buck. The Tyzx DeepSea High Speed Stereo Vision System. In Proc. of the 2004 Conference on Computer Vision and Pattern Recoginition Workshops, 2004. J. I. Woodfill and B. Von Herzen. Real-time stereo vision on the PARTS reconfigurable computer. In Proc. of the 5th IEEE Symposium on FPGAs for Custom Computing Machines, 1997. Z. Yan, X. Xiaodong, P. Xuejun, and W. Wei. Mobile Robot Indoor Navigation Using Laser Range Finder and Monocular Vision. In Proc. of the IEEE International Conference on Robotics, Intelligent Systems and Signal Processing, pages 77–82, 2003. S.-Y. Yi and B.-W. Choi. Autonomous navigation of indoor mobile robots using a global ultrasonic system. Robotica, 22(4):369–374, 2004. R. Zabih and J. I. Woodfill. Non-parametric local transforms for computing visual correspondence. In Proc. of the 3rd European Conference on Computer Vision, 1994. Z. Zhang. Determining the epipolar geometry and its uncertainty: A review. International Journal of Computer Vision, 27(2):161–195, 1998.

Suggest Documents