Real-time geometric lens distortion correction using a graphics processing unit Sam Van der Jeught Jan A. N. Buytaert Joris J. J. Dirckx
Optical Engineering 51(2), 027002 (February 2012)
Real-time geometric lens distortion correction using a graphics processing unit Sam Van der Jeught Jan A. N. Buytaert Joris J. J. Dirckx University of Antwerp Laboratory of Biomedical Physics 171 Groenenborgerlaan, B-2020, Belgium E-mail:
[email protected]
Abstract. Optical imaging systems often suffer from distortion artifacts which impose important limitations on the direct interpretation of the images. It is possible to correct for these aberrations through image processing, but due to their calculation-intensive nature, the required corrections are typically performed offline. However, with image-based applications that operate interactively, real-time correction of geometric distortion artifacts can be vital. We propose a new method to generate undistorted images by implementing the required distortion correction algorithm on a commercial graphics processing unit (GPU), distributing the necessary calculations to many stream processors that operate in parallel. The proposed technique is not limited to affine lens distortions but allows for the correction of arbitrary geometric image distortion artifacts through individual pixel resampling at display rates of more than 30 frames per second for fully processed images (1024 × 768 pixels). Our method enables real-time GPU-based geometric lens distortion correction without the need for additional digital image processing hardware. © 2012 Society of Photo-Optical Instrumentation Engineers (SPIE). [DOI: 10.1117/1.OE.51.2.027002]
Subject terms: graphics processing unit; distortion correction. Paper 111207 received Sep. 29, 2011; revised manuscript received Dec. 16, 2011; accepted for publication Dec. 22, 2011; published online Feb. 29, 2012.
1 Introduction Optical lenses can introduce a wide variety of geometric distortion artifacts to an imaging system, which reduces the perceptive quality of the resulting images. Especially in imaging systems with a wide viewing angle, such as endoscopes, image distortion can be very pronounced. Local variations in the index of refraction of the medium through which an object is observed will also cause image distortions. Such distortions cannot be described by a simple affine transformation. To correct for these visual aberrations, numerous recalibration algorithms have been described1,2 and software that remaps the produced image pixels accordingly is widely available.3 The recalibration process typically requires more processing power than modern central processing units (CPUs) can deliver when applied in real-time, and the lens distortion correction process is therefore often performed offline. However, in many applications in medical imaging, computer vision, and industrial monitoring, interaction with the user or environment requires real-time production of undistorted images to enable the accurate perception of distance and shape.4 To this end, a distortion correction method was described5 that effectively exploits the capability of standard graphics processing units (GPUs) to reposition triangle vertex coordinates at high speeds. This method can only correct for the dominant radial distortion effects (also known as barrel or pincushion distortion) as it relies on a general affine transformation formula and does not permit pixel-by-pixel resampling. To enable pixel-by-pixel resampling at video rates, special-purpose hardware-based solutions based on field programmable gate arrays (FPGAs) 0091-3286/2012/$25.00 © 2012 SPIE
Optical Engineering
and frame grabbers have been reported,6,7 but this kind of hardware is typically expensive and inflexible. In this paper, we propose a GPU-based individual pixel resampling mechanism that enables fully undistorted images to be displayed in real-time. Such a solution is superior to the previously mentioned methods, as it combines their advantages in a simple distortion correction scheme that is able to deal with arbitrary pixel transformations without the need for additional processing hardware. Modern GPUs are specialized coprocessors that relieve the CPU of most graphics rendering tasks. The highly parallel nature of most visual imaging tasks such as geometric manipulations, lighting, and shading is effectively exploited by the GPU architecture, as the GPU chip generally contains a large number of stream processors that are optimized to perform certain primitive operations in parallel. In addition, current GPUs are provided with texture memory, which is usually implemented as specialized RAM. This digital storage type is designed to accelerate frequently performed graphical operations such as the mapping of 2-D skins onto 3-D polygonal models8 by empowering the texture elements or texels with a hard-wired linear and nearest-neighbor interpolation mechanism. By loading incoming images onto this texture memory region, floating-point interpolations on these images are effectively reduced to single memory read-outs. After calculating the coordinate locations at which incoming images need to be interpolated to eliminate for certain distortion artifacts, the hard-wired interpolation mechanism of the GPU allows undistorted images to be generated in real-time. In the following sections, the full GPU-based recalibration process is described in detail for arbitrary distortions.
027002-1
February 2012/Vol. 51(2)
Van der Jeught, Buytaert, and Dirckx: Real-time geometric lens distortion correction using a graphics processing unit
2 Theory
as shown in step 2*. However, scattered data interpolation is very time consuming and would limit the video rate of displayed images dramatically.
2.1 Classical Approach Geometric distortion artifacts can be detected through a variety of recalibration algorithms that are widely available. To correct for arbitrary distortion artifacts which may be present in the optical system, are calibration transformation is applied to the integer pixel locations ði; jÞ ∈ IN × IN of an incoming distorted image I d , I d ði; jÞ ¼ I i;j ;
(1)
as illustrated in step 1 of Fig. 1, where I i;j represents the grayscale intensity value of the image at pixel location ði; jÞ.In this way, the distorted integer pixel coordinates ði; jÞ are effectively relocated to their corresponding undistorted floating-point location ðxi ; yj Þ ∈ IR × IR, yielding the undistorted intensity distribution I u ðxi ; yj Þ ¼ I i;j ;
(2)
in which the intensity values I i;j are now bound to a coordinate system of floating-points ðxi ; yj Þ. Obviously, these floating-point coordinates do not correspond fully with the integer matrix coordinates of the displaying device. To display the undistorted image on-screen, the image pixels need to be evaluated at equally spaced pixel locations. The most straightforward approach to obtain the undistorted 0 would now be to and evenly sampled intensity matrix I k;l interpolate the scattered, undistorted intensity distribution I u at integer locations ðk; lÞ that correspond to the pixel locations of the displayed image: 0 ¼ I ðk; lÞ; I k;l u
(3)
2.2 Novel Alternative Approach We propose a different approach in which first the Cartesian coordinate values i and j of the original image grid are plotted onto the scattered floating-point data grids X and Y in both image dimensions, with Xðxi ; yj Þ ¼ i ; (4) Yðxi ; yj Þ ¼ j after which a 2-D scattered data interpolation is performed at equally spaced intervals k and l in each image direction: 0 xk;l ¼ Xðk; lÞ . (5) 0 yk;l ¼ Yðk; lÞ This 2-D scattered data interpolation creates an interpolant that fits a surface of the form V ¼ FðXÞ to the scattered data by a Delaunay triangulation of X. In this way, step 2 or Eqs. (4) and (5) effectively create a recalibration map 0 0 ; yk;l Þ at which the image of floating-point coordinates ðxk;l needs to be evaluated to compensate for the present distortion. By positioning the original integer pixel values i and j onto the grid of distortion corrected floating-point coordinates ðxi ; yj Þ, we obtain two 2-D discrete functions, one for each image dimension. By evaluating these functions at the predefined locations ðk; lÞ, we obtain a floating0 for the first image dimension and a floatpoint value xk;l 0 for the second image dimension. The ing-point value yk;l 0 0 ; yk;l Þ effectively represents floating-point coordinate pair ðxk;l the position in the incoming distorted image I d where we should interpolate to obtain the undistorted intensity value for pixel ðk; lÞ. After making the one-time calculational investment of interpolating the original grid coordinates at integer locations that lie between the scattered data points ðxi ; yj Þ, we can replace the scattered data interpolation on the intensity values, which was required in step 2* for each processing cycle, by a much faster regular 2-D interpolation on a Cartesian pixel grid. Note that by changing the sampling intervals at which X and Y are interpolated in Eq. (5), one can easily alter the final resolution of the displayed image. After the interpolation coordinates have been determined and stored, the problem of removing the present distortion on an incoming image is reduced to step 3, 0 ¼ I ðx 0 ; y 0 Þ; I k;l d k;l k;l
Fig. 1 Digital signal processing schematic in which two possible ways of correcting for geometric lens distortion are illustrated. After applying an arbitrary correction map to the pixel coordinates (step 1), the direct approach to display the undistorted image was to perform scattered data interpolations at a Cartesian grid of integer locations (step 2*). However, this method was highly calculation-intensive and could not be executed in real-time. We proposed to first calculate the coordinate values at which interpolation of the original image would result in an undistorted image. This was achieved by plotting the Cartesian grid onto the scattered data grid in both image dimensions (step 2) and performing a single offline scattered data interpolation. After this, a regular 2-D interpolation on a Cartesian grid (step 3) suffices to undistort an incoming image.
Optical Engineering
(6)
in which an incoming distorted image I d is 2-D interpolated 0 ; y 0 Þ, and in at the predefined floating-point locations ðxk;l k;l 0 are now which the distortion corrected intensity values I k;l stored on an integer grid of size k × l. 3 Results and Discussion 3.1 Radial Lens Distortion Radial distortions are a common problem in imaging systems, such as webcams or endoscopes using wide-angle lenses. Real-time image correction can be important in such systems, especially if one needs to judge distance
027002-2
February 2012/Vol. 51(2)
Van der Jeught, Buytaert, and Dirckx: Real-time geometric lens distortion correction using a graphics processing unit
and object sizes in the obtained images. Our method can be applied to any kind of distortion, but we will demonstrate it using radial distortions, so that results can be compared with analytical correction. We first demonstrate our method using an image of a straight line grid taken through an endoscope. Our full undistortion procedure will be used to correct live endoscope images that contain radial barrel distortion. The customized GPU program was written in NVIDIA’s dedicated software environment compute unified device architecture (CUDA),9 compiled using Microsoft Visual Studio 2008, and tested with a commercial Geforce GTX570 on a PC with four Intel i5 processing cores. The graphical user interface was created in Labview 2010, in which the GPUbased algorithm was imported through dynamic link libraries (DLL). The incoming images of 1024 × 768 pixels were fully processed and displayed at 30.2 frames per second. The effect of the processing scheme is shown in Fig. 2, in which both a distorted and an undistorted still frame from a live endoscopic video feed are shown in Figs. 2(a) and 2(b), respectively. By applying the correction algorithm to the incoming distorted images, the barrel distortion that is present in the unprocessed image [Fig. 2(a)] is clearly removed in the processed image [Fig. 2(b)]. Maintaining the maximum resolution and displaying rate achievable by our digital camera (Foculus FO323TC, NET GmbH, Finning, Germany), we were able to process and display distortion corrected images in real-time. The interpolation algorithm used in the distortion correction is crucial for both the performance and image quality of the optical imaging setup, as it determines the level of accuracy that is achieved in calculating the correct pixel values. Three interpolation methods of varying accuracy were implemented on the GPU: zero-order (nearestneighbor), first-order (linear), and third-order (cubic) B-spline interpolation. Using CUDA, the first two methods were directly hard-wired through texture memory. The third was most commonly achieved by calculating the set of basis functions βn , represented in 1-D by: n nþ1 X ð−1Þk ðn þ 1Þ n þ 1 þx−k β ðxÞ ¼ ðn þ 1 − kÞ!k! 2 k¼0 n
∀ n ∈ N;
∀ x ∈ R; (7)
from which an array of lookup table values can be created. However, to exploit the hard-wired linear interpolation
Fig. 2 Endoscopic images of a tilted sample grid in which the effect of the distortion correction scheme can be seen. (a) Represents an unprocessed, barrel distorted image. (b) Represents the corresponding distortion-corrected image. The tilted orientation of the grid can be perceived much clearer when the image is distortion-corrected. The line to line distance is 1 mm.
Optical Engineering
capabilities of the GPU, the usage of look-up tables can be replaced by a set of linear texture interpolation calculations to calculate the required cubic interpolation coefficients on the fly.10 The resulting interpolation algorithms were benchmarked at several resolutions and compared with native Labview 2010 code. A comparison of image quality is presented in Fig. 3, where an enlarged detail is shown of the corrected images of the test grid, processed with different interpolation algorithms. Figure 3(a) represents standard CPU-processed linear interpolation; Figs. 3(a)–3(d) illustrate the result of increasingly accurate GPU-based interpolation. Starting with Fig. 3(b) that was processed using nearest-neighbor interpolation, Fig. 3(c) was processed using linear interpolation and Fig. 3(d) using cubic spline interpolation. There is a clear difference between Fig. 3(b) and the other methods, but although it is a little smoother, no real improvement can be noticed for the cubic spline interpolation method when compared with its linear counterparts. In addition, no difference in image quality can be noticed between the corresponding CPU-based [Fig. 3(a)] and GPU-based [Fig. 3(c)] linearly interpolated images. Next, we benchmarked these recalibration algorithms by measuring the total processing times needed to complete a single distortion correction cycle. Although the undistortion scheme that we propose requires the offline recalibration of interpolation coordinates, these calculations only need to be made once for a given setup. The scattered data interpolation that is needed to find these coordinates can take several minutes, but is not included in the benchmarked processing times, as it can easily be done offline before processing. In addition, note that the presented distortion correction algorithm does not cause the type of distortion—whether it be barrel, pincushion, or arbitrary—to influence the respective processing times of the interpolation methods. Figure 4 shows the theoretically maximal display rates for fully processed images, based on these processing times. For the GPU-based algorithms, these include the time required to transfer the data to the GPU memory and back. Note that these are the theoretical frames per second achievable with our setup, but that the effective display rate of processed images is limited at 30.2 frames per second by the employed camera. Whereas the CPU-based algorithm can only deliver acceptable display rates at low image resolutions, all three GPU-based algorithms can complete a full cycle in less than 33 ms for images of 1024 × 768 pixels, making the camera frame rate the limiting factor. Even for 1024 × 1024 pixel images, full real-time displaying speed
Fig. 3 Detail of corrected images of sample grid lines that were processed with different interpolation mechanisms. (a) This image was processed on the CPU using linear interpolation, (b), (c), and (d) were processed on the GPU using nearest-neighbor, linear, and cubic spline interpolation, respectively.
027002-3
February 2012/Vol. 51(2)
Van der Jeught, Buytaert, and Dirckx: Real-time geometric lens distortion correction using a graphics processing unit
Fig. 4 Comparison in maximally achievable display rate for our setup of fully processed distortion-corrected images for different recalibration algorithms, benchmarked at several pixel resolutions.
of corrected images was obtained. We also note that the processing times for nearest-neighbor and linear interpolations on the GPU completely coincide, as they are both hard-wired through texture memory and as they both require the processing time of a single memory read-out. 3.2 Arbitrary Distortion Given a correction map that was obtained with general calibration software and applying the previously described recalibration procedure, we were able to correct for arbitrary geometric lens distortion through regular 2-D interpolation. By implementing this interpolation mechanism onto the GPU, we performed this correction in real-time. To highlight the capability of the proposed real-time distortion correction algorithm to not solely correct for affine pixel transformations, as was the case in the previous section, but also for arbitrary distortion artifacts, we fabricated a randomly curved projection plane onto which a movie scene was projected. To demonstrate the real-time character of our method, the camera observed this mUovie and corrected for the present geometric distortion on the fly. First, the resulting image was calibrated using a pattern of equidistant dots, which were relocated to various positions according to the camera’s point of view, depending on the local curvature of the projection plane. Second, image feature extraction software, based on pixel thresholding and center of mass
Fig. 5 Distortion correction algorithm was applied to arbitrary distorted images. The incoming, distorted images (a) and (c) were distortion-corrected in images (b) and (d), respectively. The full distorted and distortion-corrected videos can be viewed at the websiteof the laboratory of Biomedical Physics, University of Antwerp (http://www.ua.ac.be/main.aspx?c=.BIMEF&n=97071).
Optical Engineering
calculations, were used to determine the new positions of the deformed dots. By calculating the respective displacements between the centers of the deformed dots and the centers of the original Cartesian dots, a partial recalibration map was effectively created. A full-field recalibration map of desired image size was constructed by interpolating between the dots. After calculating the recalibration map, the movie was distortion-corrected in real-time using the previously described cubic interpolation procedure. The effect of recalibrating the distorted grid [Fig. 5(a)] to a uniform grid of equidistant dots that cover the entire image window [Fig. 5(b)] can be seen in Fig. 5. Corresponding still frames from both the distorted and the undistorted video files are included in Figs. 5(c) and 5(d), respectively. Both have a resolution of 1024 × 768 pixels. The incoming stream of distorted images was processed and displayed at a rate of more than 30 frames per second.
3.3 Quantitative Results To determine the level of accuracy provided by the presented GPU-based algorithms quantitatively, we compared the pixel brightness values of the distortion corrected images with those of the undistorted image. The difference in curvature of the distorted and the undistorted image planes causes them to reflect the incoming light in a different way, resulting in local brightness differences. To compensate for these different intensity modulations, we projected a uniformly white image onto both the distorted and undistorted plane, and then recorded the respective intensity modulation maps. Based on these maps, an intensity calibration was applied to both the distorted and the undistorted recordings of projected source images. The 2-D root mean square (RMS) of the matrix of numeric pixel brightness differences (d) between the distortion corrected source (s) and target (t) image is a quantitative indicator for the numeric accuracy of the deformation correction algorithm, and is defined as: ffi X 1 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dðs; tÞ2 ; RMSðs; tÞ ¼ pffiffiffi n n
(8)
with n the number of pixels in the image. First, a standard test image (Lena 1024 × 768 pixels) with a high level of detail was used as the source image. The RMS of the difference between target and distortion corrected source images was calculated for CPU-based linear and GPU-based nearest-neighbor, linear, and cubic interpolation algorithms. Second, identical measurements were performed on the collection of still-frames from the previously mentioned projected scene, with the RMS of the differences calculated and averaged more than 120 frames. A grid of 15 × 9 calibration dots was employed in both measurements. The results of these experiments, which were obtained using the aforementioned optical setup, are given in Table 1. The RMS results show that the overall distortion correction was done sufficiently and accurately using a grid of 15 × 9 calibration dots to correct for the present geometric distortion. They also confirm the visually apparent conclusion that higher-order interpolation yields higher-order correction accuracy. Finally, they underline the similar accuracy of CPU-based and GPU-based interpolation methods.
027002-4
February 2012/Vol. 51(2)
Van der Jeught, Buytaert, and Dirckx: Real-time geometric lens distortion correction using a graphics processing unit
Table 1 RMS values of the matrices that contain the difference in pixel brightness values between the target image and the distortioncorrected source images, for both a standard test image (Lena) and a projected scene of 120 still-frames for which the RMS results were averaged.
RMS difference between source Lena (12-bit and target image gray values)
Lena (%)
Projected scene (12-bit Projected gray values) scene (%)
CPU linear
131
3.3
103
2.6
GPU nearest
172
4.3
121
3.0
GPU linear
131
3.3
103
2.6
GPU cubic
124
3.1
99
2.5
Noise level
35
0.9
32
0.8
4 Conclusion A geometric image distortion correction method was proposed in which predefined recalibration coordinates were offloaded to the texture memory of a commercial GPU. By using this hard-wired GPU functionality, arbitrary distortion artifacts can be eliminated in real-time. The proposed distortion correction method enables real-time video rate (30.2 Hz) processing and displaying of geometric lens distortion corrected image sizes up to 1024 × 768 pixels using a low-cost graphics card. An accuracy analysis of the employed distortion correction algorithms was performed. Acknowledgments We acknowledge the financial support that was given under the form of a PhD fellowship of the Research Foundation, Flanders. References 1. W. E. Smith, N. Vakil, and S. A. Maislin, “Correction of distortion in endoscope images,” IEEE Trans. Med. Imag. 11(1), 117–122 (1992). 2. J. Weng, P. Cohen, and M. Herniou, “Camera calibration with distortion models and accuracy evaluation,” IEEE Trans. Pattern Anal. Mach. Intell. 14(10), 965–980 (1992). 3. Camera Calibration Toolbox for Matlab, http://www.vision.caltech.edu/ bouguetj/calib_doc/ (2005). 4. A. J. Moore et al., “Closed-loop phase stepping in a calibrated fiberoptic fringe projector for shape measurement,” Appl. Opt. 41(16), 3348–3354 (2002). 5. M. R. Bax, Real-Time Lens Distortion Correction: 3D Video Graphics Cards Are Good for More than Games, pp. 9–13, Image, Rochester, NY (2002). 6. K. T. Gribbon, C. T. Johnston, and D. G. Bailey, “A real-time FPGA implementation of a barrel distortion correction algorithm with bilinear interpolation,” in Proc. Image and Vision Computing, pp. 408–413, New Zealand (2003). 7. J. P. Helferty et al., “Video endoscopic distortion correction and its application to virtual guidance of endoscopy,” IEEE Trans. Med. Imag. 20(7), 605–617 (2001).
Optical Engineering
8. D. Debry et al., “Painting and rendering textures on unparameterized models,” ACM Trans. Graph 21(3), 763–768 (2002). 9. CUDA zone: http://www.nvidia.com/object/cuda_home_new.html (2006). 10. D. Ruijters, B. M. terHaarRomeny, and P. Suetens, “Accuracy of GPUbased B-spline evaluation,” Proc. 10th IASTED Intl. Conf. Comput. Graph. Imaging, 117–122 (2008). Sam Van der Jeught received his degree of master in physics in 2010 from the University of Antwerp, Belgium, where he researched new ways of accelerating the digital signal processing algorithms involved in optical coherence tomography in a joint collaboration between the Laboratory of Biomedical Physics (BIMEF) and the University of Kent, UK. As a Marie Curie fellow, he was able to work at the Applied Optics Group (AOG) at the University of Kent for a period of ten months. The research that led to his thesis was awarded with the 2010 Barco High Tech Award. Currently, he is working as a PhD student at the University of Antwerp, investigating new optical profilometric techniques that allow the accurate and real-time imaging of human eardrums. Jan Buytaert obtained his PhD in Physics in April 2010, with felicitations of the jury. Currently he is doing a post-doc at the Laboratory of BioMedical Physics of the University of Antwerp, Belgium. Jan is a Research Fellow at the Light and Lighting Laboratory in Ghent, Belgium and a Visiting Research Associate at the University of Kent, UK. He has won several awards: Business-2-Student award (2005), Barco award (2005), Rosa Blanckaert award (2007), Vocation award (2009), MEMRO Stanford best poster award (2009), and 2010 BSM thesis award (2011). His main interests and expertise lie in topography, tomography, 3D imaging, OPFOS and LSFM methods, phase-shifting and optical metrology. Joris J. J. Dirckx was born in Antwerp, Belgium , in 1960. He graduated in Physics and in Didactics at the University of Antwerp, taught physics and mathematics in high school, and then became assistant at the University of Antwerp. In 1991 he obtained the PhD in Physics, with the dissertation “Automated moiré topography and its use for shape and deformation measurements of the eardrum”. From 1992 to 1993 he worked as scientific advisor for the government (IWT), where he assessed and audited major industrial research projects, with a total project portfolio of over 10 million Euros. In 1994 he returned to research and worked at the ENT department of St. Augustinus hospital as clinical audiologist, and performed research on oto-acoustic emissions and cochlear implants. In 1996 he joined the University of Antwerpas post-doc researcher, and was appointed lecturer in physics in 2000. He became professor in 2003, and is now director of the laboratory of Biomedical Physics and full professor in the Department of Physics. He teaches courses in general physics for pharmacy, biology and biochemistry students, physics of optical microscopy, and courses in practical holography and biomedical imaging for physics students.
027002-5
February 2012/Vol. 51(2)