Fast Color Correction for Rapid Scanning in

Fast Color Correction for Rapid Scanning in Uncontrolled Environment Arnaud Schenkel

Nadine Warzée

Olivier Debeir

Laboratories of Image, Signal processing and Acoustics Université libre de Bruxelles (U.L.B.) Email: [email protected]



Abstract—This work addresses the problem of colorizing correctly time-of-flight 3D scans performed during rapid scanning of a site in natural and uncontrolled lighting conditions. Contrarily to the usual approach correcting high resolution pictures, the proposed method combines all available radiometric information to colorize per vertex the medium resolution depth map. The obtained rendering exhibits no visible artifacts, for models composed of a series of scans. Our solution allows to quickly obtain good visual results within a reasonable time, while allowing to consider various other methods to improve colors and rendering.

I.

I NTRODUCTION

Accurate appearance capture of outdoor scenes, such as archaeological or architectural work, is a complex process. Ideally, the geometry and texture information of the whole site are combined to produce a colored three-dimensional model reflecting the reality. However, this involves two independent data sources: colors are captured with digital cameras, whereas geometry is captured with a 3D laser scanner. On field, scanning an entire site requires several acquisitions from different viewpoints and is therefore an incremental process interlacing time-of-flight range images and photographs acquisitions. Therefore, color variation is not only due to the model facet orientation (i.e. scan from various directions) but is also due to temporal changes such as natural illumination changes (i.e. effects of climate or time of day). So, because of light and camera settings variations, acquired color maps of a same scene element can be very different. Without further processing, rendering multiview model leads to a poor appearance, including visible discontinuities in the colorization. Our work intends to produce complete site scans with consistent colors and without visual artifacts (aberrant colorization, errors due to noise in geometry,. . . ). The first objective is to quickly get a homogenized visual rendering of large sites composed of several scans, while keeping model division in scans. Our real-time method allows to preview results during on field scanning procedure itself, and therefore allows us to fine-tune the acquisition parameters. II.

P REVIOUS W ORK

A. Images Blending The most frequent problems identified by the literature to blend real images are: blur due to registration errors, ghosting

due to moving objects or visible seams. Simple average does not solve these problems. Feathering [1], [2] (distance map weighted average) corrects partialy color transitions, but not ghosting. To preserve details, Burt and Adelson [3] base their method on Laplacian pyramid decomposition. B. Multiview Colorization To texture mesh, an approach consists of selecting, for each model part, the most appropriate source. Callieri et al. [4] use redundant information to correct textures. Lensch et al. [5] merge border overlapping parts. Others, like Baumberg [6], suggest approaches considering all acquired images. To remove transitions and inconsistencies between images, [5] use a weighted average to smooth the boundary. To preserve details and avoid registration errors, Baumberg [6] uses weight maps and multi-band 3D splining approach. In opposition to recomposition approach, Yu et al. [7] reconstruct textures, texel by texel, with inverse rendering process. Bernardini et al. [8] suggest pixel-based blending method, taking into account each image and normal maps. Different weighting criteria can be considered [6], [8], [9]: orientation relative to the camera, depth, albedo, obstacles presence, edges quality degradation. Callieri et al. [9] provide a solution, calculating, for each image, a set of weighting masks by considering all the geometry, to keep the photographed details. However, because each pixel is calculated independently of its neighbors, these approaches cannot avoid ghosting effects, possibly caused by a registration error. C. Uncompensated Effects Correction Multiple sources consistent colorization often requires additional treatments to compensate exposure or color differences. Indeed, changes in shooting, in lighting or settings can cause visible transitions emergence, even when images are blended in overlapping areas. First approach is to obtain textures without lighting. Most approaches calculate lighting properties and light/matter interactions, as reflectance map or BRDF, reversing the forming process of images set [10]–[12]. Currently, such process can only be done in controlled laboratory environment [11]–[13]. Love [14] bases his solution on sky and sun model, while Yu and Malik [15] extend this idea by using sky and surroundings photographs. Debevec et al. [16] introduce new light measurement device to improve results.

Fig. 1. Color problem: the two left pictures are views of independent scans of the same geometry, and the right image illustrates the exposition discrepancy between the two acquisitions if a simple fusion is used.

Alternative forms of color adjustment are necessary when the lighting conditions vary considerably. To adjust the exposure and eliminate visible color differences through several images, Uyttendaele et al. [2] attempt to obtain continuous exposure in space, by performing block by block calculations and parameters interpolation. Agathos and Fisher [17] propose a global correction method based on RGB transformation matrix computation, from overlapping areas of two images. Bannai et al. [18] extend this approach to use any number of images. Xu and Mulligan [19] compare several parametric and non-parametric approaches based on similar ideas. Marschner and Greenberg [20] and Beauchesne and Roy [21] compute illumination ratio between textures to re-light an images collection. Troccoli and Allen [22] extend this idea to handle shadows in the case of outdoor scenes. These approaches are generally not suitable for our problem, because either they require specific data or equipment, either they need significant computation time for rendering with higher quality than necessary for our purposes. Indeed, the target is generally the quality, rather than the process speed. III.

D EPTH M AP C OLORIZATION

Photographic image resolution is significantly higher than those of scans. It is therefore unnecessary to consider all pixels in point cloud colorization process. The promoted approach is based on the scan geometry and on a per vertex colorization, instead of treating the entire high resolution images. The proposed procedure is a stitching of the resampled and re-weighted original pictures. Fig. 1 illustrates the problem of using different color sources between two scans. One defines the weight for all range image / rgb picture combinations by giving a high weight on pixels having a good view of the geometry and a low weight for pixels where the geometry is very far or worst, hidden. A. Preprocessing For a consistent colorization, it is necessary to have color information redundancy. A small overlap between images and between scans will lead to an increased risk of having visible transitions, non-uniformity in the colorization and discrepancies in color due to changes in brightness or hue between photographic acquisitions. The proposed method typically eliminates most of the visual artifacts when there is a overlap of 10% in each set of images.

Fig. 2. Principle overview (and data resolution): a. Captured depth map (1501 x 750), b. Survey’s pictures (2000 x 3008), c. Computed masks (865 x 750), d. Final colorization (1501 x 750).

However visual artifacts (overexposure, shading), present in several sets of pictures, cannot be corrected solely by this approach. To solve this problem, we use the method proposed by Uyttendaele et al. [2] to reduce differences in exposure and the visible variations in color through a series of images. B. Algorithm Algorithm 1 describes the approach and each part is detailed next. Each scan colorization is computed independently using all available pictures (i.e. including pictures taken from other viewpoints). Maps and masks are resampled in the i, j coordinates of the depth map St . Fig. 2 gives a schematic overview of the whole process. Starting with a depth map St foreach Survey’s pictures Ik do Generate the 2D RGB Map Pk Compute the quality masks Vk , Dk and Qk Compute the weighting mask Mk end Stitch the generated maps according to the masks Algorithm 1: The Colorization Algorithm C. 2D RGB Map Generation Each picture Ik is projected on the depth map St using its intrinsic and relative extrinsic parameters, then resampled as Pk on the depth map grid. One computes, for each pixels, several values stored as three quality masks defined hereunder. D. Weighting Mask Computation When several radiometric information are available for one same pixel in the depth map (e.g. due to the overlapping of consecutive pictures of the same scan, or for pictures taken from another viewpoint) a fusion is applied using weight, based on the fusion algorithm described by Mertens et al. [23].

For a scan, a new mask is computed to weight the contributions of each resampled picture Pk . One defines a set of weighting masks Mk , where k = 1, . . . , n are the RGB maps, and n is the number of survey’s pictures. Each weighting mask Mk is constructed pixel by pixel as a combination of different quality masks : Mk = Vk · Dk · Qk

(1)

E. Quality Masks Definitions To get a good visual result, it is essential to remove visual artifacts depending on the model geometry. It is not necessary to calculate quality measures directly dependent on the images content (e.g. contrast). The method can easily be extended to take into account weighting factors proposed by [24] or [9] to obtain better visual result, but which requires additional and unnecessary computing time for our application. Our various measures depend only on site geometry. 1) Validity Mask: Some points are invalid due to the scanner limitations (points out of reach, totally reflective surfaces), and should not be taken into account during the blending process. A binary value retains this information 1 if sij is valid as stated by the scan data vijk = (2) 0 else where sij is a depth map point. 2) Distance Mask: The second considered criterion is the distance between a surface point and the acquisition center ck of the considered image ( ks −c k 1 − dijmaxk if ksij − ck k ≤ dmax dijk = (3) 0 else It is essential to remove pixels corresponding to a distant geometry; color range and resolution of a same area decreases as the distance increases. The ideal value of dmax depends on several factors: the captured site characteristics (e.g. complex geometry), acquisition (weather, lights dynamic) and technical conditions (distance between scans, overlap). When there are large brightness variations between different shots, the value should be kept low, however, in good conditions, we can consider all images using a larger value. An experimental value of 10 meters is consistent with expected visual quality. 3) Visibility Mask: The last criterion is the visibility of a surface point from the acquisition center ck . This is crucial, even when performing a single scan coupled with a conventional image capture. Since both camera and range scannner are usually spaced a few centimeters, some points are invisible due to the parallax and therefore cannot be colored correctly. For sake of simplicity and computing efficiency, we use the z-buffer principle approach by simulating a low resolution rendering (typically in the order of one percent of the images resolution) of the surfaces to determine the nearest points. Using a low resolution can cause a big loss of points, we keep thus elements in an defined interval behind the nearest point. dij is the minimum distance for all projected points onto the same pixel of the low resolution picture as the sij projection, 1 if ksij − ck k − dij ≤ dmin qijk = (4) 0 else

where the ideal value of dmin parameter depends mainly on the scan resolution. A low value thus avoids classification errors, but greatly reduces the visible points density, and vice versa. Experimentally, a value of 0.5 m is a good compromise between error and point density. F. Generated Images Stitching The blending of these data sets provides the final colorization. For this, we use the Laplace pyramid decomposition [3]. But, other methods could also be used, such as feathering. IV.

D ISCUSSION

A. Input Data and Material Configuration We evaluate our methodology on different sets of scans performed during different acquisition campaigns. These data have several common characteristics: scans have a base resolution of 1501 x 750 vertices; images have a resolution of 6M pixels; sets of images consist of nine images; and no light control device was used. All calculations were performed with a PC Core 2 Quad at 2.66 GHz with 8 GB RAM in standard operating conditions and with a non-optimized solution (without any parallelism). B. Computation Times The measurement of different metrics requires little computation time. The total time Tf us to process a scan can be estimated, assuming constant scans and images resolutions, by Tf us = Tscn + S × (Tscn + Tdst + Ssiz × Tmap ) Tmap = Timg + Tprj + Tcol + Tmsk + Tbld

(5)

with Tscn , the loading time of one geometry dataset, Tdst , the time to calculate the distances, Timg , the loading time of an image, Tprj , the computing time of the surface-to-pixel mappings, Tcol , the time for a geometry colorization, Tmsk , the computation time of the masks, Tbld , the blending time of a depth map with the others. S is the number of considered image sets and thus Ssiz the number of images per set. All these durations, except the images load time, overall depend of the scans resolution, more specifically the number of points. Equation 5 expresses the dependence of the total time compared to the images number. The method complexity is linear in the amount of points and the number of images considered for the colorization. The graph shown in Fig. 3 clearly shows the growth of computing time based on the resolution criteria, and the average time to perform a scan colorization. However, the slower process could greatly benefit massively parallel architectures. Parallelizing calculations on the CPU would divide the slope of the graph lines by the number of cores used. An advantageous characteristic is that a scan colorization requires only relatively little data simultaneously. Indeed, a colorization with its masks requires the loading in main memory of a maximum of two geometries and a color source. The organization’s pipeline to colorize a set of scans, in order to obtain a complete and colorized site model, will exploit this property to optimize the calculations, minimizing the memory resources used while keeping the pertinent data.

[2]

[3]

[4]

[5]

[6] Fig. 3.

Computation time depending on the resolution of the scan.

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14] [15]

Fig. 4. Our method applied to the same data and viewpoint as in Fig. 1 (up), and to the entire survey, composed of 4 scans and 40 pictures (down).

[16]

[17]

V.

C ONCLUSION

We have presented a method for quickly obtaining visually pleasing rendering, by ensuring, for each geometry, to use the same color sources in each overlapping area in the colorization process. Fig. 4 shows the results. Our method keeps the model cutting in scans suitable for different studies, while still allowing considering all other types of corrections and improvements proposed in the literature. Our method can be easily integrated into the acquisition process on the field to easily observe the gaps in fast scanning process, without requiring the use of techniques to control the brightness. Computation time could be improved, especially with a parallelization of the slowest steps. R EFERENCES [1]

R. Szeliski and H.-Y. Shum, “Creating full view panoramic image mosaics and environment maps,” in 24th annual conference on Computer graphics and interactive techniques SIGGRAPH 97, 1997, pp. 251–258.

[18]

[19]

[20] [21]

[22]

[23] [24]

M. Uyttendaele, A. Eden, and R. Szeliski, “Eliminating ghosting and exposure artifacts in image mosaics,” in 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001, pp. 509–516. P. J. Burt and E. H. Adelson, “A multiresolution spline with application to image mosaics,” ACM Transactions on Graphics, vol. 2, no. 4, pp. 217–236, 1983. M. Callieri, P. Cignoni, and R. Scopigno, “Reconstructing textured meshes from multiple range + rgb maps,” in Proceedings of the Vision, Modeling, and Visualization Conference 2002, 2002, pp. 419–426. H. Lensch, W. Heidrich, and H.-P. Seidel, “Automated texture registration and stitching for real world models,” in 8th Pacific Conference on Computer Graphics and Applications, 2000, pp. 317–452. A. Baumberg, “Blending Images for Texturing 3D Models,” in Procedings of the British Machine Vision Conference 2002, 2002, pp. 404–413. Y. Yu, P. Debevec, J. Malik, and T. Hawkins, “Inverse Global Illumination : Recovering Reflectance Models of Real Scenes from Photographs,” in 26th annual conference on Computer graphics and interactive techniques SIGGRAPH 99, 1999, pp. 215–224. F. Bernardini, I. M. Martin, and H. Rushmeier, “High-quality texture reconstruction from multiple scans,” IEEE Transactions on Visualization and Computer Graphics, vol. 7, no. 4, pp. 318–332, 2001. M. Callieri, P. Cignoni, M. Corsini, and R. Scopigno, “Masked photo blending: Mapping dense photographic data set on high-resolution sampled 3D models,” Computers & Graphics, vol. 32, no. 4, pp. 464– 473, 2008. T. Mitsunaga and S. K. Nayar, “Radiometric self calibration,” in 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999, pp. 374–380. H. Lensch, J. Kautz, M. Goesele, W. Heidrich, and H.-P. Seidel, “Imagebased reconstruction of spatial appearance and geometric detail,” ACM Transactions on Graphics, vol. 22, no. 2, pp. 234–257, 2003. C. Rocchini, P. Cignoni, C. Montani, and R. Scopigno, “Acquiring , Stitching and Blending Diffuse Appearance Attributes on 3D Models,” The Visual Computer, vol. 18, no. 3, pp. 186–204, 2002. P. Debevec, T. Hawkins, C. Tchou, H.-P. Duiker, W. Sarokin, and M. Sagar, “Acquiring the reflectance field of a human face,” in Proceedings of the 27th Conference on Computer graphics and interactive techniques SIGGRAPH 00, 2000, pp. 145–156. R. C. Love, “Surface Reflection Model Estimation from Naturally Illuminated Image Sequences,” Ph.D. dissertation, 1997. Y. Yu and J. Malik, “Recovering photometric properties of architectural scenes from photographs,” in 25th annual conference on Computer graphics and interactive techniques SIGGRAPH 98, 1998, pp. 207–217. P. Debevec and T. Lundgren, “Estimating Surface Reflectance Properties of a Complex Scene under Captured Natural Illumination,” USC ICT Technical Report ICT-TR-06.2004, Tech. Rep., 2004. A. Agathos and R. B. Fisher, “Colour texture fusion of multiple range images,” in Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings., 203, pp. 139–146. N. Bannai, A. Agathos, and R. Fisher, “Fusing multiple color images for texturing models,” in International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT 2004), 2004, pp. 558–565. W. Xu and J. Mulligan, “Performance evaluation of color correction approaches for automatic multi-view image and video stitching,” in Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 263–270. S. R. Marschner and D. P. Greenberg, “Inverse Lighting for Photography,” in IS&T/SID 5th Color Imaging Conference, 1997, pp. 262–265. E. Beauchesne and R. Sébastien, “Automatic relighting of overlapping textures of a 3D model,” in 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, 2003, pp. II–166–73. A. Troccoli and P. Allen, “Building Illumination Coherent 3D Models of Large-Scale Outdoor Scenes,” International Journal of Computer Vision, vol. 78, no. 2-3, pp. 261–280, 2007. T. Mertens, J. Kautz, and F. Reeth, “Exposure Fusion,” in 15th Pacific Conference on Computer Graphics and Applications, 2007, p. 382. U. Hahne and M. Alexa, “Exposure Fusion for Time-Of-Flight Imaging,” Computer Graphics Forum, vol. 30, no. 7, pp. 1887–1894, 2011.