Generating Absolute-Scale 3D Point Cloud Data of Built Infrastructure Scenes Using a Monocular Camera Setting A. Rashidi1, I. Brilakis2 and P. Vela3 1
PhD Candidate, School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332; email:
[email protected] 2 Laing O’Rourke Lecturer of Construction Engineering, Department of Engineering, University of Cambridge, UK; email:
[email protected] 3 Associate Professor, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332; email:
[email protected] ABSTRACT Existing methods for automatically determining the absolute scale of Point Cloud Data (PCD) acquired from monocular video/photogrammetry are suitable to a limited set of applications under specific settings and are not general enough to be considered to be practical solutions for reconstructing both indoor and outdoor built infrastructure scenes. To address the issue, in this paper, we propose a novel method for automatically calculating the absolute scale of built infrastructure PCD. The absolute scale estimation method uses a pre-measured cube for outdoor scenes and a letter-size sheet of paper for indoor environments as the calibration patterns. Assuming that the dimensions of these objects are known, the proposed method extracts the objects’ corner points in 2D video frames using a novel algorithm. The extracted corner points are then matched between the consecutive frames. Finally, the corresponding corner points are reconstructed along with other features of the scenes to determine the real world scale. Three indoor and three outdoor cases were selected to evaluate the performance of the proposed method and the absolute-scale PCD for each case was computed. Obtained results illustrate the capacity of the proposed method to accurately compute the absolute scale of PCD. Keywords: Absolute scale; 3D reconstruction; point cloud data; monocular setting INTRODUCTION Reconstruction of the 3D structure of built infrastructure is mainly presented in the form of point cloud data (PCD). In civil engineering, PCD are typically utilized in 3D as-built documentation of infrastructure, quality control of construction-related products, effective progress monitoring of projects, and deviation identification of constructed facilities from as-planned conditions (Brilakis et al. 2011). 3D reconstruction is possible with both active (e.g. laser) and passive (e.g. camera) sensors (Dai et al. 2013). Within the last two decades, advances in high resolution digital photography and increased computing capacity, have made it possible for image-based 3D reconstruction to produce highly accurate results. 3D reconstruction algorithms based on video are divided into three separate categories based on the number of cameras used (Rashidi et al. 2013):
1
1- Monocular setting or using a single camera as the sensor 2- Binocular setting or using a stereo set of cameras as the sensor 3- Camera rigs or using multiple cameras setup as the sensor For regular applications in the fields of construction engineering and facility management, using a single camera is the easiest, most practical way to capture images/video data. However, in computer vision it is well known that using a monocular camera only allows for the creation of unknown global scale PCD (Scaramuzza et al. 2009). In order to compute the absolute scale, the operator needs to know the base line of the camera motion or at least one dimension of the scene. Manually measuring the dimensions in job sites needs extra work and the results might be inaccurate. Further processing is also required for identifying the measured dimensions on PCD and scaling the entire scene proportionally, which is a labor-intensive task. In this paper, we propose a novel method for automatically computing the absolute scale of PCD. The proposed method is based on using pre-measured simple objects, particularly a letter size sheet of paper for indoor settings and a simple colored cube made of plywood material for outdoor environments. The vertices of these predefined objects are detected in video frames using a novel algorithm. The detected vertices in 2D frames are then reconstructed along with the other feature points extracted from the scene. Knowing the distance between the vertices, the entire PCD is then scaled up using an existing method. The paper is organized as follows: The background section summarizes the existing research on absolute scale calculation for monocular photo/videogrammetry. Our method for automating the absolute scale calculation is presented in the next section. In Experiments’ section, experiments are conducted to test the validity of the proposed algorithms and the entire pipeline. Finally, conclusions are drawn in the last section. BACKGROUND The standard approach for acquiring the absolute scale in 3D reconstruction is implementing a binocular setup using stereo cameras with known baseline (Brilakis et al., 2011). Binocular photo/videogrammetry requires specific setup and adjustments, e.g. aligning the lenses, which prevents its general application in routine AEC/FM practices. In comparison with stereo cameras, using a single camera (monocular setup) is more practical since there is no need for specific setup and almost everyone on a construction job site has access to an off-the-shelf digital camera or smart phone for videotaping the scene or taking pictures. In the case of using a single camera, a scene can only be reconstructed up to an unknown scale factor. This limitation is of great significance since in civil engineering and facility management applications; all measurements take place in Euclidean space with real values. Besides manually measuring one dimension of the scene, two major approaches might be used for automatically recovering the absolute scale: 1) obtain prior knowledge about the scene and existing objects (Kuhl et al. 2006; Tribou 2009) and 2) use additional sensors, e.g. accelerometers and GPS, for getting
2
extra information about the scene or motion of the camera (Kneip, et al. 2011; N¨utzi et al. 2011 and Eudes et al. 2010). In the area of AEC/FM, a number of researchers applied specific settings to solve particular problems. Golparvar-Fard et al. (2012) used 3D coordinates of predominant benchmarks, e.g. corners of walls and columns, and the building information modeling (BIM) of the built infrastructure to solve the absolute scale and registration problems. Jahanshahi et al. (2011) on the other hand, applied the working distance (camera-object distance), the camera sensor size and the camera sensor resolution to estimate the dimensions of cracks while reconstructing the surfaces of structural elements. As a major obstacle, the existing solutions are limited to specific categories of scenes and are not general enough to cover a vast range of applications for the AEC/FM practices. Moreover, the need for extra sensors and associated hardware constraints prevent some of the solutions from gaining popularity in construction and facility job sites. As a result, there is significant demand for a simple, practical solution applicable for both indoor and outdoor built infrastructure scenes. The research objective of this paper is to test whether a novel method proposed by the authors is able to successfully and accurately compute the absolute scale of various built infrastructure scenes in both indoor and outdoor environments. The key research question that will be addressed is: how can we automatically compute the absolute scale of built infrastructure scenes using simple objects easily placed in almost all job sites, and without requiring any extra sensors or hardware setup? The presented solution relies on using predefined objects, with known dimensions, for each indoor and outdoor scenario in order to extract the necessary prior knowledge about the scene. The proposed method is described in details in the following sections: METHODOLOGY: AUTOMATED ABSOLUTE SCALE COMPUTATION FOR OUTDOOR SETTINGS Many AEC/FM practices take place in outdoor settings so it is necessary to choose a simple, consistent object which is easily detectable and easy to build at most job site. Among geometrical objects, a cube is the simplest. The dimensions of a cube are equal and it is typically possible to view three of its surfaces from various perspectives simultaneously. We chose a cube made of plywood which is solid and light weight, noting that it can be easily built at nearly any job site. The size of the cube should be big enough to use in large scale infrastructure scenes yet small enough to be carried out and handled by only one person. Considering those factors we choose 0.8 meter as the standard dimension for the cube . In order to better detect the object in the scene we chose three different colors for the cube’s surfaces. Two criteria should be considered while choosing the right colors for the cube surfaces: One, the colors should be distinct from the colors of existing features in the scene, and two, there should be a maximum difference between RGB (HSV) values of the selected colors so they can easily be identified using color detection algorithms. Considering the above constraints, we remove colors close to blue and green
3
since those colors frequently appear in outdoor settings. Examining what remains, and distributing the color values as evenly as possible across the remaining spectrum, leads to the three distinct colors whose HSV values are depicted in Figure 1. Given the selected colors, the overall method for calculating absolute scale mainly relies on detecting the cube in video key frames; identifying, matching and reconstructing the cube vertices along with other feature points of the scene; and scaling the obtained PCD given the known dimensions of the cube (distances between the vertices). Figure 2 depicts the proposed framework for absolute scale estimation.
Figure 1: Selected colors for surfaces of the cube. Video Clip
Key Frames Selection
Cube’s Vertices Detection
Cube’s Vertices Matching
3D Reconstruction Pipeline
Computing the Absolute Scale
Figure 2: Overall workflow of proposed algorithm for computing absolute scale of PCD More details about different stages of the proposed algorithm are presented in the following three steps: Step 1: Detection of the cube’s vertices: Figure 3 describes the necessary steps for detecting the vertices of the cube in 2D video frames captured from the scene. The procedure starts with detecting the surfaces of the cube by filtering the HSV values. For each detected surface, the connected components are analyzed and an opening morphology operator is applied to remove small areas with the same color values which do not belong to the cube’s surface. To ensure that detected areas belong to the cube surfaces, the following constraints should be met: 1- The area of the surface should be bigger than 0.005 of the area of the entire image. This criterion removes false detections of small areas that might match, and also ignores detected boxes that are too far from the camera which often introduce estimation error. The threshold value, 0.005, was experimentally obtained. 2- It is assumed that each surface of the cube should look neither too long nor too circular in the image. Accordingly, the roundness of the surface, calculated by:
4
4𝜋𝜋 ×𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = (𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃)2
(1)
should be located between an upper and a lower threshold. 3- Due to the perspective projection equations describing image formation, the imaged surfaces of a cube are trapezoidal in shape, which is convex. To isolate potential cubes by removing non-convex objects, the real area of the surface should be approximately equal to the convex hull of the surface computed by implementing an algorithm called the “vertex hull algorithm.” Key Frames
Color Thresholding
Morthology
No
Forming Connected Components
Scomponent> 0.005 Simage Yes
No
0.15