image-based 3d reconstruction techniques and their ...

IMAGE-BASED 3D RECONSTRUCTION TECHNIQUES AND THEIR APPLICATION IN CIVIL ENGINEERING Zhiliang Ma1, Shilong Liu2 1) Ph.D., Prof., Department of Civil Engineering, Tsinghua University, Beijing, China. Email: [email protected] 2) Ph.D. Candidate, Department of Civil Engineering, Tsinghua University, Beijing, China. Email: [email protected]

Abstract: Image-based three-dimensional (3D) reconstruction techniques have been used to generate the 3D model based on 2D images of scenes. In order to enhance their use in civil engineering, the status quo of the research and application of the techniques needs to be clarified. Both domestic and foreign major literature databases are retrieved and then the status quo of research and application of the techniques is summarized by analyzing the abstracts or full papers when required. Finally, by analyzing the tendency in the research and application of the techniques, the future directions of the research and application of the techniques in civil engineering are predicted. This paper provides a sound foundation for furthering the research and application of the techniques in civil engineering. Keywords:

3D reconstruction, civil engineering, 3D reconstruction application, image process.

1. INTRODUCTION Three-dimensional (3D) reconstruction techniques are becoming important tools in civil engineering. Their biggest advantage is that they can be used to generate the 3D model of scenes efficiently. The models generated from the techniques can then be used to obtain dimensions of objects in real world and monitor the progress of projects, etc. The input of 3D reconstruction includes digital images, point cloud from laser scanners, Computed Tomography (CT) images and so on. Among them, digital images are a good alternative because of their cost efficiency. Due to the merits of image-based 3D reconstruction techniques, they are enlarging their application in civil engineering. Since the application of image-based 3D reconstruction techniques in civil engineering is still in the early stage, it is anticipated that their potential to be used in civil engineering is large. To enhance their use in civil engineering, it is essential to clarify the status quo of the research and application of the techniques through literature review. This paper aims to summarize the research and application of image-based 3D reconstruction techniques and propose their future directions of research and application in civil engineering. First, the basics of image-based 3D reconstruction techniques are introduced. Second, the application and research of the techniques in civil engineering are summarized by analyzing the abstracts or full papers when required, and as typical examples, that in two top fields are introduced to give an insight into the achievements and challenges. Finally, the future directions of research and application of the techniques in civil engineering are predicted. 2. IMAGE-BASED 3D RECONSTRUCTION TECHNIQUES According to the literature that the authors retrieved and analyzed, the process for using the techniques slightly differs from each other in literature. Generally, the process consists of six steps, i.e., feature extraction, feature matching, camera motion estimation, sparse 3D reconstruction, model parameters correction and dense 3D reconstruction, as shown in Figure 1. The main steps and the related algorithms are introduced as follows. Images

Camera Motion Estimation

Sparse Point Cloud

Feature Extraction

Sparse 3D Reconstruction

Dense 3D Reconstruction

Model Parameters Correction

Dense Point Cloud

Process

Feature Matching

Data

Figure 1. Process of image-based 3D reconstruction techniques 2.1

Feature Extraction The aim of the step is to gain the feature points of the images of a certain scene. The feature points are the points used to estimate the initial structure of the scene (Golparvar-Fard et al., 2009). The algorithms used to extract feature points include Scale Invariant Feature Transform (SIFT) detector (Bhadrakom & Chaiyasarn, 2016; Rodriguezgonzalvez et al., 2014) and Speeded Up Robust Features (SURF) detector (Bae et al., 2013). SURF detector is actually derived from SIFT detector for the sake of handling efficiency. It uses a feature point descriptor of 64 dimensions while SIFT uses that of 128 dimensions, which is one of the reasons why SURF

detector can be used to more quickly obtain the feature points of scenes than SIFT detector but with a lower number and lower level of quality (Rashidi et al., 2011; Jog et al., 2011). 2.2 Feature Matching The aim of the step is to match feature points of each image pair and remove false matches. In general, the algorithms used for the step include Approximate Nearest Neighbors (ANN) algorithm (Bae et al., 2013; Yang et al., 2013) and Random Sample Consensus (RANSAC) algorithm (Rashidi et al., 2011). The ANN algorithm is used to match feature points of each image pair by computing the Euclidian distance of feature point descriptors across two images (Golparvar-Fard et al., 2009). The outputs of ANN algorithm are matched feature points, but false matches may exist in the result because the criterion of matching feature points is only relevant to the Euclidian distance of feature point descriptors and it loses sight of the epipolar geometry of the image pair, which describes the uniqueness of the matched feature points in each image pair. The RANSAC algorithm (Rashidi et al., 2011) is used to remove false matches by utilizing the epipolar geometry of an image pair. The epipolar geometry is obtained by some matched feature points iteratively until it is satisfied by most of the matched feature points. 2.3 Camera Motion Estimation The aim of the step is to recover camera parameters for each image. The algorithms used for the step include eight-point algorithm, seven-point algorithm and five-point algorithm. Although each of them can be used to compute camera parameters (Rashidi et al., 2011), five-point algorithm is better in accuracy (Rashidi et al., 2011). 2.4 Sparse 3D Reconstruction The aim of the step is to generate point cloud of the scene. The algorithm used for the step is called triangulation algorithm (Yang et al., 2013). With matched feature points of each image pair from the feature matching step and the camera parameters of each image from the camera motion estimation step, the 3D location of the points corresponding to the matched feature points are obtain by the algorithm, a point cloud of the scene is thus obtained (Bhadrakom & Chaiyasarn, 2016). 2.5 Model Parameters Correction The aim of the step is to correct the camera parameters for each image and 3D location of points in the point cloud from the sparse 3D reconstruction step. The algorithm used for the step is called Bundle Adjustment (BA) algorithm. The algorithm is based on a nonlinear least square method and is used to correct the camera parameters for each image and 3D location of points in the point cloud (Bhadrakom & Chaiyasarn, 2016; Liu, 2015). After being processed by BA algorithm, a point cloud called sparse point cloud with high accuracy is obtained. 2.6 Dense 3D Reconstruction The aim of step is to recover the details of the scene. The algorithms used for the step include multi-view stereo algorithm (Yang et al., 2013), which contains Clustering views for Multi-View Stereo (CMVS) and Patchbased Multi-View Stereo (PMVS). CMVS removes redundant images of the scene and clusters the resulting images to generate clustered images (Liu, 2015; Yang et al., 2013), which should be able to reconstruct any point in the sparse point cloud (Zhang et al., 2015). PMVS uses the clustered images from CMVS and generates dense 3D point cloud through three steps, namely matching, expansion and filtering (Koch et al., 2014; Rodriguezgonzalvez et al., 2014; Yang et al., 2013). At present, software applications that implement some of the aforementioned algorithms already exist and can be used to recover scenes. For example, VisualSFM is an open-source software which contains the function of feature point extraction, feature point matching, camera motion estimation and sparse 3D reconstruction, etc. (VisualSFM, 2016), and it can be used to reconstruct scenes. However, if the software applications do not meet the required accuracy or computation time, aforementioned algorithms or more advanced ones should be implemented to meet the requirements. 3. RESEARCH AND APPLICATION OF IMAGE-BASED 3D RECONSTRUCTION TECHNIQUES The research and application of image-based 3D reconstruction techniques have been explored in such fields of civil engineering as buildings, roads and bridges, etc. Classified by application levels, the number of papers of each application field of the techniques is shown in Table 1. It is obvious that the hottest research and application field is buildings, and it consists of single building reconstruction and construction site reconstruction, etc. As typical examples, in this section, achievements and challenges of research and application of the techniques in single building reconstruction and construction site reconstruction are introduced. 3.1 Single Building Reconstruction Research and application of image-based 3D reconstruction techniques in single building reconstruction can be divided into three categories, i.e. reconstructing single building, improving application methods of the techniques and combined application with new technologies. Bhadrakom & Chaiyasarn used the techniques to assess the deformation of historical buildings. VisualSFM was used to process images of the buildings to generate point cloud of the buildings. The output of

VisualSFM, i.e. point cloud, was compared with the model from photogrammetry to obtain the incline angles of the buildings, which were used for deformation assessment of the buildings (Bhadrakom & Chaiyasarn, 2016). They concluded that the accuracy of the deformation assessed by the techniques is acceptable when the incline angle of the buildings is small, and it is not acceptable when the incline angle is large. Zhou et al. (2015) used the techniques to assess damage of post-hurricane residential buildings. The images of post-hurricane residential buildings were processed by using VisualSFM. Then, the point cloud from VisualSFM was compared with that from Light Detection and Ranging (LIDAR) system to assess damage of the buildings. Zhou et al. concluded that the accuracy of damage assessment by the techniques is acceptable when the required accuracy is not high, and it is not acceptable when the required accuracy is high (generally