Automatic Room Segmentation of 3D Laser Data ... - Semantic Scholar

International Journal of

Geo-Information Article

Automatic Room Segmentation of 3D Laser Data Using Morphological Processing Jaehoon Jung 1 , Cyrill Stachniss 2 1 2 3

*

and Changjae Kim 3, *

School of Civil and Construction Engineering, Oregon State University, 1691 SW Campus Way, Corvallis, OR 97331, USA; [email protected] Department of Photogrammetry, University of Bonn, Nussallee 15, 53115 Bonn, Germany; [email protected] Department of Civil and Environmental Engineering, Myongji University, 116 Myongji-ro, Cheoin-gu, Yongin, Gyeonggi-do 449-728, Korea Correspondence: [email protected]; Tel.: +82-31-330-6474

Academic Editor: Wolfgang Kainz Received: 11 May 2017; Accepted: 4 July 2017; Published: 7 July 2017

Abstract: In this paper, we introduce an automatic room segmentation approach based on morphological processing. The inputs are registered point-clouds obtained from either a static laser scanner or a mobile scanning system, without any required prior information or initial labeling satisfying specific conditions. The proposed segmentation method’s main concept, based on the assumption that each room is bound by vertical walls, is to project the 3D point cloud onto a 2D binary map and to close all openings (e.g., doorways) to other rooms. This is achieved by creating an initial segment map, skeletonizing the surrounding walls of each segment, and iteratively connecting the closest pixels between the skeletonized walls. By iterating this procedure for all initial segments, the algorithm produces a “watertight” floor map, on which each room can be segmented by a labeling process. Finally, the original 3D points are segmented according to their 2D locations as projected on the segment map. The novel features of our approach are: (1) its robustness against occlusions and clutter in point-cloud input; (2) high segmentation performance regardless of the number of rooms or architectural complexity; and (3) straight segmentation boundary generation, all of which were proved in experiments with various sets of real-world, synthetic, and publicly available data. Additionally, comparisons with the five popular existing methods through both qualitative and quantitative evaluations demonstrated the feasibility of the proposed approach. Keywords: point cloud; room segmentation; morphological processing; as-built BIM

1. Introduction Building Information Modeling (BIM) is a process entailing the gathering and structuring of all data and information relevant to building assets [1]. BIM can be used to evaluate design, reduce construction and maintenance costs, support building management, and enhance collaboration and information sharing. However, the majority of existing buildings lack BIM [2–4]. Even where existing buildings have BIM, often it does not represent the current situation, particularly in cases of continual remodeling or renovation [5]. For these reasons, generating accurate as-built BIM for the purposes of verifying the as-built condition of existing structures is an emerging challenge in the architecture, engineering and construction (AEC) industries. Many techniques for recording of existing building data are available. As-built BIM with laser scanning, for example, is more and more commonly utilized within the AEC industry for its provision of explicit 3D information in the form of high-density point clouds suitable for detailed 3D modeling [6,7]. Obtainment of BIM outputs from 3D point-cloud data entails a sequence of processes: (1) geometric modeling of architectural ISPRS Int. J. Geo-Inf. 2017, 6, 206; doi:10.3390/ijgi6070206

www.mdpi.com/journal/ijgi

ISPRS Int. J. Geo-Inf. 2017, 6, 206

2 of 26

objects; (2) recognition and attribution of those objects; and (3) establishment of their topological relationship(s) [3]. Since detail-rich BIM is time consuming and labor intensive, much effort has been devoted to the development of automated methods. Until just recently, automatic as-built BIM with point-cloud data has been limited to single rooms [7–14]. With the aid of recent advances in laser scanning, however, as-built BIM has been extended to large and complex indoor environments that include many rooms and corridors. A higher degree of automation requires, as an essential initial step, room segmentation that allows the addition of semantic information from scanning data, which in turn enables simplified-geometric modeling for each room [15]. Automated and efficient room segmentation approaches recently have been the foci of keen attention and interest within the AEC community [15–18]. Room segmentation also is an important topic within the robotics community, as indeed, detailed and accurate topological maps yield significant savings in computational effort necessary for obtainment of navigation trajectories [19]. If multiple mobile robots perform high-level tasks in large and complex indoor environments, appropriate assignment and path planning become more crucial. Significantly in this respect, room segmentation is an effective means of reducing redundant work and avoiding interference between robots [20]. In such scenarios, identification and categorization of environments based on onboard sensors is essential [19–27]. Existing approaches take as input the 2D or 3D point-cloud data acquired by either the onboard laser sensors of a mobile robot or a static laser scanner. In the robotics community, most room segmentation techniques for mobile robots have been implemented on what is known as the occupancy grid map, which represents the surrounding environments by means of a 2D evenly spaced grid [28,29]. The grid map can be obtained through a separate pre-processing step using the 2D laser sensor readings. Due to its analogousness to an image, occupancy-grid-map room segmentation often involves morphological image processing [22,23] or graph-partitioning techniques [20,25], although these methods often lead to over-segmentation with arbitrarily segmented boundaries. Meanwhile, in the AEC community, it is common practice to directly process 3D point-cloud data for room segmentation. Many existing methods, however, rely on prior knowledge to provide geometric features for supervised learning algorithms [21] or initial labeling with respect to the known scanner locations [15,17,30]. Without such information, satisfactory room segmentation is achieved only under heavy constraints, for example, with no links between rooms at certain height levels [18], or the Manhattan assumption [31]. Therefore, the purpose of the present study was to devise a reliable segmentation approach by which the above-noted limitations can be overcome and segmentation, thereby, can be made reliable for further as-built BIM utilization. Note that our input is just a registered 3D point-cloud data with a known upper direction, and the output is a 3D point cloud segmented for each room. In the proposed approach, first, the point-cloud data is projected onto a 2D binary map representing occupied pixels as 1 and the others as 0. Then, the binary map is partitioned into several isolated regions, using the detection window to search for occupied pixels from nearby walls that are beyond its range. On the basis of the binary map’s representation of the initial segments, the surrounding walls of each initial segment are skeletonized, and all of their openings to other rooms are closed by iteratively connecting the two end pixels in each opening. This provides a watertight floor map by which each room can be separated using a labeling process. Subsequently, the original 3D points are segmented according to their 2D locations as projected on the final segment map. Notably, the proposed room segmentation also is applicable to 2D grid map segmentations, and thus can be utilized for operation of mobile robots. The remainder of this paper is organized as follows. Section 2 reviews the latest developments in automated room segmentation for point-cloud data. We take into account the research advances in both the robotics and AEC communities, and discuss the limitations. Section 3 outlines the procedure of our room segmentation method in detail. Section 4 presents the results of experimentation with real-world and synthetic data sets and discusses the further validation of the proposed method relative to the five existing room segmentation methods using publicly available data sets. Finally, Section 5 identifies the study limitations and looks ahead to upcoming work.


3 of 26

2. Related Work In this section, the recent work on two-different-domain room segmentation within the robotics and AEC communities is presented, and the strengths and limitations of the respective approaches are discussed. 2.1. Room Segmentation in Robotics Community The following methods assume that a complete grid map of the indoor environment is given as the input for room segmentation. Fabrizi and Saffiotti [22] apply morphological operators to room segmentation. First, a fuzzy grid map representing the degree of cell occupancy by obstacles is generated. Then, the morphological opening operator (i.e., erosion followed by dilation) is applied to partition the fuzzy grid map into regions separated by narrow passages such as doors. Since the transformed grid map is still dispersed, the watershed algorithm is used to refine the transformed grid map into a more separable form. Diosi et al. [23] propose segmentation of the occupancy grid map into labeled rooms based on the watershed algorithm, which performs a distance transformation of the grid map, thus creating a distance map representing the distance from each grid cell to the closest obstacle. Each cell is then traced uphill to its local maximum, typically near the center of the room; all cells leading to the same local maximum are labeled as belonging to the same segment. However, the watershed algorithm can lead to over-segmentation [32]; in such cases, post-processing for subsequent merging must be manually implemented, for example, by using virtual markers. Mozos et al. [33] implement feature-based segmentation. They compute a set of geometric features, such as the average distance between two consecutive beams or the perimeter of the area covered by a scan, from simulated laser beams reaching to all accessible pixels within the grid map. They next classify the features into specific room labels, such as office or corridor, using the AdaBoost classifier, and merge all neighboring points with the same label into one. In order to obtain good classification results, however, the classifier requires a sufficient amount of manually labeled training data. Further, this is problematic if new environments that do not resemble the training data are to be segmented. Brunskill et al. [25] propose an algorithm for automatic decomposition of a map into sub-segments using a graph-partitioning technique known as spectral clustering; the algorithm, by assuming each grid map as a graph, divides graph nodes (each represented by map position coordinates x and y) into groups so that connectivity is maximized between nodes in the same cluster and minimized between nodes in different clusters. Wurmet al. [20] apply distance transformation to the grid map, and construct a Voronoi Graph by skeletonization of the generated distance map. To partition the nodes of the graph into disjointed groups, the critical nodes, which is to say those where the distance to the closest obstacle on the map is a local minimum, are investigated. Typically, the critical nodes, which can be found in narrow passages (such as doorways), are used for grouping of similar nodes. However, graph-based segmentation often over-segments long corridors into smaller regions [19]. Friedman et al. [34] suggest Voronoi random field segmentation. They extract a Voronoi graph from the occupancy grid map and generate the Voronoi random field with nodes and edges corresponding to the Voronoi graph. Then, they extract spatial and connectivity features from the map and generate the learned AdaBoost decision stumps to be used as observations in the Voronoi random field. Finally, they apply loopy belief propagation inference to compute the map labeling in the Voronoi random field. This method, however, requires manually labeled maps for training of the AdaBoost classifier on each place type. 2.2. Room Segmentation in AEC Community In the context of AEC applications, several studies on room segmentation have been conducted for 3D point-cloud data. Ochmann et al. [30] propose a probabilistic point-to-room segmentation method based on Bayes’ theory. In the beginning, each measured point is initially segmented with an index of the scan from which it originates. Then, an iterative re-segment process is performed to


4 of 26

update assignments of points to rooms based on the estimated visibilities between any two locations within the point cloud. However, this method requires initial prior room segmentation with respect to known scanner locations. In addition, the computational loads are proportional to the number of points scanned. Mura et al. [15] propose an iterative binary subdivision algorithm. It starts by projecting the entire point cloud onto the x-y plane. The projected candidate walls are clustered in order to obtain a smaller number of good representative lines for walls. Then, from the intersections of the representative lines, a cell complex is built, the edges of which are weighted according to the likelihood of being genuine walls. Finally, diffusion distances on the cell-graph of the complex are computed, which are used to drive iterative cell clustering that enables extraction of the separate rooms. However, this approach assumes that during the segmentation, the scanner locations are known. Turner [17] defines a room as a connected subset of interior triangles created by Delaunay triangulation. A seed triangle is representative of a room, in that every room to be modeled has one. These seeds are used to partition the rest of the interior triangles into rooms. Before segmentation though, the triangles have to be initially segmented into “inside” or “outside” according to the known scanner path. In the approach of Macher et al. [18], only 20 cm slices of point clouds at the ceiling level, as projected onto a 2D binary image, are used. A pixel on the binary image is randomly selected, and then region growing is performed to segment the selected region into a single room. This process is repeated to create a 2D segmented image that is subsequently used to segment the original 3D point cloud. Unfortunately, this method is valid only if the ceiling points are obtained without occlusions and there are no links between rooms at the ceiling level. Ikehata et al. [31] initialize the domain of point clouds on the x-y plane, apply the k-medoids algorithm, and cluster sub-sampled pixels. Starting from 200 clusters, they iterate the k-medoids algorithm and perform cluster merging, whereby two adjacent clusters are merged if the distance between their centers is less than the pre-defined value. Finally, the segmentation results are propagated to all of the pixels in the domain using the nearest neighbor. The main downside of this method is the strict Manhattan assumption, according to which, for refinement of segmented rooms, the walls must be parallel to each other. 2.3. Summary In the robotics community, the above-discussed five different methods, namely morphological processing, distance transform, feature-based segmentation, Voronoi graph segmentation, and Voronoi random field, are still widely used for room segmentation. In a recent paper, Bormann et al. [19] thoroughly compare the segmentation performances of these methods for a variety of test data sets. According to their results, the three commonly occurring limitations of those methods are as follows: (1) they tend to provide severe over- or under-segmentation; (2) the segment boundaries are irregular and arbitrarily positioned, which render use in as-built BIM creation problematic; and (3) some of the methods include manual steps, such as creation of labeled maps [33,34] or placement of virtual markers [23], that prevent their integration into automatic flows. According to our overall review of the recent AEC literature, the chief limitation of the existing room segmentation methods lies in the initial segmentation or refinement phases, because they require known scanner locations. This prevents their effective application for existing or publicly available point-cloud data lacking such information, and limits them, moreover, to specific scanning instruments such as a static laser scanner or a mobile scanning system. Some methods, for example Macher et al. [18] and Ikehata et al. [31], do not require known scanner locations or prior knowledge, but operate only under specific conditions. Further, some methods are limited to segmentation of rooms with planar walls [15,30,31], which fact narrows the range of their practical applications. Table 1 provides a summary of the room segmentation methods used in the robotics and AEC communities as well as a comparison with our approach. In contrast to the existing methods, our


5 of 26

approach is available for different data types (i.e., 2D grid maps or 3D point clouds), for different scanning instruments (i.e., static laser scanners or mobile scanning systems), and does not require supplementary data (i.e., training data or known scanner locations). Further, our approach provides straight segmentation boundary and high segmentation performance, regardless of architectural complexity in terms of curved walls. The following sections provide detailed overview of the respective aspects of our approach. Table 1. Summary of room segmentation methods. Properties

Domain

Input Data

Main Assumptions or Limitations

Supplementary Data

References

Morphological

Robotics

Grid map

Narrow passages/2D data only

-

Fabrizi et al. [22]

Distance transform

Robotics

Grid map


Virtual markers

Diosi et al. [23]

Voronoi graph

Robotics

Grid map


-

Wurm et al. [20]

Feature based

Robotics

Grid map

Resemblance between data/2D data only

Labeled map

Mozos et al. [33]

Voronoi random

Robotics

Grid map


Labeled map

Friedman et al. [34]

Probabilistic model

AEC

Point cloud

Planar walls

Initial labeling

Ochmann et al. [30]

Iterative binary subdivision

AEC

Point cloud

Planar walls

Scanner locations

Mura et al. [15]

Graph clustering

AEC

Point cloud

Vertical walls/Mobile data only

Scanner path

Turner [17]

Height constraint

AEC

Point cloud

Vertical walls/No link at certain level

-

Macher et al. [18]

K-medoids

AEC

RGBD images

Narrow passages/Manhattan world/Planar walls

-

Ikehata et al. [31]

Grid map/Point cloud

Narrow passages/Vertical walls

-

-

Methods

Proposed Approach

Robotics/AEC

3. Our Approach to Room Segmentation 3.1. Overview We propose a novel method for segmentation of a point-cloud acquisition of an indoor environment into separate rooms based on morphological processing. Our room segmentation approach entails the consecutive steps schematized in Figure 1: (1) obtainment of the input data as a registered point cloud; (2) computation of a histogram along the z-axis of the point cloud for height estimation; (3) projection of only the points within the height offset onto a 2D binary map (representing occupied pixels as 1 and the others as 0) and filtration of those points by a labeling process; (4) creation of the initial segment map from the binary map using the detection window, which classifies only the occupied pixels from nearby walls that are beyond its range; (5) skeletonization of the surrounding walls of the initial segments, all of the openings of which are closed to create a watertight floor map; (6) segmentation of each room on the watertight floor map by a labeling process and its refinement to create the final segment map; and (7) segmentation of the original 3D points according to their 2D locations as projected onto the final segment map. The following subsections each contain a detailed description of our room segmentation approach for phases (2) to (6), respectively. Note that the proposed room segmentation is also applicable to 2D grid map inputs. In this case, the algorithm performs only the phases from (4) to (6) and provides a 2D segment map as the final product. Except for parameter selection, our approach is automatic and involves no manual intervention. The main assumptions, which hold valid for typical indoor environments, are: (1) ceilings and floors are planar and parallel structures bound by vertical walls; and (2) different rooms and corridors are connected by narrow passages such as doorways [7–9,15].

following each contain detailed description of segmentation our room segmentation approach for phases (2) subsections to (6), respectively. Note athat the proposed room is also applicable to 2D phases (2)inputs. to (6), In respectively. Note that theperforms proposed room is also to 2D grid map this case, the algorithm only the segmentation phases from (4) to (6)applicable and provides a grid map inputs. In this case, the algorithm performs only the phases from (4) to (6) and provides 2D segment map as the final product. Except for parameter selection, our approach is automatic anda 2D segment as the final product. Except parameter selection, our approach automatic and involves no map manual intervention. The mainfor assumptions, which hold valid foristypical indoor involves manual The main assumptions, which hold valid ISPRS Int. J. no Geo-Inf. 2017, 206 6 of 26 environments, are: (1)6,intervention. ceilings and floors are planar and parallel structures boundfor by typical vertical indoor walls; environments, (1) ceilings and floors are planarby and parallel structures by vertical walls; and (2) differentare: rooms and corridors are connected narrow passages suchbound as doorways [7–9,15]. and (2) different rooms and corridors are connected by narrow passages such as doorways [7–9,15].

Figure 1. Schematization of proposed room segmentation. Figure 1. Schematization of proposed room segmentation. Figure 1. Schematization of proposed room segmentation.

3.2. Height Estimation 3.2. 3.2. Height Height Estimation Estimation The ceiling-to-floor height is estimated to classify the points that we want to retain for the next The height is classify the points that to for next The ceiling-to-floor ceiling-to-floor height is estimated estimated to towe classify thethat points that we we want to retain retain for the the next rasterization phase. In the height estimation, assume ceilings andwant floors are mostly planar rasterization phase. In the height estimation, we assume that ceilings and floors are mostly planar rasterization phase. In including the heightthe estimation, we assume that ceilings and floors are mostly planar and parallel structures largest number of points, with which the z-coordinates in the and parallel structures including the largest number of points, with which the z-coordinates in the and parallel structures including the largest number of points, with which the z-coordinates in list the list of original points represent the height variations [7–9]. A histogram along the z-axis is computed of original points represent the height variations [7–9]. A histogram along the z-axis is computed to listfind of original points represent the height variations [7–9]. A histogram thean z-axis is computed to the height with the largest number of points [17,18]. Figure 2along shows example of the find the height with the largest number of points [17,18]. Figure 2 shows an example of the number to find the height with theranging largest from number points [17,18]. Figure 2 height shows (at an 0.1 example of the number of histogram points the of minimum to the maximum m intervals), of histogram points ranging from the minimum to the maximum height (at 0.1 m intervals), the two number of histogram points ranging from the minimum to the maximum height (at 0.1 m intervals), the two peaks indicating the presence of large horizontal surfaces, the ceiling and floor respectively. peaks indicating the presence of largeof horizontal surfaces, the ceiling andand floor respectively. The the two peaks indicating the presence large horizontal the ceiling and floor respectively. The two largest-number point groups are separated fromsurfaces, the point cloud, their mean z-values two largest-number point groups are separated from the point cloud, and their mean z-values are Thecomputed. two largest-number groups separated from theto point cloud, andheight, their mean z-values are Finally, thepoint higher meanare z-value is determined be the ceiling and the lower computed. Finally, the higher mean z-value is determined to betothe height, and the one are computed. Finally, the higher mean z-value is determined beceiling the ceiling height, andlower the lower one the floor height. the floor height. one the floor height.

Figure 2. Histogram representing number of scan points according to variation of height. Figure 2. 2. Histogram Histogram representing representing number number of of scan scan points points according according to to variation variationof ofheight. height. Figure

3.3. Rasterization and Noise Filtering 3.3. Rasterization Rasterization and and Noise Noise Filtering Filtering 3.3. In this phase, we filter out the noisy points using the height estimation, and we rasterize the Incloud thisphase, phase, we filter out the noisy points using the height estimation, andrasterize we rasterize the pointIn into awe binary map for later morphological processing. In theand rasterization, the this filter out the noisy points using the height estimation, we the point point point cloud into a binary map for later morphological processing. In the rasterization, the point cloud into a binary map for later morphological processing. In the rasterization, the point cloud is projected onto the x-y plane to create a 2D binary map representing occupied pixels, which contains at least one projected point, represented as 1, and the others as 0 [35]. It is useful for the pixel size of the binary map to be small enough to preserve the architectural shape. We recommend the use of half the minimum wall thickness to effectively identify the different rooms; note though that it should not be too small to prevent hollow parts in a low-point-density area.

cloud is projected onto the x-y plane to create a 2D binary map representing occupied pixels, which contains at least one projected point, represented as 1, and the others as 0 [35]. It is useful for the pixel size of the binary map to be small enough to preserve the architectural shape. We recommend ISPRS Int.of J. Geo-Inf. 2017, 6, 206 7 of 26 the use half the minimum wall thickness to effectively identify the different rooms; note though that it should not be too small to prevent hollow parts in a low-point-density area. Figure 3a provides an example of raw point-cloud input, and Figure 3b demonstrates its Figure 3a provides an example of raw point-cloud input, and Figure 3b demonstrates its projection projection onto the 2D binary map, wherein the occupied pixels in white represent the indoor onto the 2D binary map, wherein the occupied pixels in white represent the indoor environments environments of the building or noise, and the unoccupied pixels in gray represent the outside of the building or noise, and the unoccupied pixels in gray represent the outside environments of environments of building or walls. Figure 3b also shows the problem presented by noisy pixels building or walls. Figure 3b also shows the problem presented by noisy pixels when projecting the when projecting the entire point-cloud acquisition. Noisy pixels occur when projecting the scan entire point-cloud acquisition. Noisy pixels occur when projecting the scan points that pass through points that pass through windows. To prevent any possible problems with the later morphological windows. To prevent any possible problems with the later morphological processing, it is desirable processing, it is desirable to remove these points rather than keeping them as occupied pixels in the to remove these points rather than keeping them as occupied pixels in the binary map. To that end, binary map. To that end, only the points within the ceiling and floor height intervals (the two peaks only the points within the ceiling and floor height intervals (the two peaks in Figure 2) are kept for in Figure 2) are kept for rasterization. Then, in order not to miss any structural details, an offset rasterization. Then, in order not to miss any structural details, an offset space from the ceiling height space from the ceiling height interval is defined, as illustrated in Figure 4, in order to add points interval is defined, as illustrated in Figure 4, in order to add points within it into the rasterization. This within it into the rasterization. This is valid, because vertical walls are as high as the ceiling height, is valid, because vertical walls are as high as the ceiling height, while most clutter and windows are while most clutter and windows are rather less than the ceiling height [10]. The offset space from the rather less than the ceiling height [10]. The offset space from the floor is not considered, because clutter floor is not considered, because clutter near the floor is more hindrance than help. In this way, the near the floor is more hindrance than help. In this way, the unwanted scan points outside the building unwanted scan points outside the building can be greatly reduced and cut off from the main can be greatly reduced and cut off from the main segment, as shown in Figure 3c. After rasterization, segment, as shown in Figure 3c. After rasterization, we perform a labeling process to remove any we perform a labeling process to remove any pixel segments that do not belong to the main segment. pixel segments that do not belong to the main segment. In this process, the algorithm defines which In this process, the algorithm defines which pixels are connected to other pixels based on the number of pixels are connected to other pixels based on the number of connected neighborhoods connected neighborhoods (4-connectivity was assumed in the present study). Each connected segment (4-connectivity was assumed in the present study). Each connected segment is uniquely labeled with is uniquely labeled with different integer values, and its pixels are counted [12,36]. Among the labeled different integer values, and its pixels are counted [12,36]. Among the labeled segments, the largest segments, the largest one is retained; the others, considered to be noise, are removed. Figure 3d shows one is retained; the others, considered to be noise, are removed. Figure 3d shows the rasterized point the rasterized point cloud after noise filtering. cloud after noise filtering.

(a)

(b)

(c)

(d)

Figure 3. Rasterization of point cloud: (a) raw point-cloud input; (b) rasterization of raw point-cloud Figure 3. Rasterization of point cloud: (a) raw point-cloud input; (b) rasterization of raw point-cloud input; (c) rasterization after reducing noisy points passing through windows; and (d) main segment input; (c) rasterization after reducing noisy points passing through windows; and (d) main segment filtered by labeling process. filtered by labeling process.


8 of 26


8 of 26


8 of 26

Figure 4. Offset space for noise filtering process. Figure 4. Offset space for noise filtering process.

3.4. Initial Segmentation Figure 4. Offset space for noise filtering process. 3.4. Initial Segmentation Given theSegmentation noise-filtered binary map, an initial segment map is created to roughly locate single 3.4. Initial Given the noise-filtered binary map, an initial segment map is created to roughly locate single rooms. The initial segment map plays an important role in that it determines the number of rooms to rooms. The initial mapbinary plays map, an important in that the number rooms to Given thesegment noise-filtered an initial role segment mapit isdetermines created to roughly locateofsingle be segmented and also provides a starting point for detailed segmentation of each room. Assuming rooms. The initial segment map plays an important role in that it determines the number of rooms to be segmented and also provides a starting point for detailed segmentation of each room. Assuming that different rooms and corridors are connected by narrow passages such as doorways, the user be segmented and also provides a starting point for detailed segmentation of each room. Assuming that different rooms and corridors are connected by narrow passages such as doorways, the user needs needsthat to define a detection greater than the doorway. Thisasdetection window rooms andwindow corridors connected by largest narrow passages such doorways, uservisits to define different a detection window greaterare than the largest doorway. This detection windowthe visits all of all ofneeds the occupied on the binarygreater map and thedoorway. number This of unoccupied pixels visits within its to define pixels a detection window than counts the largest detection window the occupied pixels on the binary map and counts the number of unoccupied pixels within its range. range. is zero, which thatand thecounts centerthe of number the detection windowpixels is farther allIfofthe thecount occupied pixels on theindicates binary map of unoccupied withinfrom its the If the count is zero, which indicates that the center of the detection window is farther from the nearby nearby walls than its range, the algorithm classifies the center pixel as a part of the initial segments; range. If the count is zero, which indicates that the center of the detection window is farther from the wallsnearby than itswalls range, the algorithm classifies the centerthe pixel as a partasofa the initial segments; otherwise, than pixel its range, the algorithm center pixel areas. part of the initial segments; otherwise, the center is classified as aclassifies part of unsegmented Note that a circular-shaped the center pixelthe is center classified part of unsegmented areas. Note that a circular-shaped detection otherwise, pixelas is aclassified a part of unsegmented that a circular-shaped detection window is desirable, becauseasthe sharp corners of areas. other,Note polygonal-shaped detection window is desirable, because the sharp corners of other, polygonal-shaped detection windows easily detection window is desirable, because the sharp corners of other, polygonal-shaped detection windows easily bump into the boundaries of a narrow hallway, thus often failing to create the bumpwindows into the easily boundaries a narrow hallway, often hallway, failing tothus create thefailing correct bump of into the boundaries ofthus a narrow often to initial create segment the correct initial segment map. By moving the detection window onto all of the occupied pixels one by initial the segment map. window By moving the all detection onto all ofone theby occupied pixels one by map.correct By moving detection onto of the window occupied pixels one, the original binary one, the original binary map is partitioned into several isolated regions, as shown in Figure 5. Each the originalinto binary map isolated is partitioned into several isolated regions, shown in Figure 5. Each map one, is partitioned several regions, as shown in Figure 5. as Each isolated region in white isolated region in white represents a asingle ofhomogeneous, homogeneous, occupied pixels. isolated region in white represents singleroom room consisting consisting occupied pixels. The The represents a single room consisting of homogeneous, occupiedofpixels. The black areas are the occupied, blackblack areasareas are the occupied, but unsegmented pixels to be segmented in the later phases, and the are the occupied, but unsegmented pixels to be segmented in the later phases, and the but unsegmented pixels to be segmented in the later phases, and the gray areas are the unoccupied gray gray areasareas are the unoccupied pixels, such as walls or outside building environments, that do not are the unoccupied pixels, such as walls or outside building environments, that do not pixels, such as walls or outside building environments, that do not need to be segmented. need need to betosegmented. be segmented.

Figure 5. Initial segment map creation.

3.5. Opening Closure

Figure 5. Initial segment map creation. Figure 5. Initial segment map creation.

3.5. Opening Closure ThisClosure section describes in detail the main contribution of our paper: a method for creation of the 3.5. Opening watertight floor map from initial using theofmorphological illustrated in of This section describes in the detail thesegmentation main contribution our paper: aprocess method for creation This section describes in detail the main contribution of our paper: a method for creation of the Figure 6. In this study, we define a room as a region bound by vertical walls and linked to other the watertight floor map from the initial segmentation using the morphological process illustrated rooms through doorway openings. The key operation is the separation of a room from the others by watertight floor map from the initial segmentation using the morphological process illustrated in in Figure 6. In this study, we define a room as a region bound by vertical walls and linked to other closing all of the openings to them. The algorithm begins by selecting one segment (Figure 7a) from Figure 6. In thisdoorway study, we define a The roomkey as operation a region bound by vertical of walls and from linked other rooms through openings. is the separation a room thetoothers thethrough initial segment map, and findsThe the key surrounding walls (unoccupied pixels) to be closed. Forothers that by rooms doorway openings. operation is the separation of a room from the closing all of the openings to them. The algorithm begins by selecting one segment (Figure 7a) from the initial segment map, and finds the surrounding walls (unoccupied pixels) to be closed. For that


9 of 26


9 of 26

by closing all of the openings to them. The algorithm begins by selecting one segment (Figure 7a) from the initial segment map, and finds surrounding (unoccupied pixels) to be purpose, a 1-pixel-thick boundary of thethe selected initial walls segment is traced as shown in closed. Figure For 7b. that purpose, a 1-pixel-thick boundary of the selected initial segment is traced as shown in Figure 7b. Then, a new detection window is defined to find the surrounding walls: it visits all of the occupied Then, a new detection window is defined to find the surrounding walls: it visits all of the occupied pixels composing the boundary and finds any unoccupied pixels within its range. This time, the size pixels composing the boundary any unoccupied within its range. time, of the detection window is doubleand thatfinds specified in the initial pixels segmentation phase so asThis to find allthe of size of the detection window is double that specified in the initial segmentation phase so as to find the surrounding walls without exception. Additionally, a square-shaped detection window is all of the surrounding walls without exception. Additionally, square-shaped detection window is desirable, because it is advantageous to include a sharp acorner. The surrounding walls are desirable, because it is advantageous to include a sharp corner. The are represented represented in yellow in Figure 7c, which includes a doorway andsurrounding other small walls openings. Note that in yellow in Figure 7c, which includes a doorway and other small openings. Note that small openings small openings are found when the pixel size is not sufficiently smaller than the minimum wall are found when the pixel size is not sufficiently smaller than the minimum wall thickness; order thickness; in order to prevent hollow parts in a low-point-density area, sometimes inthis is to prevent hollow parts in a low-point-density area, sometimes this is unavoidable. These openings unavoidable. These openings are closed by iteratively connecting the closest pixels of two different are closed by iteratively connecting the closest pixels two different wall pixels, segments. To reduce the wall segments. To reduce the computation loads foroffinding the closest a skeletonization computation loads for finding the closest pixels, a skeletonization algorithm is adopted. The algorithm algorithm is adopted. The algorithm iteratively reduces pixels composing an object in a binary map iteratively reduces pixels composing an object in a binary map to retain a skeletal remnant that has the to retain a skeletal remnant that has the same distance from two closest pixels on the object same distance from two closest pixels on the object boundary [37]. The skeletonized image yields boundary [37]. The skeletonized image yields a 1-pixel-thick center line that preserves the topologya 1-pixel-thick center thatreducing preservesthe thesearch topology of for the finding original the walls whilepixels. reducing search of the original wallsline while space closest Thethe resulting space for finding the closest pixels. The resulting skeleton, due to the openings, is often divided into skeleton, due to the openings, is often divided into several small skeletons, one of which is labeled several small skeletons, one of which is labeled yellow and the others green, as shown in Figure 7d. yellow and the others green, as shown in Figure 7d.

Figure 6. Procedure for opening closure. Figure 6. Procedure for opening closure.

The two closest pixels are determined by computing the distances of all pixel-to-pixel The two between closest pixels are determined by computing the skeleton distancesand of all combinations two groups; see, for example, the yellow the pixel-to-pixel other, green combinations between two groups; see, for example, the yellow skeleton and the other, green skeletons skeletons in Figure 7d. The point pair with the shortest distance is determined by Equation (1), in Figure 7d. The point pair ∗with the shortest distance is determined by Equation (1), , ∗ = argmin − , = 1, ⋯ , , = 1, ⋯ , (1) ∗ ∗ ∗ Pi , Pjpair= with argmin || Pshortest = 1, · · · , n, jand = 1, · · ·are , m the pixels from two (1) i − Pj ||, i distance; , ∗ is the pixel the where different and are the numbers of pixels composing the pixelgroups, respectively; and ∗ ∗ where Pi ,skeletons. Pj is theApixel pair with line the shortest and Pj 8a), are the pixels from two different respective 1-pixel-thick (the reddistance; pixels inPiFigure which is intersected by the straight line connecting theand closest two pixels, is found of bypixels Bresenham’s algorithm [38] andskeletons. used to pixel groups, respectively; n and m are the numbers composing the respective close the opening. Then, of in theFigure connected greenis skeletons thestraight 1-pixel-thick red line are A 1-pixel-thick line (the redboth pixels 8a), which intersectedand by the line connecting the combined the isyellow This process operates ofopening. the greenThen, skeletons closest twowith pixels, foundskeleton. by Bresenham’s algorithm [38] recursively and used tountil closeall the both are combined with the yellow skeleton 8b,c). Note thatare thiscombined iterative with process skipped if the of the connected green skeletons and the(Figure 1-pixel-thick red line theisyellow skeleton. skeletonized wall initially has only one opening or none. At the final iteration, all of the skeletonized This process operates recursively until all of the green skeletons are combined with the yellow skeleton walls are8b,c). combined to form a single skeleton, inevitably leaves one wall last opening to be closed (Figure Note that this iterative process iswhich skipped if the skeletonized initially has only one (Figure opening8d). or none. At the final iteration, all of the skeletonized walls are combined to form a single skeleton, which inevitably leaves one last opening to be closed (Figure 8d).


10 of 26


10 of 26


10 of 26

(a)

(b)

(a)

(b)

(c)

(d)

(c) (d) Figure 7. Process of wall detection: (a) selection of one initial segment; (b) boundary tracing; (c) Figure 7. Process of wall detection: (a) selection of one initial segment; (b) boundary tracing; detection; and (d) skeletonization of surrounding Figure 7. Process of wall detection: (a) selectionwalls. of one initial segment; (b) boundary tracing; (c)

(c) detection; and (d) skeletonization of surrounding walls. detection; and (d) skeletonization of surrounding walls.

(a)

(b)

(a)

(b)

(c)

(d)

(c) (d) Figure 8. Example of opening closure: (a–c) openings on skeletonized walls are sequentially found and closed; and (d) of allopening skeletonized walls are openings combinedon to skeletonized form a single walls skeleton. Figure 8. Example closure: (a–c) are sequentially found

Figure 8. Example of opening closure: (a–c) openings on skeletonized walls are sequentially found and and closed; and (d) all skeletonized walls are combined to form a single skeleton. closed; and (d) all skeletonized walls are combined to form a single skeleton.


11 of 26


11 of 26

This problem is resolved by temporarily creating a new larger opening than the previous one, This problem is resolved by temporarily creating a new larger opening than the previous one, which enables finding the closest pixels in the last opening. Note that the new opening should be which enables finding the closest pixels in the last opening. Note that the new opening should be located on the main branch of the skeleton tree; otherwise, the closest two pixels will be found in located on the main branch of the skeleton tree; otherwise, the closest two pixels will be found in the the new opening, not in the last opening. The above-noted conditions can be achieved by removing new opening, not in the last opening. The above-noted conditions can be achieved by removing all of all of the small branches. To that end, all of the end pixels and junctions of the last skeleton are the small branches. To that end, all of the end pixels and junctions of the last skeleton are found and found and excluded together with their nearby skeleton pixels within the range of a new detection excluded together with their nearby skeleton pixels within the range of a new detection window window whose size and shape are the same as those of the window used in the rasterization phase. whose size and shape are the same as those of the window used in the rasterization phase. This is This is valid, because the small branches of the last skeleton cannot be greater than half the detection valid, because the small branches of the last skeleton cannot be greater than half the detection window size. As a result, in the example of Figure 9a, the single skeleton is temporarily broken window size. As a result, in the example of Figure 9a, the single skeleton is temporarily broken into into five smaller skeletons (yellow lines), the longest of which is selected for the temporary opening. five smaller skeletons (yellow lines), the longest of which is selected for the temporary opening. In In Figure 9b, the original skeleton is broken again into two smaller skeletons (green and yellow) by the Figure 9b, the original skeleton is broken again into two smaller skeletons (green and yellow) by the temporary opening, which allows for finding of the two closest pixels in the last opening by computing temporary opening, which allows for finding of the two closest pixels in the last opening by the shortest distance, as shown in Figure 9c. Finally, the temporary opening is restored back to the computing the shortest distance, as shown in Figure 9c. Finally, the temporary opening is restored wall, resulting in the closed room shown in Figure 9d. The entire process is repeated until all of the back to the wall, resulting in the closed room shown in Figure 9d. The entire process is repeated until surrounding walls of each initial segment are completely closed to create a watertight floor map, on all of the surrounding walls of each initial segment are completely closed to create a watertight floor which, in the next phase, each room can be segmented by a labeling process. map, on which, in the next phase, each room can be segmented by a labeling process.

(a)

(b)

(c)

(d)

Figure 9. 9. Closure last opening: opening: (a) skeleton tree; tree; (b) (b) creation creation of of Figure Closure of of last (a) removal removal of of small small branches branches on on skeleton temporary opening; opening;(c)(c) finding of two closest last opening and their and closure; and (d) temporary finding of two closest pixelspixels in lastin opening and their closure; (d) formation formation of watertight room. of watertight room.

3.6. Room Labeling and Refinement A labeling process is performed on the generated watertight floor map to separate rooms from each other. In the labeling process, each connected segment is uniquely labeled and sorted according to the occupied pixel count. The number of top largest segments, which are supposed to be classified as rooms, is determined by the initial segmentation phase. Typically, the other remaining segments


12 of 26

3.6. Room Labeling and Refinement A labeling process is performed on the generated watertight floor map to separate rooms from each other. In the labeling process, each connected segment is uniquely labeled and sorted according ISPRS J. Geo-Inf.pixel 2017, 6, 206 The number of top largest segments, which are supposed to be classified 12 of 26 to theInt. occupied count. as rooms, is determined by the initial segmentation phase. Typically, the other remaining segments do not include a sufficient number of occupied pixels to be classified as individual rooms, and thus do not include a sufficient number of occupied pixels to be classified as individual rooms, and thus should be reclassified among the above-classified rooms. Likewise, the occupied pixels used for should be reclassified among the above-classified rooms. Likewise, the occupied pixels used for closing closing the openings (red pixels in Figures 8 and 9) should also be classified. In the refinement phase, the openings (red pixels in Figures 8 and 9) should also be classified. In the refinement phase, we we investigate all of the eight-connectivity neighboring pixels of those occupied but unclassified investigate all of the eight-connectivity neighboring pixels of those occupied but unclassified segments, segments, and group the neighboring pixels according to their labels. The dominant label is and group the neighboring pixels according to their labels. The dominant label is determined by determined by counting the number of occupied pixels of each group, and is then used to label the counting the number of occupied pixels of each group, and is then used to label the unclassified unclassified segments. The refinement process operates recursively until no other unclassified segments. The refinement process operates recursively until no other unclassified segments are found. segments are found. Examples before and after application of refinement are illustrated in Figure 10, Examples before and after application of refinement are illustrated in Figure 10, where the unclassified where the unclassified segments in black are classified as green according to the dominant label of segments in black are classified as green according to the dominant label of the neighboring pixels. the neighboring pixels.

(a)

(b)

Figure 10. Example of refinement for unclassified segments colored in black: (a) before; and (b) after Figure 10. Example of refinement for unclassified segments colored in black: (a) before; and (b) refinement. after refinement.

Over-segmentation can occur when the initial segmentation incorrectly finds single rooms due Over-segmentation can occur when initial segmentation single rooms to hollow parts on the rasterized point the cloud. The main factorsincorrectly leading tofinds hollow parts are due the to hollow parts on the rasterized point cloud. The main factors leading to hollow parts are the presence of low-density areas or unfiltered occlusions. Typically, low-density areas can be prevented presence low-density areas or unfiltered Typically, low-density areas can be prevented either by of selecting the proper pixel size for occlusions. rasterization or by performing multiple scans. However, either by selecting the proper pixel size for rasterization or by performing multiple scans. However, significant clutter passing through the ceiling and floor, such as stairs, columns, or pipelines, significant clutter passing through the ceiling and floor, such as stairs, columns, or pipelines, inevitably inevitably produces hollow parts on the rasterized binary map. Since hollow parts sometimes are produces on the rasterized binary map. Since hollow parts are identified as identifiedhollow as wallsparts (unoccupied pixels), as shown in Figure 11a, they cansometimes lead to over-segmentation, walls (unoccupied pixels), shown in Figure 11a, theywe caninvestigate lead to over-segmentation, as shown in as shown in Figure 11b,c.asTo address this problem, all of the eight-connectivity Figure 11b,c. To address this problem, we investigate all of the eight-connectivity neighboring pixels neighboring pixels of individual rooms, and classify them as occupied or unoccupied. If the number of individual rooms, and classify them as occupied or unoccupied. If the number of occupied pixels of occupied pixels is greater than the number of unoccupied pixels, indicating that more than half of is than the number unoccupied pixels, that more half of the thegreater neighboring pixels are of occupied by the otherindicating rooms, which wouldthan be against ourneighboring assumption pixels are occupied by the other rooms, which would be against our assumption that room is a that a room is a region bounded by walls (i.e., unoccupied pixels), the current room is areclassified region walls (i.e., unoccupied the current room is 11d). reclassified among the other amongbounded the otherbyrooms according to thepixels), dominant label (Figure This process operates rooms according to the dominant label (Figure 11d). This process operates recursively until all of the recursively until all of the individual rooms are tested, thus creating the final segment map. individual rooms are tested, thus creating the final segment map. Subsequently, the original 3D scan Subsequently, the original 3D scan points are segmented according to their projected 2D locations on points aresegment segmented according to their projected 2D locations on the final segment map. the final map.


13 of 26


13 of 26

(a)

(b)

(c)

(d)

Figure 11. Example of refinement process for over-segmentation: (a–c) incorrect segmentation due to

Figure 11.hollow Example process for over-segmentation: incorrect segmentation due parts;of andrefinement (d) merging of over-segmented room (colored in red) (a–c) into nearby room (colored in to hollow blue). parts; and (d) merging of over-segmented room (colored in red) into nearby room (colored in blue). 4. Experiments and Results

4. Experiments and Results 4.1. Evaluation with Real-World Data Sets We evaluated the proposed approach using two real-world data sets acquired by a phase-based

4.1. Evaluation with Real-World Data laser scanner, FARO Focus 3D. Sets The first data set contained approximately 66 million scan points, including a long corridor that connects seven rooms. The second data set contained approximately We evaluated the proposed approach using two real-world data sets acquired by a phase-based 31 million scan points, including five rooms and stairs ascending into the upper floor. The scanning laser scanner, FARO Focus 3D. The first data setocclusions contained 66 in million scan points, results for these sites, as representative of severe andapproximately clutter, are illustrated Figure 12. including We a long corridorthe that connects seven rooms. approach The second data set implemented proposed room segmentation in MATLAB forcontained prototyping,approximately and the experiment onfive a laptop computer with Intel Core i7 (2.3into GHz,the 16.0upper GB of RAM). 31 millionperformed scan points, including rooms and stairs ascending floor. The scanning The test parameters are listed in Table 2. Except for parameter selection, our approach is results for these sites, as representative of severe occlusions and clutter, are illustrated in Figure 12. automatic and does not require any user intervention. In the height estimation phase, considering We implemented the proposed room segmentation approach in MATLAB for interval prototyping, and the distance accuracy of the laser scanner (±2 mm for Focus 3D), a sufficiently large height of performed0.1the experiment on a laptop with Intel Core constituting i7 (2.3 GHz, GB offloor RAM). m was adopted for both data setscomputer to effectively detect the points the 16.0 ceiling and in the histogram. In noise in filtering to properly remove theselection, noisy points while preserving The test parameters arethelisted Tablephase, 2. Except for parameter our approach is automatic the structural detail, the offset value should be slightly less than the distance between the ceiling and and does not require any user intervention. In the height estimation phase, considering the distance the highest points of openings or furniture. In this study, the distance was measured to 0.56 m for the accuracy of the laser scanner mm Focus a sufficiently largevalues height interval first data set and to 0.45(± m2for the for second data3D), set; accordingly, the offset of 0.5 and 0.4of m0.1 m was adopted for both dataforsets to effectively detect the respectively. points constituting theoffset ceiling and floor in the were adopted the first and the second data sets, Note that these spaces were choices; certainly, one to could alter or skip this process to extend the while segmentation histogram.ourInsubjective the noise filtering phase, properly remove the noisy points preserving the rasterization, a large pixel size can reduce processing time, but a too-large pixel size structural capabilities. detail, theFor offset value should be slightly less than the distance between the ceiling and the incurs large distortions in wall boundaries. By contrast, a small pixel size preserves the precise highest points of openings or furniture. In this study,processing the distance was to parts 0.56 m for the boundaries, but a too-small pixel size greatly increases time and canmeasured lead to hollow first data set to 0.45areas. m for the second data set; accordingly, the offset of 0.5 and for and low-density Therefore, the pixel size should be smaller than the sizevalues of the detectable wall0.4 m were thickness, but and should notsecond be too small prevent hollow parts on the binary map. Inoffset this study, the were our adopted for the first the datatosets, respectively. Note that these spaces subjective choices; certainly, one could alter or skip this process to extend the segmentation capabilities. For rasterization, a large pixel size can reduce processing time, but a too-large pixel size incurs large distortions in wall boundaries. By contrast, a small pixel size preserves the precise boundaries, but a too-small pixel size greatly increases processing time and can lead to hollow parts for low-density areas. Therefore, the pixel size should be smaller than the size of the detectable wall thickness, but should not be too small to prevent hollow parts on the binary map. In this study, the pixel sizes of 0.05 and 0.03 m were adopted for the first and the second data sets, respectively, resulting in binary maps of 459 by 446 pixels for the first data set and 203 by 249 pixels for the second.


14 of 26

ISPRS Int.sizes J. Geo-Inf. 2017, 6, 206 14 of 26 pixel of 0.05 and 0.03 m were adopted for the first and the second data sets, respectively, resulting

in binary maps of 459 by 446 pixels for the first data set and 203 by 249 pixels for the second.

(a)

(b) Figure 12. Point-cloud Input: first (a); and second (b) data sets, representing severe occlusion and Figure 12. Point-cloud Input: first (a); and second (b) data sets, representing severe occlusion and clutter. Ceiling points are omitted for clarity. clutter. Ceiling points are omitted for clarity. Table 2. Parameter selections for two real-world data sets (unit: meters). Table 2. Parameter selections for two real-world data sets (unit: meters). Process Phase Height estimation Process Phase Rasterization HeightNoise estimation filtering Rasterization Initial segmentation

Noise filtering Opening closure Initial segmentation Opening closure

Parameter (s) Interval(s) Parameter Pixel size Interval offset Noise-filtering Pixel size Detection window size for finding initial segments Noise-filtering offset Detection window size for finding surrounding walls Detection window size for finding initial segments Detection window size for pruning small branches

Detection window size for finding surrounding walls Detection window size for pruning small branches

Data 1 Data 2 Remark 0.10 0.10 User-specified Data 1 Data 2 Remark 0.05 0.03 User-specified 0.10 0.10 User-specified 0.50 0.40 User-specified 0.05 0.03 User-specified User-specified 1.55 0.99 0.50 0.40 User-specified 3.10 1.98 Automatic 1.55 0.99 User-specified 1.55 0.99 Automatic

3.10 1.55

1.98 0.99

Automatic Automatic

The rasterization phase requires, for initial segmentation, one additional parameter: the detection window size. A large detection window size increases the number of pixels to be The rasterization phase requires, for initial segmentation, one additional parameter: the detection investigated within its range, thus requiring excessive processing time. In addition, a large detection window size. largenot detection window size increases the number of pixels to be investigated within window sizeAdoes necessarily equate to better performance; in fact, the detection window size is its recommended range, thus requiring processing In addition, a large detection size does to be justexcessive slightly greater than time. the existing openings. In practice, thewindow user is required nottonecessarily equateofto better performance; fact, detection window size is recommended define the radius the detection window; asinfor thethe diameter of the detection window for initial to be just slightly(Table greater existing openings. In practice, the and user adding is required define segmentation 2),than it isthe determined by doubling the radius one topixel sizethe radius of the detection window; as for the diameter of the forwindow initial segmentation representing the detection window center. By contrast, thedetection other twowindow detection sizes in the (Table 2), it is determined by automatically doubling the radius and adding onetopixel sizedetection representing the detection opening-closure phase are determined according the first window size in window center. Byphase. contrast, the other two detection window sizes in the opening-closure phase are the rasterization Figure 13a,b shows according the generated segment maps forwindow the first size and in second real-world data sets, automatically determined to the first detection the rasterization phase. respectively. Theshows total time consumption was 22.27 seconds forfirst the and first second data setreal-world and 16.78 seconds Figure 13a,b the generated segment maps for the data sets, for the second. the original 3D points were segmented, as shown in Figure 13c,d, according to respectively. The Then, total time consumption was 22.27 s for the first data set and 16.78 s for the second. their projected on the segment maps. As isinapparent, the proposed approach Then, the original2D 3Dlocations points were segmented, as shown Figure 13c,d, according to their performed projected 2D well with the first real-world data set. Note that some holes in the blue point cloud were locations on the segment maps. As is apparent, the proposed approach performed well withdue the to first occlusions by furniture at the time of scanning. In the second data-set result, however, we at real-world data set. Note that some holes in the blue point cloud were due to occlusions by furniture a problem, which is that the algorithm, as shown in Figure 14a, failed to decompose the theencountered time of scanning. In the second data-set result, however, we encountered a problem, which is that first segment into two different rooms. The main factor leading to this incorrect segmentation was the algorithm, as shown in Figure 14a, failed to decompose the first segment into two different rooms. The main factor leading to this incorrect segmentation was the noisy scan points that filled the empty space between two wall surfaces as shown in Figure 14b, and which were projected onto the binary map and segmented together with the other (correctly) occupied pixels. The diffused noisy points

ISPRS Int. J. Geo-Inf. 2017, 6, 206 ISPRS Int. J. Geo-Inf. 2017, 6, 206

15 of 26 15 of 26

theInt. noisy scan 2017, points that filled the empty space between two wall surfaces as shown in Figure 14b, ISPRS J. Geo-Inf. 6, 206 15 of 26 the points that filled two wall surfaceswith as shown in Figure 14b, andnoisy whichscan were projected onto the theempty binaryspace map between and segmented together the other (correctly) and which were projected onto the binary map and segmented together with the other (correctly) occupied pixels. The diffused noisy points were mainly due to a mirror in the scan area [39]. To were mainly due toThe a mirror in the scan area [39]. To enhance thea current noise-filtering phase, one occupied pixels. diffused noisy points were mainly due would to mirror in the scan area [39]. To enhance the current noise-filtering phase, one possible means be planar detection [40], which possible means would be planar detection [40], which will be the focus of one of our future studies. enhance the current noise-filtering phase, one possible means would be planar detection [40], which will be the focus of one of our future studies. will be the focus of one of our future studies.

(a) (a)

(b) (b)

(c) (d) (c) (d) Figure 13. Segmentation results for two real-world data sets: (a,b) 2D segment maps; and (c,d) 3D Figure 13. Segmentation results for two real-world data sets: (a,b) 2D segment maps; and (c,d) 3D Figure 13. Segmentation segmented point clouds. results for two real-world data sets: (a,b) 2D segment maps; and (c,d) 3D segmented point clouds. segmented point clouds.

(a) (b) (a) (b) Figure 14. Original point cloud of second real-world data set: (a) entire point cloud in plan view; and Figure 14.points Original point cloud of second real-world data set: (a) entire point cloud in plan view; and (b) noisy between two wall surfaces. Figure 14. Original point cloud of second real-world data set: (a) entire point cloud in plan view; and (b) noisy points between two wall surfaces. (b) noisy points between two wall surfaces.

Additionally, we evaluated our room segmentation approach, as shown in Figure 15, using we evaluated our room approach, as shown in 15, using four Additionally, publicly available point-cloud data setssegmentation [41]. The details of the experiments areFigure summarized in four publicly available point-cloud data sets [41]. The details of the experiments are summarized in Additionally, we evaluated our room segmentation approach, as shown in Figure 15, using Table 3. For each data set, the pixel sizes were selected so as to be smaller than the size of the Table 3. For each datapoint-cloud set, sizes were selected so parts asoftothe bethe smaller the size the four publicly available data sets [41]. The details experiments are detectable wall thickness, butthe notpixel too small to prevent hollow on binarythan map. Forsummarized the of initial detectable wall thickness, but not too small to prevent hollow parts on the binary map. For the initial in segmentation, Table 3. For each data set, pixel sizes weresizes selected as to be smaller of the meanwhile, thethe detection window were so selected to be greaterthan thanthe thesize existing segmentation, meanwhile, window sizes were selected tothe be binary greatermap. than the existing detectable thickness, butthe notdetection too small to prevent hollow parts Forwere the initial openingswall measured in the point cloud. The height estimation andonnoise-filtering offset not openings measured in the point cloud. The height estimation and noise-filtering offset were not segmentation, meanwhile, the detection window sizes were selected to be greater than the existing openings measured in the point cloud. The height estimation and noise-filtering offset were not considered in the experiments with the publicly available data sets, due to the different ceiling levels


16 of 26


16 of 26

of the rooms. Compared with the above-mentioned two real-world data sets, the publicly available considered in the experiments with the publicly available data sets, due to the different ceiling levels dataofsets more challenging, due to the complicated shapes andpublicly incomplete scans. the were rooms. Compared with the above-mentioned two architectural real-world data sets, the available Nevertheless, the more segment maps yielded good results for separation of rooms with straight data sets were challenging, due togenerally the complicated architectural shapes and incomplete scans. segment boundaries. In Figure 15a, the first publicly available data set is a relatively simple Nevertheless, the segment maps yielded generally good results for separation of rooms with structure, straight andsegment its two rooms were In correctly bypublicly the proposed approach, in Figure 15b. boundaries. Figure segmented 15a, the first available data setasisshown a relatively simple With the second setrooms in Figure the algorithm successfully separated the small segments structure, and data its two were15c, correctly segmented by the proposed approach, as shown in (room numbers 1, 3, and 4 in Figure from the 15c, mainthe hall (room number 2 in Figure 15d),the butsmall failed Figure 15b. With the second data15d) set in Figure algorithm successfully separated to decompose the main hall 1, into a room a corridor. the fact that 2the segments (room numbers 3, and 4 in and Figure 15d) fromThis the was maindue hallto (room number in corridor Figure 15d), but decompose the main hall into a room andare a corridor. Thiswith was narrower due to thepassages): fact that violated onefailed of ourtoassumptions (i.e., two different regions connected the corridor oneto of the our other assumptions two different regions connected with narrower it was directly violated connected rooms (i.e., without doorways, and are thus the algorithm failed to passages): it was directly connected to the other rooms without doorways, and thus the algorithm create proper initial segments. On the other hand, with the third data set in Figure 15e, the algorithm failed to create initial segments. On1,the other4;hand, the8third data set Figure 15e, the over-produced theproper segments (room numbers 2, and and 6,with 7, and in Figure 15f)indue to the hollow algorithm over-produced the segments (room numbers 1, 2, andmeticulous 4; and 6, 7, scanning and 8 in Figure 15f)spaces due parts stemming from incomplete scans. To avoid this problem, without to the hollow parts stemming from incomplete scans. To problem, meticulous scanning is required so as to minimize occluded areas. However, foravoid cases this in which sufficient scanning is not without spaces is required so as to minimize occluded areas. However, for cases in which sufficient available, an effective refinement strategy will be taken into consideration in future work. Finally, as scanning is not available, an effective refinement strategy will be taken into consideration in future indicated in Figure 15g,h, the algorithm yielded overall good agreements with the experimental data. work. Finally, as indicated in Figure 15g,h, the algorithm yielded overall good agreements with the Note though that due to the small pixel size and the greater number of rooms, the processing time was experimental data. Note though that due to the small pixel size and the greater number of rooms, longer than for the other data sets. The several factors affecting processing time will be discussed in the processing time was longer than for the other data sets. The several factors affecting processing detail in the following section. time will be discussed in detail in the following section.

(a)

(b)

(c)

(d)

(e)

(f)

Figure 15. Cont.


17 of 26


17 of 26

(g)

(h)

Figure 15. Segmentation results for publicly available point-cloud data sets: (a,c,e,g) input point Figure 15. Segmentation results for publicly available point-cloud data sets: (a,c,e,g) input point clouds; clouds; and (b,d,f,h) resultant segment maps. and (b,d,f,h) resultant segment maps. Table 3. Summary of experiments with four publicly available point-cloud data sets. Table 3. Summary of experiments with four publicly available point-cloud data sets.

Data No.

Data No.1 1 2 2 3 3 4 4

Data Size (million Data Sizepoints) (million 37.69 points) 40.71 37.69 40.71 33.93 33.93 25.17 25.17

Pixel Size (meters) Pixel Size (meters) 0.15 0.20 0.15 0.20 0.10 0.10 0.05 0.05

Detection Window Size (meters) Detection Window Size (meters) 4.65 4.60 4.65 4.60 1.90 1.90 1.85 1.85

Processing Time (sec)Time Processing (sec) 25.93 26.74 25.93 26.74 23.57 23.57 52.67 52.67

4.2. Evaluation with Synthetic Data Sets 4.2. Evaluation with Synthetic Data Sets In order to check the feasibility of the proposed approach for the data sets with nonlinear architectural numbers of rooms, we generated five floor that In order toshapes checkand thelarger feasibility of the proposed approach forsynthetic the data setsplans. withNote nonlinear all of the tests were directly the 2Dwe binary map. In Figure 16a, floor the first binary map architectural shapes and larger performed numbers ofonrooms, generated five synthetic plans. Note that includes the greatest number of rooms connected by a long corridor that makes a large loop, but its all of the tests were directly performed on the 2D binary map. In Figure 16a, the first binary map overallthe architectural shape is simple, consisting rectangular-shaped In Figure includes greatest number ofrelatively rooms connected by a long of corridor that makes arooms. large loop, but its 16c, the second binary map is still limited to straight boundaries, but the architectural shape is more overall architectural shape is relatively simple, consisting of rectangular-shaped rooms. In Figure 16c, thanmap the first one.limited In Figure the third binary map same architectural shape thecomplicated second binary is still to 16e, straight boundaries, buthas thethe architectural shape is more as the second one, but includes many hollow parts for consideration of scan holes due to low-density complicated than the first one. In Figure 16e, the third binary map has the same architectural shape as areas or tall furniture reaching to the ceiling level. In Figure 16g, the fourth binary map includes the second one, but includes many hollow parts for consideration of scan holes due to low-density some nonlinear architectural shapes and small openings. Finally, the fifth binary map in Figure 16i areas or tall furniture reaching to the ceiling level. In Figure 16g, the fourth binary map includes some shows the most complicated case among the synthetic data sets: the architectural shape is described nonlinear architectural shapes and small openings. Finally, the fifth binary map in Figure 16i shows by a set of highly nonlinear boundaries. Notably, each room on all of the synthetic binary maps the most complicated case among the synthetic data sets: the architectural shape is described by a set includes one or more openings to other rooms. of highly nonlinear Notably, on all of thethat synthetic binary maps includes one The detailed boundaries. test specifications are each listedroom in Table 4. Note the time consumption does not or more openings to other rooms. Table 4 indicates that the time consumption is influenced by four include point-cloud rasterization. The detailed test specifications are listed inmap Table 4. detection Note thatwindow the time consumption does not factors: number of rooms, hollow parts, and the and sizes in the rasterization include point-cloud rasterization. Table 4 indicates that the time consumption is influenced four phase. A larger number of rooms (data set 1) and hollow parts (data set 3) undoubtedly incur by more factors: of rooms, hollow and the the number map andofdetection sizes in the Likewise, rasterization time number consumption, because theyparts, increase openingswindow to be investigated. a phase. larger number of increases rooms (data 1)consumption. and hollow parts set 3)size undoubtedly incur largeAmap size certainly the set time Since(data the map is determined by more the time consumption, because theyphase, increase theneed number of openings topixel be investigated. Likewise,ofa the large pixel size in the rasterization users to select the proper size for minimization computational loads but preservation of the precise architectural shapes; wedetermined recommendby thethe usepixel of map size certainly increases the time consumption. Since the map size is wallphase, thickness. this test, the pixel size of pixel 5 cm size was for assumed for all ofofthe sizehalf in the minimum rasterization usersInneed to select the proper minimization the synthetic binary maps. The detection window size greatly influences the processing time; for computational loads but preservation of the precise architectural shapes; we recommend the use of the segmentation of data set 4test, became much more consuming even halfexample, the minimum wall thickness. In this the pixel size of 5time cm was assumed for with all ofthe thesmallest synthetic number of rooms. As we discussed a large detection window size does not necessarily equate binary maps. The detection windowabove, size greatly influences the processing time; for example, the to better performance. Therefore, a smaller detection window size is to be preferred, as long as it segmentation of data set 4 became much more time consuming even with the smallest number of provides fordiscussed proper initial segmentation. rooms. As we above, a large detection window size does not necessarily equate to better performance. Therefore, a smaller detection window size is to be preferred, as long as it provides for proper initial segmentation.

ISPRS Int. J. Geo-Inf. 2017, 6, 206 ISPRS Int. J. Geo-Inf. 2017, 6, 206

18 of 26 18 of 26

Table 4. Test specifications for synthetic data sets. Table sets.

Data Architectural No. of Room Data Architectural No. of Room No. Shape (segment/true) No. Shape (segment/true) 1 Linear 28/28 1 Linear 28/28 2 Linear 14/14 2 Linear 14/14 3 Linear 16/14 3 Linear 16/14 4 Nonlinear 9/9 4 Nonlinear 9/9 5 Nonlinear 17/17 5 Nonlinear 17/17

Map Size Map Size (pixels) (pixels) 421 by 704 421 by 704 359 by 606 359 by 606 359 by 606 359 by 606 359 by 606 606 359 by 359 by 606 359 by 606

Detection Window Detection Window Size Size(meters) (meters) 1.05 1.05 1.05 1.05 1.05 1.05 1.55 1.55 1.05 1.05

Processing Processing Time (sec) Time (sec) 46.62 46.62 7.98 7.98 10.75 10.75 27.69 27.69 8.66 8.66

The resultant segment maps are compared with the input binary maps in Figure 16. Overall, the The resultant segment maps are compared with the input binary maps in Figure 16. Overall, proposed approach was proved to be not susceptible to either the number of rooms or complex the proposed approach was proved to be not susceptible to either the number of rooms or complex architectural shapes. However, in Figure 16f, compared with Figure 16d, two regions (room architectural shapes. However, in Figure 16f, compared with Figure 16d, two regions (room numbers numbers 2 and 13) were found to be over-segmented. This problem was due to the fact that several 2 and 13) were found to be over-segmented. This problem was due to the fact that several hollow hollow parts prevented the detection window from entering those rooms, thus leading to incorrect parts prevented the detection window from entering those rooms, thus leading to incorrect initial initial segmentation. In the refinement phase, if the number of neighboring, unoccupied pixels of a segmentation. In the refinement phase, if the number of neighboring, unoccupied pixels of a segment is segment is greater than the number of occupied ones, the algorithm is not able to distinguish the greater than the number of occupied ones, the algorithm is not able to distinguish the over-segmented over-segmented room from the other, real rooms. To prevent hollow parts on the binary map, users room from the other, real rooms. To prevent hollow parts on the binary map, users need to either select need to either select a greater pixel size for rasterization or perform multiple scans to avoid a greater pixel size for rasterization or perform multiple scans to avoid low-density areas. low-density areas.

(a)

(b)

(c)

(d)

(e)

(f) Figure 16. Cont.


19 of 26


19 of 26

(g)

(h)

(i)

(j)

Figure 16. Segmentation results for synthetic data sets: (a,c,e,g,i) input binary maps; and (b,d,f,h,j)

Figure 16. Segmentation results for synthetic data sets: (a,c,e,g,i) input binary maps; and (b,d,f,h,j) resultant segment maps. resultant segment maps.

4.3. Comparison with Existing Methods Using Publicly Available Data Sets

4.3. Comparison with Existing Methods Using Publicly Available Data Sets

Additionally, we compared our room segmentation with those of five popular existing methods

Additionally, we compared our room segmentation withUse those fivedata popular using 20 publicly available non-furnished data sets [19]. of of these sets existing reflected methods our occlusions and clutterdata are properly by means of point-cloud projection. usingassumption 20 publicly that available non-furnished sets [19].reduced Use of these data sets reflected our assumption Figure 17 provides a visual comparison of the by segment created by each of the algorithms. Due that occlusions and clutter are properly reduced meansmaps of point-cloud projection. Figure 17 provides to the length limit of this paper, only four segmentation results, those that best describe the strengths a visual comparison of the segment maps created by each of the algorithms. Due to the length limit limitations of the proposed approach, are listed in the figure. Figure 17a–g depicts the ground of thisand paper, only four segmentation results, those that best describe the strengths and limitations truths by human segmentation, the morphological segmentation [22], the distance transform-based of the proposed approach, are listed in the figure. Figure 17a–g depicts the ground truths by human segmentation [23], the Voronoi graph-based segmentation [20], the feature-based segmentation [33], segmentation, the morphological segmentation [22], distance transform-based segmentation [23], the Voronoi random field segmentation [34], and ourthe room segmentation, respectively. the Voronoi [20], segmentation the Voronoi random At graph-based this stage, onesegmentation should note that we the usedfeature-based the existing ground truths and[33], segmentation results field segmentation [34], room respectively. available online [42]and to our avoid any segmentation, subjectivities and ambiguities arising from reproducing the At this stage, note we used and the existing truths and segmentation results segment mapsone withshould different testthat specifications parameterground adjustments. Overall, the existing methods offer irregular or ragged segments, arereproducing susceptible tothe severe available online [42] to avoid any subjectivities and ambiguities arisingand from segment overor under-segmentation. As indicated in Figure 17b, the morphological approach resulted in maps with different test specifications and parameter adjustments. highly ragged segmentations first, third, and fourth data and sets,are andsusceptible showed severe Overall, the existing methodswith offerthe irregular or ragged segments, to severe under-segmentation with the second data set (the brown-colored regions). Likewise, the distance over- or under-segmentation. As indicated in Figure 17b, the morphological approach resulted in highly transform-based approach (Figure 17c) suffered from arbitrarily segmented boundaries over the test ragged segmentations with the first, third, and fourth data sets, and showed severe under-segmentation data sets. The Voronoi graph-based approach (Figure 17d) enabled straight segment boundaries, but with the second data set (the brown-colored regions). Likewise, the second distance transform-based approach severe over-segmentation was found for the long corridor in the data set and on the main (Figure 17c) suffered from arbitrarily segmented boundaries over the test data sets. The Voronoi hall in the third data set. The feature-based approach (Figure 17e) showed the worst results: highly graph-based approach (Figure 17d) enabled straight segment boundaries, but severe over-segmentation ragged boundaries and severe under-segmentation over the test data sets. Among the existing was found forthe theVoronoi long corridor theapproach second (Figure data set and on the the main in the third data set. methods, random in field 17f) generated besthall overall segmentation results, thoughapproach over-segmentation boundaries still anhighly issue, as shown boundaries for the third and The feature-based (Figure with 17e) ragged showed the worstwas results: ragged data set. With regardover to Figure 17g, visual suggested that our approach has distinct severetest under-segmentation the test data sets.inspection Among the existing methods, the Voronoi random advantages over the existing methods: it is resistant to overor under-segmentation, and provides field approach (Figure 17f) generated the best overall segmentation results, though over-segmentation straight segment boundaries. More specifically, compared with the ground truths, our approach with ragged boundaries was still an issue, as shown for the third test data set. With regard to Figure 17g, yielded quite similar segmented rooms and corridors for the second and third data sets. For the first visual inspection suggested that our approach has distinct advantages over the existing methods: it is and fourth data sets, our approach also afforded satisfactory results in the separation of rooms, but resistant to overor under-segmentation, and provides straight More tended to some under- or over-segmentation of corridors. Forsegment example,boundaries. with the first dataspecifically, set in compared with the ground truths, our approach yielded quite similar segmented rooms corridors Figure 17g, the two narrow corridors in the upper-left and lower-right corners were notand properly for the second from and third datahall sets.because, For theasfirst and fourth data sets, our approach also afforded separated the main is the case in Figure 15e, they violated our second satisfactory results in the separation of rooms, but tended to some under- or over-segmentation of corridors. For example, with the first data set in Figure 17g, the two narrow corridors in the upper-left and lower-right corners were not properly separated from the main hall because, as is the


20 of 26

ISPRS J. Geo-Inf. 206 20 of 26 case in Int. Figure 15e,2017, they6, violated our second assumption (i.e., two different regions are connected with narrower passages). The fourth data set, as shown in Figure 17g, contained several different-sized assumption (i.e., two different regions are connected with narrower passages). The fourth data set, passages; thus, in the initial segmentation phase, some corridors could not be properly separated by as shown in Figure 17g, contained several different-sized passages; thus, in the initial segmentation means of a single detection window size. phase, some corridors could not be properly separated by means of a single detection window size.

(a)

(b)

(c)

(d)

(e)

(f)

Figure 17. Cont.


21 of 26


(g)

21 of 26


21 of 26

(g)

Figure 17. Comparison of room segmentation approaches with non-furnished data sets: (a) ground Figure 17. Comparison of room segmentation approaches with non-furnished data sets: (a) ground truth by manual segmentation; (b) morphological segmentation; (c) distance transform-based truth by manual segmentation; (b) morphological segmentation; (c) distance transform-based segmentation; (d) Voronoi graph-based segmentation; (e) feature-based segmentation; (f) Voronoi segmentation; (d) Voronoi graph-based segmentation; (e) feature-based segmentation; (f) Voronoi random field segmentation; and (g) our room segmentation. randomFigure field 17. segmentation; and (g)segmentation our room segmentation. Comparison of room approaches with non-furnished data sets: (a) ground truth by manual segmentation; (b) morphological segmentation; (c) distance transform-based Meanwhile, occlusions and clutter on occupancy grid maps cannot be avoided, as they are often segmentation; (d) Voronoi graph-based segmentation; (e) feature-based segmentation; (f) Voronoi generated from the 2D horizontal sensor at the height abovebethe floor. Given thisare Meanwhile, occlusions and clutter onreadings occupancy grid mapsjust cannot avoided, as they random field segmentation; and (g) our room segmentation. reality, experiments 20 horizontal publicly available furnished at data werejust additionally often generated from with the 2D sensor readings thesets height above theperformed. floor. Given Figure 18Meanwhile, provides aocclusions visual comparison ofoccupancy just four of those sets sets (due to the paper-length limit) and clutter on grid mapsdata cannot be avoided, as they are often this reality, experiments with 20 publicly available furnished were additionally performed. generated from the 2D horizontal sensor readings at the height just above the floor. Given this using manual segmentation (Figure 18a), morphological segmentation (Figure 18b), distance Figure 18 provides a visual comparison of just four of those sets (due to the paper-length limit) using reality, experiments with 20 publicly available furnishedgraph-based data sets were additionally performed. transform-based segmentation 18c), Voronoi (Figure 18d), manual segmentation (Figure 18a),(Figure morphological segmentation (Figure segmentation 18b), distance transform-based Figure 18 segmentation provides a visual comparison of just four of those setssegmentation (due to the paper-length limit) feature-based (Figure 18e), Voronoi random field (Figure 18f), and our segmentation (Figure segmentation 18c), Voronoi(Figure graph-based segmentationsegmentation (Figure 18d),(Figure feature-based segmentation using manual 18a), morphological 18b), distance room segmentation (Figure 18g) method. The ground truths and segmentation results with the (Figure 18e), Voronoi random field segmentation (Figure 18f), and oursegmentation room segmentation (Figure 18g) transform-based segmentation (Figure 18c), Voronoi graph-based (Figure 18d), furnished data are available online [42]. In the segmentation results, the occlusions due to furniture 18e), Voronoi random segmentation (Figure 18f), and our method.feature-based The groundsegmentation truths and (Figure segmentation results with field the furnished data are available online [42]. yielded nosegmentation sensor readings for18g) grid mapping, and thus remained as unsegmented black areas. room (Figure method. The ground truths and segmentation results with In the segmentation results, the occlusions due to furniture yielded no sensor readings for gridthe mapping, Overall, compared with the experimental results with the non-furnished data, the existing methods furnished data are available online [42]. In the segmentation results, the occlusions due to furniture and thus remained as unsegmented black areas. Overall, compared with the experimental results with resulted in more severereadings over-segmentation even and withthus veryremained simple test data. Our approach yielded yielded no sensor for grid mapping, as unsegmented black areas. the non-furnished data, the existing methods resulted in more severe over-segmentation even with very similar results for the first third data sets, but tended to over-segmentation for themethods second and Overall, compared with and the experimental results with the non-furnished data, the existing simple test data. Our yielded results for the first and third data sets, but tended to in more severe over-segmentation even with simple test data. Our approach yielded fourthresulted sets. This was approach attributable to the similar inability of thevery detection window in the initial segmentation over-segmentation for the second and fourth sets. This was attributable to the inability of the detection similar results for the first and third data sets, but tended to over-segmentation for the second and phase to pass through the areas where there were lots of occlusions, thus resulting in fourth sets. This was attributable to the inability of the detection window in the initial segmentation window in the initial segmentation phase to pass through the areas where there were lots of occlusions, over-production of the initial segments. phase to through the of areas where segments. there were lots of occlusions, thus resulting in thus resulting in pass over-production the initial

over-production of the initial segments.

(a) (a)

(b)

(c)

(b)

(c)

Figure 18. Cont.


22 of 26


22 of 26

(d)

(e)

(f)

(g)

Figure 18. Comparison of room segmentation approaches with furnished data sets: (a) ground truth

Figure by 18. Comparison of room segmentation approaches with furnished datatransform-based sets: (a) ground truth manual segmentation; (b) morphological segmentation; (c) distance segmentation; (d) Voronoi graph-based segmentation; (e) feature-based segmentation; (f) Voronoi by manual segmentation; (b) morphological segmentation; (c) distance transform-based segmentation; random field segmentation; and (g) our room segmentation. (d) Voronoi graph-based segmentation; (e) feature-based segmentation; (f) Voronoi random field segmentation; and (g) our room segmentation.

Finally, quantitative evaluations are carried out using three different measures: correctness, completeness, and absolute deviation [40,43]. The correctness and completeness measures were calculated pixel-by-pixel in comparing thecarried automatically segmented with the measures: ground truths. In Finally, quantitative evaluations are out using threemap different correctness, Equation (2), the correctness measure evaluates the percentage of the matched pixels among the completeness, and absolute deviation [40,43]. The correctness and completeness measures were automatically detected ones. On the other hand, the completeness measure in Equation (3) provides calculated pixel-by-pixel in comparing the automatically segmented map with the ground truths. the percentage of the matched pixels among the manually detected ones (i.e., the ground truths). In Equation (2), the correctness measure evaluates the percentage of the matched pixels among the Lastly, the absolute deviation is the absolute difference between the number of rooms detected automatically On theofother theautomatically completeness in Equation manuallydetected ( ) andones. the number roomshand, detected ( measure ). This measure indicates(3) theprovides the percentage of the matched pixels among the manually detected ones (i.e., the ground truths). robustness against over- or under-segmentations. Table 5 lists the evaluation metrics computed by Lastly, the fivedeviation existing methods and our approach using the 20 publicly available sets detected without and the absolute is the absolute difference between the number ofdata rooms manually with furniture. (N ) and the number of rooms detected automatically (N ). This measure indicates the robustness m

a

of pixels matched against over- or under-segmentations. Table 5total listsnumber the evaluation metrics computed by the five existing = (2) total number of pixels detected automatically methods and our approach using the 20 publicly available data sets without and with furniture.

Correctness =

=

total number of pixels matched total number of pixels matched total number of pixels detected manually

(3)

total number of pixels detected automatically

Completeness =

= | of pixels − | matched total number total number of pixels detected manually

Absolute Deviation = | Nm − Na |

(2)

(4)

(3) (4)

In the table, the average values indicate the average measures of the five existing methods. Each bolded value is a best or second-best result. Overall, it was found that the proposed approach outperformed all of the average values. In the case of the non-furnished data sets, the correctness


23 of 26

of the proposed approach (89.6 ± 9.8%) was comparable with the second-best value (90.0 ± 8.4%) achieved by the Voronoi-random method. Moreover, the proposed approach achieved the best results for the measures of completeness (91.7 ± 9.0%) and absolute-deviation (2.5). Meanwhile, in the case of the furnished data sets, the proposed approach achieved two second-best results, for the measures of correctness and completeness (84.8 ± 13.9% and 67.8 ± 14.5%), respectively, which proved that the proposed room segmentation is an effective approach for furnished environments. Table 5. Evaluation metrics (mean ± standard deviation) as calculated over 20 publicly available data sets without and with furniture. The first and the second best results are highlighted in bold type. Data Type Properties Methods Morphological Distance transform Voronoi graph Feature based Voronoi random Average Proposed

Non-Furnished

Furnished

Correctness (%)

Completeness (%)

Absolute Deviation

Correctness (%)

Completeness (%)

Absolute Deviation

81.9 ± 13.0 83.2 ± 14.5 95.0± 6.1 78.0 ± 8.4 90.0 ± 8.4 85.6 ± 12.0 89.6 ± 9.8

81.7 ± 13.5 82.1 ± 15.1 80.7 ± 7.6 72.9 ± 10.7 88.2 ± 10.0 81.1 ± 12.5 91.7 ± 9.0

5.2 3.5 10.1 4.7 2.9 5.3 2.5

78.3 ± 16.3 76.5 ± 17.6 94.0 ± 6.7 79.8 ± 16.0 81.0 ± 14.4 81.9 ± 15.7 84.8 ± 13.9

56.8 ± 15.0 51.4 ± 15.7 68.6 ± 8.3 60.0 ± 15.3 64.6 ± 14.6 60.3 ± 14.7 67.8 ± 14.5

6.1 11.8 11.7 8.5 5.4 8.7 8.2

5. Conclusions In the AEC industry, recently, the demand for as-built BIM creation of large and complex building interiors has increased with the advent of high-performance laser scanning devices. However, automated and detail-rich as-built modeling remains a challenge, as the available commercial and academic tools require excessive processing time and manual work. One way to facilitate the Scan-to-BIM process is to obtain a separate point cloud for each room, which allows for effective operation of the modeling phases for each room while reducing the computational complexity as compared with processing the entire data set. Such room segmentation is necessary for creating topological relationships to enrich 3D models with semantic information of the sort that is required for BIM. Unfortunately, most existing methods utilized in the AEC community rely on strict constraints or known scanner locations, a fact that limits their applications to specific data sets. Meanwhile, within the robotics community, room segmentation usually has been implemented on occupancy grid maps in a 2D context, which often leads to over-segmentation with arbitrarily segmented boundaries. In contrast to the conventional methods, our input is just a registered point cloud acquired by either a static laser scanner or a mobile scanning system; supplementary data, such as known scanner locations or training data, are not needed for the segmentation, thus allowing for more general applications of the proposed approach, even to existing or publicly available data. Based on the assumption that each room is bound by vertical walls, we propose to project the 3D point-cloud input onto a 2D binary map and to segment rooms using morphological processing. We tested our approach under a variety of data conditions, including real-world, synthetic, and publicly available 2D and 3D data sets. The experiments with the real-world data sets demonstrated the robustness against occlusions and clutter in 3D point clouds, while the experiments with the five synthetic data sets proved our method’s high segmentation performance regardless of the number of rooms or the complexity of architectural shapes. Additionally, comparisons with the five popular existing methods were conducted using 20 publicly available non-furnished and furnished data sets. The qualitative evaluations proved that the proposed approach is resistant to over- or under-segmentation while preserving straight segment boundaries. In the quantitative evaluations, the proposed approach exceeded the existing methods in terms of the measures of completeness and absolute deviation with non-furnished data, and the measures of correctness and completeness with furnished data. Overall, the proposed approach achieved high performance with the non-furnished data sets. It also had relatively good results when dealing with the furnished data sets, although improvement in this area is necessary. It can also be a problem when only parts of a building are scanned, as shown


24 of 26

in Figure 15. Thus, in future work, we will further investigate occlusion problems to improve the robustness of our method against such furnished or incomplete data input. In addition, we will aim at automatic detection window size control allowing for segmentation of complex indoor environments containing several different-sized passages. Although the proposed noise filtering helps to reduce the number of unwanted scan points outside the building, noise inside the scan area remains a problem that leads to the type of segmentation failure indicated in Figure 14. Therefore, other means of resolving this problem will be considered, for example, the removal of noisy points that are beyond a certain range from nearby walls extracted by planar detection. Based on the segmented point cloud, the next phase of our research will involve segmentation and modeling of detail-rich architectural components, such as columns, windows, and doors, for as-built BIM creation. Acknowledgments: This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2015R1A6A3A03019594). Author Contributions: Jaehoon Jung developed the main program, and wrote most of the paper. Cyrill Stachniss contributed to the data collection, and provided comments. Corresponding author Changjae Kim provided guidance throughout the course of the research. Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2. 3.

4.

5. 6. 7.

8. 9. 10.

11. 12. 13.

14.

Bosche, F.N.; O’Keeffe, S. The Need for Convergence of BIM and 3D Imaging in the Open World. In Proceedings of the CitA BIM Gathering Conference, Dublin, Ireland, 12–13 November 2015; pp. 109–116. Xiong, X.; Adan, A.; Akinci, B.; Huber, D. Automatic creation of semantically rich 3D building models from laser scanner data. Autom. Constr. 2013, 31, 325–337. [CrossRef] Tang, P.; Huber, D.; Akinci, B.; Lipman, R.; Lytle, A. Automatic reconstruction of as-built building information models from laser-scanned point clouds: A review of related techniques. Autom. Constr. 2010, 19, 829–843. [CrossRef] Carbonari, G.; Stravoravdis, S.; Gausden, C. Building information model implementation for existing buildings for facilities management: A framework and two case studies. In Building Information Modelling (BIM) in Design, Construction and Operations; WIT Press: Southampton, UK, 2015; Volume 149, pp. 395–406. Wang, C.; Cho, Y.K. Application of As-built Data in Building Retrofit Decision Making Process. Procedia Eng. 2015, 118, 902–908. [CrossRef] Randall, T. Construction Engineering Requirements for Integrating Laser Scanning Technology and Building Information Modeling. J. Constr. Eng. Manag. 2011, 137, 797–805. [CrossRef] Jung, J.; Hong, S.; Yoon, S.; Kim, J.; Heo, J. Automated 3D Wireframe Modeling of Indoor Structures from Point Clouds Using Constrained Least-Squares Adjustment for As-Built BIM. J. Comput. Civ. Eng. 2016, 30, 04015074. [CrossRef] Valero, E.; Adan, A.; Cerrada, C. Automatic Method for Building Indoor Boundary Models from Dense Point Clouds Collected by Laser Scanners. Sensors 2012, 12, 16099–16115. [CrossRef] [PubMed] Hong, S.; Jung, J.; Kim, S.; Cho, H.; Lee, J.; Heo, J. Semi-automated approach to indoor mapping for 3D as-built building information modeling. Comput. Environ. Urban Syst. 2015, 51, 34–46. [CrossRef] Previtali, M.; Barazzetti, L.; Brumana, R.; Scaioni, M. Towards automatic indoor reconstruction of cluttered building rooms from point clouds. In Proceedings of the ISPRS Technical Commission V Symposium, Riva del Garda, Italy, 23–25 June 2014; Volume 2, pp. 281–288. Thomson, C.; Boehm, J. Automatic Geometry Generation from Point Clouds for BIM. Remote Sens. 2015, 7, 11753–11775. [CrossRef] Jung, J.; Hong, S.; Jeong, S.; Kim, S.; Cho, H.; Hong, S.; Heo, J. Productive modeling for development of as-built BIM of existing indoor structures. Autom. Constr. 2014, 42, 68–77. [CrossRef] Budroni, A.; Böhm, J. Automatic 3D modelling of indoor Manhattan-world scenes from laser data. In Proceedings of the ISPRS Commission V Mid-Term Symposium ‘Close Range Image Measurement Techniques’, Newcastle upon Tyne, UK, 21–24 June 2010; Volume 38, pp. 115–120. Adán, A.; Huber, D. Reconstruction of Wall Surfaces under Occlusion and Clutter in 3D Indoor Environments; Robotics Institute, Carnegie Mellon University: Pittsburgh, PA, USA, 2010.


15.

16. 17. 18.

19.

20.

21. 22. 23.

24.

25.

26.

27. 28.

29.

30.

31. 32. 33.

25 of 26

Mura, C.; Mattausch, O.; Villanueva, A.J.; Gobbetti, E.; Pajarola, R. Automatic room detection and reconstruction in cluttered indoor environments with complex room layouts. Comput. Graph. 2014, 44, 20–32. [CrossRef] Ochmann, S.; Vock, R.; Wessel, R.; Klein, R. Automatic reconstruction of parametric building models from indoor point clouds. Comput. Graph. 2016, 54, 94–103. [CrossRef] Turner, E.L. 3D Modeling of Interior Building Environments and Objects from Noisy Sensor Suites. Ph.D. Thesis, The University of California, Berkeley, CA, USA, 14 May 2015. Macher, H.; Landes, T.; Grussenmeyer, P. Point clouds segmentation as base for as-built BIM creation. In Proceedings of the 25th International CIPA Symposium, Taipei, Taiwan, 31 August–5 September 2015; Volume 2, pp. 191–197. Bormann, R.; Jordan, F.; Li, W.; Hampp, J.; Hagele, M. Room Segmentation: Survey, Implementation, and Analysis. In Proceedings of the IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 16–21 May 2016; pp. 1019–1026. Wurm, K.M.; Stachniss, C.; Burgard, W. Coordinated multi-robot exploration using a segmentation of the environment. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Nice, France, 22–26 September 2008; pp. 1160–1165. Mozos, Ó.M.; Stachniss, C.; Rottmann, A.; Burgard, W. Using adaboost for place labeling and topological map building. In Robotics Research; Springer: Berlin, Germany, 2007; Volume 28, pp. 453–472. Fabrizi, E.; Saffiotti, A. Augmenting topology-based maps with geometric information. Robot. Auton. Syst. 2002, 40, 91–97. [CrossRef] Diosi, A.; Taylor, G.; Kleeman, L. Interactive SLAM using laser and advanced sonar. In Proceedings of the IEEE International Conference on Robotics and Automation, Barcelona, Spain, 18–22 April 2005; pp. 1103–1108. Jebari, I.; Bazeille, S.; Battesti, E.; Tekaya, H.; Klein, M.; Tapus, A.; Filliat, D.; Meyer, C.; Ieng, S.-H.; Benosman, R.; et al. Multi-sensor semantic mapping and exploration of indoor environments. In Proceedings of the 3rd IEEE Incernational Conference on Technologies for Practical Robot Applications, Woburn, MA, USA, 11–12 April 2011; pp. 151–156. Brunskill, E.; Kollar, T.; Roy, N. Topological mapping using spectral clustering and classification. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, San Diego, CA, USA, 29 October–2 Novermber 2007; pp. 3491–3496. Shi, L.; Kodagoda, S.; Ranasinghe, R. Fast indoor scene classification using 3D point clouds. In Proceedings of the Australasian Conference on Robotics and Automation, Melbourne, Australia, 7–9 December 2011; pp. 1–7. Siemiatkowska, B.; Harasymowicz-Boggio, B.; Chechlinski, Ł. Semantic Place Labeling Method. J. Autom. Mob. Robot. Intell. Syst. 2015, 9, 28–33. [CrossRef] Burgard, W.; Fox, D.; Hennig, D.; Schmidt, T. Estimating the absolute position of a mobile robot using position probability grids. In Proceedings of the 13th National Conference on Artificial Intelligence, Portland, OR, USA, 4–8 August 1996; pp. 896–901. Lu, Z.; Hu, Z.; Uchimura, K. SLAM estimation in dynamic outdoor environments: A review. In Proceedings of the 2nd International Conference on Intelligent Robotics and Applications, Singapore, 16–18 December 2009; pp. 255–267. Ochmann, S.; Vock, R.; Wessel, R.; Tamke, M.; Klein, R. Automatic generation of structural building descriptions from 3D point cloud scans. In Proceedings of the 9th IEEE International Conference on Computer Graphics Theory and Applications, Lisbon, Portugal, 5–8 January 2014; pp. 1–8. Ikehata, S.; Yang, H.; Furukawa, Y. Structured indoor modeling. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 13–16 December 2015; pp. 1323–1331. Meinel, G.; Neubert, M. A comparison of segmentation programs for high resolution remote sensing data. Int. Arch. Photogramm. Remote Sens. 2004, 35 Pt B, 1097–1105. Mozos, O.M.; Rottmann, A.; Triebel, R.; Jensfelt, P.; Burgard, W. Semantic labeling of places using information extracted from laser and vision sensor data. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Beijing, China, 9–15 October 2006.


34.

35. 36. 37. 38. 39. 40.

41. 42. 43.

26 of 26

Friedman, S.; Pasula, H.; Fox, D. Voronoi Random Fields: Extracting Topological Structure of Indoor Environments via Place Labeling. In Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, 9–12 January 2007; Volume 7, pp. 2109–2114. Li, Y.; Hu, Q.; Wu, M.; Liu, J.; Wu, X. Extraction and Simplification of Building Façade Pieces from Mobile Laser Scanner Point Clouds for 3D Street View Services. ISPRS Int. J. Geo-Inf. 2016, 5, 231. [CrossRef] Motameni, H.; Norouzi, M.; Jahandar, M.; Hatami, A. Labeling method in Steganography. Int. Sch. Sci. Res. Innov. 2007, 1, 1600–1605. Han, S.; Cho, H.; Kim, S.; Jung, J.; Heo, J. Automated and efficient method for extraction of tunnel cross sections using terrestrial laser scanned data. J. Comput. Civ. Eng. 2012, 27, 274–281. [CrossRef] Bresenham, J.E. Algorithm for computer control of a digital plotter. IBM Syst. J. 1965, 4, 25–30. [CrossRef] Käshammer, P.; Nüchter, A. Mirror identification and correction of 3D point clouds. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, 40, 109. [CrossRef] Kim, C.; Habib, A.; Pyeon, M.; Kwon, G.-R.; Jung, J.; Heo, J. Segmentation of Planar Surfaces from Laser Scanning Data Using the Magnitude of Normal Position Vector for Adaptive Neighborhoods. Sensors 2016, 16, 140. [CrossRef] [PubMed] DURAARK Datasets. Available online: http://duraark.eu/data-repository/ (accessed on 22 June 2017). IPA Room Segmentation. Available online: http://wiki.ros.org/ipa_room_segmentation (accessed on 12 February 2017). Rodríguez-Cuenca, B.; García-Cortés, S.; Ordóñez, C.; Alonso, M.C. Morphological operations to extract urban curbs in 3D MLS point clouds. ISPRS Int. J. Geo-Inf. 2016, 5, 93. [CrossRef] © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).