Progressive 3D shape segmentation using online ... - Semantic Scholar

10 downloads 7006 Views 3MB Size Report
Computer-Aided Design 58 (2015) 2–12. Contents lists ... Our framework uses Online Multi-Class LPBoost (OMCLP) to train/update a segmentation model.
Computer-Aided Design 58 (2015) 2–12

Contents lists available at ScienceDirect

Computer-Aided Design journal homepage: www.elsevier.com/locate/cad

Progressive 3D shape segmentation using online learning✩ Feiqian Zhang, Zhengxing Sun ∗ , Mofei Song, Xufeng Lang State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, PR China

highlights • • • •

A progressive interactive 3D shape segmentation method is proposed. Online learning is adopted to train the segmentation model accumulatively. The segmentation model can be updated incrementally when new shapes are added. Segmentation can be more accurate with the increasing of the segmented shapes.

article

info

Keywords: 3D shape Progressive segmentation Online learning Shape set

abstract In this article, we propose a progressive 3D shape segmentation method, which allows users to guide the segmentation with their interactions, and does segmentation gradually driven by their intents. More precisely, we establish an online framework for interactive 3D shape segmentation, without any boring collection preparation or training stages. That is, users can collect the 3D shapes while segment them, and the segmentation will become more and more precise as the accumulation of the shapes. Our framework uses Online Multi-Class LPBoost (OMCLP) to train/update a segmentation model progressively, which includes several Online Random forests (ORFs) as the weak learners. Then, it performs graph cuts optimization to segment the 3D shape by using the trained/updated segmentation model as the optimal data term. There exist three features of our framework. Firstly, the segmentation model can be trained gradually during the collection of the shapes. Secondly, the segmentation results can be refined progressively until users’ requirements are met. Thirdly, the segmentation model can be updated incrementally without retraining all shapes when users add new shapes. Experimental results demonstrate the effectiveness of our approach. © 2014 Elsevier Ltd. All rights reserved.

1. Introduction Segmentation of 3D shapes into meaningful parts is a fundamental problem in shape analysis and processing. Numerous tasks in shape processing, 3D modeling, animation and texturing of 3D shapes benefit from consistent segmentation of shapes into parts [1,2]. A large number of segmentation methods have been proposed, and most of them pay attention to segmenting an individual shape based on geometric features such as convexity and curvature [1,3,4]. Although a variety of geometric features have been investigated, no single feature or collection of features is known to produce high-quality results for all classes of shapes [1], and consistently segmenting a set of shapes remains challenging [5].

✩ This paper has been recommended for acceptance by Dr. Vadim Shapiro.



Corresponding author. E-mail address: [email protected] (Z.X. Sun).

http://dx.doi.org/10.1016/j.cad.2014.08.008 0010-4485/© 2014 Elsevier Ltd. All rights reserved.

Accordingly, more and more researchers have been focusing on segmentation methods based on shape set to partition the 3D shapes consistently [6–8]. The shapes in this shape set belong to the common family, and they share the same functionality and general form [7]. By utilizing this characteristic, the segmentation methods based on shape set can generate segmentation carrying consistent semantics across shapes. They can be classified into three categories: supervised, unsupervised and semi-supervised. The supervised methods can employ the information learned from the labeled training set to segment a given shape [6,9], and obtain significant improvement over single-shape segmentation methods. However, they require a large number of manually segmented and labeled training shapes to learn from, and they have to retrain on the whole training set, when there are only a small amount of new shapes submitted to the training set. Moreover, the segmentation mode, such as the specific scope, number and categories of consistent parts to be segmented, is utterly determined by the training set and can hardly be changed by users.

F.Q. Zhang et al. / Computer-Aided Design 58 (2015) 2–12

The unsupervised ones process the shapes from an input set in a batch, and simultaneously segment these shapes into consistent parts [5,7,10–15]. They do not require any labeling process for the shape set, but they still have to collect a number of shapes to perform the algorithm. And they also need performing the algorithm over the whole pipeline for the updating of the shape set. In addition, although the number of consistent parts can be specified by users, the scope and categories are still determined by the algorithm. The semi-supervised ones use information from both labeled shapes and unlabeled ones, which improve the results of unsupervised methods with the assistance of external input and overcome the difficulties of requiring a large amount of labeled shapes in the supervised methods [8,16,17]. Users’ intention on the segmentation mode can be reflected in the results. However, the same as the supervised and unsupervised methods, they require preparing a number of shapes at the very beginning. Although some semisupervised methods have considered about the segmentation of the new shape [16,17], the segmentation or labeling of the new shape cannot be helpful for other shapes. They still need to perform the algorithm on all the shapes including the new shape to achieve the set updating. So overall, there are three challenges in the existing methods: firstly, they have to collect a number of shapes before performing the segmentation algorithm to achieve satisfactory results; secondly, they are shortage of some efficient updating strategy that they must perform the algorithms on all the shapes when there are only a small amount of new shapes added to the shape set; thirdly, the segmentation does not necessarily convey the users’ intention. In this paper, we propose a novel method, which can progressively segment a 3D shape using online learning. A segmentation model is trained/updated progressively by Online Multi-Class LPBoost (OMCLP) [18], where several Online Random forests (ORFs) [19] are trained as the weak learners. Then, graph cuts optimization [20] is performed to segment the 3D shape, the trained/updated segmentation model is used as its data term. The features of this method lie in three aspects. The first one is that the segmentation model can be learned online. In the initial stage, users can submit a 3D shape and label on this shape according to their intention. The segmentation model can be learned initially based on these labels, and used to segment this submitted shape. Users then correct the initial segmentation results. And the segmentation model can be further learned iteratively based on the correction. As the new shapes are submitted occasionally, the segmentation model can be formed gradually during the online learning process. The second one is that users can refine the segmentation results in an online way. During the whole pipeline, users can correct the false-labeled regions in the segmentation. And the refined segmentation can be obtained according to these regions. Users can continue correct this refined segmentation result until it meets their requirements. The third one is that the segmentation model can be updated online. When the new shapes are submitted, the segmentation model can be updated efficiently without retraining on all the shapes. Our approach has four advantages. 1. The segmentation model can be learned gradually as the shapes are segmented. So, it does not need any explicit shape collecting or training stages in our method. 2. It provides users an easy and intuitive way to interact through the segmentation process. They can correct the segmentation by simply click at the false-labeled parts of the shape. And with the increasing of the user interactions and accumulation of the shapes, the segmentation of the submitted shapes will become more and more precise.

3

3. The segmentation model can be updated easily when new shapes are added into the shape set. Updating is an important task in learning based methods, no matter supervised or unsupervised ones. The existing methods are lack of some updating scheme, so they usually have to perform their algorithms on the whole shape set repeatedly. 4. Users can segment the 3D shapes according to their intention. They can direct the segmentation mode including the scope, number and categories of consistent parts during the segmentation. We evaluate the presented approach on several 3D shapes, and the experimental results demonstrate the above advantages of our method. 2. Related work We review the previous work in several related categories: segmentation of individual shapes, segmentation based on shape set, interactive segmentation, and online learning. 2.1. Segmentation Shape segmentation is a fundamental problem in shape analysis, which refers to decomposing an individual 3D shape into meaningful parts. Recent surveys can be found in [3,4], and comparisons of several methods appear in [1]. These methods use techniques such as K-means [21], graph cuts [22], hierarchical clustering [23], primitive fitting [24], random walks [25], and spectral clustering [26]. And they aim to partition the individual shape into segments based on some simple geometric criteria: convexity [27], curvature [28] and so on. Although a variety of geometric features have been investigated, no single feature or collection of features is known to produce high-quality results for all classes of shapes [1], and consistently segmenting a set of shapes remains challenging [5]. Therefore, more and more researchers have been focusing on segmentation methods based on shape set to partition the 3D shapes consistently. 2.2. Segmentation based on shape set These works can be classified into three categories: supervised, unsupervised and semi-supervised. In the supervised case, they employ the information learned from the labeled training set to segment a given shape. Kalogerakis et al. [6] use the Conditional Random Field (CRF) model trained by a collection of labeled shapes to segment and label a given 3D shape. van Kaick et al. [9] further use this knowledge to establish a part correspondence between a pair of shapes. Benhabiles et al. [29] learn a boundary edge function from a large database of manually segmented 3D shapes, to segment any input 3D shape. A significant improvement in correctness over single-shape segmentation methods is demonstrated in these supervised methods. However, they require a large number of manually segmented and labeled training shapes to learn from, and they have to retrain on the whole training set, when there are only a small amount of new shapes submitted to the training set. Moreover, the segmentation mode, such as the specific scope, number and categories of consistent parts to be segmented, is utterly determined by the training set and can hardly be changed by users. The unsupervised methods process the shapes from an input set in a batch, and simultaneously segment these shapes into consistent parts. Kreavoy et al. [30] first segment each shape in the set individually into parts with similar segmentations, and then create a consistent segmentation by matching the parts and finding their correspondences. Huang et al. [11] present a novel linear programming approach to jointly segment the

4

F.Q. Zhang et al. / Computer-Aided Design 58 (2015) 2–12

shapes in a heterogeneous shape library, which significantly outperforms single-shape segmentation techniques. They generate a set of initial segments for each shape and then perform a global optimization over these segments together with correspondences between similar ones. However, these two methods only generate a pairwise consistent segmentation therefore cannot guarantee the consistency across the set. Golovinskiy and Funkhouser [5] consider the co-segmentation as a graph clustering problem. They combine intra-shape segmentation cues and inter-shape proximity in the graph after aligning all the shapes in the set, thus simultaneously segment shapes and create correspondences between segments. To deal with nonhomogeneous part scales, Xu et al. [10] classify the shapes based on their styles before applying the similar co-segmentation scheme in each style group. Since both methods use a global alignment for all shapes and compute correspondences between different shapes by performing the Iterative Closest Point (ICP) algorithm, they can only handle limited variability of the parts in geometry. Several unsupervised approaches propose to treat co-segmentation as clustering problem in the descriptor space, thus they can achieve proper co-segmentation of shapes with large variability. Sidi et al. [7] first group the shape facets by mean-shift clustering to obtain the initial segments of each shape in the set, and then use the spectral clustering based on segment-level shape descriptors to get the consistent parts, at last they refine these parts by applying graph cuts optimization. They can build the dissimilar consistent parts via linking through third-parties during spectral clustering. Meng et al. [13] perform over-segmentation for each shape by employing the normalized cuts (NCuts) to get patches as the initial segments, and cluster these initial segments to generate the consistent parts. They iteratively build a statistical model to describe each class of these parts from the previous estimation and employ the multi-label optimization to improve the co-segmentation results. In contrast to the existing ‘‘one-shot’’ algorithms, their method is able to improve the initial rough cosegmentation results. Zhang et al. [31] propose a co-segmentation method by combining Fuzzy C-Means (FCM) with Random Walks to eliminate the dependency on the per-object segmentation. They perform FCM to cluster directly all the facets in the set, and combine Random Walks into the iterations of FCM to ensure the geometric structure of consistent parts reasonable all through the clustering. Since different shape descriptors characterize a shape in different aspects, several researchers have focused on the multi-feature fusion problem. Instead of concatenating the different features into one feature descriptor, Hu et al. [12] formulate co-segmentation into a subspace clustering problem in multiple feature spaces of the initial segments, and adopt fuzzy cuts to smooth the cutting boundaries of each part. Luo et al. [14] formulate the problem as a multi-view spectral clustering task by co-training a set of affinity matrices derived from different shape descriptors of the oversegmented initial segments. Wu et al. [15] propose a method based on the affinity aggregation spectral clustering to fuse multiple descriptors of the over-segmented initial segments to obtain the cosegmentation of the set. Recently, to derive structure from large, unorganized, diverse collections of 3D models, Kim et al. [32] propose an automatic algorithm that starts with an initial template model and then jointly optimizes for part segmentation, point-to-point surface correspondence, and a compact deformation model to best explain the input model collection. So, they can simultaneously recover segmentation, point-level correspondence, and a probabilistic part-based deformable model for shape collections. However, all these unsupervised methods have some common limitations: firstly, they still have to collect a number of shapes before they perform the algorithm; secondly, they need performing

the algorithm over the whole pipeline for the updating of the shape set; thirdly, although the number of consistent parts can be specified by users, the scope and categories are still determined by the algorithm. The semi-supervised methods use information from both labeled shapes and unlabeled ones, which improve the results of unsupervised methods with the assistance of external input and overcome the difficulties of requiring a large amount of labeled shapes in the supervised methods. Lv et al. [16] propose to segment and label a given shape by a Semi-supervised segmentation model using Conditional Random Fields (CRF), which can be trained from unlabeled shapes and labeled shapes. Wang et al. [8] propose a semi-supervised method where the user can actively assist in the learning process by interactively providing inputs. Wu et al. [17] propose an interactive approach for shape co-segmentation via label propagation, which is able to produce error-free results and is very effective at handling out-of-sample data. Users’ intention on the segmentation mode can be reflected in the results. However, the same as the supervised and unsupervised methods, the existing semi-supervised methods require preparing a number of shapes at the very beginning of the algorithm. Although some semisupervised methods have considered about the segmentation of the new shape [16,17], the segmentation or labeling of the new shape cannot be helpful for other shapes. They still need to perform the algorithm on all the shapes including the new shape to achieve the set updating. 2.3. Interactive segmentation Interactive shape segmentation approaches can simply and intuitively help users express their intentions. Consequently, they have received significant attention [4]. They can be classified into two categories: boundary based approaches and region based approaches. The boundary based approaches require users to draw strokes or click along the desired cut boundary, which require great care when specifying the boundary points or boundary areas [33–35]. Region based approaches take the regional information as the input, which require a much smaller amount of user efforts. In particular, many approaches allow users to specify sketches in the foreground and/or background regions [36–39]. In this paper, we allow users to modify the segmentation by simply click at the false-labeled regions of the shape. Rather than considering one individual shape in isolation, we can benefit from other shapes in common family. The segmentation of new shapes can become more and more accurate as the shapes are submitted accumulatively, so the interactions during the segmentation can be easier and easier. 2.4. Online learning Online learning is an essential tool for learning from dynamic environments, from very large scale data sets, or from streaming data sources. It has been used in many applications such as object recognition [40], object detection [41], object tracking [42], and object segmentation from 3D scenes [43]. To the best of our knowledge, we are the first to introduce this algorithm into shape segmentation. 3. Algorithm In our method, users submit some shape to be segmented, and choose the category of this shape. The shape can be segmented according to the segmentation model of this category. Users then correct the segmentation result progressively until it meets their requirements. Next, we will introduce our progressive 3D shape segmentation method using online learning.

F.Q. Zhang et al. / Computer-Aided Design 58 (2015) 2–12

5

Fig. 1. The pipeline of the proposed method.

3.1. Overview Fig. 1 illustrates the pipeline of our method. At the initial stage, users can submit a 3D shape, and add some part labels on this shape according to their intention. A segmentation model can be initialized by training based on these labeled regions through online learning. Then, the initial result of this shape can be obtained by performing the segmentation according to this segmentation model. Next, users correct this initial result, and the segmentation model can be learned based on these corrected regions. It is further used in the segmentation to get the refining result of the submitted shape. Users can continue to correct the refining result until it is accepted by them, and the final result of this shape can be got. Then, to learn a more accurate segmentation model, it is retrained according to the final segmentation of the whole shape. The segmentation model thus can be used to segment other new submitted shapes. As new shapes are submitted occasionally, the segmentation model can be formed gradually during the online learning process. In the following sections, we will describe these steps in detail.

Fig. 2. Several examples of our over-segmentation results.

3.2. Feature descriptors The method is performed based on the feature descriptors of the shape. We compute some shape descriptors for each facet. According to the study on feature selections in the supervised approach [6], we choose five shape descriptors including Gaussian curvature (GC) [44], shape diameter function (SDF) [45], average geodesic distance (AGD) [46], shape contexts (SC) [47], and the geodesic distance to the base of the shape (GB) [7]. Then we can get the feature vector xi ∈ X for each facet i in the shape.

(a) labeling.

(b) labeled regions.

(c) correcting.

(d) corrected regions.

3.3. Labeling/correcting At the very beginning, users can add some part labels on the submitted shape according to their intention. And as the segmentation proceeds, they can correct the initial/refining segmentation result through interactions. They can simply click on the regions of the corresponding parts, and input the labels to finish the interactions of labeling/correcting step. In order to make the labeling/correcting more convenient for users, we preprocess the submitted shape to obtain its oversegmentation. Similar to some previous works [12,13], we employ normalized cuts [48] to decompose the shape into primitive patches, and further align the patch boundaries by fuzzy cuts [22]. The number of patches is set to 200. Fig. 2 shows several examples of our over-segmentation results. After users click on the shape surface, the over-segmentation patch which includes the click can be found as the labeled/corrected region. The facets in this patch form the marked set SM = {(xt , yt )}, where xt is the feature vector of facet t in this patch, yt ∈ C is the label that users have inputted, and C is the set of labels

Fig. 3. An example of labeling and correcting.

(K = |C |). Fig. 3 shows an example of this step. In Fig. 3(a), users label on the first submitted shapes at the initial stage. These labels are represented by points with different colors. The different colors correspond to the different part classes. In this example, there are 4 part classes. Fig. 3(b) shows the labeled regions extracted in our method. In Fig. 3(c), users click on some false-labeled regions to correct the segmentation result, and the corrected regions are shown in Fig. 3(d).

6

F.Q. Zhang et al. / Computer-Aided Design 58 (2015) 2–12

3.4. Online learning/training In this step, we should train the segmentation model in an online mode, to make it capable to segment the new submitted shape correctly as the user guided. We use each facet t (xt , yt ) in the marked set as a new observed sample sequentially to retrain the segmentation model in the algorithm. Next, we will discuss this process in detail. The segmentation model. The segmentation model can be trained/ updated progressively by Online Multi-Class LPBoost (OMCLP) [18], where several Online Random forests (ORFs) [19] are trained as the weak learners. Combining ORFs and OMCLP together can speed up the convergence rate [18], which is very important in our progressive shape segmentation. Since the segmentation results need to be corrected rapidly based on the segmentation model, which should be learned from a small amount of users’ interactions. The segmentation model obtains the confidence that facet i is labeled as the kth part class, and can be retrained according to the facets in the marked set. For the facet t (xt , yt ) in the marked set, the number of online training is t, and the segmentation model can be defined as ft (xi , k) =

M 

wt ,m gt ,m (k|xi )

where gt ,m (k|xi ) is the probability of mth ORF, M is the number of ORFs, and wt ,m is the current weight of mth ORF. We set M = 10 in our approach. Online random forests. Random Forests are ensembles of randomized decision trees combined using bagging. In Online Random Forests (ORF) [19], to achieve the online learning, the algorithm should combine the online bagging and online decision trees with random feature-selection. For the bagging part, the arrival of sequential data is modeled by a Poisson distribution. Therefore, the trees are retrained k times on each facet t (xt , yt ) in the marked set, where k is a random number generated by Poisson(λ) and usually λ is set to a constant (typically λ = 0.1 in our approach). In this way, the node gathers the statistics online. For the online decision trees growing part, each decision node in a tree contains a test in form of g (x) > θ , where g (x) usually returns the value of a selected feature, θ is a threshold to decide the left/right propagation. When the node splits, the test can be determined by picking the best from a set of randomly generating tests S = {(g1 (x), θ1 ), . . . , (gN (x), θN )} according to a quality measurement. The node also maintains the label density of each part class, denoted by pt = [pt ,1 , . . . , pt ,K ]. This density information will be updated by each facet in the marked set which is falling in this node. In online mode, the statistics are gathered gradually, therefore, the decision when to perform the splitting depends on: 1. if there has been enough facets in a node to have a robust statistics; 2. if the splits are good enough for the classification purpose according to the quality measurement. If the above conditions are satisfied, we choose the test s⋆ with highest quality measurement as the main decision test of node R, and create two corresponding children nodes Rls⋆ and Rrs⋆ . After retraining the Online Random Forests t times with facet t in the marked set, the estimated probability for kth part class of each other facet i can be calculated by gt (k|xi ) =

T j =1

pt ,j (k|xi )

min C wt ,ξ



ξ k + ∥ w t ∥1

k̸=yt

s.t .

∀m : wt ,m ≥ 0 ∀k ̸= yt : ξk ≥ 0 ∀k ̸= yt : (Gt (yt , ·) − Gt (k, ·))wt + ξk ≥ 1

max dy′t +

wt ,dy′

s.t .

wt ,m (1 − dy′t ∆Gt ,y′t (m) − ζm )

m=1

t



M 

M 1 



where pt ,j (k|xi ) is the estimated density of kth part class in the leaf of the jth tree where xi falls, and T is the number of trees. We set T = 10 in our approach.

(4)

(1 − dy′t ∆Gt ,y′t (m) − ζm )2

m=1

∀m : ζm ≥ 0, wt ,m ≥ 0 0 ≤ dy′t ≤ C

where ∆Gt ,k (m) = Gt (yt , m) − Gt (k, m), dy′t is the dual variable of weights on ORFs, corresponding to the closest non-target class y′t : y′t = arg max ft (xt , k).

(5)

k∈C ,k̸=yt

θ > 0 is a designed constant and ζ is a new set of slack variables, which can be easily found by computing the derivatives of the objective with respect to them and setting it to zero. This leads to

ζm = max(0, qm )

(6)

where qm = 1 − dy′t ∆Gt ,y′t (m) − θ wt ,m . For more details of the optimization we refer the reader to [18]. This optimization can be solved by performing primal–dual gradient descent–ascent iteratively among the M ORFs. We set dual variable dy′t ,0 = C initially. Then, for the mth ORF, we compute variable dy′t ,m by dual gradient ascent update

 em = dy′t ,m−1 + νd 1 +

θ



m−1

1

 j=1,qj 0 (we set γ = 0.1 in our approach) is a constant that regulates the influence of the data term in the total energy, and ϵ (ϵ = 1e − 6) is a small threshold to avoid zero value in the logarithm function. According to the minima rule by Hoffman [51] that the human visual system usually perceives region boundaries at negative minima of principal curvature, or concave crease, we define the smoothness term as

ES (u, v, lu , lv )

 =

0

−log(1 − min(θuv /π , 1) + ϵ)luv

if lu = lv otherwise

(11)

to penalize boundaries between facets with the exterior dihedral angle θuv . luv indicates the length of the edge between adjacent facets u and v . ϵ is still the threshold to avoid zero value. Finally, we employ graph cuts optimization [20] to minimize the energy E and finish the segmentation. Fig. 4 shows the effect of the smooth term. Fig. 4(a) shows the segmentation result using the data term only, and Fig. 4(b) shows the segmentation result after combining both the data term and the smooth term. 4. Results In this section, we describe the experimental results and demonstrate the performance of our segmentation approach. Data set. We evaluate our progressive segmentation method on 7 sets of shapes from several typical object categories, where 5 sets (Human, FourLeg, Ant, Cup and Chair) come from Princeton Segmentation Benchmark (PSB) [1], and the rest (Vase and Candelabra)

Progressive segmentation. We perform our progressive segmentation method on each small set: Human, FourLeg, Ant, Cup, Chair, Vase and Candelabra. Each set is composed of 20–28 shapes. We submit each shape in the set one by one and correct the segmentation continuously. For the first shape, we should add the part class labels on this shape as shown in Fig. 3(a). Then, the segmentation result can be refined through interactions. Fig. 5 shows a segmentation example. There have been submitted 2 shapes as shown in Fig. 5(a), Fig. 5(b) and (c) show the segmentation results of some other shapes. Notice that the base parts of these four shapes are unlike the trained base parts and cannot be segmented correctly. So the user submits the shape in 5(b), corrects the false-labeled base part as shown in 5(e), where the bright yellow point marks the interactive input and adds this shape into the shape set as shown in 5(d). We retrain the segmentation model by performing our algorithm on this interactive input part, and use the retrained segmentation model to segment the shapes in 5(c). The segmentation results are shown in 5(f). We can see that the base of these shapes can be labeled correctly. Accuracy and time. To access the performance of our algorithm in a quantitative manner, we need to examine the accuracy of the segmentation. The accuracy for one shape is defined following [6,7] as Accuracy(l, t ) =

  i

ai δ(li = ti )

 

 

ai

(12)

i

where ai is the area of face i, l is the labeling of our algorithm, t is the ground-truth labeling, and δ(x = y) is 1 only if x = y. When a new shape is submitted, users should label and correct it according to the ground-truth labeling as much as possible, and the segmentation model can be retrained after this shape is segmented. We use the segmentation model to segment the remaining shapes and examine the accuracies for these shapes. Then we average these accuracies as the accuracy of submitting this new shape. Fig. 6 illustrates the accuracies for gradually submitting new shapes. Moreover, when the final segmentation result of each submitted shape is accepted by the user, we retrain the segmentation model according to the final segmentation of the whole shape, and examine the training time to demonstrate the efficiency of our method on shape set updating. The training time is shown in Fig. 7.

8

F.Q. Zhang et al. / Computer-Aided Design 58 (2015) 2–12

(a) 2. Fig. 6. Accuracies for gradually adding new shapes.

(b) 5.

(c) 12.

(d) 17.

Fig. 9. Segmentation results of certain Human shape after submitting different number of shapes. The numbers of the submitted shapes are shown under the figures.

Fig. 7. Training time for gradually adding new shapes. (a) 1.

(b) 3.

(c) 5.

(d) 11.

Fig. 10. Segmentation results of certain Chair shape after submitting different number of shapes. The numbers of the submitted shapes are shown under the figures.

(a) Accuracies for gradually adding new shapes.

(a) 2.

(b) 6.

(c) 9.

(d) 17.

Fig. 11. Segmentation results of certain FourLeg shape after submitting different number of shapes. The numbers of the submitted shapes are shown under the figures.

(b) Training time for gradually adding new shapes. Fig. 8. The performance on Large Chairs model.

For the large set: Large Chairs, which have 400 shapes, we use the ground-truth segmentation as the corrected final segmentation, and examine the accuracies and training time similar to the above experiments. Fig. 8 illustrate the results. All these experiments were performed on an Intel(R) Core(TM) 3.10 GHz CPU with 8 GB RAM. Overall, the accuracies are increased with accumulation of the shapes. The training is efficient. The training time is unrelated to the number of submitted shapes. It does not rapidly grow during the updating, and is lower than 1 min throughout the procedure. Segmentation as accumulation of shapes. In Figs. 9–15, we show the segmentation of some typical shapes as other shapes from the same family are accumulated, to give the reader a visual experience. Fig. 9 shows the segmentation results of certain Human shape after submitting different number of shapes. In the

early stage, the segmentation of this shape is poor. For example, after submitting 2 shapes, some regions of the lower leg is segmented into the head and part of the upper arm is segmented into the upper leg (Fig. 9(a)). After submitting 5 shapes, the lower leg is segmented correctly (Fig. 9(b)). Then after submitting 12 shapes, the head is segmented correctly and the upper leg becomes more accurate (Fig. 9(c)). And after submitting 17 shapes, the whole shape is nearly precisely segmented (Fig. 9(d)). Similarly, the segmentation of certain Chair shape is very bad when there are only a small number of shapes submitted, for example, 1 and 3 shapes submitted as shown in Fig. 10(a) and (b). When the number of shapes increases, the segmentation parts become correctly partitioned (Fig. 10(c)), and after 11 shapes are submitted, the segmentation of this shape become close to perfection (Fig. 10(d)). The segmentation results of some FourLeg, Ant, Cup, Vase and Candelabra shapes are shown in Figs. 11–15. We can see that the segmentation of the shapes will become more and more accurate as the accumulation of shapes. Intention expression. Since shape segmentation is generally a basic procedure of many other applications. It is necessary to segment shapes in different ways according to the specific application requirements. In our method, we allow users to segment the shapes according to their intention. They can modify the segmentation results in detail. In Fig. 16(a), users click on the false-labeled regions of the segmentation result in Fig. 9(d), and the segmentation can

F.Q. Zhang et al. / Computer-Aided Design 58 (2015) 2–12

(a) 1.

(b) 3.

(c) 4.

9

(d) 9. (a) Small leg.

(b) Big leg.

(c) Small leg.

(d) Big leg.

Fig. 12. Segmentation results of certain Ant shape after submitting different number of shapes. The numbers of the submitted shapes are shown under the figures.

(a) 1.

(b) 3.

(c) 8.

(d) 12.

Fig. 17. Users can control the scope of some segmentation part.

Fig. 13. Segmentation results of certain Cup shape after submitting different number of shapes. The numbers of the submitted shapes are shown under the figures.

(a) Initial labeling.

(a) 1.

(b) 7.

(c) 10.

(b) Correcting.

(c) 4 parts.

(d) 16.

Fig. 14. Segmentation results of certain Vase shape after submitting different number of shapes. The numbers of the submitted shapes are shown under the figures.

(d) Initial labeling. (a) 1.

(b) 2.

(c) 3.

(d) 13.

Fig. 15. Segmentation results of certain Candelabra shape after submitting different number of shapes. The numbers of the submitted shapes are shown under the figures.

(a) Interaction.

(b) Final result.

(c) Interaction.

(d) Final result.

Fig. 16. Users can correct the segmentation results in detail.

be refined more precisely as shown in Fig. 16(b). And in Fig. 16(c), users correct the segmentation result in Fig. 13(d), and the refined segmentation is shown in Fig. 16(d).

(e) Correcting.

(f) 5 parts.

Fig. 18. The labeling and correcting for obtaining different number and categories of segmentation parts.

They can also control the scope of certain part in the similar way. Fig. 17 shows an example. The leg parts of these two FourLeg shapes can be segmented differently, into the small ones or the big ones. Users can control the number and categories of the segmentation parts of the shapes in the whole set. Fig. 18 shows an example of two segmentation modes of the Candelabra shapes. One is to segment the shapes into 4 parts: fire, body, base and handle, and the other is to segment them into 5 parts: fire, body, support, base and handle. During the whole segmentation process, users should maintain the same labeling and correcting ways. At the initial stage, they label the 4 part classes (Fig. 18(a)) and 5 part classes (Fig. 18(d)) respectively on the shape. And during the progressive segmentation, they correct the results according to the categories of the parts they want to segment (Fig. 18(b) and (e)). And finally, two types of segmentation results can be obtained. Fig. 18(c) and (f) show the results of one of the shapes. Similarly, the FourLeg shapes can be segmented into 5 parts: torso, head, earhorn (ear and horn), leg and tail (Fig. 19(a) and (c)), and 6 parts: torso, head, ear, horn, leg and tail (Fig. 19(b) and (d)).

10

F.Q. Zhang et al. / Computer-Aided Design 58 (2015) 2–12

(a) 5 parts.

(b) 6 parts.

(a) 6 segmentation parts: torso, neck, head, earhorn, leg and tail.

(c) 5 parts.

(d) 6 parts.

Fig. 19. Users can control the number and categories of the consistent segmentation parts.

(b) 5 segmentation parts: torso, head, earhorn, leg and tail.

(a) 4 segmentation parts: fire, body, base and handle.

(c) 6 segmentation parts: torso, head, ear, horn, leg and tail. Fig. 21. Three segmentation modes of FourLeg shapes.

(b) 5 segmentation parts: fire, body, support, base and handle. Fig. 20. Two segmentation modes of Candelabra shapes.

The two types of segmentation results of all the Candelabra shapes are shown in Fig. 20. And the three types of segmentation results of all the FourLeg shapes are shown in Fig. 21. From the above experimental results, we can demonstrate that users can choose not only the number of the parts to be segmented, but also the scope and categories of the segmented parts. Therefore, the users can freely express their intention on segmentation in our method. Comparison to the state-of-the-art. The state-of-the-art methods are usually confined to the segmentation based on a stationary shape set. In Kalogerakis et al. [6], they need a large number of labeled

shapes to complete the learning process, and use the learned model to segment any other new shapes. However, if the segmentation of some new shape is different, such as the base part in Fig. 5, it would be difficult to update the learned model to capture this difference. The algorithm should be performed again to achieve this updating. In Hu et al. [12], they get the co-segmentation of the prepared shape set by using the subspace clustering. Similar to the supervised methods, they also need to perform the algorithm again when the shape set changes. While in our approach, we introduce online learning algorithm to the segmentation process. We do not need any explicit shape collecting or training stage. The segmentation model can be learned and updated gradually as the new shapes are segmented. The objective of our method is different with the aforementioned methods. So it is difficult to show these advantages in the quantitative comparison experiments. However, to examine the segmentation quality of our method, we compare with the aforementioned methods roughly using accuracy measurement. We use the segmentation accuracy after submitting 19 shapes as the accuracy of our method. For the method in

F.Q. Zhang et al. / Computer-Aided Design 58 (2015) 2–12

Fig. 22. Comparison with the state-of-the-art method.

11

Fig. 24. Number of interactions for progressively segmenting new shapes.

Meanwhile, this information can be further learned during the segmentation of these new shapes. From Fig. 24, we can see that the number of interactions can become lower and lower as the shapes are submitted, and nearly no interactions are needed in the segmentation of the last few shapes in our experiment. It demonstrates that with the accumulation of the shapes, the segmentation of the submitted shapes will become more and more precise and easy. As the participants said the usage is intuitive and the system gets smarter through the segmentation. Fig. 23. Two different segmentation results of users.

5. Conclusion Kalogerakis et al. [6], we use the accuracy when the training set size is 19 as its accuracy. And for the method in Hu et al. [12], the accuracy of the co-segmentation is used here directly. Fig. 22 shows the comparison result between our approach and these two methods on the shapes from PSB. The average accuracies are 93.5%, 95.8% and 87.8% respectively. It is seen that our method gets a close performance to the offline supervised method [6], and outperforms the unsupervised method [12]. User study. Since different users often have different intentions for segmentation, we assume that there is only one user in the whole segmentation process. On the other hand, in multi-users scenarios, there are two possibilities: firstly, the shape parts to be partitioned are generally consistent. The users’ intention may only indicate the required number or categories of the segmentation parts. Thus, the learned segmentation model can be shared by different users; secondly, users may want to obtain the personalized segmentation results. We can preserve the corresponding segmentation models for different users in the database to serve for their segmentation. We perform a user study, where we ask 15 participants to progressively segment the shapes from Candelabra set (28 shapes). The users in our study have not used our system before, and start to segment the shapes directly after we briefly describe the function and usage of our system. And they are requested to segment each shape until they are satisfied with the result. We find that for most shapes, participants can get the segmentation accuracies higher than 99%, but for some shapes, they would segment them in different ways as Fig. 23 shows. Most of the participants segment these two shapes nearly the same as the ground-truth (Fig. 23(a)), while 4 participants have segmented them as shown in Fig. 23(b). These examples just prove that our method can capture users’ intention for the segmentation. Fig. 24 shows the average number of interactions for progressively segmenting new shapes. The total number of interactions is about 55, while in Wang et al. [8], they need 26 link constraints (one link constraint connects two interaction points on the shapes, so is roughly equal to two interactions in our system) to get the accuracy of about 98%. So, our method is comparable with theirs in the aspect of interactive complexity. Moreover, method in [8] only considers about the segmentation and interaction in the specific shape set. Different with them, we progressively segment shapes with users’ interactions. So both the set information and the user intention can be learned to segment many other new shapes.

We presented a novel method in this paper, which can achieve 3D shape segmentation progressively by using online learning algorithm. It has four advantages. Firstly, the segmentation model can be learned gradually as the shapes are segmented. So, it does not need any explicit shape collecting or training stages in our method. Secondly, users can use an easy and intuitive way to correct the segmentation. And with increasing of the user interactions and accumulation of the shapes, the segmentation of the submitted shape will become more and more precise. Thirdly, we use online learning algorithm to update the segmentation model efficiently as new shapes are submitted. Finally, users can segment the 3D shapes according to their intention. They can direct the segmentation mode including the scope, number and categories of consistent parts during the segmentation. Experiments demonstrated the above advantages. However, for some organic models, the boundary is difficult to determine. We may find out that whether the boundary information can be learned during users’ interactions. Moreover, we may consider about using some sketch-based interactive shape segmentation approaches to improve our method, and make the interaction more natural. Acknowledgments This work is supported by the National Natural Science Foundation of China No. 61272219, 61100110, 61321491; the National High Technology Research and Development Program of China No. 2007AA01Z334; the Key Projects Innovation Fund of State Key Laboratory No. ZZKT2013A12; the Program for New Century Excellent Talents in University of China No. NCET-04-04605; the Graduate Training Innovative Projects Foundation of Jiangsu Province No. CXLX13 050; the Science and Technology Program of Jiangsu Province No. BE2010072, BE2011058, BY2012190. References [1] Chen X, Golovinskiy A, Funkhouser T. A benchmark for 3d mesh segmentation. ACM Trans Graph 2009;28(3):73:1–73:12. [2] Chaudhuri S, Kalogerakis E, Guibas L, Koltun V. Probabilistic reasoning for assembly-based 3D modeling, ACM Trans Graph (Proc. SIGGRAPH) 30. [3] Agathos A, Pratikakis I, Perantonis S, Sapidis N, Azariadis P. 3d mesh segmentation methodologies for cad applications. Comput Aided Des 2007; 4(6):827–41.

12

F.Q. Zhang et al. / Computer-Aided Design 58 (2015) 2–12

[4] Shamir A. A survey on mesh segmentation techniques. Comput. Graph. Forum 2008;27(6):1539–56. [5] Golovinskiy A, Funkhouser T. Consistent segmentation of 3d models. Comput Graph 2009;33(3):262–9. [6] Kalogerakis E, Hertzmann A, Singh K. Learning 3d mesh segmentation and labeling. ACM Trans Graph 2010;29(4):102:1–102:12. [7] Sidi O, van Kaick O, Kleiman Y, Zhang H, Cohen-Or D. Unsupervised cosegmentation of a set of shapes via descriptor-space spectral clustering. ACM Trans Graph 2011;30(6):126:1–126:10. [8] Wang Y, Asafi S, van Kaick O, Zhang H, Cohen-Or D, Chen B. Active co-analysis of a set of shapes. ACM Trans Graph 2012;31(6):165. [9] van Kaick O, Tagliasacchi A, Sidi O, Zhang H, Cohen-Or D, Wolf L, Hamarneh G. Prior knowledge for part correspondence. Comput. Graph. Forum 2011;30(2): 553–62. [10] Xu K, Li H, Zhang H, Cohen-Or D, Xiong Y, Cheng Z-Q. Style-content separation by anisotropic part scales. ACM Trans Graph 2010;29(6):184:1–184:10. [11] Huang Q, Koltun V, Guibas L. Joint shape segmentation with linear programming. ACM Trans Graph 2011;30(6):125:1–125:12. [12] Hu R, Fan L, Liu L. Co-segmentation of 3d shapes via subspace clustering. Comput. Graph. Forum 2012;31(5):1703–13. [13] Meng M, Xia J, Luo J, He Y. Unsupervised co-segmentation for 3d shapes using iterative multi-label optimization. Comput Aided Des 2013;45(2):312–20. [14] Luo P, Wu Z, Xia C, Feng L, Ma T. Co-segmentation of 3d shapes via multi-view spectral clustering. Vis Comput 2013;29(6–8):587–97. [15] Wu Z, Wang Y, Shou R, Chen B, Liu X. Unsupervised co-segmentation of 3d shapes via affinity aggregation spectral clustering. Comput Graph 2013;37(6): 628–37. [16] Lv J, Chen X, Huang J, Bao H. Semi-supervised mesh segmentation and labeling. Comput. Graph. Forum 2012;31(7). [17] Wu Z, Shou R, Wang Y, Liu X. Interactive shape co-segmentation via label propagation. Comput Graph 2014;38:248–54. [18] Saffari A, Godec M, Pock T, Leistner C, Bischof H. Online multi-class lpboost. In: 2010 IEEE conference on computer vision and pattern recognition. IEEE; 2010. p. 3570–7. [19] Saffari A, Leistner C, Santner J, Godec M, Bischof H. On-line random forests. In: 2009 IEEE 12th international conference on computer vision workshops. IEEE; 2009. p. 1393–400. [20] Boykov Y, Veksler O, Zabih R. Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 2001;23(11):1222–39. [21] Shlafman S, Tal A, Katz S. Metamorphosis of polyhedral surfaces using decomposition. In: Computer graphics forum. 2002. p. 219–28. [22] Katz S, Tal A. Hierarchical mesh decomposition using fuzzy clustering and cuts. ACM Trans Graph 2003;22(3):954–61. [23] Lai Y-K, Zhou Q-Y, Hu S-M, Martin RR. Feature sensitive mesh segmentation. In: Proceedings of the 2006 ACM symposium on solid and physical modeling. 2006, p. 17–25. [24] Attene M, Falcidieno B, Spagnuolo M. Hierarchical mesh segmentation based on fitting primitives. Vis Comput 2006;22(3):181–93. [25] Lai Y-K, Hu S-M, Martin RR, Rosin PL. Rapid and effective segmentation of 3d models using random walks. Comput Aided Geom Design 2009;26(6):665–79. [26] Liu R, Zhang H. Segmentation of 3d meshes through spectral clustering. In: Proceedings of the 12th Pacific conference on computer graphics and applications, 2004. 2004, p. 298–305. [27] Chazelle BM. Convex decompositions of polyhedra. In: Proceedings of the thirteenth annual ACM symposium on theory of computing. 1981, p. 70–9. [28] Lavoué G, Dupont F, Baskurt A. A new cad mesh segmentation method, based on curvature tensor analysis. Comput Aided Des 2005;37(10):975–87.

[29] Benhabiles H, Lavoué G, Vandeborre J-P, Daoudi M. Learning boundary edges for 3d-mesh segmentation. Comput. Graph. Forum 2011;30(8). [30] Kreavoy V, Julius D, Sheffer A. Model composition from interchangeable components, in: 15th Pacific conference on computer graphics and applications, 2007. p. 129–38. [31] Zhang F, Sun Z, Song M, Lang X, Yan H. 3d shapes co-segmentation by combining fuzzy c-means with random walks. In: Proceedings of 13th international conference on computer-aided design and computer graphics. IEEE; 2013. p. 16–23. [32] Kim VG, Li W, Mitra NJ, Chaudhuri S, DiVerdi S, Funkhouser T. Learning partbased templates from large collections of 3d shapes. ACM Trans Graph 2013; 32(4):70:1–70:12. [33] Sharf A, Blumenkrants M, Shamir A, Cohen-Or D. Snappaste: an interactive technique for easy mesh composition. Vis Comput 2006;22(9–11):835–44. [34] Lee Y, Lee S, Shamir A, Cohen-Or D, Seidel H-P. Mesh scissoring with minima rule and part salience. Comput Aided Geom Design 2005;22(5):444–65. [35] Meng M, Fan L, Liu L. icutter: a direct cut-out tool for 3d shapes. Comput Animat Virtual Worlds 2011;22(4):335–42. [36] Ji Z, Liu L, Chen Z, Wang G. Easy mesh cutting. In: Computer graphics forum, vol. 25. Wiley Online Library; 2006. p. 283–91. [37] Wu H-Y, Pan C, Pan J, Yang Q, Ma S. A sketch-based interactive framework for real-time mesh segmentation. In: Computer graphics international. 2007. [38] Zhang J, Wu C, Cai J, Zheng J, Tai X-c. Mesh snapping: robust interactive mesh cutting using fast geodesic curvature flow. In: Computer graphics forum, vol. 29. Wiley Online Library; 2010. p. 517–26. [39] Fan L, Liu K, et al. Paint mesh cutting. In: Computer graphics forum, vol. 30. Wiley Online Library; 2011. p. 603–12. [40] Granger E, Savaria Y, Lavoie P. A pattern reordering approach based on ambiguity detection for online category learning. IEEE Trans Pattern Anal Mach Intell 2003;25(4):524–8. [41] Pham M-T, Cham T-J. Online learning asymmetric boosted classifiers for object detection. In: IEEE conference on computer vision and pattern recognition, 2007. 2007, p. 1–8. [42] Collins R, Liu Y, Leordeanu M. Online selection of discriminative tracking features. IEEE Trans Pattern Anal Mach Intell 2005;27(10):1631–43. [43] Tombari F, Di Stefano L, Giardino S. Online learning for automatic segmentation of 3d data. In: 2011 IEEE/RSJ international conference on intelligent robots and systems (IROS), 2011, p. 4857–64. [44] Gal R, Cohen-Or D. Salient geometric features for partial shape matching and similarity. ACM Trans Graph 2006;25(1):130–50. [45] Shapira L, Shalom S, Shamir A, Cohen-Or D, Zhang H. Contextual part analogies in 3d objects. Int J Comput Vis 2010;89(2–3):309–26. [46] Hilaga M, Shinagawa Y, Kohmura T, Kunii TL. Topology matching for fully automatic similarity estimation of 3d shapes. In: Proceedings of the 28th annual conference on Computer graphics and interactive techniques. New York (NY, USA): ACM; 2001. p. 203–12. [47] Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 2002;24(4):509–22. [48] Golovinskiy A, Funkhouser T. Randomized cuts for 3d mesh analysis. In: ACM transactions on graphics (TOG), vol. 27. ACM; 2008. p. 145. [49] Demiriz A, Bennett KP, Shawe-Taylor J. Linear programming boosting via column generation. Mach. Learn. 2002;46(1–3):225–54. [50] Nocedal J, Wright S. Numerical optimization. In: Springer series in operations research and financial engineering. 1999. [51] HoffMan DD, Richards W, Pentl A, Rubin J, Scheuhammer J. Parts of recognition. Cognition 1984;18:65–96.

Suggest Documents