PerTurbo: a new classification algorithm based on ... - Semantic Scholar

ECML-PKDD, 5-9 September 2011

PerTurbo: a new classification algorithm based on the spectrum perturbations of the Laplace-Beltrami operator Nicolas Courty, Thomas Burger and Johann Laurent Universit´ e de Bretagne Sud

Problem and motivations

1

Supervised classification problem

2

A geometric point of view on the manifold processing

3

In 3 dimension problems, strong parallel with computer graphics

Sampling problems

Application to ML

Experimental results

Future works

Conclusion

Outline

1

Sampling problems in computer graphics

2

Application to machine learning and classification

3


4

Future works

5

Conclusion

[email protected]

3/26

Sampling problems

Application to ML


Future works

Conclusion

Outline

1


2


3


4

Future works

5

Conclusion

[email protected]

4/26

Sampling problems

Application to ML


Future works

Conclusion

Sampling point clouds in computer graphics

[email protected]

1

Too many points required to sample real life objects

2

The reduction of the number of samples leads to compact storing

3

Only the informative enough samples are kept

5/26

Sampling problems

Application to ML


Future works

Conclusion

Spectral sampling

Need for metric to quantify the modification of the definition of the ¨ object surface induced by a new sample [Oztireli, Alexa & Gross, 2010]. [email protected]

6/26

Sampling problems

Application to ML


Future works

Conclusion

Riemanian manifolds and the Laplace-Beltrami Operator

[email protected]

1

The surface is assumed to be a Riemannian manifold M (i.e. differentiable everywhere).

2

The Laplace-Beltrami operator 4(.) is a generalization of the Laplace operator for Riemannian manifolds.

3

The spectrum of 4(M ) completely defines M up to an isometry. 7/26

Sampling problems

Application to ML


Future works

Conclusion

Approximating the Laplace-Beltrami operator 1

The surface being unknown, the computation of 4(M ) is impossible.

2

It can be approximated with the Gram matrix K of the samples {x1 , . . . , xi , . . . xN }, fitted with a Gaussian dissimilarity measure: ||xi − xj ||2 Kij = k(xi , xj ) = exp − 2σ 2

3

(1)

The spectrum of K is used to characterize the real object surface [Coifman & Lafon, 2006] and its pertubation is used to ¨ evaluate the interest of a sample [Oztireli, Alexa & Gross, 2010].

[email protected]

8/26

Sampling problems

Application to ML


Future works

Conclusion

Outline

1


2


3


4

Future works

5

Conclusion

[email protected]

9/26

Sampling problems

Application to ML


Future works

Conclusion

Idea #1 : Comparing a class to a surface The pertubation of K: 1 In computer graphics, it quantifies the interest of a sample.

[email protected]

10/26

Sampling problems

Application to ML


Future works

Conclusion

Idea #1 : Comparing a class to a surface The pertubation of K: 1 In computer graphics, it quantifies the interest of a sample. 2 In statistics ?

[email protected]

10/26

Sampling problems

Application to ML


Future works

Conclusion

Idea #1 : Comparing a class to a surface The pertubation of K: 1 In computer graphics, it quantifies the interest of a sample. 2 In statistics ? 1

Only an outlier modifies the distribution of its class

[email protected]

10/26

Sampling problems

Application to ML


Future works

Conclusion



[email protected]

10/26

Sampling problems

Application to ML


Future works

Conclusion



[email protected]

10/26

Sampling problems

Application to ML


Future works

Conclusion

Idea #1 : Comparing a class to a surface The pertubation of K: 1 In computer graphics, it quantifies the interest of a sample. 2 In statistics ? 1 2

Only an outlier modifies the distribution of its class An interesting clue of the membership of a sample to a class !

[email protected]

10/26

Sampling problems

Application to ML


Future works

Conclusion

Idea #2: Class-wise manifold learning 1

For each class `, a dedicated manifold M` is considered

2

All the manifold are learned independently

3

Each class ` is represented by a particular matrix K`

Figure: Fictive example of a 2-class problem where each class is embed in a dedicated manifold. [email protected]

11/26

Sampling problems

Application to ML


Future works

Conclusion

Remarks and notations Remarks: 1

K` is also the classical Gram matrix in the Gaussian RKHS

2

The proxy for Laplace-Beltrami operator ≡ a kernel trick

Notations: X The input space (generated by the problem variables) Z The feature space associated to the Gaussian kernel φ() the mapping from X onto Z T` The set of training examples for M` K` The Gram matrix of φ(T` ) in Z x˜ A test sample in X

[email protected]

12/26

Sampling problems

Application to ML


Future works

Conclusion

The perturbation measure in the Gaussian RKHS (1) Let us project φ(˜ x ) on the manifold M` in Z, i.e. < φ(T` ) >: φ(˜ x) =

[email protected]

r` (˜ x) + o (˜ x) | {z } | `{z } coplanar to orthogonal to

(2)

13/26

Sampling problems

Application to ML


Future works

Conclusion



(2)

¨ From [Oztireli, Alexa & Gross, 2010], the perturbation of K` by x ˜ comes from o` (˜ x ). Thus, the perturbation measure reads:

τ (˜x, M` ) =

[email protected]

||o` (˜ x )||2 ||r` (˜ x )||2 = 1 − = 1 − ||r` (˜x)||2 ||φ(˜ x )||2 ||φ(˜ x )||2

(3)

13/26

Sampling problems

Application to ML


Future works

Conclusion



(2)

¨ From [Oztireli, Alexa & Gross, 2010], the perturbation of K` by x ˜ comes from o` (˜ x ). Thus, the perturbation measure reads:

τ (˜x, M` ) =

||o` (˜ x )||2 ||r` (˜ x )||2 = 1 − = 1 − ||r` (˜x)||2 ||φ(˜ x )||2 ||φ(˜ x )||2

(3)

If Φ` is the matrix whose columns are the elements of φ(T` ), then, −1 T the projector over < φ(T` ) > can be written as: Φ` (ΦT ` Φ` ) Φ` [email protected]

13/26

Sampling problems

Application to ML


Future works

Conclusion

The perturbation measure in the Gaussian RKHS (2) −1 T ||r` (˜x)||2 = ||Φ` (ΦT x)||2 ` Φ` ) Φ` φ(˜

(4)

[email protected]

14/26

Sampling problems

Application to ML


Future works

Conclusion

The perturbation measure in the Gaussian RKHS (2) −1 T ||r` (˜x)||2 = ||Φ` (ΦT x)||2 ` Φ` ) Φ` φ(˜ .. . T −1 = (φ(˜ x)T Φ` ) ((ΦT (ΦT x)) ` Φ` ) ) ` φ(˜

[email protected]

(4)

14/26

Sampling problems

Application to ML


Future works

Conclusion

The perturbation measure in the Gaussian RKHS (2) −1 T ||r` (˜x)||2 = ||Φ` (ΦT x)||2 ` Φ` ) Φ` φ(˜ .. .

φ(˜x)) = (φ(˜ x)T Φ` ) ((ΦT Φ )T )−1 (ΦT } ` | ` {z`

(4)

K−1 `

[email protected]

14/26

Sampling problems

Application to ML


Future works

Conclusion


φ(˜x)) = (φ(˜ x)T Φ ) ((ΦT Φ )T )−1 (ΦT | {z `} | ` {z` } | ` {z } kT x) ` (˜

[email protected]

K−1 `

(4)

k` (˜ x)

14/26

Sampling problems

Application to ML


Future works

Conclusion



K−1 `

(4)

k` (˜ x)

Finally: τ (˜x, M` ) = 1 − kT x)K−1 x) ` (˜ ` k` (˜

[email protected]

(5)

14/26

Sampling problems

Application to ML


Future works

Conclusion



K−1 `

(4)

k` (˜ x)

Finally: τ (˜x, M` ) = 1 − kT x)K−1 x) ` (˜ ` k` (˜

[email protected]

(5)

14/26

Sampling problems

Application to ML


Future works

Conclusion

PerTurbo: A new classification algorithm Training step: ∀ class `, K−1 ` is computed.

[email protected]

15/26

Sampling problems

Application to ML


Future works

Conclusion


Testing step: 1 The dissimilarity of a new test sample ˜ x to each class ` is derived from the perturbation of M` induced by ˜x: τ (˜ x, M` ) = 1 − kT x)K−1 x) ` (˜ ` k` (˜ 2

(6)

The sample ˜x is associated to the class with the least induced perturbation, which reads as: arg min τ (˜ x, M ` )

(7)

` [email protected]

15/26

Sampling problems

Application to ML


Future works

Conclusion


Testing step: 1 The dissimilarity of a new test sample ˜ x to each class ` is derived from the perturbation of M` induced by ˜x: τ (˜ x, M` ) = 1 − kT x)K−1 x) ` (˜ ` k` (˜ 2

(6)

The sample ˜x is associated to the class with the least induced perturbation, which reads as: arg min τ (˜ x, M ` )

(7)

` [email protected]

15/26

Sampling problems

Application to ML


Future works

Conclusion

Outline

1


2


3


4

Future works

5

Conclusion

[email protected]

16/26

Sampling problems

Application to ML


Future works

Conclusion

Experimental setting and analysis of the results Experimental setting: 1

Several tests conducted on simulated and real datasets [UCI Machine Learning Repository]

2

Comparison to several algorithms (among which SVMs)

3

SVMs are fully optimized with cross validation

4

Several versions of PerTurbo are tested (see article)

[email protected]

17/26

Sampling problems

Application to ML


Future works

Conclusion

Experimental setting and analysis of the results Experimental setting: 1

Several tests conducted on simulated and real datasets [UCI Machine Learning Repository]

2

Comparison to several algorithms (among which SVMs)

3

SVMs are fully optimized with cross validation

4

Several versions of PerTurbo are tested (see article)

Qualitative results of PerTurbo: 1

Performances similar to SVMs

2

Less efficient with missing values or binary variables

3

Depending on the problem, the best version is not the same

[email protected]

17/26

Sampling problems

Application to ML


Future works

Conclusion

Datasets Table: Description of simulated/UCI datasets Datasets SimData-1 SimData-2 SimData-3 Ionosphere diabets Blood-transfusion Ecoli Glasses Wines Parkinsons Letter-reco Hill-valley1 Hill-valley2

[email protected]

#Training 200 200 200 71 154 150 67 43 36 39 4000 606 606

#Tests 800 800 800 280 614 598 269 171 142 156 16000 606 606

#Classes 10 10 10 2 2 2 8 6 3 2 26 2 2

#Variables 19 26 31 34 8 4 7 9 13 22 16 100 100

Comments 64 components 64 components 75 components missing values too small for CV

50% unlabeled 50% unlabeled

18/26

Sampling problems

Application to ML


Future works

Conclusion

Accuracy rates Table: Comparison of the accuracy rates (mean and standard deviation, in percentages) with SVM (with optimized parameters and hyper-parameters) Datasets SimData-1 SimData-2 SimData-3 Ionosphere diabets Blood-transfusion Ecoli Glass Wines Parkinsons Letter-reco Hill-valley1 Hill-valley2

[email protected]

PerTurbo 79.0 (1.8) 54.7 (2.5) 19.7 (1.2) 92.1 (1.6) 72.6 (2.2) 76.9 (1.0) 83.7 (2.5) 65.4 (2.9) 72.60 (1.3) 83.8 (1.3) 92.7 (0.2) 60.7 59.9

SVM 76.7 (1.3) 45.0 (1.8) 16.2 (1.3) 92.5 (1.8) 74.0 (1.7) 77.6 (1.5) 83.2 (2.2) 60.6 (4.4) 96.1 (1.7) 85.0 (3.3) 91.9 (0.3) 56.4 55.3

Comparison • • • • • • • • • • • • •

19/26

Sampling problems

Application to ML


Future works

Conclusion

Outline

1


2


3


4

Future works

5

Conclusion

[email protected]

20/26

Sampling problems

Application to ML


Future works

Conclusion

Active Learning (1) It rather simple to find the borders of the classes: B = {x ∈ X s.t. |τr (1) (x) − τr (2) (x)| = 0} where r (i) is the ith least perturbated classes by x.

[email protected]

21/26

Sampling problems

Application to ML


Future works

Conclusion

Active Learning (1) It rather simple to find the borders of the classes: B = {x ∈ X s.t. |τr (1) (x) − τr (2) (x)| = 0} where r (i) is the ith least perturbated classes by x. Then, B 0 = {x ∈ X s.t. |τr (1) (x) − τr (2) (x)| < γ} with γ < 1, corresponds to the region to query with an active learning policy.

[email protected]

21/26

Sampling problems

Application to ML


Future works

Conclusion

Active Learning (2)

[email protected]

22/26

Sampling problems

Application to ML


Future works

Conclusion

Others Regularization: 1

What if K` is not invertible ? ⇒ pseudo inverse

2

Matrix regularization

Graph Laplacian Eigenmaps, dimensionality reduction: 1

K` has the same spectrum as Graph Laplacian

2

We provide a new “projection method” for GLE [Belkin, 2003]

Kernel Mahalanobis distance: 1

Mahalanobis in Kernel space [Haasdonk & Pekalska, 2008]

2

The perturbation measure is similar to some of their distances

Exhaustive study for parameter tuning: 1

The standard deviation σ associated to the Gaussian kernel

2

Additional parameters: Regularization, spectrum truncation...

[email protected]

23/26

Sampling problems

Application to ML


Future works

Conclusion

Outline

1


2


3


4

Future works

5

Conclusion

[email protected]

24/26

Sampling problems

Application to ML


Future works

Conclusion

PerTurbo: Main results

1

Classification algorithm inspired by computer graphics results

2

With very few parameters to tune

3

Performances similar to SVM

4

Perspectives: 1 2 3 4

Active learning Matrix regularization Manifold learning and dimensionality reduction Kernel Mahalanobis distance

[email protected]

25/26

Sampling problems

Application to ML


Future works

Conclusion

Questions

Thank you !

[email protected]

26/26

PerTurbo: a new classification algorithm based on ... - Semantic Scholar

PerTurbo: a new classification algorithm based on ... - Semantic Scholar

Suggest Documents

PerTurbo: A New Classification Algorithm Based on the ... - Springer Link

A Combination Classification Algorithm Based on ... - Semantic Scholar

A New Algorithm of Service Discovery Based on ... - Semantic Scholar

A New Sea-Ice Concentration Algorithm Based on ... - Semantic Scholar

A New Color Constancy Algorithm Based on the ... - Semantic Scholar

A new clustering algorithm based on the chemical ... - Semantic Scholar

A new SPECT reconstruction algorithm based on ... - Semantic Scholar

New Quality Control Algorithm Based on GNSS ... - Semantic Scholar

Classification based on contextual probability - Semantic Scholar

Hyperspectral Image Classification Based on ... - Semantic Scholar

Weighted Bayesian Classification based on ... - Semantic Scholar

Robust Pedestrian Classification Based on ... - Semantic Scholar

A New and Efficient Algorithm-Based Fault ... - Semantic Scholar

A New Fiji-Based Algorithm That Systematically ... - Semantic Scholar

A New Sampling-Based Algorithm for Discovering ... - Semantic Scholar

A New Algorithm for the Satellite-Based Retrieval ... - Semantic Scholar

a new feature based image registration algorithm ... - Semantic Scholar

The Synchrosqueezing Algorithm Based on ... - Semantic Scholar

A New Survivable Heuristic Algorithm Based on

A New Camera Calibration Algorithm Based On

A New Overlapping Clustering Algorithm Based on

A Novel Image Encryption Algorithm Based on a ... - Semantic Scholar

A robust algorithm based on a failure-sensitive ... - Semantic Scholar

A new classification of glaucomas - Semantic Scholar