complexity constant by merging a pair of components for each new ker- nel added. ... incremental one-class learning algorithm (Incremental SVDD [5]) on a va- ... entirely absent during training, but their subsequent identification is of crucial ..... rithm has bounded computational complexity, we recorded the time taken to.
! !
" ! ! #
! ! " $ "
% &'' ()*+ " ,
!
!"
#
$ $
% $ $
$ % % & % % % ' % $
% ( )
*
#
+,-
" . % $ ) ( $ % ! / $ % )
) % $ ' 0 % !""
# $%&'(& )$% *$ +& !""
- . " / - - !
)0
$ ( '
$ / %
1
% & $ ' ) %
$ ' ( % $
) 2 % " 0 "% $ $ $ 0 $ 3 $ $
% 2 0 +4- 5 % ( $ ' +6- 7
! 0 $ 89 8$ 9$ : " * ! # $ $ 8 $ $
/ % %
) $ % *
+;-% $ > = > > " $ > +?- =>> ) % $ ) "
+?- ' % =>>
$ =
$
% /
$ > +@-" % % ) $ : $ 9 +A-" =>> $ % :
$ : )
12
33 3/ 4
2 4%
,% =>> +;- : +A-" 3
( % % =>> ( :
( %
% / $ %
)
' B > 8 ( $ 0 Σ " % ) z % N X = {x1 , . . . , xN }% / N 1 − 1 (z−xn )T Σ −1 (z−xn ) · p(z) = e 2 d 1 · (2π) 2 |Σ| 2 N n=1
1
C"
(2π) 1|Σ| d " C / $ xN EW % X / X → X ∪ {xN EW } 4" $ Σ % $ Σ(σ) = I d · σ2 % σ σ ! > +C-
D +CE-" ( )
$ : % ) σ → 0 σ % ! d 2
1 2
- . " / - - !
15
$ xn ∀x = xn % / LL(σ) =
N n=1
log
1 d
(2πσ) 2
1 − 12 (xn −x)T (xn −x) · · ∀x=xn e 2σ N −1
,"
8 " ," σ% σ = arg maxσ (LL(σ))
C" ( %
F $ $ 0 $ % 0 $ 0 $ +4-/ $ $
%
3 $ $
N = : 0 $ "% ) "% " /
Nmax
w1...N = N1 Σ1...N = I d · σf2 inal μ1...N = x1...N
@"
Gi = {wi , μi, Σi} Gj = {wj , μj , Σj }
?" $ $ N × N $ C #, N (N2 −1) # Nmax
C" max
max
' $ 0 . $ )$ ) %
$ % " /
wmerge(i,j) = wi + wj wj i μmerge(i,j) = wiw+w μi + wi +w · μj j j wi Σmerge(i,j) = wi +wj Σi + (μi − μmerge(i,j) )(μi − μmerge(i,j) )T wj Σj + (μj − μmerge(i,j) )(μj − μmerge(i,j) )T + wi +w j
;"
, ! , ! Nmax + 1 " !
16
33 3/ 4
2 : B < ! 8 % ∞ p(x) log p(x) B< KL(P ||Q) = −∞ q(x) dx ( ) $ $ Q P 0 Gp = {μp , Σp } Gq = {μq , Σq }% +4-/ KL(Gp ||Gq ) =
1 2
|Σq | + (Σq−1 Σp ) + (μp − μq )Σq−1 (μp − μq )T − d log |Σp |
G" ( Gi Gj i = j " Gmerge(i,j) 0 +4-" B < Gmerge(i,j) / cost(Gi , Gj ) = wi KL(Gi ||Gmerge(i,j) ) + wj KL(Gj ||Gmerge(i,j) ) ?" 2 $ xN EW % GN +1 = { N 1+1 , xN EW , I d · σf2inal } % $ NN +1 % Nex $ $ % {Gi , Gj } = arg minG ,G (cost(Gi , Gj )) ' {Gi, Gj } $ % Gi Gmerge(i,j) Gj 1 Gj Gi Gmerge(i,j) GN +1 % $ C ( )$ Nmax + 1 ?" $ % $
max
ex
ex ex
i
j
max
0 % ) * " $ % . % ( $
(
%
% )
' $ 10%% ) ) 90%
- . " / - - !
17
' )
% ) $ Test Data (2500 points inside spiral, 2500 outliers)
Training Data (2500 points inside spiral)
inside spiral outliers
Kernels after 100 Training Examples (pre−merging)
Mixture Model after 2500 Training Examples
= 1 s.d.
Synthetic Spiral Dataset (known class=’inside spiral’)
Comparison with incremental SVDD (mean ROC from 10 trials)
1
1
0.9
0.9
0.8
0.8 0.7
0.6
True positive rate
Performance
0.7 True positives (standard deviation) True positives (mean from 10 trials) False positives (standard deviation) False positives (mean from 10 trials)
0.5 0.4
0.6
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
500
1000 1500 Observations
2000
2500
Our model (operating point) Incremental SVDD (operating point)
0.5
0
0
3
0.2
0.4 0.6 False positive rate
0.8
!
1
18
33 3/ 4 Comparison with incremental SVDD (mean ROC from 10 trials) 1
0.9
0.95
0.8
0.9
0.7
0.85
True positive rate
Performance
Wisconsin Breast Cancer Database (known class=’benign’) 1
True positives (standard devation) True positives (mean from 10 trials) False positives (standard deviation) False positives (mean from 10 trials)
0.6 0.5 0.4 0.3
Incremental SVDD (operating point) Our model (operating point)
0.8 0.75 0.7 0.65
0.2
0.6 merging starts
0.1 0 0
50
100
150 200 250 Observations
0.55 300
350
0.5 0
400
0.02
0.04 0.06 False positive rate
0.08
0.1
0
0.1
0.2 0.3 False positive rate
0.4
0.5
0
0.02
0.04 0.06 False positive rate
0.08
0.1
1
0.9
0.9
0.8
0.8
0.7
0.7 True positive rate
Performance
Letter Image Recognition Data (known class=’A’) 1
0.6 0.5 0.4
0.6 0.5 0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
100
200
300 400 Observations
500
600
0
700
1
0.9
0.95
0.8
0.9
0.7
0.85 True positive rate
Performance
Statlog Databases: Landsat Satellite Images (known class=’red soil’) 1
0.6 0.5 0.4 0.3
0.8 0.75 0.7 0.65
0.2
0.6
0.1
0.55
0
0
200
400
600 800 Observations
1000
1200
3
0.5
!
$
=>> +;-%
$ 2
=>> % % 9*> ) 79.44% T P = 89.69% F P = 0.308%"%
!
11
33 3/ 4
35 C # T P F P & "
) % =>> C CE # *
( $
# K5' 9 <
/
C % G66 6 " % @;A $ 4@C 4 ! 4E%EEE CG " : $ % 4G 2 ?A6 $ L*L %
, "#" $% % G@,; ,G " G # / C;,, $ L L 6EH $ % CEH * ( $ CE # * % $ $ CEE % : =>> ) 35 4%
=>> % 5 & &' ) $ % ,G $ $
) : " ,% )$ $
3 565.56 ± 0.32 C,?6 $ % =>> ) 6.25 ± 0.34 % : 482.53 ± 60.21 8 / 2.88±0.02 5056 $ % =>> 2.01 ± 0.102 $
3 - :;;""" ;< ;9.3
- . " / - - !
1=
2 )
%
: > % =>> % : : +A- ) =>> / % %
5 ' 3>?: > @ $ - -6) 55=)A55=0 %50=1+ 6 # B 3" : C
!
D > 5= )2)A)56 %622)+ 7 C &B B: 3 " 66 E)A561 %6228+ 8 $! '9B ' 3>?: F
@ B 9 . 3 6 5))A5=7 %6225+ ) $! '9B . >: &9 : : > 57 DD > ? - > .
%6227+ 1 $! '9B: '' ' ! 9 5)) = $! '9B ' 3>?: > 3 . 62 5505A5500 %5000+ E $! '9B 9 3: : > ->3 6228 7 717A711 %6228+ 0 G $ B ? # 9 >:
! " ' 9 " ' E 6=)A722 %6228+ 52 H I 9 C 3B: / " - J ' )2 7220A7275 %6221+