Reliable computations of knee point for a curve and ...

1 downloads 0 Views 380KB Size Report
Dec 1, 2014 - First of all the knee point of a curve that is plotted using y = f(x) is defined as .... are positioned symmetrically around the inflection point p = 5.
Reliable computations of knee point for a curve and introduction of a unit invariant estimation Demetris T. Christopoulosa,b a National

and Kapodistrian University of Athens, Department of Economics b [email protected]

Abstract We are investigating the problem of knee finding for a curve when we have only a set of points from it. We are using Menger curvature plus the Extremum Distance Estimator, in order to compute the knee point. Problems of noisy and rescaling data are discussed with respect to the invariance or not of knee points and their estimations. A unit invariant knee point estimator is proposed. Keywords: knee point estimation, discrete curvature, EDE, unit invariance

1. Introduction

2. Knee point First of all the knee point of a curve that is plotted using y = f (x) is defined as

The concept of knee point is used in many fields like fatigue damage theories [1], [2], [3], detecting number of clusters [7], [8], botnet detection problem [9], in system behavior [10]. Although it is broadly used, simple codes for computing it are not easily found while many problems, like scaling non invariance, have not been mentioned or investigated since now. A main result of the paper is that after changing the units in x or y–axis (rescaling) the knee point is essentially been lost, since the new curve has a totally different one and it is not possible to get the initial known by applying the inverse transformation. Another result is that we can define an invariant measure of the ’knee property’, based on the Extremum Distance Estimator (EDE method), which is invariant under the common used unit transformations.

Definition 2.1. Let f be at least C (4) ([a, b]). The knee point of f in [a, b] is the unique extreme point of curvature ′′ k( f ) =  f (x) 2  3 at [a, b]. 1+( f ′ (x)) 2 Since we can always drop away the non-zero and positive   ′ 2  32 denominator 1 + f (x) and all of its powers for computing roots and signs, we can achieve using elementary Calculus the next Corollary 2.1. Knee point χ is the algebraic solution at [a, b] of the characteristic equation   ′′ ′′′ ′ 2 ′ 2 E1 ( f, χ) = f (χ) 1 + f (χ) − 3 f (χ) f (χ) = 0 (1)

The structure of the paper: Initial definitions are given provided that for χ it also holds that the second derivative at Section 2, the two methods and numerical examples are       presented at Section 3, the noisy data problem is solved at E2 ( f, χ) = 3 4 f ′(χ)2 − 1 f ′′ (χ)3 − 9 f ′(χ) 1 + f ′(χ)2 f ′′′ (χ) f ′′(χ) + 1 + f ′(χ)2 2 f ′′′′ (χ) (2) Section 4, the rescaling problems at Section 5, the definition of unit invariant knee is in Section 6, discussion at is positive or negative, for a maximum or minimum curvature respectively. Section 7 and R codes are given at Appendix 9. Corresponding author: Tel.: +306979210251

December 1, 2014

and we have the maximum curvature at x = χ if a > 0. 3. Methods to compute the knee point A set of methods have been proposed for this task using Menger curvature [4], [5], [6], angle-based method for Bayesian information criterion [7], the Kneedle algorithm [10] and dynamically determining the knee, [11]. 3.1. Using the Menger curvature Let’s define first some useful concepts that we will use. Definition 3.1. Discrete curvature at point xi of a curve is the Menger curvature of points {(xi−1 , yi−1 ) , (xi , yi ) , (xi+1 , yi+1 )} computed by use of Heron’s formula for the area of the corresponding triangle. Lemma 3.1. Discrete curvature at point (xi , yi ) of a curve is the Menger curvature Figure 1: True knee point and estimation, f (x) = − 1x + 5, x ∈ [0.2, 2.1]

Example 2.1. For f (x) = − 1x +5, x ∈ [0.2, 2] we compute 1 χ4

1+ E1 ( f, χ) = 6

χ4

− 12

1 =0→χ=1 χ8

while it also holds E2 ( f, 1) = 48 > 0

Example 2.2. For f (x) = a eb x we compute   2   3 E1 ( f, χ) = ab3 ebχ 1 + a2 b2 ebχ − 3 a3 b5 ebχ = 0

while it also holds E2

  1 ln 2 a2 b2 2b

=

A B

= =

kpqk

=

kqrk

=

krpk

=

√ A−B2 kpqk kqrk krqk

4 kpqk2 kqrk2 2 kpqk + kqrk2 − krpk2 q q

q

(4)

(xi−1 − xi )2 + (yi−1 − yi )2 (xi − xi+1 )2 + (yi − yi+1 )2 (xi+1 − xi−1 )2 + (yi+1 − yi−1 )2

Proof The radius of the uniquely defined circle that passes from three non collinear points p = (xi−1 , yi−1 ) , q = (xi , yi ) , r = (xi+1 , yi+1 ) is given at [4], while its inverse value gives the 4 A(p,q,r) 1 relevant curvature C(p, q, r) = R(p,q,r) = |p−q| |q−r| |r−p| . If we also use the well known from ancient times Heron’s formula for computing the area of a triangle p A(p, q, r) = 14 4 |p − q|2 |q − r|2 − (|p − q|2 + |q − r|2 − |r − p|2 )2 then we can directly find Eq. 4. 

and we indeed have minimum curvature, since function is strictly concave, see Figure 1 where we have plotted the relevant circles with radii of curvature and the curvature k( f ).

χ=−

DC(xi )

(3) Definition 3.2. Discrete knee point for {(xi , yi ) , i = 1, 2, . . . , n} of curve points is DKconvex = max {DC(xi ), i = 2, . . . , n − 1} DKconcave = min {DC(xi ), i = 2, . . . , n − 1}

!  1 3 ab4 2 2 f, − ln 2 a b ) = − √ 2b 2 a2 b2 2

the

set

(5)

for a convex or concave curve, respectively. Small R codes for computing 4 and 5 are given at Appendix 9. 3.2. Using the Extremum Distance Estimator method We can find an approximation of knee points by using the Extremum Distance Estimator (EDE) method, as is defined in [12]. For a convex/concave curve we can find two knee points approximately given by χF1 , χF2 respectively. If we have a strictly convex or concave curve, then knee point is close to χF1 . The numerical computation of the above points can be easily found using [13] by taking the function findiplist(x,y,index) with index = 0 (for a convex/concave or strictly convex curve) and index = 1 (for a concave/convex or strictly concave curve). A small R code illustrating the use can be found at Appendix 9. The differences between our method and Kneedle method of [10] are (i) in EDE there is not any kind of data smoothing like smooth splines used in Kneedle, (ii) in EDE there Figure 2: Knee points for f (x) = 5 + 5 tanh(x − 5) is no kind of conversion to the simplex [0, 1] × [0, 1] that is used in Kneedle and (iii) in EDE there is no threshold The relevant EDE approximations can theoretically be value as is defined and used in Kneedle. found using Lemma 1.4 of [12] to be close to knee points with an error about ±0.22 3.3. Numerical examples  √ √ e20 +5 e10 = 3.556313768 (11) xF1 = 5 − arctanh 10 53+2 10 e +5 A sigmoid curve. Let’s take for example a con √ √ vex/concave sigmoid curve which is the plot of the funce20 +5 e10 = 6.443686232 (12) xF2 = 5 + arctanh 10 53+2 e10 +5 tion f (x) = 5 + 5 tanh(x − 5), x ∈ [0, 10] (6) The curve and the points of interest are plotted at Figure Then it is easy to find that there exist two knee points that 2 where we have also put the relevant circles of curvature are positioned symmetrically around the inflection point with radius being the inverse of the computed discrete curp=5 vature. ! 1 √ χ1 = 5 − arctanh 195 = 3.334536724 (7) A strictly convex curve. We are studying the function 15 of Example 2.1 with x ∈ [0.2, 2.1] and knee χ = 1, ! 1 √ which is taken from [10], in order to show that the max(8) χ2 = 5 + arctanh 195 = 6.665463276 imum of a rotated graph does not give us the true knee 15 of the initial curve. We convert the x and y data of an with the confirmation for curvature extrema equidistant partition with N = 101 points for [0.2, 2.1] to the [0, 1] × [0, 1] range by using the transformation √ 16016 E2 ( f, χ1 ) = − 195 = −12.27167454 < 0 (9) T 1 (a, b, x) = x−a with a, b the min, max values respecb−a 18225 tively for x and y. When we apply our methods to the 16016 √ 195 = +12.27167454 > 0 (10) simplex we find DK = χF1 = x(s) = 0.24 → 0.656, while E2 ( f, χ2 ) = + 25 18225 3

Figure 3: EDE estimations for knee point of f (x) = − 1x +5, x ∈ [0.2, 2.1] after applying partially the procedure of Kneedle method, [10]

Figure 4: Knee points for noisy data of f (x) = 5 + 5 tanh(x − 5)

Thus we can use the EDE method in order to find the knee points when we have noisy data with an acceptable error, compared with the totally divergence of the Discrete Curvature approach.

after rotating with θ = −90◦ we find again DK = χF1 = x(s,rot) = 0.7129705936 → 0.24 → 0.656, see Figure 3. 25 So by applying the conversion to simplex and then the rotation of −90◦ we just compute by a complicated way the relevant point of EDE method and not the knee point, thus the Kneedle method of [10] cannot find the true knee of the curve, which here was χ = 1, see again Figure 1.

5. The problem of rescaling data Many times, especially in Physics and Engineering, it is a common practise to rescale data by changing units or by dividing with maximum values, in order to have the so called arbitrary units. Then the knee point in most of the cases is not invariant. We will study the subclass of affine transformations, those with the form u → λ u + µ, λ > 0, because they do not change the order of our data. We will also study the common logarithmic axis rescaling.

4. The problem of noisy data It is remarkable to mention that when we have a noisy data set the above EDE estimations are close to the true knee points, while the direct Menger curvature computation totally fails, since it is highly sensitive to errors. As an example we add a uniformly distributed error term ∼ U(−0.1, +0.1), to the function 6 using an equidistant partition of [a, b] = [0, 10] of 100 intervals and we find using Eq. 5 the values DK = {x22 , x91 } = {2.1, 9.0} that are far away from true knee points, while EDE estimations χF1,2 = {x36 , x67 } = {3.5, 6.6} are still close to the true knees, see Figure 4.

5.1. Affine axis transformations 5.1.1. Rescaling y–axis If we study instead of the D ={(xi , yi n = f (xi )) , i = 1, 2, . . ., n} data the transformed o ′ D(λ y+µ) = xi , yi = λ f (xi ) + µ ), i = 1, 2, . . . , n , λ > 0 4

with respect to inflection point p = 5 –which is unaltered– and we observe that they are different from the initial λ f (x) ones. By taking an equidistant partition of [0, 10] with (13) k (λ f + µ) =  3 100 subintervals we can compute using 5 for the initial 1 + λ2 ( f ′ (x))2 2 data and for the transformed one that DK f,1 = x34 = 3.3 and DKλ f +µ,1 = x44 = 4.3 correctly for both cases, but The characteristic equation now is different. So after rescaling y–axis we actually compute a  ′′′ ′ ′′ 2 2 ′ E1 (λ f + µ, χ) = f (χ) 1 + λ2 f (χ) − 3 λ2 f (χ) f (χ) = 0 totally different value for knee point, ie the knee point is (14) not y–scaling invariant. The second derivative is But, if we apply the EDE estimation we can compute for the two cases the same results χF1 = x37 = 3.6, χF2 = E2 (λ f + µ, χ) =      2 x65 = 6.4, so they have not changed. 3 4 λ2 f ′ (χ)2 − 1 λ3 f ′′ (χ)3 − 9 λ3 f ′ (χ) 1 + λ2 f ′ (χ)2 f ′′′ (χ) f ′′ (χ) + λ 1 + λ2 f ′ (χ)2 f ′′′′ (χ) (15) It is obvious that we cannot find the same values for the Now we can prove the next remarkable knee point, except iff λ = 1, so we have proven next Lemma 5.2. The estimation of a knee point by using EDE method is invariant under rescaling the y–axis after apLemma 5.1. The knee point of a curve is not invariant plying an affine transformation. under the y–axis transformation y → λ y + µ, λ > 0, λ , 1. Proof By using Lemma 1.4 of [12] for the transformed function Example 5.1. For f (x) = e x , x ∈ [a, b] = [−2, 3] us- g(x) = λ f (x) + µ we have that ing Eq. 3 we find the knee point χ = − 12 ln (2) = ( ) −0.34657359. We want to convert y–axis to an axis with g(b) − g(a) ′ (x) arg g = xF 1,2 (g) = units of the maximum value e3 , so λ = e−3 , µ = 0. Then b−a x∈[a−δ1 ,b+δ2 ] n o we find ′ λ f (b)+µ−λ f (a)−µ λ f (x) = = arg b−a   x∈[a−δ1 ,b+δ2 ]  2  2 ( ) E1 (e−3 f, χ) = eχ 1 + e−3 (eχ )2 − 3 e−3 (eχ )3 = 0 f (b) − f (a) ′ = arg f (x) = b−a x∈[a−δ1 ,b+δ2 ] 1 χ = − ln (2) + 3 = 2.65342641 = xF 1,2 ( f ) 2 while it also holds  ! 1 3 √ 5.1.2. Rescaling x–axis E2 e−3 f, − ln (2) + 3 = − 2 0. If we now different. By taking an equidistant partition of [−2, 3] D inverse the transformation we find x = λt − µλ = Λ t + M with 100 subintervals we can compute using Eq. 5 that 1 with Λ = λ and M = − µλ . So by using inverse transform DKλ f +µ = x94 = 2.65 correctly, but it has changed. When we apply the EDE estimation we find the same po- we can find that the curvature as a function of t is 2 ′′ 2 ′′ sition for knee, x = χF1 = x69 = 1.4, which, although is (16) k ( f, t) =  Λ f ′ (Λ t+M) 2 3/2 =  Λ f ′ (x) 2 3/2 not close to knee, it remains the same after rescaling. 1+Λ2 ( f (Λ t+M)) 1+Λ2 ( f (x)) then the curvature of our observed curve is ′′

Example 5.2. For the function of 3.3 we want to resize the y–axis converting it to the [0, 1] range, so we need 1 λ = 10 tanh(5) and µ = −1+tanh(5) 2 tanh(5) . Then we can find χ1 = 4.25526516, χ2 = 5.74473484, again symmetrical

The characteristic equation is  ′ 2  ′′ 2 ′ ′′′ ′′′ E1 ( f, t) = f (x) + Λ2 f (x) f (x) − 3 Λ2 f (x) f (x) = 0

(17)

5

f (x) = f (Λ t + M) = g(t), ta = λ a + µ, tb = λ b + µ we have that n ′ o a) g (t) = g(tbtb)−g(t arg tF 1,2 (g) = −ta

The second derivative is 



2





4

′′′

′′



E2 ( f, t) = f (4) (x) + 2 Λ2 f (4) (x) f (x) + Λ4 f (4) (x) f (x) − 9 Λ2 f (x) f (x) f (x)  ′′ 3  ′′ 3  ′ 2  ′ 3 ′′ ′′′ −9 Λ4 f (x) f (x) f (x) + 12 Λ4 f (x) f (x) − 3 Λ2 f (x)

(18) using x = Λ t + M. Of course we directly observe that the new values for knee points are different from the initial ones, provided that λ , 1.

t∈[ta −∆1 ,tb +∆2 ]

=

x∈[a−δ1 ,b+δ2 ]

= = = =

50  5 t+ 2 3 4 5 t+ 2 e3 3 = 0 e3 3 − 27 243 ! 2 18 3 t=− + ln = −0.49855122 5 10 25

E1 ( f, t) =

E2

n

n



Λ f (x) = ′

Λ f (x) =

(

f (Λ tb +M)− f (Λ ta +M) λ b+µ−λ a−µ

1 λ

o

1 f (Λ λ b+Λ µ+M)− f (Λ λ a+Λ µ+M) λ b−a

f (b) − f (a) arg Λ f (x) = Λ b−a x∈[a−δ1 ,b+δ2 ] ( ) ′ f (b) − f (a) arg f (x) = b−a x∈[a−δ1 ,b+δ2 ] xF 1,2 ( f )

since it holds Λ =



and M = − µλ .

)



5.2. Logarithmic transformations

while it also holds !!

arg x∈[a−δ1 ,b+δ2 ]

2

Example 5.3. For f (x) = 12 e 3 x , x ∈ [a, b] = [−1.5, 3.5] thereexists  a knee point that is found using Eq. 3 to be χ = 3 9 ln 4 2 = 1.12805805. We want to convert the x–axis to [-1,1], so we need λ = 25 , µ = − 52 thus [Λ = 52 , M = 1]. Then it is

3 18 2 ln f, − + 5 10 25

arg

5.2.1. Log-Y rescaling We study instead of the D ={(xi , yni = f (xi )) , i = 1, 2, . . . , n} data theotransformed ′ D(ln y) = xi , yi = ln ( f (xi )) ), i = 1, 2, . . . , n , f (xi ) > 0 The curvature of our observed curve is  ′ 2 ′′ f (x) f (x) − f (x) k (ln f ) =  3 ( f ′ (x))2 2 f (x)2 1 + f (x)2

8 √ =− 2 0. By using Lemma 1.4 of [12] for the transformed function If we inverse the transformation we find x = et . So by 6

o

using inverse transform we can find that the curvature as a function of t is   ′′ ′ ′′ ′ e2 t f et + et f et x2 f (x) + x f (x) k ( f, t) =  = 3/2 3/2  1 + e2 t ( f ′ (et ))2 1 + x2 ( f ′ (x))2 (20) The characteristic equation is

we apply the EDE method are invariant under affine axis rescaling and are approximately invariant under logarithmic axis rescaling. So, even if we are not sure about the validity of our EDE knee point estimation (since that method depends on the interval [X(1) , X(ν) ] = [a, b] of data) we can always be sure that our estimation is unit invariant or approximately invariant for logarithmic plots. This is a remarkable result for data handling in natural  2   3   2   2 sciences. For example in symmetrical sigmoid curves, we E1 ( f, t) = f f − 3 f f x4 − 3 f f x3 + −2 f + f x2 + 3 f x + f = 0 (21) could define the χF1,2 points to be the scale invariant apwith x = et and we omit the second derivative test. Again proximations of knee points, giving the next we find different values for knee points, we have proven Definition 6.1. The Unit Invariant Knee (UIK) of a curve next is the proper point given by the Extremum Distance EsLemma 5.5. The knee point of a curve is not invariant timator method according to the convexity, concavity or under the Log-X axis transformation. sigmoidicity classification. ′

′′′



′′



′′



′′′

′′



Example 5.4. We choose an equidistant partition of 100 The UIK is invariant under unit transformations of x and subintervals of [a, b] = [5, 10] for the function 6, thus we y–axis and it is approximately invariant under logarithhave a strictly concave curve. mic axis rescaling. For a strictly convex or concave curve UIK = χF1 , while for a convex/concave sigmoid curve Log-Y. By taking logarithms for y–axis we find χ = x11 = UIK1,2 = χF . 1,2 5.5 ≈ 5.518384 which is the correct value from 19, but A visual interpretation of χF1,2 points is that they are acdifferent from the initial of 6.665463, found at 11. The tually slant extrema, ie local minimum or maximum relaEDE estimation is now χF1 = x27 = 6.3 closer to the ini- tively to the (slant) total chord which connects initial and tial value and slightly different than the x30 = 6.45 which ending points of the curve. If we compute the angle was the result without taking logy. So, even if EDE is not ! any more invariant, the new disturbed value is close to the f (b) − f (a) (22) θ = − arctan initial one, thus it is approximately invariant. b−a Log-X. By taking logarithms for x–axis we find t55 = and do a rotation to all our points by using the rotation 2.041220 ≈ 2.043988 which is the correct value from 21, matrix    cos (θ) − sin (θ)  but after inversion it gives χ = x55 = 7.7, different from   (23) R (θ) =  the initial of 6.665463, found at 11. The EDE estimation sin (θ) cos (θ) is now t28 = 1.848454 ie χF1 = x28 = 6.35 closer to the initial value and slightly different than the x30 = 6.45 then we can convert our graph to another one which has which was the result without taking logx. So, again EDE as extrema the EDE points, as can be easily verified from Figure 5 which presents the data of Figure 2 after such a is not logx invariant, but it is approximately invariant. rotation. Just to mention that the EDE estimations for rotated data are again at the same positions as before χrot F1,2 = 6. A unit invariant estimation of knee point rot {xrot , x } = {2.7549, 11.500} → {3.5, 6.6} = {x , x 36 67 }. 36 67

We found that, although the concept of knee point is directly connected to the curvature - invariant under re- 7. Discussion parametrizations- it is not by itself invariant under the We studied extensively the concept of knee point giving most commonly used rescaling of x and y–axis, thus it is proved to be a non so useful measure. But we have definitions for functions and for discrete data. We proved also found that the two χF1,2 points that are defined when that it is not neither unit nor logarithmic scale invariant. 7

The R code for finding knee points using Eq. 5. findknee

Suggest Documents