On the Radial Basis Function Interpolation II: Tikhonov and L2-Norm Total Variation regularizations Jianping Xiao*
[email protected] University of Michigan, Ann Arbor March 8, 2016
Abstract In this paper, we provide detailed numerical analysis of the Tikhonov regularization and L2-Norm Total Variation regularization in the context of RBF interpolations. We show that applying Tikhonov regularization and L2-Norm Total Variation regularization is equivalent to applying a low-pass filter. The filtering factors for each of the harmonic components are analytically computed.
1
Introduction
This paper is organized as follows: in the first part, we review the traditional Tikhonov regularization and give a detail proof of the related conclusions; in the second part, we introduce the total variation L2 regularization in the context of interpolating a given function using RBFs. For the total variation L2 regularization, the resulted problem is equivalent to a general Tikhonov regularized least square problem. The procedure of analyzing the traditional Tikhonov regularization problem is applied to the total variation L2 regularization problem. The discussion could be extended to regularization on a arbitrarily high dimensional space. In the last section, we would discuss the filtering effect using Tikhonov regularization and total variation L2 regularization in a wavenumber domain hypersphere, 1
by properly manipulating the traditional total variation L2 regularization function for high dimensional space.
2
Tikhonov Regularization
In this section, we will review the principle of Tikhonov regularization as a filter. Further discussion can be seen in [2, 3]. The Tikhonov solution ~ctk solves the problem ~~ c − f~||22 + λ||~c||22 ) min(||K~ c
(1)
and the solution is formally given by ~~ −1 ~~ T ~ ~~ −1 ~ ~~ T ~~ K f = Ktk f K + λI) ~ctk = (K
(2)
To understand the minimization problem, we need to look at the following overdetermined linear system
~~ K √ ~c = ~ λI~
f~
~0
(3)
~ Usually this system has no exact solutions unless vector f lies in the range of matrix ~~ K √ . Based on Theorem 11.1 in Trefethen’s Book [6], the projection of the residual ~ λI~ ~~ ~~ ~ K~c − f K √ onto the column space of matrix √ must be zero, that is ~~ ~~ λI~c λI T ~~ ~~ K K~ c − f~ √ √ =0 (4) ~ ~~ λI~ λI~ c Rearrange Eqn. 4, we have ~~ T ~~ ~~ ~~ T ~ [K K + λI]~ c=K f the solution of which is Eqn. 2. For matrix K, the SVD is
2
(5)
~~ ~~ ~~ ~~ T K =U SV
(6)
~ ~ ~~ ~ = [u1 , ..., un ], S ~ = diag(s1 , ..., sn ), and V where, U = [v1 , ..., vn ], with s1 ≥ s2 ≥ s3 ... ≥ sn ≥ ~ ~ ~~ −1 ~~ T ~~ −1 ~~ T ~ and V ~ are both unitary, i.e., U 0. The matrices U = U and V =V .
~ ~ ~ ~ ~ ~~ ~~ T ~~ T ~~ ~~ ~~ T ~~ ~~ ~~ T −1 ~~ ~~ T ~~ T ~ −1 = (K ~ TK ~ + λI) ~ −1 K ~ T = (V K S U U S V + λV IV ) V S U tk 2 X s2 1 ~ ~ ~ ~ ~ ~ −1 ~ ~ ~~ T ~ ~ ~~ diag( si 1 )U ~~ T = ( 2 i =V (S T S + λI) ST U =V )~vi ~uTi s s2i + λ si s + λ i i i X 1 = wi ~vi ~uTi si
(7)
i
and, ~ctk =
X
wi
i
(~uTi f~) ~vi si
(8)
with the weighting function wi = si 2 /(si 2 + λ). We had seen from the weighting function, as s2i → 0, wi → 0; while s2i λ, wi → 1. This shows that small SVD components have smaller effects on ~xtk2 . In the limit λ → 0, Eqn. ( 8) recovers the non-regularized case with wi = 1.
~c =
X (~uT f~) i
si
i
~vi
(9)
The Tikhonov regularization has been a very powerful method in filtering out the influence of noise. The choice of the Tikhonov parameter λ is based on the L-curve method. This method was used by Lawson and Hanson [5] and further studied by Hansen [2, 3]. The basic idea is the regularized solution ||~cλ ||22 is a monotonic decreasing function of λ and the ~ ~ c − f~||2 is a monotonic increasing function. To understand this argument, we residual ||K~ 2 will need to take a look at the regularized solution Eqn. 8. The 2-norm of the solution is
||~c||22 =
" X i
(~uT f~) wi i ~vi si
3
#T X (~uTj f~) wj ~vj sj j
(10)
~~ is unitary and ~vi~vj = δi,j , therefore Since V ||~c||22
=
X i
(~uT f~)2 wi2 i 2 si
=
X i
si 2 si 2 + λ
2
2
(~uTi f~)2 s2i
Since wi2 = [ sis2i+λ ]2 are monotonically decreasing functions of λ and
(11)
~2 (~ uT i f) 2 si
are all positive,
therefore, ||~c||22 is a monotonically decreasing function of λ. Next, we have a look at the ~ ~ c − f~||2 , where we will expand f~ = P (~uT f~)~ui and residual ||K~ 2 i i X X (~uTj f~) ~ ~c= K~ si ~ui~viT wj ~vj sj i
(12)
j
~~ ~~ , the residual equation becomes and V Again, due to the unitarity of matrices U ~~ ||K~ c − f~||22 T T ~ XX X X XX (~uTj f~) (~u f ) = si wj ~ui δi,j − (~uTi f~)~ui sk wl l ~uk δk,l − (~uTj f~)~uj sj sk i j i j k l " #T X X X X (~uTj f~) (~uT f~) = ~uj − si wi i ~ui − (~uTi f~)~ui (~uTj f~)~uj s j wj si sj i i j j " #T X X = (1 − wi )(~uTi f~)~ui (1 − wj )(~uTj f~)~uj i
=
Xh
j
i2 (1 − wi )(~uTi f~)
i
(13) Since 0 < wi < 1 are monotonic decreasing functions of λ, therefore 0 < 1 − wi < 1 are ~~ monotonic increasing functions of λ. Therefore, residual ||K~ c−f~||22 is a monotonic increasing function of λ. The optimum Tikhonov parameter for the minimization problem Eqn. 1 is ~~ a λ that balances the two competing terms: the increasing ||K~ c − f~||22 and the decreasing ~ ~ c − f~||2 is a L-shape curve, which is called L-curve [1]. The ||~c||22 . A plot of ||~c||22 against ||K~ 2 L-curve has been a guide to select the optimum Tikhonov parameter for the minimization problem 1.
4
2.1
Tikhonov regularization for global RBF interpolation
In [4], Radial Basis Function interpolation with Tikhonov regularization is interpreted as roughness-minimizing splines. For global RBF interpolation, the interpolation matrix is symmetric and positive definite. The interpolation matrix can then be written as ~~ ~~ ~~ ~~ T K =U SU
(14)
The Tikhonov regularized solution becomes ~ctk =
X
wi
i
where, as shown before, wi =
s2i 2 si +λ
(~uTi f~) ~ui si
(15)
, λ > 0. Fig. 1 shows the weighting function of Tikhonov
regularization with three different parameters.
2.2
Filter high frequency waves using Tikhonov regularization
In this section, we will prove how the Tikhonov regularization damps high frequency components in an interpolation process which is followed by two numerical experiments using Tikhonov regularization. As we have seen from the eigen analysis of 1D and 2D PGARBF interpolation matrices, the eigen modes are simply Fourier series. The eigenvalues of the PGARBF interpolation matrices decay exponentially.
Theorem 2.1. Applying Tikhonov regularization with regularization parameter λ to an interpolation problem is effectively applying a filter to the problem. In the space of eigenvectors, an eigenvector with eigenvalue λj is damped by a factor of wj , where wj are the weighting function of Tikhonov regularization. Proof. Suppose, we have a grid ~x1 , ~x2 , ..., ~xN . The interpolation of a function ψ(~x) can be P written as ψ N (~x) = N x), where in our case, φj (~x) is the PRBF basis corresponding j=1 cj φj (~ to center ~xj , i.e., φj (~x) = θGA (~x − ~xj ; ) for Periodic Gaussian RBF. This interpolation
5
weighting function of Tikhonov regularization on a log−log scale 0 10 α=10−2 α=10−3
−5
10
w
α=10−4 −10
10
−15
10
−20
10
−10
−5
10
10 s
0
10
s2
i Figure 1: Weighting function s2 +λ of Tikhonov regularization using different λ = α2 . The i figure is plotted on a log-log scale. The curves start to decay at si ≈ α. When si is greater than α, the weighting function is close to unity. As si approaches zero, weighting function decays rapidly, and effectively filters out components of small singular values.
equation is forced to satisfy the interpolation condition on all the grid points, that is, ψ N (~xj ) = ψ(~xj ), j = 1, ..., N ;
(16)
The above interpolation is written in a matrix-vector product form ~~ ~ K~ c=ψ
(17)
~ ~~ ~ is the PRBF interpolation matrix with elements K where, matrix K xi ), and ~c is i,j = φj (~ ~~ a vector containing the expansion coefficients. Suppose, matrix K has N eigenvectors ~uj ~~ forming a set of complete basis in an N dimensional space. We can rewrite K as,
6
N
X ~~ K = λj ~uj ~uTj
(18)
j
In a Tikhonov regularization interpolation,
~cnew =
N X
wj
j=1
~new = ψ
N X i
=
~old ~uTj ψ ~uj λj
N N ~old X X ~uTj ψ λi ~ui ~uTi ~cnew = ( ~uj ) λj ~ui ~uTi )( wj λj i
N X
(19)
λi ~ui wi
i
~old ~uTi ψ λi
=
j=1
N X
(20)
~old )~ui wi (~uTi ψ
i
~ has been damped by a factor of wi , So the original ith component of initial function ψ with small eigenvalues components damped more quickly (The weighting function is plotted in Fig.
1). Therefore, Tikhonov regularization can be used as a filter to high frequency
modes. The damping factor for each Fourier mode is plotted in Fig. 2.
numerical and analytical damping factor, λ=1
0
damping factor
10
numerical analytical
−1
10
−2
10
0
10
20
30 40 n mode
50
60
th
Figure 2: Damping factor for different trigonometric modes using standard Tikhonov regularization. Numerical and analytical damping factor are the same.
7
2.3
Tikhonov regularization for RBFs interpolation on a sphere
In this section, we superpose two modes of Legendre polynomial, i.e., P1 (µ) + P10 (µ), which gives Psum (µ) = P1 (µ) + P10 (µ)
(21)
Here, µ = cos(θ) sin(θp ) + sin(θ) sin(θp ) cos(λ − λp ), and (θp , λp ) is the rotated angle. In this part, we let θp = π/2, λp = 0. To see the filtering effect of Tikhonov regularization in RBF interpolation, we interpolated function Psum (µ) using Gaussian Radial Basis Function(GRBF) on a sphere, and then applied the Tikhonov regularization. The following patterns, i.e., Fig. 3 and Fig. 4 show that, as Tikhonov regularization parameter λ increases, the high mode, i.e., P10 (µ) becomes diminished, and the resulting mode becomes simply the lower mode P1 (µ). Numerically, we have successfully filtered out higher mode, P10 (µ).
3
L2-norm Total Variation Regularization
The limitation of Tikhonov regularization is that we are not approaching the interpolated ~ ~ c directly, but we are constraining the coefficients ~c only. In this section, we function K~ will discuss the RBF interpolation of a given function f (x) regularized by the L2-norm total variation, which is a measure of the smoothness of a function. The total variation is minimized directly. The idea is, we interpolate the function f (x) using RBFs, but would like to see the interpolated function f N (x) is smooth, that is the total variation remains small. The function f (x) could be represented as a summation of RBFs, that is,
f (x) ≈ f N (x) =
N X
ci φ(x − xi )
(22)
i=1
The total variation of the function f (x) on a set of discrete locations x1 , x2 , x3 , ..., xN is defined as T V L2(f (x)) = ||f~xN ||2 where f~xN is a vector with j th entry equal to
∂f N (x) ∂x |x=xj .
8
(23)
The minimization problem is, min(||f~N − f~||22 + λ||f~xN ||22 ) c
(24)
where, f~N and f~ are vectors with j th entries equal to f N (xj ) and f (xj ), respectively. By P substituting f N (x) = N i=1 ci φ(x − xi ), we have ~~ ~ min(||K~ c − f~||22 + λ||P~x~c||22 ) c
(25)
~ where, K is the interpolation matrix with entries K(i, j) = φ(xi −xj ) and P~x is the derivative ~ ∂φ(x−xj ) matrix with entries P~x (i, j) = |x=xi . ∂x The L2 minimization problem is equivalent to solving the following overdetermined system, ~~ K f~ ~c = √ ~ ~0 λP~x
(26)
The solution of the overdetermined system is equivalent to solving the following equation ~~ ~ ~ ~~ T ~ ~ ~ TK + λP~xT P~x )~c = K f (K ~~ ~ Since K and P~x have the same set ~ ~ ~ ~ ~ =U ~S ~U ~ T and P~ ~x = matrices are K
(27)
of eigen vectors, the eigen decomposition of the two ~~ ~~ ~~ T U RU . The solution of the linear equation could be
expressed as ~c =
N X
s2 j=1 j
sj (~uT f~)~uj + λrj2 j
(28)
For the eigen values ri are imaginary, so ri2 are negative. Based on the theorem in our previous paper, we see that ri2 = −j 2 s2i , which is substituted into the above equation, we have, ~c =
N X j=1
where wj =
1 . 1−λj 2
wj
1 T~ (~u f )~uj sj j
(29)
Therefore, to smooth the solution(equivalently, to filter high frequency
9
components), we need to set λ < 0 so that the weighting functions are smaller than unity. However, numericaly, this is trivial, since −1 = i2 , while the i could be multiplied to the eigenvectors. This is equivalent to a π/2 phase shift to the eigen vectors, since i = π
ei 2 = cos( π2 ) + i sin( π2 ). We have seen the weighting function is a monotonic function of wavenumber j. For a higher mode, the weighting function gets smaller. Similar to the traditional Tikhonov regularization, the total variation L2 regularization is a filter to high frequency components.
4
Tikhonov filter in a wavenumber hyperspace
For a function f (x, y) in a 2D dimensional space domain, it could be expanded as 2
f (x, y)N =
N X N X
cij θ(x − xi , y − yj )
(30)
i=1 j=1
T V L2(f ) =
q ||f~xN 2 ||22 + ||f~yN 2 ||22
(31)
Again, the minimization problem is, 2 2 2 min(||f~N − f~||22 + λ(||f~xN ||22 + ||f~yN ||22 ))
c
(32)
Similarly, we could formulate the overdetermined system
~~ K √ ~ λP~x √ ~ λP~y
f~
~c = ~0 ~0
(33)
The least square solution is equivalent to solving ~ ~ ~~ T ~ ~ TK ~ + λP~~ T P~~x + λP~~yT P~~y )~c = K (K f x
(34)
~ ~x and P~~y have the same set of eigen vectors ~vi,j = ~ui ⊗ ~uj . The ~ P~ Again, the matrices K,
10
solution could be written as
~c =
=
N X N X
s2 s2 i=1 j=1 i j N X N X i=1 j=1
si sj (~v T f~)~vi,j + λ(ri2 s2j + s2i rj2 ) i,j (35)
s2i s2j
1 (~v T f~)~vi,j 2 2 2 2 2 2 si sj + λ(ri sj + si rj ) si sj i,j
Substituting ri2 = −i2 s2i and rj2 = −j 2 s2j into above equation, we have
~c =
N X N X i=1 j=1
where, wi,j =
wi,j
1 (~v T f~)~vi,j si sj i,j
(36)
1 . 1−λ(i2 +j 2 )
This suggests, in the wavenumber space, harmonic components of wavenumbers on the same spherical surfaces are damped by the same factor.
5
Conclusion
In this paper, we provided two methods for smoothing the RBF interpolations. This procedure could be applied to other collocation spectral methods.
References [1] D. Calvetti, L. Morigi, L. Reichel, and F. Sgallari. Tikhonov regularization and the L-curve for large discrete ill-posed problems. Journal of Computational and Applied Mathematics, 123:423–446, 2000. [2] Per Christian Hansen. Analysis of discrete ill-posed problems by means of the l-curve. SIAM REVIEW, 34(4):561–580, December 1992. [3] Per Christian Hansen and Dianne Prost O’Leary. The Use of the L-Curve in the Regularization of Discrete Ill-Posed Problems. SIAM J. Sci. Comput., 14(6):1487–1503, November 1993.
11
[4] F.J. Hickernell and Y.C. Hon. Radial basis function approximation as smoothing splines. Appl. Math. Comput., 102(1):1–24, 1999. [5] Charles L. Lawson and Richard J. Hanson. Solving Least Squares Problems. PrenticHall, Englewodd Cliffs, NJ, 1974. [6] Lloyd N. Trefethen and David Bau. Numerical Linear Algebra. SIAM, Philadelphia, 1997.
12
RBF vorticity field, P1+P10, λ=1e−05 3 2.5 θ
2 1.5 1 0.5 −2
0 2 λ RBF vorticity field, P1+P10, λ=0.01 3 2.5 θ
2 1.5 1 0.5 −2
0 2 λ RBF vorticity field, P1+P10, λ=10 3 2.5 θ
2 1.5 1 0.5 −2
0 λ
2
Figure 3: Interpolate P1 (µ) + P10 (µ) using Gaussian RBF on a sphere with λ = 10−5 , 10−2 and 10. The number of icosahedral grid points used are N = 1442, and Gaussian RBF shape parameter is α = 1/3.
13
exact vorticity field, P1 3 2.5 θ
2 1.5 1 0.5 −2
0 2 λ exact vorticity field, P10
3 2.5 θ
2 1.5 1 0.5 −2
0 2 λ exact vorticity field, P1+P10 3 2.5 θ
2 1.5 1 0.5 −2
0 λ
2
Figure 4: Exact P1 (µ), P10 (µ) and P1 (µ) + P10 (µ) for comparison. The figures from RBF interpolations with difference Tikhonov regularization parameters are plotted int Fig. 3
14