An optimization approach for dynamical Tucker tensor

An optimization approach for dynamical Tucker tensor approximation Lukas Exl 1,2 1

2

Faculty of Mathematics, University of Vienna, 1090 Vienna, Austria. Physics of Functional Materials, University of Vienna, 1090 Vienna, Austria.

January 17, 2018

Abstract. An optimization-based approach for Tucker tensor approximation of parameterdependent data tensors and solutions of tensor differential equations with low Tucker rank is presented. The problem of updating the tensor decomposition is reformulated as fitting problem subject to the tangent space without relying on an orthogonality gauge condition. A discrete Euler scheme is established in an alternating least squares framework, where the quadratic subproblems reduce to trace optimization problems, that are shown to be explicitly solvable and accessible using SVD of small size. In the presence of small singular values, instability for larger ranks is reduced, since the method does not need the (pseudo) inverse of matricizations of the core tensor. Regularization of Tikhonov type can be used to compensate for the lack of uniqueness in the tangent space. The method is validated numerically and shown to be stable also for larger ranks in the case of small singular values of the core unfoldings. Higher order explicit integrators of Runge-Kutta type can be composed. Keywords. Dynamical tensor approximation, low-rank approximation, orthogonality gauge condition, small singular values, tensor differential equations, model order reduction Mathematics Subject Classification. 15A69, 90C30, 65Z05

1

Introduction

Low-rank matrix and tensor approximation occurs in a variety of applications as model reduction approach (Grasedyck et al., 2013). This paper concerns the problem of fitting parameter or time-dependent matrices and tensors with small (Tucker) rank within the Tucker tensor format (Kolda and Bader, 2009), that is, updating the factors and core tensor of the model instead of working with the tensor itself. This topic was recently discussed in (Koch and Lubich, 2007, 2010) in a continuous setting, where differential equations for the factor matrices and core tensor of the Tucker approximation were derived. For a time-varying family of tensors Aptq P RI1 I2 Id and the approximation manifold Mr (Tucker tensors with rank r) an alternative to the best-approximation problem min

p qP

X t Mr

}Xptq Aptq}

[email protected]

1

(1)

is to considered a dynamical tensor approximation problem based on derivatives Y9 which lie in the tangent space TY Mr , i.e., min

p qP

}Yptq Aptq} 9

Y9 t TY Mr

9

(2)

with an initial condition Yp0q, typically a best-approximation in Mr of Ap0q. Note that for 9 ptq F pt, Yptqq this becomes the defect approximation of a tensor differential the choice A equation Y9 ptq F pt, Yptqq. A prominent example of a tensor differential equation on Mr is the multiconfiguration time-dependent Hartree approach (MCTDH) of molecular quantum dynamics (Beck et al., 2000, Lubich, 2015), where a Schr¨odinger equation subject to Mr is solved. More precisely, the MCTDH ansatz takes the form i ψ9

Hψ

s.t. ψ px, tq

r1 ¸

j1 1

rd ¸

aj1 jd ptq

d ¹

pκq

ϕjκ pxκ , tq,

κ 1

jd 1

where the discretized constraint on the many-particle wave function ψ leads to a Tucker tensor. A time-dependent variational principle is used to extract differential equations for the coefficient pκq tensor aj1 jd and the factors ϕjκ . This procedure is equivalent to solving problem (2) for 9 ptq H Y, with Y denoting the discretized wave function in Tucker format, by applying the A Galerkin condition

xiY H Y, Vy 0 9

for all

VPT Y Mr ,

for a gauged tangent space T Y Mr . In this sense, problem (2) yields an initial value problem on Mr in form of a system of nonlinear differential equations for the factors and core tensor in the Tucker decomposition. This system is explicit if an orthogonality gauge condition in TY Mr is imposed. However, the arising equations involve the (pseudo) inverse of the core unfoldings, which might be close to singular. Consider for instance the two dimensional (low-rank matrix) case, where Mr consists of matrices Y UCVT P RI1 I2 with U P RI1 r , V P RI2 r and C P Rrr nonsingular. The mentioned equations for the factors of the model take the form (omitting the argument t) (Koch and Lubich, 2007)

1 , 9 pI UUT qAVC 9 U

T , 9 pI VVT qAUC 9 V

(3)

9 UT AV. 9 C

Nonzero singular values of Y are those of the core C, which has to be inverted in the first two equations of (3). Hence, very small singular values lead to severe problems concerning stability of numerical explicit or implicit integrators (Kieri et al., 2016). Unfortunately, this is a very realistic case, since in practice the prescribed initial ranks should be chosen larger than the unknown effective ranks (overestimation) because underestimating necessarily leads to large errors. Therefore, a numerical scheme for (2) has to manage the rank deficient case. Recently, Lubich and Oseledets (2014) introduced an operator splitting approach for the low-rank matrix approximation, which behaves robust in the mentioned problematic case. In the meanwhile the method was extended to the tensor train format (Lubich et al., 2015) and applied to MCTDH (Lubich, 2015). In this paper we break out in a different direction. We will develop an integration scheme using the optimization framework (2) itself rather than a system of nonlinear differential equations. For that purpose, it turns out to be beneficial to give up the gauge condition in the tangent 2

space. Remarkably, the proposed scheme does not need any inversion of (pseudo) inverse of the core unfoldings (density matrices in MCTDH). The approach is motivated from static fitting problems of data tensors, where quasi best-approximation in the Tucker format can be derived very efficiently from alternating least squares (ALS) or general nonlinear minimization of the norm of the residual (Kolda and Bader, 2009, Hackbusch, 2012). For instance, the key of the higher order orthogonal iteration (HOOI) (De Lathauwer et al., 2000) is the reduction of the quadratic subproblems in the ALS scheme to the task of computing an orthonormal basis of a dominant eigenspace (Kokiopoulou et al., 2011). Here, we establish a discrete Euler scheme by alternating least squares, where we impose orthonormal columns for the updated factor matrices as a constraint. As a consequence, the originally quadratic subproblems get linear with explicit solutions via SVD. The method does not require any pseudo inverse of unfoldings of the core. Hence, the approach is far less affected by small singular values, although the stability of the SVD solutions of the involved trace optimization problems can still be negatively affected in severe cases. It turns out that Tikhonov regularization reduces this problem while also being a compensation for the lack of uniqueness of the tangent space representation. The regularization only acts on the column space of the factor matrices and would be inactive if the conventional orthogonality gauge condition would have been imposed. In the following section we introduce tensor notation and give some details to Tucker decompositions and the approximation manifold. In section 3 we review the approach of Koch and Lubich (2010). Section 4 formulates the optimization framework and establishes the Euler scheme. Finally, we validate our approach by means of numerical examples.

2

Prerequisites

In the following a brief description of tensors, Tucker decomposition and the manifold of rankpr1, . . . , rdq tensors is given. The reader is referred to the review article (Kolda and Bader, 2009) or the book (Hackbusch, 2012) for further details and references therein. We treat real-valued matrices and tensors, but emphasize that the complex case (as in quantum mechanics applications) does not pose any problems. Specifically, the derivation of the Euler scheme of Sec. 4 with its Theorem 3 generalizes to this case.

2.1

Notation

We denote matrices by boldface capital letters, e.g., A and vectors by boldface lowercase letters, e.g., a. A tensor is denoted by boldface Euler script letters, e.g., X. Scalars are denoted by lowercase letters, e.g., α. We use the Frobenius inner product and norm for matrices, denoted by x., .y and }.}, respectively. The norm of a tensor X P RI1 I2 Id is the square root of the sum of the squares of all elements, or equivalently, the Euclidean norm of the vectorized tensor

}X}

b

vecpXqT vecpXq.

(4)

The inner product of two tensors X, Y P RI1 I2 Id is the sum of the product of their entries, i.e.,

xX, Yy vecpXqT vecpYq.

(5)

There holds }X}2 xX, Xy. Fibers of a tensor X P RI1 I2 Id are the generalizations of matrix rows and columns. A mode-n fiber of X consists of the elements with all indices fixed except the nth index. The nth 3

±

unfolding of a tensor X P RI1 I2 Id is the matrix Xpnq P RIn j n Ij that arranges the moden fibers to be the columns of the resulting matrix. We denote the corresponding matricization as rXspnq Xpnq . The n-rank of a tensor X P RI1 I2 Id is rn

rankpXpnqq,

(6)

and the vector r pr1 , . . . , rd qT is called the Tucker rank of X. The n-mode product of a tensor X P RI1 I2 Id with a matrix U P RJn In results in a tensor X n U P RI1 In1 Jn In

1

Id ,

(7)

which is defined as

rX n Uspnq UXpnq.

(8)

There holds (with appropriate dimensions)

pX n Uq n V X n VU

(9)

X n U m W X m W n U.

(10)

and for n m

We will further denote the range of a matrix U orthogonal complement with R pUqK .

2.2

P RJ

n

In (column space) with R pUq and the

Tucker decomposition and manifold of rank-pr1 , . . . , rd q tensors

A Tucker decomposition (Tucker, 1966) of a tensor X P RI1 I2 Id has the form X C 1 U1 d Ud

: C

d ¡

Uk ,

(11)

k 1

representation where the core tensor C P Rr1 r2 rd and the factor matrices Uk P RIk rk . The r C d Vk together (11) is not unique: For Vk P Rrk rk nonsingular, modifying the core C k 1 r d U r k. r k Uk V1 gives the same tensor X C with modified factor matrices U k 1 k In the following we further assume the factor matrices having orthonormal columns, that is UTk Uk

I.

(12)

r k Uk Qk with Qk Note that this does not lead toa unique Tucker decomposition, since U d T rC r d U r k. orthogonal matrices and C k1 Qk gives the same tensor X C k 1 A Tucker decomposition can be written as a linear combination of rank-1 tensors that are formed as the outer products of the columns of the factor matrices, that is

X

¸

p1q updq ,

cj1 ,...,jd uj1

(13)

jd

j1 ,...,jd

where C pcj1 ,...,jd q. The nth unfolding of a Tucker tensor is given by Xpnq

UnCpnqpUd b b Un 1 b Un1 b b U1qT : UnCpnq 4

â

k n

UTk ,

(14)

where

b denotes the Kronecker product.

For given Tucker rank r pr1 , . . . , rd qT the set

(15) tX P RI I I : rn rankpXpnqq, n 1, . . . , du is the manifold of rank-pr1 , . . . , rd q tensors. Every element X P Mr has a representation as a Mr

1

2

d

Tucker decomposition, where w.l.o.g we assume the factor matrices having orthonormal columns, that is UTk Uk I. We will use Mr as an approximation manifold with (typically) rn ! In . In the special case of d 2 elements of Mr are usually referred to as low-rank matrices, where r pr, rqT . In this case, the conventional notation is X UCVT P RI1 I2 with U P RI1 r , V P RI2 r and C P Rrr nonsingular, but not necessarily diagonal as in the case of a (truncated) singular value decomposition (SVD). For Y P Mr the corresponding tangent space is denoted with TY Mr , where Y9 P TY Mr has the form Y9 C9

d ¡

n 1 9 with C9 P Rr1 r2 rd and U n

d ¸

Un

9 C n U n

n 1

¡

Uk ,

(16)

k n

P TU VI ,r , where (17) TU VI,r : tU P RI r : UT U UT U 0u is the tangent space of U P VI,r , the Stiefel manifold of real I r matrices with orthonormal n

n

n

9

9

9

columns. If the gauge condition (orthogonality constraint) 9 UTn U n

0

n 1, . . . , d

(18)

is imposed, we denote the tangent space with T Y Mr .

3

Related approach of Koch and Lubich

Dynamical approximation of time-dependent tensors Aptq were previously considered in (Koch and Lubich, 2010) with help of an orthogonal projection on the tangent space T Y Mr . This can be accomplished by a Galerkin condition, also known as Dirac-Frenkel variational principal (Beck et al., 2000), that is (omitting the argument t)

xY A, Vy 0 9

9

for all

VPT Y Mr .

(19)

The core and the factors of the decomposition of Y fulfilling (19) are explicitly given by a system of differential equations. We state here the associated theorem from (Koch and Lubich, 2010). Theorem 1 (Koch and Lubich (2010)). For a tensor Y orthonormal columns the condition (19) is equivalent to YC 9

9

d ¡

Un

n 1

d ¸

n 1

9 C n U n

P

¡

Mr with n-mode factors having

Uk

(20)

k n

with the core and the factors satisfying 9 C9 A

d ¡

UTk ,

k 1 ¡ 1 K 9 9 U UTk CTpnq Cpnq CTpnq n Pn A pnq k n K T where PK n I Un Un is the orthogonal projection onto R pUn q .

5

(21)

2

The equations (21) can be integrated for a given initial tensor Yp0q P Mr to obtain the Tucker decompositions Yptq P Mr , t ¡ 0, which are approximations to Aptq P RI1 I2 Id in the sense of the Galerkin condition (19), or equivalently, the optimization problem

M }Yptq Aptq}. YptqPT 9

min

9

Y

9

(22)

r

The error of (19) can be bounded in terms of the best-approximation error (1), which can be extended to the case of tensor differential equations if (one-sided) Lipschitz conditions for F are assumed (Koch and Lubich, 2010).

4

Optimization approach and Euler scheme via ALS

4.1

Dynamical approximation without orthogonality constraint in the tangent space

9 We can derive generalizations of (21) if we give up the orthogonality constraint UTn U n 0. We will then numerically tackle these implicit equations by an optimization scheme, which will allow us to overcome numerical problems in the factor equations coming from the orthogonal projections and the necessary inversion of core unfoldings.

Theorem 2. For a tensor Y mization problem

P Mr with n-mode factors having orthonormal columns the opti}Yptq Aptq} 9

min

p qP

Y9 t TY Mr

9

(23)

is equivalent to Y9 C9

d ¡

d ¸

Un

n 1

9 C n U n

n 1

¡

Uk

(24)

k n

with the core and the factors satisfying 9 C9 A

9

Un

d ¡

UTk

k 1

A 9

¡

k n

d ¸

k 1

UTk

pnq

9 , C k UTk U k

UnCpnq Un 9

¸

C

T 9 k Uk Uk

k n

9 which reduces to (21) if UTn U n

pnq

p q Cp q p q

CTn

T n C n

1

(25) ,

0, n 1, . . . , d.

Proof. We look at the optimization problem min

p qP

}Yptq Aptq}. 9

Y9 t TY Mr

9

Since a tangent tensor in TY Mr has the form (24) we will minimize for given Yptq objective function

}C 9

d ¡

k 1

Uk

d ¸

9 C k U k

k 1

¡

` k

6

U`

A}2 9

P Mr the

9 . Rewriting yields with respect to C9 and U k

}C 9

d ¡

d ¸

Uk

k 1

2

k 1

d ¸

xA

k 1

9

¡

` k

9 C k U k

¡

9 }2 U` A

}A}2 }C}2 2xA 9

9

9

` k

9 y UT` , C k U k

d ¡

d ¸

UTk , C9 y

k 1

2

d ¸

xC, C k UTk Uk y 9

9

d ¸ ¸

}C k Uk }2 9

k 1

xC k UTk Uk , C ` UT` U`y. 9

9

k 1` k

k 1

9 hence, setting the gradient with respect to C 9 equal to This is a convex quadratic function in C, zero gives the first equation in (25). We now define the part of the objective function involving 9 only terms in U n with fixed index n (utilizing nth unfolding) as the convex quadratic function

2 9 9 q : }U 9 C 9 fn pU n n pnq } 2xUn , A

¡

` n

T 9 9 ,U C 2xU n n pnq Cpnq y

UT` spnq CTpnq y

2

¸

xUn, Un C ` UT` U` pnqCTpnqy. 9

9

(26)

` n

The derivative of fn set to zero gives the second equation of (25).

2

Note that a discretization scheme using the derivatives (21) or (25) requires the solution of a 1 least squares problem for the factor matrices by calculating the pseudo inverse CTpnq Cpnq CTpnq of the nth-unfolding of the core tensor. Small singular values in Cpnq will affect the solution and make the method unstable (Kieri et al., 2016). However, the advantage of (25) lies in the 9 fact that the derivative U n is not orthogonal to the range of Un . This becomes apparent in a discretization scheme that we will establish in the forthcoming.

4.2

Euler method under orthogonality constraint

We will consider Euler’s method for integration via an alternating least squares (ALS) scheme. For that purpose, all but one factor matrix derivative will be kept fixed consecutively, where the core derivative is assumed to be already computed according to the first equation in (25). Consider an Euler step at t for the second equation of (25) with length h ¡ 0, that is Un pt

hq : Un ptq

9 ptq. hU n

(27)

9 Note that the orthogonality gauge UTn U n 0 prevents the updated factor matrices from having orthonormal columns, that would be Un pt hqT Un pt hq I. On the other hand, as will become apparent from Theorem 3 below, the function fn (26) with respect to the variable T Uhn : Un pt hq gets linear if Uhn Uhn I is imposed. We will therefore seek for an Euler step under the orthogonality constraint for the updated factor matrices. More precisely, for fixed t we define for Vn P RIn rn (omitting the argument t)

frn pVn q : fn

1 h

pV n U n q ,

(28)

and define Uhn as the solution of the constraint optimization problem min frn pVn q.

TV Vn n I

(29)

The following theorem characterizes the updated factor matrices Uhn from one Euler step according to (29) as the solution to a trace optimization problem, which is computable from an SVD of a matrix of size In rn . 7

Theorem 3. The optimization problem min frn pVn q

(30)

TV Vn n I

is equivalent to the trace optimization problem

max Tr BTn Vn ,

TV Vn n

with Bn

9 A

¡

UTk

k n

UnCpnq Un pnq

(31)

I

¸

9

9 C k UTk U k

k n

T p q Cpnq .

T pnq Cpnq

1 h Un C n

Moreover, given the (economy size) SVD Bn Rn Σn TTn with Rn P RIn rn , Tn the diagonal matrix Σn P Rrn rn , the solution Uhn to (31) is given by

P Rr

n

(32)

rn and

Vn

Uhn RnTTn . (33) ° Proof. Let B1n A kn UTk Un Cpnq Un kn C k UTk Uk pnq CTpnq . Then we pnq have fn pUn q }Un Cpnq }2 2xUn , B1n y and hence frn pVn q h2 }Cpnq }2 xVn , Un Cpnq CTpnq y h2 xVn , B1n y xUn , B1n y , where we used VnT Vn UTn Un I. Thus, the optimization problem (30) is equivalent to max xVn , B1n h1 Un Cpnq CTpnq y, V V I 9

9

9

9

9

9

2

T n

n

which is the trace optimization problem (31). Consider now the singular value decomposition Σ diag p σ , . . . , σ q n 1 r n r nΣ r n TT with R r n P RIn In , Σ rn Bn R P RInrn and n 0 0 Tn P Rrn rn . Note that we assume rn ¤ In . Cyclic permutation of the arguments in the trace yields

Tr BTn Vn

Tr TnΣr nRr Tn Vn Tr Σr nRr Tn VnTn

rn ¸

σj wjj ,

j 1

r T Vn Tn P RIn rn . Owing to WT Wn I, we have for where we define the matrix Wn : R n n the diagonal elements |wjj | ¤ 1. Hence, the maximum of the objective in (31) is attained for wjj 1, which implies I RTn Vn Tn and consequently Vn Rn TTn . 2

Note that the matrices Bn do not require inversion of the potentially problematic matrices Cpnq CTpnq . Theorem 3 readily generalizes to the case of complex valued dynamical tensor approximation. More precisely, in the subproblems of the ALS scheme the functions fn boil down to the real part 9 , B1 ys. As a consequence, the equivalent trace optimization of the inner product, that is RerxU n n problem takes the form max RerxA, Bys,

A A I

(34)

with the solution A0

RT,

where B RΣT the economy sized SVD of B. 8

(35)

4.3

Algorithm for Euler step via ALS

The computation of an Euler step for (25) via alternating least squares for (23) is summarized in Algorithm 1. Algorithm 1 Euler step from t Ñ t h via ALS for (23) Initialize ∆Un , e.g., ∆U 0, ° n 1 . . . , d, d n T d T 9 Initialize ∆C A U k1 k1 C k Uk ∆Uk . k repeat for n 1, . . . , d do 9 9 Compute Bn from (32) with U n Ð ∆Un and C Ð ∆C T Compute economy SVD Bn Rn Σn Tn Set Uhn Rn TTn Set ∆Un pUhn Un q{h end for °d d T T 9 Update core ∆C A k1 C k Uk ∆Uk k1 Uk until fit (23) ceases to improve or maximum iterations exhausted return Approximation to Ypt hq pC h∆Cq dn1 Uhn An Euler method for the dynamical approximation (23) using Algorithm 1 starts from a rank-r decomposition or (best-) approximation of Ap0q, e.g., via higher order orthogonal iteration (HOOI), see (Kolda and Bader, 2009). Higher order schemes of Runge-Kutta type can be composed based on the above Euler scheme.

4.4

Regularization

9 In the course of the derivation of Algorithm 1 the orthogonality gauge UTn U n 0 of (Koch and Lubich, 2010) was given up. As a consequence, elements of the tangent space TY Mr have no unique representation any more. Although, we have not observed any related problems in the numerical tests, we found that including a regularization of Tikhonov type helps to stabilize the SVD in cases where Bn is effectively rank deficient and where Cpnq CTpnq is numerically close to singular, respectively. More precisely, we can treat the problem (α ¡ 0) 9 q min fn pU n

9 }2 , α }U n

9

Un

(36)

compare with (26). Note that this positively affects definiteness of the quadratic term in (26), 9 ,U 9 pC which becomes xU n n pnq CTpnq αIqy. The discrete scheme changes as follows. We treat the modified optimization problem min frn pVn q

TV Vn n I

2α h2

xVn, Uny,

α h

Un qT Vn

(37)

which is equivalent to

max Tr pBn

TV Vn n

I

(38)

9 according to Theorem 3. Observe that the regularization term is only effective because UTn U n K 1 9 0. In fact, if Un h pVn Un q P R pUn q we get

Pn Vn

Un,

9

where Pn is the orthogonal projection onto R pUn q. Hence, the regularization term would deT 2 generate to TrrUTn Vn s TrrUTn pPn Vn PK n Vn qs TrrUn Pn Vn s }Un } . The choice α 9 h2 makes the regularization term in (37) independent of the step size h. If regularization was considered in the numerical examples below we have always chosen α h2 .

5

Numerical validation

We use the Tensor Toolbox Version 2.6 (Bader et al., 2015), which implements efficient tensor arithmetics and fitting algorithms for different tensor classes in MATLAB (Bader and Kolda, 2007). Results were achieved by using the Vienna Scientific Cluster 3 (VSC3). Example 1. (Koch/Lubich) We consider the numerical example as in (Koch and Lubich, 2010). A time-dependent data tensor A P R15151515 is constructed as Aptq expptq B

4 ¡

Uk

ε 1

t

sinp3 tq C,

t P r0, 1s,

(39)

k 1

where Uk P R1510 are random matrices with orthonormal columns, B P R10101010 a random core tensor, and C P R15151515 a random perturbation. The dynamical Tucker tensor approximations are computed with r p10, 10, 10, 10qT , fixed h 104 and different ε P t105 , 104 , 103 u. We use Euler steps described in Algorithm 1 without regularization. For initialization we use ∆Un 0, n 1 . . . , d. Moreover, a quasi best-approximation for Ap0q A0 is used, which is derived from HOOI. Note that this example solves the tensor initial value problem 9 A A

A p0 q A 0 .

ε 3 cosp3 tq sinp3 tq t C,

Results are shown in Figure 1, where relative errors are given, i.e., }Aptq Yptq}{}Aptq}. Alg. 1 9 ptq Y 9 ptq}{}A 9 ptq}, which was used with a tolerance of 105 for the change in the relative fit }A was basically reached after two loops.

Figure 1: Errors of the dynamical approximation of (39) with r t105, 104, 103u. Interval r0, 1s, step size h 104.

p10, 10, 10, 10qT

with ε

P

Example 2. (Second Order) We repeat Example 1 with r p8, 9, 10, 11qT and adjusted sizes in (39) for Uk and B. The 10

second order improved Euler method (extrapolated Euler) is now used, i.e., for an initial value problem y1

f pt, yq y p0q η0 the new iterate ην at time tν tν 1 according to ην

tν 1

ην 1

h is calculated from the previous iterate ην 1 at time

hf ptν 1

h 2 , ην 1

h 2f

ptν 1, ην 1qq,

which is a 2-stage explicit Runge-Kutta formula. The half and full Euler steps, respectively, are calculated using Alg. 1. Figure 2 shows relative errors for different step sizes h and ε 1010 on the interval r0, 1s (left) and compares these errors among each other (right). We can observe 10 -4

64

10 -6

Error

err1/err2 err1/err3 err1/err4 err1: h = 0.01 err2: h = 0.005 err3: h = 0.0025 err4: h = 0.00125

10 -8

10 -10

0

0.5

16 4 1

0

t

0.5

1

t

Figure 2: Errors of the dynamical approximation of (39) with r p8, 9, 10, 11qT and ε 1010 using the improved Euler method (left) and comparison of the errors among each other (right). second order convergence in h. Example 3. (Small singular values in higher dimensions) We generalize the example from (Kieri et al., 2016) to higher dimensions. A time-dependent tensor Aptq P RI1 I2 Id with Ik I is constructed as Aptq expptq C

d ¡

expptWk q

(40)

k 1

with skew-symmetric random matrices Wk P RI I and a super-diagonal tensor C P RI1 I2 Id with diagonal entries cj 1{2pd1qj , j 1, . . . , I. We now use the Euler method of Alg. 1 and regularization with α h2 . Figure 3 shows relative errors at t 0.3 against step length h for different ranks in the two dimensional case with I 100. We repeat these computations for d 3 and I 50 and give associated results in Figure 4. The tolerance for the change of the fit was 106 and basically reached after two loops in Algorithm 1.

6

Conclusion

An optimization-based approach for dynamical low-rank matrix and Tucker tensor approximation of parameter-dependent data tensors and solutions of tensor differential equations such as MCTDH of molecular quantum dynamics was presented. The approach is motivated by the 11

Error at t = 0.3

10 -1 10 -2 10 -3

rank = 4 rank = 8 rank = 16

10 -4 10 -5 10 -5

10 -4

10 -3

10 -2

h

Figure 3: Errors at t 0.3 of the dynamical approximation of (40) in the case d I 100 for different step sizes h and ranks r pr, rq.

2 and

Error at t=0.3

10 -1 10 -2 10 -3 10

rank = 4 rank = 6 rank = 8

-4

10 -5 10 -5

10 -4

10 -3

10 -2

h

Figure 4: Errors at t 0.3 of the dynamical approximation of (40) in the case d 3 and I for different step sizes h and ranks r pr, r, rq.

50

effectiveness of alternating schemes for static tensor fitting problems, which reduce quadratic optimization subproblems to eigenproblems accessible via SVD. The presented method uses alternating least squares to establish an Euler scheme for the dynamical fitting problem on the tangent space without a gauge constraint. The quadratic subproblems are found to be equivalent to trace optimization with solutions being explicitly computable via SVD of small size. Remarkably, the method needs no pseudo inverse of matrix unfoldings of the core tensor (or inverse of density matrices in MCTDH), which enhances stability in the presence of small singular values. This makes the method robust and eliminates numerical stability problems coming from ”overestimated” initial values. Regularization of Tikhonov type is suggested to compensate for the lack of uniqueness of the tangent space representations, having the positive side effect to further stabilize in rank deficient cases. Higher order methods can be composed, where a second order example is given. In forthcoming work the approach can be extended to the hierarchical Tucker format and tensor trains (matrix product states) (Lubich et al., 2013).

Acknowledgments Financial support by the Austrian Science Fund (FWF) via the SFB ViCoM (grant F41) is acknowledged. The computations were achieved by using the Vienna Scientific Cluster (VSC).

References Bader, B. W. and Kolda, T. G. (2007). Efficient MATLAB computations with sparse and factored tensors, SIAM Journal on Scientific Computing 30(1): 205–231. https://doi.org/ 10.1137/060676489.

12

Bader, B. W., Kolda, T. G. et al. (2015). Matlab tensor toolbox version 2.6, Available online. http://www.sandia.gov/~tgkolda/TensorToolbox/. Beck, M. H., J¨ ackle, A., Worth, G. and Meyer, H.-D. (2000). The multiconfiguration timedependent Hartree (MCTDH) method: a highly efficient algorithm for propagating wavepackets, Physics reports 324(1): 1–105. https://doi.org/10.1016/S0370-1573(99)00047-2. De Lathauwer, L., De Moor, B. and Vandewalle, J. (2000). On the best rank-1 and rank-(r 1, r 2,..., rn) approximation of higher-order tensors, SIAM journal on Matrix Analysis and Applications 21(4): 1324–1342. https://doi.org/10.1137/S0895479898346995. Grasedyck, L., Kressner, D. and Tobler, C. (2013). A literature survey of low-rank tensor approximation techniques, GAMM-Mitteilungen 36(1): 53–78. https://doi.org/10.1002/ gamm.201310004. Hackbusch, W. (2012). Tensor spaces and numerical tensor calculus, Vol. 42, Springer Science & Business Media. https://doi.org/10.1007/978-3-642-28027-6. Kieri, E., Lubich, C. and Walach, H. (2016). Discretized dynamical low-rank approximation in the presence of small singular values, SIAM Journal on Numerical Analysis 54(2): 1020–1038. https://doi.org/10.1137/15M1026791. Koch, O. and Lubich, C. (2007). Dynamical low-rank approximation, SIAM Journal on Matrix Analysis and Applications 29(2): 434–454. https://doi.org/10.1137/050639703. Koch, O. and Lubich, C. (2010). Dynamical tensor approximation, SIAM Journal on Matrix Analysis and Applications 31(5): 2360–2375. https://doi.org/10.1137/09076578X. Kokiopoulou, E., Chen, J. and Saad, Y. (2011). Trace optimization and eigenproblems in dimension reduction methods, Numerical Linear Algebra with Applications 18(3): 565–602. https://doi.org/10.1002/nla.743. Kolda, T. G. and Bader, B. W. (2009). Tensor decompositions and applications, SIAM review 51(3): 455–500. https://doi.org/10.1137/07070111X. Lubich, C. (2015). Time integration in the multiconfiguration time-dependent Hartree method of molecular quantum dynamics, Applied Mathematics Research eXpress 2015(2): 311–328. https://doi.org/10.1093/amrx/abv006. Lubich, C. and Oseledets, I. V. (2014). A projector-splitting integrator for dynamical low-rank approximation, BIT Numerical Mathematics 54(1): 171–188. https://doi.org/10.1007/ s10543-013-0454-0. Lubich, C., Oseledets, I. V. and Vandereycken, B. (2015). Time integration of tensor trains, SIAM Journal on Numerical Analysis 53(2): 917–941. https://doi.org/10.1137/ 140976546. Lubich, C., Rohwedder, T., Schneider, R. and Vandereycken, B. (2013). Dynamical approximation by hierarchical Tucker and tensor-train tensors, SIAM Journal on Matrix Analysis and Applications 34(2): 470–494. https://doi.org/10.1137/120885723. Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis, Psychometrika 31(3): 279–311. https://doi.org/10.1007/BF02289464.

13

An optimization approach for dynamical Tucker tensor

An optimization approach for dynamical Tucker tensor

Suggest Documents

HIERARCHICAL TUCKER TENSOR REGRESSION: APPLICATION ...

An Optimization Approach for Fitting Canonical Tensor Decompositions

Tucker Tensor Regression and Neuroimaging Analysis

Tucker Tensor Regression and Neuroimaging Analysis

DYNAMICAL TENSOR APPROXIMATION 1

Multiibjective optimization: an inertial dynamical approach to Pareto ...

An Approach for Dynamical Adaptive Fuzzy

An optimization approach

An optimization approach for radiosurgery ... - Semantic Scholar

Query Optimization- An Experimental Approach for ...

An Optimization Based Approach for Relative ...

An optimization approach for agent-based ...

An optimization-based approach for designing ...

AN APPROACH TO DYNAMICAL SYSTEMS USING ... - Documat

AN OPTIMIZATION APPROACH TO THE

Hydropower optimization: an industrial approach

Tensor model and dynamical generation of commutative

Dynamical stability in scalar-tensor cosmology

A second order dynamical approach with ... - Optimization Online

a continuous dynamical newton-like approach to ... - Optimization Online

Multifactorial Optimization Approach for

An Adaptive Dictionary Learning Approach for Modeling Dynamical

A Tensor Voting Approach for the Hierarchical

An Interactive Approach for Multi-Criterion Optimization, with an ...