A Singular Value Decomposition of a k-Way Array for a ... - Core

0 downloads 0 Views 1MB Size Report
(1998). 0 1998 Elsevier Science Inc. All rights reserved. 0024-3795/98/$19.00. 655 Avenue of the Americas, New York, NY 10010. PII sOO24-379$97)90229-2 ...
A Singular Value Decomposition of a k-Way Array for a Principal Component Analysis of Multiway Data, PTA-k Didier Leibovici* School of Computing

and Mathematical

University of Greenwich, Wellington Street London

Woolwich

Sciences

Campus

SE18 6PF, U.K.

and Robert Sabatier Fazulte’ de Pharmacie Avenue Charles Flahaut 34000

Montpellier,

France

Submitted by Richard A. Brualdi

ABSTRACT Employing

a tensorial

decomposition

approach

singular value, based on a generalization form. A recursive tion

of the

concepts:

0

algorithm

Eckart-Young

the orthogonal

generalization over

to describe

of this type of multiarray

theorem

of the transition is introduced

1998 Elsevier

Science

which

formulae, termed

value

has a Gauss-Seidel

SVD-k.

rank. The

by a principal

conserves

singular

given to attain a A generaliza-

by consideration

rank and the free orthogonal PTA-k,

the

The algorithm

leads to the decomposition

in data analysis is illustrated

k modes, termed

a k-way army,

is established.

of new

application

component

rank

of this

analysis (PCA)

most of the properties

of a PCA.

Inc.

*E-mail: [email protected].

LINEARALGEBRAAND

ITS APPLICATIONS

0 1998 Elsevier Science Inc. All rights reserved. 655 Avenue of the Americas, New York, NY 10010

269:307-329 (1998) 0024-3795/98/$19.00 PII sOO24-379$97)90229-2

DIDIER

1.

LEIBOVICI

AND

ROBERT

SABATIER

introduced

by Cailliez

INTRODUCTION

While in factorial data analysis the duality scheme and Pages (1976) of such simple

has permitted

exploratory

methods

or multiple

explain

adequate

comprehension

as principal

component

correspondence

a multimode

analysis

analysis

analysis (CA or MCA),

is limited.

two-way arrays. For a three-way

in algebraic

Thus,

it is most

terms

(PCA)

and

its capacity appropriate

array, three duality schemes

to for

can be drawn,

but each entry does not play the same role, as for example in the case of the statis and prestatis duality schemes

methods

linked to this composite Algebraists PCA

models

k-way arrays

(Tucker,

19661,

and

De

developed Leeuw

have systematically

(1987)

(1992)

existing

It is in this context

in his thesis spaces.

and analytically

h ave been

used the Kronecker (19701,

1970;

which

a model

that the Tucker

Focusing

Kruskal,

1977)

are actually combining

operating

The

product.

the

both

between

by

latter

on SVD

and Candesame

the

Apart from the Kronecker

product

(PCA-3)

introduced.

model.

orthogonal

product,

matrices

included

a new algebraic

approach,

This new approach Candecomp,

enabled

Parafac,

Franc

and PCA-3,

extend them to k modes without difficulty. The purpose of this presentation is to base an extension with k modes,

for main

which

for a fixed

an extension of the algebraic framework was also required. work was conducted in this area by Kaptein et al. (19861, and

of k vector

algebraically

The

modes

algebra. A k-way array is seen as a tensor of order k, an element product

models

models.

in PCA over three

approach.

can be seen as the tensor

Franc

from models such as

have developed

to extend

(1980),

has described

Parafac and the Tucker representation, Preliminary

to generalize

(SVD),

model (Harshman,

by Carol1 and Chang

Yoshisawa

These

along with other models

in the manner in which the data may be represented,

have been the Parafac camp

attempting

for what optimization.

Kroonenberg authors

and

lies, however,

and thereafter,

(1993),

design.

value decomposition

arrays

problem

in Leibovici

and statisticians,

or singular

three-way

(i.e. statis on the arrays); see Lavit (1988).

are described

by deriving singular values and the SVD,

the tensor of the tensor to describe and also to

of PCA to a PCA using the tensorial

approach in order to obtain a theorem similar to that of Eckart and Young (1936). In the second section, simple theoretical elements of the tensor product are described.

Two further sections are devoted to the explanation

the SVD for a tensor of order 2, 3, or k. An algorithm

of

to obtain the SVD-k

will be shown in Section 5, and a generalization of the Eckart-Young theorem for a tensor of order k in Section 6. This last part leads to the elaboration of a method termed principal tensor analysis over k modes (PTA-k),

which can be

SINGULAR

VALUE

DECOMPOSITION

used as a standard method for multidimensional

2.

TENSOR

309

for multiway multidimensional

analysis, as PCA is

analysis.

PRODUCT

AND

MULTIWAY

ARRAYS

Firstly, it is essential to recall some definitions for a simple construction of the tensor product and some of the main properties of the calculus from which subsequent

methodologies

greater detail in Chambadal Allouch (1984),

will be derived.

and Ovaert (19681,

These

points are given in

Schwartz (19751, Charles and

and Lang (1984).

DEFINITION1. (i) Let

E,, . . . , E,

be k Euclidean

vector

spaces of finite dimensions,

with metrics D,,..., D,. With a k-tuple (a,, . . . , uk) of vectors in these spaces, let the element denoted a, 8 a2 @ ... 8 ak be a k-linear map on

E, XE,

where

x ... x E, defined by

( , )E, indicates tensor.

the inner product

in E,. This element

is termed

a

decomposed

(ii) The

space generated

tensor product

by all the decomposed

of the k spaces

E,. Its dimension

is the product

tensors

E,, . . . , E,, and is denoted

c

tensors.

Let

AiiiP,.,ik el,,

the

of the dimensions.

(iii) The inner product in E, @ E, @ ... 8 E, is defined

for decomposed two tensors:

is termed

E, @ E, @ ... @

{eji,.. . . “j,}

as

be a basis of Ej, and X and A be

@ e2iz 8 ... Q ekik,

i,i,...i,,

c i,i,

Xiliz...it el,, ik

8 e2i2 8 ... Q ek,t

(2) E,@E,@

...

@Ek

310

DIDIER

i,i, =

LEIBOVICI

AND

ROBERT

SABATIER

. ik

“x”(D,

c%+ D, -.. c$ D,)

A”

= (X, A) E,@E2@.~~@E, AiliZ...ik’

where

xi,i,...ik

the vectorialization length dim(E,

R,

means the Kronecker product, and x” is X, i.e., its representation as a vector of

d

8 Es o *-- @ Ek). This definition

(E, where

E

of the tensor

@ E, 8 .-a 63 Ek)* = E:

leads to the expression

@ E,* @ --. @ Et,

(3)

* means the dual space.

(iv> Chambadal the tensor product

and Ovaert

(1968)

g eneralize

of two linear applications:

+ F,; then let A: E, ~3 E, + F, @ F2 such that A,(x,);

this unique linear application

(v) A useful operation

is expressed

is proposed

consists

of tensor

contraction

(3) defining

A, : E, + F, and A, : E, A(x, @ x2> = A,(x,) as A = A, 8 A,.

by Schwartz

image of a vector by a linear application by a tensor, here denoted

the assertion

Let

(19751,

as the contracted

generalizing

8 the

product of a vector

. . (no notation having been given by the author).

multiplication

of the tensor

and the vector

on the space to which the vector belongs.

temnr; of E @ F 8 G, and let {e,ll,n,

A = xAijkei

{&>,,,,

followed

It by

For example, let A be a

and {gkll, p be bases of E, F,

@fj 63 gk.

qk

Consider

a vector z* E G*. Then A.. z* = xAijkei

@h(gk,

z*>

@f,(gk.

Ez,gi)= m

ijk

=

CAijkei ijk

CAijkZkei ijk

@&-

(4

A . . z* is an element of E @ F. With z an element of G, A . . z will often be expressed in the same way, explaining a contraction as an inner product. In (4), ( gk, z*> is then changed to ( gk, 2)~. Thus the inner product of two tensors can be seen as the contracted product between them, and so the

SINGULAR

VALUE

metric may be expressed

(A, X>E,0E20

311

DECOMPOSITION [see (iv)] as

=

. ..OEt

A.. X = A . . (D,

@ D, 63 ..a Q Dk) .. X.

(5)

REMARK 1. (1) Note that (4) can be obtained

by transforming

dim( E 8 F) rows and dim(G)

columns

with qn rows and p columns. to

Computing

If complete

vectorialization

expressed

the image of z by this matrix leads

c,

+

A..z*

= AGz*.

is put into bijection,

L(R;

(7) for example

E @ F @ G), then the indexed vectorialization F @ G as L(G*; E 8 F). (2) The fundamental

difference

A to a matrix with

as

between

&

E @ F 8 G and E @

as in (6) identifies

and 8 is that the Kronecker

product operates with a specific and fixed choice of base (lexicographic of indices),

i.e.,

The advantage

order

8 is algebraic, whereas & is arithmetic (on coordinates). of the tensor product is the flexibility of its representations.

They depend on the operation

applied.

(3) Using the contracted product have an underlying use of metrics.

with the inner product

enables

one to

There are several important properties of the tensor product which may be considered fundamental to factorial data analysis.

PHOPERTY 1. (a> Definition l(iii) describes the universal property of the tensor product, which is generally taken for the definition and construction of the tensor

DIDIER

312 product.

For

commutative

any

bilinear

map

LEIBOVICI

S the space

AND ROBERT tensor

product

SABATIER implies

the

diagram S (bilinear) (8) ETFxrY

E@F (b)

The tensor

product

of two subspaces

of E and F is a subspace

of

E @J F. (c)

to L( F; E), the space of linear ‘maps from

E @ F* is isomorphic

to L(E; F).

E, and E* @J F is isomorphic

F to

Even if E @ F # F Q E, they are

isomorphic. (d)

The operation

@ is associative.

By Property 1( c ) a matrix is identified with a linear map and with a tensor of order two: E @ F N L( F*; E) N M(n; q; I@. The factorial analysis methods can thus be described by tensor calculus. This approach can be generalized to an array with k ways, by consideration order k, i.e., an element and in our presentation,

of a tensor product

of the latter as a tensor

of

of k vector spaces. In practice,

those spaces will be iw”~, where m, is the number

of

cells in way t.

3.

SINGULAR

VALUES

FOR

TWO

MODES

Let S, : E* X F* + R be the bilinear map defined by S,(e* , J;* > = Xij with {e?),, ,&*I,,

the canonical

of the tensor pro&ct

bases of the spaces. The universal property

implies

(9)

E* @ F* Then for all rj~* and 4p* in E* and F*,

=

“(lpd p)F=

s,.

(10)

SINGULAR

VALUE

313

DECOMPOSITION

PROPERTY 2. (i)

The first singular

value can he expressed

CT, = ,,81;;‘2y_Lgx(111* @ cp*> = ,,cp*,,;:= I =

max

ll*llE= IldF=

1 1

=x-c+,

(ii)

($8

by diflerent

ma

(**

@ P*, X>

max

X..(Ic,@

~0)

llti*llt*= 1 Il~*ll,~. = I

q,X)~@r=

muximization,s:

Il*llF= 1 Ilqllf:= 1

(11)

@ cpl>

The tensor solution in (11) is unique

up to an orthogonal

transforma-

tion leaving X invariant. Proof.

It is a maximization

of a continuous

{ l+b @ cp~ll$bllE = 1, which is closed in a compact This implies

the existence

IIPIIF = I}

c

linear map over

{~llME@F = I}>

set (the unit sphere);

of cr. The

uniqueness

(12)

thus it is itself compact. is because

map.

of the linear ??

In expressing the Lagrange problem associated with this maximization, the classical transition formulae which lead to the eigenequations of the well-known

operators

are found. In matrix form these are

If there are metrics

In a tensorial

D and Q on E and F respectively,

XQP = N

XQ”XDrc, = a2rC,

“XDtc, = arp

“XDXQcp = c2q

form the transition

formulae

where

X is the tensor equivalent

(14

are

x..cp = a*, x .. * =

(13) becomes

(15)

crcp,

to the matrix

X.

314

DIDIER

LEIBOVICI

The other singular values can be obtained Lemma in the optimization.

ity constraints generalization)

enables

E

@a F = (El

generated

and uniqueness.

by the first solution,

so that

h Et) @(F, 6 F;)

= (E,

@F,)

h(E:

@F;)

= (El

@F,)

h(E,

6s F$

in (E,

with constraint

projections

null tensor. Thereafter,

h(E,

8

h(E: @Fl)

F:)

(16)

;

the maximization

is in fact in El’ @ F1’ , i.e., of the orthogonal

this space is termed the orthogonal

the

space lead to the tensorial

space of

E, and F,.

REMARK 2. and De Leeuw

The well-known (1980)

duality in these solutions. orthogonal

@ F1)l

of X on the other subspaces

the subspaces

A priori

in E or F or both. Let E, and

note that El’ ~3 VI1 c (E, ~3 F1)’ . Given the duality [(13) or (15)] in the first solution, solution

SABATIER

by consideration of orthogonal1 given in Section 4 (for the

us to affirm the existence

there is a choice with regard to orthogonality

F, be the subspaces

AND ROBERT

derive

core matrix in the PCA-3 from ‘this observation

of Kroonenberg

and from the lack of

That is to say, for three modes the solution in the

space of the first solution is not always in the orthogonal-tensorial

space of the preceding solution. After reiterating the process of solution for singular values or, in this case, after diagonalization (13), an orthogonal decomposition singular values decomposition SVD-2, may be expressed

of the tensor, as

the

(17) or in matrix form,

rank x

(18)

SINGULAR

VALUE

The well-known

DECOMPOSITION

matrix approximation

permits the performance

315 theorem

may thus be formulated,

it

of a PCA:

THEOREM 1 (Eckart and Young, 1936). The best rank r (r < q> approtimation of a rank 4 matrix X, according to the norm coming from the inner product in E 8 F, is given by the matrix built with the first I^ tensors of the

SVD:

the squared distance

being

min

[IX - 211” = IIX - x,11” =

Z rank Z=r

Suggest Documents