Recursive Exponentially Weighted N-way Partial

2 downloads 0 Views 293KB Size Report
Partial Least Squares Regression with ... reshape tensor to matrix. 6. . . ( ) .... The n-mode vector product of tensor “× ”; see 57. - PARAFAC ...
Recursive Exponentially Weighted N-way Partial Least Squares Regression with Recursive-Validation of Hyper-Parameters in Brain-Computer Interface Applications Andrey Eliseyev1*, Vincent Auboiroux1, Thomas Costecalde1, Lilia Langar2, Guillaume Charvet1, Corinne Mestais1, Tetiana Aksenova1, Alim-Louis Benabid1

1

Univ. Grenoble Alpes, CEA, LETI, CLINATEC, MINATEC Campus, 38000 Grenoble, France

2

Centre Hospitalier Universitaire Grenoble Alpes, 38700 La Tronche, France

*

Corresponding author: [email protected]

Appendix A REW-NPLS algorithm (𝑡−1)

Input:

𝐗 (𝑡) , 𝐘 (𝑡) , 𝐂𝐗𝐗

(𝑡−1)

, 𝐂𝐗𝐘

, {𝐰𝑓1 , … , 𝐰𝑓𝑛 }

(𝑡−1)

forgetting coefficient 𝜆, maximum number of factors 𝐹max . (𝑡)

(𝑡) (𝑡) ̃ (𝑡) , 𝐘0(𝑡) , 𝐂𝐗𝐗 𝐁 , 𝐂𝐗𝐘 , {𝐰𝑓1 , … , 𝐰𝑓𝑛 } .

Output:

∗ % 𝐹RV provides the minimum

∗ 1. 𝐹RV = 𝐑𝐞𝐜𝐮𝐫𝐬𝐢𝐯𝐞𝐕𝐚𝐥𝐢𝐝𝐚𝐭𝐢𝐨𝐧(𝐗 (𝑡) , 𝐘 (𝑡) )

error, taking into account current and previous datasets (𝑡)

(𝑡)

% normalization of the new

2. {𝐗 Norm , 𝐘Norm , 𝛍𝐗, 𝛍𝐘, 𝛔𝐗, 𝛔𝐘} =

data

𝐍𝐨𝐫𝐦𝐚𝐥𝐢𝐳𝐚𝐭𝐢𝐨𝐧(𝐗 (𝑡) , 𝐘 (𝑡) ) (𝑡)

(𝑡−1)

3. 𝐂𝐗𝐗 = 𝜆𝐂𝐗𝐗

(𝑡)

(𝑡)

+ 𝐗 Norm ×1 𝐗 Norm

% updating of the covariance tensor

(𝑡)

(𝑡−1)

4. 𝐂𝐗𝐘 = 𝜆𝐂𝐗𝐘

(𝑡)

(𝑡)

+ 𝐗 Norm ×1 𝐘Norm

(𝑡)

(𝑡)

(𝑡)

(𝑡)

5. 𝐂𝐗𝐗 = 𝐑𝐞𝐬𝐡𝐚𝐩𝐞(𝐂𝐗𝐗 ) ∈ ℝ(𝐼1 ∙…∙𝐼𝑛)×(𝐼1 ∙…∙𝐼𝑛)

% reshape tensor to matrix

6. 𝐂𝐗𝐘 = 𝐑𝐞𝐬𝐡𝐚𝐩𝐞(𝐂𝐗𝐘 ) ∈ ℝ(𝐼1 ∙…∙𝐼𝑛)×(𝐽1 ∙…∙𝐽𝑚) 7. 𝐏 = 𝟎 ∈ ℝ(𝐼1 ∙…∙𝐼𝑛)×𝐹max , 𝐑 = 𝟎 ∈ ℝ(𝐼1 ∙…∙𝐼𝑛)×𝐹max , 𝐐 = % initialization of the matrix 𝟎 ∈ ℝ(𝐽1 ∙…∙𝐽𝑚)×𝐹max

P, R, Q with 0

8. for 𝑓 = 1, … , 𝐹max 9.

(𝑡) 𝑇 (𝑡)

(𝑡)

% the eigenvector with the

̃ = 𝐞𝐢𝐠 (𝐂𝐗𝐘 𝐂𝐗𝐘 ), else 𝐪 ̃ = 𝐂𝐗𝐘 if 𝑚 > 1: 𝐪

largest eigenvalue (𝑡)

10.

̃) ∈ ℝ𝐼1 ×…×𝐼𝑛 𝐂 = 𝐑𝐞𝐬𝐡𝐚𝐩𝐞(𝐂𝐗𝐘 𝐪

11.

{𝐰𝑓1 , … , 𝐰𝑓𝑛 }

12.

𝐰 = 𝐰𝑓1 ∘ … ∘ 𝐰𝑓𝑛

% tensor formation

13.

𝐰 = 𝐑𝐞𝐬𝐡𝐚𝐩𝐞(𝐰) ∈ ℝ(𝐼1 ∙…⋅𝐼𝑛)

% reshape tensor to vector

(𝑡)

= 𝐏𝐀𝐑𝐀𝐅𝐀𝐂 (𝐂, {𝐰𝑓1 , … , 𝐰𝑓𝑛 }

% reshape vector to tensor (𝑡−1)

) % PARAFAC decomposition from the initial approximation

14.

𝐰 = 𝐰⁄‖𝐰‖

15.

𝐫=𝐰

16.

for 𝑘 = 1, … , 𝑓 − 1

% normalization

𝐫 = 𝐫 − (𝐏(: , 𝑘)𝑇 𝐰)𝐑(: , 𝑘)

17. 18.

end for

19.

𝜏 = 𝐫 𝑇 𝐂𝐗𝐗 𝐫

20.

𝐩 = (𝐫 𝑇 𝐂𝐗𝐗 ) ⁄𝜏, 𝐪 = (𝐫 𝑇 𝐂𝐗𝐘 ) ⁄𝜏

21.

𝐂𝐗𝐘 = 𝐂𝐗𝐘 − 𝜏𝐩𝐪𝑇

22.

𝐐(: , 𝑓) = 𝐪, 𝐏(: , 𝑓) = 𝐩, 𝐑(: , 𝑓) = 𝐫

23.

𝐁 𝑓 = 𝐑(: ,1: 𝑓)𝐐(: ,1: 𝑓)𝑇

(𝑡)

(𝑡) 𝑇

(𝑡)

(𝑡) 𝑇

(𝑡)

% matrix of the regression coefficients for the current 𝑓

24.

𝐁 𝑓 = 𝐑𝐞𝐬𝐡𝐚𝐩𝐞(𝐁 𝑓 ) ∈ ℝ𝐼1 ×…×𝐼𝑛×𝐽1 ×…×𝐽𝑚

% reshape matrix to tensor

25.

̃ 𝑓 = 𝐁 𝑓 𝛔𝐘/𝛔𝐗, 𝐘0𝑓 = 𝛍𝐘 − 𝛍𝐗𝐁 ̃𝑓 𝐁

% the regression coefficients and

bias

normalized

for data

the

non-

(see

the

Normalization algorithm) end for ∗

∗ 𝐹 (𝑡) ̃ (𝑡) = 𝐁 ̃ 𝐹RV 26. 𝐁 , 𝐘0 = 𝐘0 RV

% the optimal model to be used for prediction

-

The n-mode vector product of tensor “×𝑛 ”; see 57.

-

PARAFAC algorithm description; see 32.

-

The vector outer product “∘”; see 57.

Appendix B Normalization algorithm Input:

𝐗 (𝑡) ∈ ℝ𝑁𝑡×𝐼1 ×…×𝐼𝑛 , 𝐘 (𝑡) ∈ ℝ𝑁𝑡×𝐽1 ×…×𝐽𝑚 ,

𝑁 eff(𝑡−1) , 𝐒𝐗eff(𝑡−1) , 𝐒𝐒𝐗eff(𝑡−1) , 𝐒𝐘eff(𝑡−1) , 𝐒𝐒𝐘eff(𝑡−1) , forgetting coefficient 𝜆. (𝑡)

(𝑡)

𝐗 Norm , 𝐘Norm

Output:

𝑁 eff(𝑡) , 𝐒𝐗eff(𝑡) , 𝐒𝐒𝐗eff(𝑡) , 𝐒𝐘eff(𝑡) , 𝐒𝐘eff(𝑡) , 𝛍𝐗, 𝛔𝐗, 𝛍𝐘, 𝛔𝐘. 1. 𝑁 eff(𝑡) = 𝜆𝑁 eff(𝑡−1) + 𝑁𝑡

% effective number of points

2. 𝐒𝐗eff(𝑡) = 𝜆𝐒𝐗eff(𝑡−1) + 𝐗 𝑡 ×1 𝟏𝑁𝑡 , 3. 𝐒𝐘eff(𝑡) = 𝜆𝐒𝐘eff(𝑡−1) + 𝐘𝑡 ×1 𝟏𝑁𝑡 𝐒

eff(𝑡)

𝐒

eff(𝑡)

% mean(𝐗), mean(𝐘)

4. 𝛍𝐗 = 𝑁𝐗eff(𝑡) , 𝛍𝐘 = 𝑁𝐘eff(𝑡) 5. 𝐒𝐒𝐗eff(𝑡) = 𝜆𝐒𝐒𝐗eff(𝑡−1) + 𝐗 𝑡 ∗2 ×1 𝟏𝑁𝑡 6. 𝐒𝐒𝐘eff(𝑡) = 𝜆𝐒𝐒𝐘eff(𝑡−1) + 𝐘𝑡 ∗2 ×1 𝟏𝑁𝑡 7. 𝛔𝐗 = √

eff(𝑡)

𝐒𝐒𝐗

% std(𝐗)

eff(𝑡) ∗2

−(𝐒𝐗

) ⁄𝑁eff(𝑡)

𝑁 eff(𝑡) −1

eff(𝑡)

𝐒𝐒 8. 𝛔𝐘 = √ 𝐘

% std(𝐘)

eff(𝑡) ∗2

−(𝐒𝐘

) ⁄𝑁 eff(𝑡)

𝑁 eff(𝑡) −1

(𝑡)

9. 𝐗 Norm = (𝐗 (𝑡) − 𝛍𝐗) /𝛔𝐗

% element-wise normalization of the tensor 𝐗

(𝑡)

10. 𝐘Norm = (𝐘 (𝑡) − 𝛍𝐘) /𝛔𝐘

% element-wise normalization of the tensor 𝐘

Appendix C Recursive Validation algorithm Input:

𝐹

𝐹max

̃ 𝑓 , 𝐘0𝑓 } max , {𝑒𝑓(𝑡−1) } 𝐗 (𝑡) , 𝐘 (𝑡) , {𝐁 𝑓=1

𝑓=1

forgetting coefficient 𝛾. (𝑡) 𝐹max

Output:

∗ 𝐹(𝑡) , {𝑒𝑓 }

𝑓=1

.

1. for 𝑓 = 1, … , 𝐹max 2.

̂𝑓 = 𝐗 (𝑡) 𝐁 ̃ 𝑓 + 𝐘0𝑓 𝐘

% estimation of the prediction on the new data with the old model

3.

(𝑡)

(𝑡−1)

𝑒𝑓 = 𝛾𝑒𝑓

̂𝑓 , 𝐘 (𝑡) ) + 𝐄𝐑𝐑𝐎𝐑(𝐘

% error between predicted and observed values on the new data, plus previous errors

4. end for (𝑡) 𝐹max

∗ 5. 𝐹(𝑡) = argmin{𝑒𝑓 } 𝑓

𝑓=1

% find number of factors, providing the minimal error

Suggest Documents