A Comparison of Column Subset Selection Methods

0 downloads 0 Views 2MB Size Report
SPIE, April 17, 2018. Defense & Commercial sensing. Outline. • Background. ➢Motivation. ➢Linear Mixing Model (LMM). ➢Geometry of the LMM. ➢Hyperspectral ...
Maher Aldeghlawi and Miguel Velez-Reyes Electrical and Computer Engineering Department University of Texas at El Paso 500 West University Avenue | El Paso, Texas 79968 e-mail: [email protected] [email protected]

Outline • Background

Motivation Linear Mixing Model (LMM) Geometry of the LMM

Hyperspectral Unmixing

Standard two stage Unmixing Geometric Endmember Extraction

Column Subset Selection Problem (CSSP) CSSP for Endmember Extraction

• Experimental Results • Conclusions

SPIE, April 17, 2018

Defense & Commercial sensing

2/26

Motivation • Matrix factorizations (SVD, RRQR) are widely used in data analytics – Dimensionality reduction – Feature extraction

• Long history of computational tools (numerical linear algebra) available • Interest in using them for hyperspectral image analysis • Studying column subset selection algorithms application to – Band subset selection (tall matrices) – Endmember extraction (wide matrices)

SPIE, April 17, 2018

Defense & Commercial sensing

3/26

Linear Mixing Model 𝑝𝑝

𝒙𝒙𝑗𝑗 = � 𝑎𝑎𝑗𝑗𝑖𝑖 𝒆𝒆𝑖𝑖 𝑖𝑖=1

SPIE, April 17, 2018

where p is the number of endmembers ei endmember spectral signature aji fractional abundance

Defense & Commercial sensing

4/26

Linear Mixing Model: Geometric Interpretation 𝒙𝒙 = 𝑬𝑬𝑬𝑬

SPIE, April 17, 2018

Defense & Commercial sensing

5/26

Hyperspectral Unmixing Mixing

Unmixing SPIE, April 17, 2018

Defense & Commercial sensing

6/26

Standard Two-Stage Approach for Unmixing

SPIE, April 17, 2018

Defense & Commercial sensing

7/26

Endmember Extraction based on Geometric Properties • Endmembers are the vertices of the simplex – Examples: PPI (Boardman 1994), SMACC (Gruninger 2004), VCA (Nascimento 2005)

• The volume of the simplex formed by the endmembers is larger than the volume formed from any other combination of pixels – Example: NFINDR (Winter 1999)

SPIE, April 17, 2018

Defense & Commercial sensing

8/26

Unfolding the Image Cube Into a Matrix

unfolding

SPIE, April 17, 2018

Defense & Commercial sensing

𝑿𝑿 ∈ 𝑅𝑅 𝑚𝑚×𝑵𝑵

9/26

Column Subset Selection Problem (CSSP) • Let 𝑿𝑿 ∈ 𝑅𝑅 𝑚𝑚×𝑵𝑵 be a matrix (e.g. the unfolded hyperspectral cube). • CSSP can be stated as the problem of finding a permutation matrix 𝑷𝑷 such that 𝑿𝑿𝑷𝑷 = 𝑿𝑿1 𝑿𝑿2 where 𝑿𝑿1 ∈ 𝑅𝑅 𝑚𝑚×𝒑𝒑 is the matrix of selected columns and the permutation matrix 𝑷𝑷 is selected to satisfy some optimality criteria. • CSSP is widely studied in linear algebra and data mining SPIE, April 17, 2018

Defense & Commercial sensing

10/26

Column Subset Selection Problem (CSSP)

• The standard CSSP can be stated as the following approximation problem. min

𝑷𝑷 𝑠𝑠.𝑡𝑡 𝑿𝑿𝑿𝑿= 𝑿𝑿1

𝑿𝑿2

𝑿𝑿 −

2 # 𝑿𝑿1 𝑿𝑿1 𝑿𝑿 𝐹𝐹

=

min

𝑠𝑠.𝑡𝑡 𝑿𝑿𝑿𝑿= 𝑿𝑿1

𝑷𝑷

𝑿𝑿2 , 𝑪𝑪=𝑿𝑿# 1 𝑿𝑿

𝑿𝑿 − 𝑿𝑿1 𝑪𝑪

• The optimal 𝑷𝑷 chooses the columns that best predict the other columns in terms of the residual error. • Selected columns 𝑿𝑿1 are called “most representative” columns • This is a combinatorial optimization problem • Algorithms that choose a “good” column subset are proposed in the linear algebra literature. SPIE, April 17, 2018

Defense & Commercial sensing

2 𝐹𝐹

D2

X

0 D3

𝒆𝒆 = 𝑿𝑿 − 𝑿𝑿𝟏𝟏 𝑪𝑪 X1C

𝟐𝟐 𝑭𝑭

D1

Range space of X1 11/26

CSS, Why and Where? •



Motivation  Data interpretation through identifying relevant columns  Speed-up by performing computationally expensive operations on a small column subset Some applications in data analysis  HSI Band Subset Selection (Velez-Reyes 1998)  Neuroimaging data (NNCN) (Strauch 2014)  Cardiovascular and respiratory modeling (ELLWEIN 2016)  Population genetics summarization (Khan 2015)  Electronic circuits testing (Abadir 2013)  Recommendation systems (Amatriain 2011)  Machine learning and statistics (variable selection)(Altschuler 2016)

SPIE, April 17, 2018

Defense & Commercial sensing

12/26

CSS for Endmember Extraction • Let 𝑿𝑿 ∈ 𝑅𝑅 𝑚𝑚×𝑵𝑵 be the unfolded hyperspectral cube • CSS can be used to select a subset of pixels that predict the other pixels with a low residual – 𝑿𝑿1 → Representative Pixels • It can be shown that the optimal low residual solution for the CSSP also solves the problem of finding the maximum volume sub-matrix of a matrix (Çivril 2009)

– Relates to the simplex volume maximization problem (e.g. NFINDR)

SPIE, April 17, 2018

Defense & Commercial sensing

13/26

Examples of Algorithms to Solve the CSSP  DETERMINISTIC ALGORITMS  SVD Singular Value decomposition CSS (G. H. GOLUB and C. REINSCH 1970)

We refer to this algorithm as SVDSS

 Rank Revealing QR (RRQR) factorization CSS QR Factorization with pivoting CSS (G. H. GOLUB and C. REINSCH 1970) RRQR CSS in (T. F. Chan 1987)

 Greedy Nystrom Approximation CSS (a. K. Farahat et al 2011) RANDOMIZED ALGORITHMS  A two-stage algorithm for the CSSP (C. Boutsidis et al 2009)  Greedy CSS (A. K. Farahat et al 2013)  Random Greedy CSS (A. K. Farahat et al 2014) SPIE, April 17, 2018

Defense & Commercial sensing

14/26

Experiments • Multiple Data Sets

– Simulated (4 endmembers) – Hydice Urban

• Endmember extraction using – – – – –

PPI (Boardman 1994) VCA (Nascimento 2005) NFINDR (Winter 1999) SVDSS (G.H. Golub and C. Reinsch 1970) Implemented in Matlab

• Volume of generated p-simplex is computed for extracted endmembers SPIE, April 17, 2018

Defense & Commercial sensing

15

Experiments (cont.) • Volume of a p-simplex in an m-dimensional space (p

Suggest Documents