SVD for dimensionality reduction of hyperspectral images. .... C. Egho, & T. Vladimirova, âAdaptive hyperspectral image compression using the KLT and integer ...
HyspIRI Symposium-2015 NASA Goddard Spaceflight Center
An Overflow-Free Fixed-point Singular Value Decomposition Algorithm for Dimensionality Reduction of Hyperspectral Images Bibek Kabi, Anand S. Sahadevan, Ramanarayan Mohanty Aurobinda Routray, Bhabani. S. Das, Anmol Mohanty Indian Institute of Technology Kharagpur
Motivation Hyperspectral Algorithms
Floating-point Program
Hardware
Conversion to Fixed-Point
Floating-Point Processor
Uniform Wordlength
Fixed-Point Processor Wordlength Optimization
Fixed- Point FPGA and ASIC Optimized Wordlength
FPGA: Field Programmable Gate Array ASIC: Application Specific Integrated Circuit
Price
Power Consumption
Overflowing
Hyperspectral Image
My Cost
Processed Image
Previous Works on Numerical Linear Algebra Algorithms in Fixed-Point
Szecówka and Malinowski, 2010
Nikolic et al., 2007 *SVD – Singular Value Decomposition
Jerez et al., 2015
Overflow & Cost Diverse Range
Scale : Log
2
Large Range
Hyperspectral Images for Validation
• Hyperion (Space-borne): Hyperion image contains the Chilika Lake site, India. • ROSIS (Air-borne): ROSIS image contains Pavia, University site, Italy.
Overflow Data Hyperion ROSIS
Variables Variables A a σ
Hyperion (IWLs) 19 47 24
IWLs - Integer Wordlengths
ROSIS (IWLs) 20 49 25
PCs PC1 PC2 PC3 PC4 PC5
SQNR 3.73 10.01
MSE Hyperion 2.0628e+05 1.0199e+03 1.5305e+04 262.3714
SQNR - Signal-to-quantization-noise-ratio
ROSIS 6.7682e+04 6.5052e+03 4.6489e+03 1.3140e+03 3.2786e+03
PCs - Principal Components
Hyperion
MSE - Mean Squared Error
ROSIS
Related Articles General trend in previous studies Fixed-point SVD implementation uses the simulation-based approach for estimating the ranges of variables. It can’t produce promising bounds Jerez et al., 2015 applied a scaling procedure to control the bounds on variables of Lanczos algorithm. Similar scaling procedure cannot be used for SVD because the spectrum of the original matrix changes after scaling. This prevents the recovery of the original singular values and singular vectors.
Egho et al., 2014 Fixed-point implementation of SVD algorithm leads to inaccurate computation of singular values and singular vectors due to overflow. They implemented Jacobi algorithm in FPGA with floating-point arithmetic.
Lopez et al., 2013 The diverse range of the elements of the input data matrices for different HSIs limits the use of fixed-point SVD for dimensionality reduction of hyperspectral images.
Simulation-based approach
• Can produce exact bounds for a specific input matrix. The same bounds can’t avoid overflow for other input matrices. • For every new input matrix, simulation has to be run again and again to find the range of the variables which is a tedious process.
• Bounds grow with the increase in the iterations • Can't handle algorithms that are recursive like SVD • Can’t handle range analysis of algorithms with comparator units and if-statements (Sarbishei and Katarzyna, 2012)
• Difficult to derive same bounds for all range of input data • Difficult to derive same bounds for different stages of the algorithm.
Scaling Method Works !! A =U VT If each element of a matrix is divided by the square root of the product of its one-norm and infinity-norm or Frobenius norm then all the variables generated during the computation of SVD will have tight analytical ranges
U VT
A
m
Aˆ 1 2
m = 3.1227E+7
m=
A1 A
Scaling
m = 2.0629E+7
m= A F
PROOF Derivation in brief Proof: Using vector and matrix norm properties, the ranges of the variables can be derived. We start by bounding the elements of the input matrix as
max | Aˆ xy | Aˆ 2 1 xy
U is the left singular vector matrix, which is orthogonal and each column of U has unity norm. Hence all elements of are in the range [-1,1] following (*).
U (:, i) U (:, i) 2 = 1
(*)
Similar is the case for right singular vectors V.
Improved SQNR and MSE
MSE [ROSIS] Scale: log10 10.00 5.00
4.83
47 bits (Overflow)
3.66
3.81
47 bits
25 bits
3.52
3.11
0.00 -5.00
PC1
PC2
PC3 -6.74
-10.00
-5.35
PC4
-5.19
PC5
-15.00 -20.00
-25.00
-20.00 -20.00
-20.00
-20.00
-20.00
SQNR and MSE (scaling vs without scaling) with double precision floating-point as the reference
-20.00
-5.62
High-level synthesis [HLS] of fixed-point SVD algorithm on Xilinx Virtex 7 XC7VX485 FPGA
Reduced Hardware Cost
Fixed-point code for HLS is implemented using SystemC
Hardware Cost [Hyperion] Without Scaling
47 bits (With Scaling)
25 bits (With Scaling)
30 20 10
14.27
12.72 6.71 6.29
3.63 1.73 1.62
27
5.34 3.88
5
2.86
0.817 0.45 0.41
0
FFs (%)
LUTs (%)
BRAM (%)
DSP48 (%)
Power (W)
Hardware Cost [ROSIS] Without Scaling
47 bits (With Scaling)
25 bits (With Scaling)
30
14.25
20
10
3.64 1.73 1.62
27
12.72 6.79 6.32
5.34 3.88
4.96 2.82
0.78 0.49 0.42
0 FFs (%)
LUTs (%)
BRAM (%)
DSP48 (%)
Power (W)
Percentage reduction in hardware cost after scaling 53%–55% in FF (flip flop) 53%–56% in LUT (look-up-table)
59%–69% in BRAM 82%–89% in DSP48
38%–50% in On-Chip power
Indian Institute of Technology Kharagpur, India Aurobinda Routray
Bibek Kabi
Bhabani Sankar Das
It is exciting to do Remote Sensing
Ramanarayan Mohanty
Anand S Sahadevan
Anmol Mohanty
References S. Lopez, T. Vladimirova, C. Gonzalez, J. Resano, D. Mozos, & A. Plaza, “The Promise of reconfigurable computing for hyperspectral imaging onboard systems: A review and trends,” Proceedings of the IEEE, vol. 101, no. 3, pp. 698-722, 2013. C. Egho, & T. Vladimirova, “Adaptive hyperspectral image compression using the KLT and integer KLT algorithms,” in Proc. 9th NASA/ESA Conf. Adapt. Hardware Syst., 2014, pp. 112-119. Z. Nikoli´c, H. T. Nguyen, G. Frantz, “Design and Implementation of Numerical Linear Algebra Algorithms on Fixed Point DSPs,” EURASIP J. Adv. Signal Process., pp. 1-22, 2007, article 87046. P. M. Szecowka and P. Malinowski, “CORDIC and SVD Implementation in Digital Hardware,” in Proc. Int. Conf. Mixed Design of Integrated Circuits and Systems (MIXDES), 2010, pp. 237–242. J. L. Jerez, G. A. Constantinides and E. C. Kerrigan, “A Low Complexity Scaling Method for the Lanczos Kernel in Fixed-Point Arithmetic,” IEEE Trans. on Computers, vol. 64, no. 2, pp. 303-315, 2015.
O. Sarbishei and R. Katarzyna, “Verification of fixed-point datapaths with comparator units using Constrained Arithmetic Transform (CAT),” in Proc. IEEE ISCAS, 2012, pp. 592-595.
Backup slides
SQNR and MSE
NIL
SQNR and MSE (scaling vs without scaling) with double precision floating-point as the reference
Va l i d a t e d Without Scaling (Overflow , 47 bits )
SQNR and MSE in fixed-point arithmetic (scaling vs without scaling) with double precision floating-point result as the reference. Without Scaling (Overflow, 47 bits ) MSE PC1 PC2 PC3 PC4 PC5
Hyperion 2.0e+05 1.0e+03 1.5e+04 262.3714 NIL
SQNR
47 bit
Hyperion
3.73
ROSIS
10.01
With Scaling (Overflow-free)
ROSIS 6.7e+04 6.5e+03 4.6e+03 1.3e+03 3.3e+03
SQNR
47 bit
40 bit
35 bit
32 bit
25 bit
Hyperion
176.76
132.15
106.44
84.45
78.03
ROSIS
180.13
135.65
134.79 120.60
74.96
With Scaling (Overflow-free) MSE 25 32 35 40 47
bit bit bit bit bit
PC1 8.1e-7 4e-9 0 0 0
Hyperion PC2 PC3 4.9e-7 3.7e-6 1.3e-7 2.1e-6 0 8.5e-7 0 9.5e-11 0 0
PC4 4.3e-6 1.8e-6 1.9e-7 2.2e-9 0
PC1 0 0 0 0 0
PC2 1.8e-7 4.6e-7 0 0 0
ROSIS PC3 4.5e-6 3e-6 0 0 0
PC4 6.5e-6 5.8e-6 0 0 0
PC5 2.4e-6 2.3e-6 0 0 0
Reduced Cost & On-chip Power Consumption
With Scaling Utilization (%)
Without Scaling COST FF LUTs BRAM DSP48 On-Chip Power Power
COST
Utilization (%) Hyperion ROSIS 3.63 3.64 14.27 14.25 12.72 12.72 27.00 27.00 Consumption (W) Hyperion ROSIS 0.817 0.783
FF LUT BRAM DSP48 On-Chip Power Power
25 bit 1.62 6.29 3.88 2.86
Hyperion 32 bit 35 bit 40 bit 47 bit 1.63 6.41 4.17 3.29
1.67 6.51 4.66 5
1.71 6.56 5.15 5
1.73 6.71 5.34 5
25 bit 1.62 6.32 3.88 2.82
ROSIS 32 bit 35 bit 40 bit 47 bit 1.63 6.48 4.17 3.25
1.67 6.53 4.66 4.96
1.71 6.62 5.15 4.96
1.73 6.79 5.34 4.96
0.44
0.46
0.49
Consumption (W) 0.41
0.43
0.43
0.45
0.45
0.42
0.44
With Scaling
COST FF LUT BRAM DSP48 Power
Reduction (%)
Hyperion 25 bit 55.37 55.92 69.49 89.4 49.44
32 bit 55.09 55.08 67.21 87.81 47.73
35 bit 53.99 54.37 63.36 81.48 47.36
40 bit 52.89 54.02 59.51 81.48 44.79
47 bit 52.34 52.97 58.01 81.48 44.55
25 bit 55.49 55.64 69.49 89.55 46.36
ROSIS 32 bit 55.21 54.52 67.21 87.96 43.67
35 bit 54.12 54.17 63.36 81.62 42.91
40 bit 53.02 53.54 59.51 81.62 41.63
47 bit 52.47 52.35 58.01 81.62 37.42
One-sided Jacobi SVD algorithm (Hestenes)