Minimum Singular Value Estimation of Bipolar Matrices - wseas

0 downloads 0 Views 205KB Size Report
Department of Early Childhood Education. 3rd km. Florina-Niki, 53100 Florina. GREECE. Abstract: - Bipolar matrices are a special class of matrices, consisting of ...
Proceedings of the 10th WSEAS International Conference on COMPUTERS, Vouliagmeni, Athens, Greece, July 13-15, 2006 (pp642-645)

Minimum Singular Value Estimation of Bipolar Matrices THEODORE H. KASKALIS Department of Early Childhood Education 3rd km. Florina-Niki, 53100 Florina GREECE

Abstract: - Bipolar matrices are a special class of matrices, consisting of elements assuming only the values −1 and +1. Such matrices appear in various areas of research, including Artificial Neural Networks. Their simplicity can lead to the direct determination or estimation of their measures and characteristics, based on their dimensions. Here we discuss the matter of the bipolar matrix minimum singular value estimation. The geometric interpretation of singular values is combined with analytical experimental results to reveal some regularity, which shows the potential existence of a closed function, describing the minimum non-zero singular values of all possible bipolar matrices of the same dimensions. This function could give some direct information, which can be used constructively in the estimation of the respective matrix condition or the behaviour of the matrix pseudoinverse elements, among other things. We conclude by posing two questions, as regards the existence of this function and the potential acceleration of the process under which we acquire elements, in order to reveal a curve with the desired characteristics. Keywords: - Bipolar matrix, Singular values, Artificial Neural Networks

1 Introduction The nature of the elements, comprising a matrix, often determines some (or most) of its special characteristics, such as its norms or its eigenvalues and eigenvectors [2]. A very popular class of matrices, appearing in various Artificial Neural Network applications [3, 4, 9] among other scientific fields, is the one where all the elements of a matrix assume one of two possible values. For example, these values often represent black and white pixels in optical character recognition applications or binary image processing algorithms. The two most popular methods for the coding of such matrix elements are the binary scheme: {0, 1} and the bipolar scheme: {−1, 1}. The second approach is being utilized more frequently, since it provides direct unbiased operations which lead to symmetric results as regards the origin of the axes. The matrices containing elements which belong in the set {−1, 1} are called bipolar matrices. It appears that the class of bipolar matrices possesses certain characteristics due to their simple nature [5]. These characteristics lead to direct calculations of their important measures, such as their norms. It is easy to identify that several such measures depend only on the respective matrix dimensions. As a result, one becomes interested whether other bipolar matrix characteristics can be encoded in simple closed function forms.

Here, we investigate this matter, focusing on the minimum singular value estimation (and/or determination) of bipolar matrices. First, we present the notion of singular values and their potential applications. This discussion is followed by a geometric interpretation of singular values, which directly shows the simplifications appearing in bipolar cases. We present our experimental results, which exhibit a very regular and normal evolution of the minimum singular values, as the matrix dimensions increase. We investigate this characteristic and, finally, we pose two different questions on the matter under consideration.

2 Special Properties of Bipolar Matrices Let us assume the existence of a bipolar matrix A, having m × n dimensions. Representing this matrix as a collection of row or column vectors, we can write: A = [ a1 | a2 | . . . | a m ] T

A = [ a1 | a2 | . . . | a n ]

(1) (2)

where ai , i ∈ {1, 2, . . . , m}, represents the i-th row and aj , j ∈ {1, 2, . . . , n} the j-th column of A, having n and m dimensions, respectively. We can directly write the following special properties about various norm values:

Proceedings of the 10th WSEAS International Conference on COMPUTERS, Vouliagmeni, Athens, Greece, July 13-15, 2006 (pp642-645)

max |aij | = ||ai ||∞ = ||aj ||∞ = 1 i,j

||ai ||1 = n ||ai ||2 = ||aj ||2 =

||aj ||1 = m √ aTi ai = n

(3)

,

(4)

q

(5)

q

aTj aj =



m

(6)

the lower limit of σmin (A), in a similar fashion as the upper limit of Equation 8? Such a result can be of significant importance in the determination of the characteristics of A+ (Equation 12) or κ(A) (Equation 13), for full rank cases. The prior knowledge of such measures can be often proved valuable during various computations and algorithms.

for every i ∈ {1, 2, . . . , m}, j ∈ {1, 2, . . . , n}. As regards the norms of the A matrix, we can write:

3 Minimum Singular Values (7) ||A||∞ = n , ||A||1 = m √ √ ||A||2 = σmax (A) ≤ mn max |aij | = mn (8) i,j

where σmax (A) is the largest singular value of A. We remind that the singular values of a matrix are the positive elements comprising the diagonal matrix Σ in the Singular Value Decomposition [2]: A = U ΣV T

(9)

where U ∈ IRm×m and V ∈ IRn×n are orthogonal matrices and Σ = diag(σ1 , σ2 , . . . , σr , 0, . . . , 0) ∈ IRm×n , where r = rank(A). A very useful application of the Singular Value Decomposition is, among others, the direct calculation of the pseudoinverse of matrix A [1, 6]:

A first indication of the special characteristics of bipolar matrices comes from the geometric interpretation of the Singular Value Decomposition. Without loss of generality, we will assume that m ≤ n, from now on. The singular values of any matrix A represent the lengths of the semi-axes of a hyper-ellipsoid E, defined as E = {Ax : ||x||2 = 1}. Fig. 1 represents the transformations imposed by V , Σ and U for a simple planar case. 1

1 1

1 rotate by V T

stretch by Σ

A+ = V Σ+ U T

(10)

where: 1 1 1 , , . . . , , 0, . . . , 0 ∈ IRn×m σ1 σ2 σr (11) The formulation of the pseudoinverse of Equation 10 directly leads to: Σ+ = diag





||A+ ||2 =

1 σmin (A)

(12)

where σmin (A) is the smallest non-zero singular value of A. Equation 12 is related to the condition number of rectangular full rank matrices [2, 6]: κ(A) =

σmax (A) = ||A||2 ||A+ ||2 σmin (A)

(13)

where rank(A) = min(m, n). The condition number of square matrices actually represents how close to singularity is a given matrix. In a similar manner, Equation 13 represents how close to being linearly dependend are the column or row vectors of A. The above discussion directly poses an interesting problem. Can we estimate or (better) calculate

σ2

σ2

σ1

σ1 rotate by U

Fig. 1: Geometric Interpretation of Singular Values The smallest non-zero singular value of A is related to the smallest vector, that can be constructed by the rows of A, through linear combinations [2, 7, 8]. Another form of stating this comment is to relate σmin (A) with the smallest distance of one of the rows of A to the space formed by the other rows. This means that the value of σmin (A) will become smaller as the various row vectors approach each other. However, this latter statement takes a different meaning in the bipolar case. Fig. 2 presents the simple case of two vectors of equal length. The smallest singular value of matrix X = [ x1 | x2 ]T is: σ2 =

||x2 − x1 ||2 √ 2

(14)

Proceedings of the 10th WSEAS International Conference on COMPUTERS, Vouliagmeni, Athens, Greece, July 13-15, 2006 (pp642-645)

√ u2 σ2 2

1.16

m=3

1.14

Minimum Singular Value

x2

√ u1 σ1 2

x1

Fig. 2: Vector Distance and Singular Values

1.12 1.1 1.08 1.06 1.04 1.02 1 0.98

2n m

!

2n ! = m! (2n − m)!

(16)

lim f (m, n) = g(m)

11

Dimension n

m=4 0.78 0.77 0.76 0.75 0.74 0.73

4

6

5

8

7

9

Dimension n

0.5

m=5

0.498 0.496 0.494 0.492 0.49 0.488 0.486 0.484

5

possible matrices for a specific pair (m, n). Fig. 3 presents the results of this process for m = 3, 4 and 5. We can identify a very regular evolution of the minimum singular values, which appears similar between the three plots, but in different scaling. This normal behaviour poses the following interesting question: Can there be some form of a closed function f (m, n) describing, or being very close, to these curves? We can observe that there is probably a limiting value of the form: n→∞

10

9

8

7

6

5

4

0.79

Minimum Singular Value

For m > 2, of course, the situation is not that simple. We proceed in the examination of the minimum singular values of bipolar matrices according to a computational procedure. For a fixed m we produce all the possible bipolar matrices for incrementing values of n (≥ m). For all of these matrices, we calculate their Singular Value Decomposition and identify the smallest non-zero singular value. It is easily proven [2] that this value is always produced by full-rank matrices (i.e. rank(A) = m ≤ n). This last statement directs us to exclude the existence of identical row vectors in the same matrix, producing a total of:

3

Minimum Singular Value

For bipolar vectors of dimension n, the value ||x2 − x1 ||2 is minimized when the two vectors differ in only one of their elements. In other words, when their Hamming distance is 1. We, therefore, deduce that, for every full rank bipolar 2 × n matrix A, it holds that: √ 2 (15) σmin (A) ≥ √ = 2 2

(17)

Unfortunately, the computational burden of this process becomes extremely high as m and n assume greater values. This latter statement is realized from Equation 16.

7

6

8

Dimension n

Fig. 3: Minimum Singular Values of Bipolar Matrices (m = 3, 4, 5) Fig. 4 combines the plots of Fig. 3 and adds the result of a similar process for f (m, n)|m=n

(18)

In this case, the curve appears to be directing asymptotically to zero, for increasing values of m. This is actually quite expected and can be formally represented by: lim f (m, m) = 0

m→∞

(19)

We understand that the construction of this function f (m, n) will be of significant importance, since

Proceedings of the 10th WSEAS International Conference on COMPUTERS, Vouliagmeni, Athens, Greece, July 13-15, 2006 (pp642-645)

the largest values of matrix A+ will be directly calculated by: max |a′ij | ≤ ||A+ ||2 = i,j

1 1 ≤ (20) σmin (A) f (m, n)

where a′ij are the respective elements of A+ , for i ∈ {1, 2, . . . , n} and j ∈ {1, 2, . . . , m}. Moreover, the condition number, for full rank cases, will also assume an upper limit, since: √ σmax (A) mn + ≤ (21) κ(A) = ||A||2 ||A ||2 = σmin (A) f (m, n)

We understand that the answer of the second question raises the interest for the first. On the other hand, the answer of the first question can lead to the gathering of an adequate amount of data points. However, this does not necessarily mean that the desired function will be recognized. And, finally, if such an identification really happens, the next step will be the mathematical formulation of its proof. All these points comprise our future work on this interesting matter and pave the way for a more elaborate research on the properties of bipolar matrices.

Minimum Singular Value

1.4

References:

1.2 1 0.8 0.6 0.4

m=3 m=4 m=5 m=n

0.2 0

2

3

4

5

6

7

8

9

10

11

Dimension n

Fig. 4: Combined Curves and m = n Case

4 Concluding Remarks and Posed Questions The minimum singular values of bipolar matrices appear to follow a very normal evolution scheme. As a result, it is of interest to gather enough data points so as to reveal a function that describes or stays closely under the produced curves, through regression or similar other techniques. With this work we pose two questions as regards the above mentioned process: 1. Can there be a faster way for producing that special bipolar matrix, which gives the minimum singular value, for a fixed pair (m, n)? This is closely related to the m = 2 case, where the minimum Hamming distance gave the answer. If such a way exists, we can omit the full examination of all possible bipolar matrices, given by Equation 16, and accelerate the data point gathering procedure. 2. Is there some form of a function f (m, n), depending on the dimensions, or rank, of a bipolar matrix A, that can be used to produce lower limits for the minimum non-zero singular values of A?

[1] T. L. Boullion and P. L. Odell, Generalized Inverse Matrices, Wiley Interscience, New York 1971 [2] G. H. Golub and C. F. Van Loan, Matrix Computations (3rd Ed.), Johns Hopkins University Press, London 1996 [3] M. H. Hassoun, Fundamentals of Artificial Neural Networks, MIT Press, Cambridge, MA 1995 [4] S. Haykin, Neural Networks: A Comprehensive Foundation (2nd Ed.), Prentice–Hall, New York 1998 [5] W. C. Kwong, G. C. Yang and C. Y. Chang, Wavelength-Hopping Time-Spreading Optical CDMA with Bipolar Codes, Journal of Lightwave Technology, 23(1), 2005, 260-267. [6] M. Z. Nashed (Ed.), Generalized Inverses and Applications, Academic Press, New York 1976 [7] A. J. van der Veen, E. F. Deprettere and A. L. Swindlehurst, Subspace Based Signal Analysis Using Singular Value Decomposition, Proceedings of the IEEE 81(9), 1993, pp. 1277–1308. [8] A. J. van der Veen, A Schur Method for LowRank Matrix Approximation, SIAM J. on Matrix Analysis and Applications 17(1), 1996, pp. 139– 160. [9] J. Wu, J. Ma and Q. Cheng, Further Results on the Asymptotic Memory Capacity of the Generalized Hopfield Network, Neural Processing Letters, 20(1), 2004, pp. 23-38.

Suggest Documents