Rate-Distortion Function Upper Bounds for Gaussian Vectors ... - MDPI

1 downloads 0 Views 799KB Size Report
May 23, 2018 - Abstract: In this paper, we give upper bounds for the rate-distortion function (RDF) of any ... Keywords: source coding; rate-distortion function (RDF); Gaussian vector; autoregressive (AR) ...... Inf. Theory 1956, 2, 102–108.
Article

Rate-Distortion Function Upper Bounds for Gaussian Vectors and Their Applications in Coding AR Sources Jesús Gutiérrez-Gutiérrez ∗ and Xabier Insausti ID

ID

, Marta Zárraga-Rodríguez and Fernando M. Villar-Rosety

Tecnun, University of Navarra, Paseo de Manuel Lardizábal 13, 20018 San Sebastián, Spain; [email protected] (M.Z.-R.); [email protected] (F.M.V.-R.); [email protected] (X.I.) * Correspondence: [email protected]; Tel.: +34-94-3219877 Received: 16 April 2018 ; Accepted: 20 May 2018; Published: 23 May 2018

 

Abstract: In this paper, we give upper bounds for the rate-distortion function (RDF) of any Gaussian vector, and we propose coding strategies to achieve such bounds. We use these strategies to reduce the computational complexity of coding Gaussian asymptotically wide sense stationary (AWSS) autoregressive (AR) sources. Furthermore, we also give sufficient conditions for AR processes to be AWSS. Keywords: source coding; rate-distortion function (RDF); Gaussian vector; autoregressive (AR) source; discrete Fourier transform (DFT)

1. Introduction In 1956, Kolmogorov [1] gave a formula for the rate-distortion function (RDF) of Gaussian vectors and the RDF of Gaussian wide sense stationary (WSS) sources. Later, in 1970 Gray [2] obtained a formula for the RDF of Gaussian autoregressive (AR) sources. In 1973, Pearl [3] gave an upper bound for the RDF of finite-length data blocks of Gaussian WSS sources, but he did not propose a coding strategy to achieve his bound for a given block length. In [4], we presented two tighter upper bounds for the RDF of finite-length data blocks of Gaussian WSS sources, and we proposed low-complexity coding strategies, based on the discrete Fourier transform (DFT), to achieve such bounds. Moreover, we proved that those two upper bounds tend to the RDF of the WSS source (computed by Kolmogorov in [1]) when the size of the data block grows. In the present paper, we generalize the upper bounds and the two low-complexity coding strategies presented in [4] to any Gaussian vector. Therefore, in contrast to [4], here no assumption about the structure of the correlation matrix of the Gaussian vector has been made (observe that since the sources in [4] were WSS the correlation matrix of the Gaussian vectors there considered was Toeplitz). To obtain such generalization we start our analysis by first proving several new results on the DFT of random vectors. Although in [4] (Theorem 1) another new result on the DFT was presented, it cannot be used here, because such result and its proof rely on the power spectral density (PSD) of a WSS process and its properties. The two low-complexity strategies here presented are applied in coding finite-length data blocks of Gaussian AR sources. Specifically, we prove that the rates (upper bounds) corresponding to these two strategies tend to the RDF of the AR source (computed by Gray in [2]) when the size of the data block grows and the AR source is asymptotically WSS (AWSS). The definition of AWSS process was introduced by Gray in [5] (Chapter 6) and it is based on his concept of asymptotically equivalent sequences of matrices [6]. Sufficient conditions for AR processes to be AWSS can be found in [5] (Theorem 6.2) and [7] (Theorem 7). In this paper we present other sufficient conditions which make easier to check in practice whether an AR process is AWSS. Entropy 2018, 20, 399; doi:10.3390/e20060399

www.mdpi.com/journal/entropy

Entropy 2018, 20, 399

2 of 15

The paper is organized as follows. In Section 2 we obtain several new results on the DFT of random vectors which are used in Section 3. In Section 3 we give upper bounds for the RDF of Gaussian vectors, and we propose coding strategies to achieve such bounds. In Section 4 we apply the strategies proposed in Section 3 to reduce the computational complexity of coding Gaussian AWSS AR sources. In Section 5 we give sufficient conditions for AR processes to be AWSS. We finish the paper with a numerical example and conclusions. 2. Several New Results on the DFT of Random Vectors We begin by introducing some notation. C denotes the set of (finite) complex numbers, i is the imaginary unit, Re and Im denote real and imaginary parts, respectively. ∗ stands for conjugate transpose, > denotes transpose, and λk ( A), k ∈ {1, . . . , n}, are the eigenvalues of an n × n Hermitian matrix A arranged in decreasing order. E stands for expectation, and Vn is the n × n Fourier unitary matrix, i.e., 2π ( j−1)(k −1) 1 i n [Vn ] j,k = √ e− , j, k ∈ {1, . . . , n}. n If z ∈ C then b z denotes the real (column) vector b z=

! Re(z) . Im(z)

If zk ∈ C for all k ∈ {1, . . . , n} then zn:1 is the n-dimensional vector given by  zn:1

zn

 z  n −1  z =  n −2  ..  . z1

    .   

In this section, we give several new results on the DFT of random vectors in two theorems and one lemma. Theorem 1. Let yn:1 be the DFT of an n-dimensional random vector xn:1 , that is, yn:1 = Vn∗ xn:1 . 1.

If k ∈ {1, . . . , n} then

and 2.

  ∗ ∗ λn ( E ( xn:1 xn:1 )) ≤ E | xk |2 ≤ λ1 ( E ( xn:1 xn:1 ))

(1)

  ∗ ∗ λn ( E ( xn:1 xn:1 )) ≤ E |yk |2 ≤ λ1 ( E ( xn:1 xn:1 )).

(2)

If the random vector xn:1 is real and k ∈ {1, . . . , n − 1} \ { n2 } then

and

    > ) > ) λn ( E xn:1 xn:1 λ1 ( E xn:1 xn:1 2 ≤ E (Re(yk )) ≤ , 2 2

(3)

    > ) > ) λn ( E xn:1 xn:1 λ1 ( E xn:1 xn:1 2 ≤ E (Im(yk )) ≤ . 2 2

(4)

Proof. (1) We first prove that if Wn is an n × n unitary matrix then h i  ∗ ∗ λn ( E ( xn:1 xn:1 )) ≤ Wn diag1≤ j≤n λ j ( E ( xn:1 xn:1 )) Wn∗

n−k +1,n−k +1

∗ ≤ λ1 ( E ( xn:1 xn:1 )).

(5)

Entropy 2018, 20, 399

3 of 15

We have h

 i  ∗ Wn diag1≤ j≤n λ j ( E ( xn:1 xn:1 )) Wn∗

n

k1 ,k2

∑ [Wn ]k ,h

=

h

1

h =1 n

 i  ∗ diag1≤ j≤n λ j ( E ( xn:1 xn:1 )) Wn∗

n

∑ [Wn ]k ,h ∑

=

1

h =1 n

=

h



∗ diag1≤ j≤n λ j ( E ( xn:1 xn:1 ))

h,k2

i

l =1

h,l

[Wn∗ ]l,k2

∗ ))[Wn ]k ,h ∑ [Wn ]k ,h λh (E (xn:1 xn:1

(6)

2

1

h =1

for all k1 , k2 ∈ {1, . . . , n}, and hence, h

i  ∗ Wn diag1≤ j≤n λ j ( E ( xn:1 xn:1 )) Wn∗

n

n−k +1,n−k +1

=

∗ ))|[Wn ]n−k+1,h |2 . ∑ λh (E (xn:1 xn:1

h =1

Consequently, ∗ λn ( E ( xn:1 xn:1 ))

n

∑ |[Wn ]n−k+1,h |2 ≤

h

i  ∗ Wn diag1≤ j≤n λ j ( E ( xn:1 xn:1 )) Wn∗

h =1

∗ ≤ λ1 ( E ( xn:1 xn:1 ))

n−k +1,n−k +1

n

∑ |[Wn ]n−k+1,h |2 ,

h =1

and applying n



|[Wn ]n−k+1,h |2 =

h =1

n

∑ [Wn ]n−k+1,h [Wn∗ ]h,n−k+1 = [Wn Wn∗ ]n−k+1,n−k+1 = [ In ]n−k+1,n−k+1 = 1,

h =1

where In denotes the n × n identity matrix, we obtain Equation (5).   −1  ∗ ∗ ∗ Let E xn:1 xn:1 = Un diag1≤ j≤n λ j E xn:1 xn:1 Un be a diagonalization of E xn:1 xn:1 where the eigenvector matrix Un is unitary. As   h  i ∗ ∗ E | xk |2 = [ E ( xn:1 xn:1 )]n−k+1,n−k+1 = Un diag1≤ j≤n λ j ( E ( xn:1 xn:1 )) Un∗

n−k +1,n−k +1

Equation (1) follows directly by taking Wn = Un in Equation (5). Since   E |yk |2 = [ E (yn:1 y∗n:1 )]n−k+1,n−k+1   ∗ = E Vn∗ xn:1 xn:1 (Vn∗ )∗ n−k+1,n−k+1   ∗ = Vn∗ E ( xn:1 xn:1 ) (Vn∗ )∗ n−k+1,n−k+1 h i  ∗ = Vn∗ Un diag1≤ j≤n λ j ( E ( xn:1 xn:1 )) Un∗ (Vn∗ )∗ n−k +1,n−k +1 i h  ∗ ∗ ∗ ∗ , = Vn Un diag1≤ j≤n λ j ( E ( xn:1 xn:1 )) (Vn Un ) n−k +1,n−k +1

taking Wn = Vn∗ Un in Equation (5) we obtain Equation (2). (2) Applying [4] (Equation (10)) and taking Wn = Un in Equation (6) yields   E (Re(yk ))2

= =

1 n 1 n

n



cos

 2π (1 − k1 )k 2π (1 − k2 )k cos E xn−k1+1 xn−k2+1 n n



cos

i 2π (1 − k1 )k 2π (1 − k2 )k h  > cos E xn:1 xn:1 n n k1 ,k2

k1 ,k2 =1 n k1 ,k2 =1

,

(7)

Entropy 2018, 20, 399

=

1 n

=

1 n

=

1 n

=

1 n

4 of 15

n



cos

k1 ,k2 =1 n

    i 2π (1 − k1 )k 2π (1 − k2 )k h > cos Un diag1≤ j≤n λ j E xn:1 xn:1 Un∗ n n k1 ,k2

   2π (1 − k1 )k 2π (1 − k2 )k n > [Un ]k1 ,h λh E xn:1 xn:1 cos [Un ]k2 ,h ∑ n n k1 ,k2 =1 h =1 ! !    n n n 2π (1 − k2 )k 2π (1 − k1 )k > λ E x x cos [ U ] [ U ] cos n k2 ,h n k1 ,h n:1 n:1 ∑ h ∑ ∑ n n h =1 k 1 =1 k 2 =1 2    n n 2π (1 − l )k > [Un ]l,h , ∑ λh E xn:1 xn:1 ∑ cos n h =1 l =1



cos

and therefore,    1 > λn E xn:1 xn:1 n

2 n 2π (1 − l )k [Un ]l,h ∑ ∑ cos n h =1 l =1 n



≤ E (Re(yk ))

2



   1 > ≤ λ1 E xn:1 xn:1 n

2 n 2π (1 − l )k [Un ]l,h . ∑ ∑ cos n h =1 l =1 n

Analogously, it can be proved that    1 > λn E xn:1 xn:1 n

2 n 2π (1 − l )k [Un ]l,h ∑ ∑ sin n h =1 l =1 n



≤ E (Im(yk ))

2



   1 > ≤ λ1 E xn:1 xn:1 n

2 n 2π (1 − l )k [Un ]l,h . ∑ ∑ sin n h =1 l =1 n

To finish the proof we only need to show that 1 n

2 n 2π (1 − l )k 1 cos [ U ] n l,h = ∑ ∑ n n h =1 l =1 n

2 n 2π (1 − l )k 1 sin [ U ] n l,h = . ∑ ∑ n 2 h =1 l =1 n

(8)

If b1 , . . . , bn are n real numbers then 2 1 n n 1 n ∑ bl [Un ]l,h = ∑ ∑ n h =1 n h =1 l =1

=

1 n



k 1 =1

n



!

n

k1 ,k2 =1

bk1 [Un ]k1 ,h

!

n



bk2 [Un ]k2 ,h

k 2 =1

bk1 bk2 [Un Un∗ ]k1 ,k2 =

1 n

=

1 n

n



k1 ,k2 =1

n



k1 ,k2 =1

n

bk 1 bk 2

bk1 bk2 [ In ]k1 ,k2 =

∑ [Un ]k ,h [Un∗ ]h,k 1

2

h =1

1 n 2 bl , n l∑ =1

and thus, 2     ! 2π (1 − l )k 1 n 2π (1 − l )k 2 1 n 2π (1 − l )k 2 1 n n sin [ U ] = sin = 1 − cos ∑ n l,h n l∑ n h∑ n n n l∑ n =1 l =1 =1 =1 2   1 n 2π (1 − l )k 2 1 n n 2π (1 − l )k = 1 − ∑ cos = 1 − ∑ ∑ cos [Un ]l,h . n l =1 n n h =1 l =1 n

Equation (8) now follows directly from [4] (Equation (15)). Lemma 1. Let yn:1 be the DFT of an n-dimensional random vector xn:1 . If k ∈ {1, . . . , n} then      ∗ 1. E |yk |2 = Vn∗ E xn:1 xn:1 Vn .     n−k+1,n−k+1 > V 2. E y2k = Vn∗ E xn:1 xn:1 . n n−k +1,n−k +1  3. E (Re (yk ) Im (yk )) = 12 Im E y2k .

(9)

Entropy 2018, 20, 399

5 of 15

  E(|yk |2 )+Re( E(y2k )) . E (Re(yk ))2 = 2   2 2 E y − Re E y (| k | ) ( ( k )) E (Im(yk ))2 = . 2

4. 5.

Proof. (1) It is a direct consequence of Equation (7). (2) We have i h  i   h  > = E Vn∗ xn:1 xn:1 E y2k = E yn:1 y> (Vn∗ )> n:1 n−k +1,n−k +1 n−k +1,n−k +1 h  i h   i ∗ > ∗ > = E Vn xn:1 xn:1 Vn = Vn E xn:1 xn:1 Vn . n−k +1,n−k +1

n−k +1,n−k +1

(3) Observe that     E y2k = E (Re(yk ))2 − (Im(yk ))2 + 2Re(yk )Im(yk )i     = E (Re(yk ))2 − E (Im(yk ))2 + 2E (Re(yk )Im(yk )) i, and hence,

(10)

   = 2E (Re(yk )Im(yk )) . Im E y2k

(4) and (5) From Equation (10) we obtain        Re E y2k = E (Re(yk ))2 − E (Im(yk ))2 . Furthermore,         E |yk |2 = E (Re(yk ))2 + (Im(yk ))2 = E (Re(yk ))2 + E (Im(yk ))2 .

(11)

(12)

(4) and (5) follow directly from Equations (11) and (12). Theorem 2. Let yn:1 be the DFT of a real n-dimensional random vector xn:1 . If k ∈ {1, . . . , n − 1} \ { n2 } then         > ) > ) λ1 ( E xn:1 xn:1 λn ( E xn:1 xn:1 > > ≤ ≤ λ1 E ybk (ybk ) ≤ λ2 E ybk (ybk ) . 2 2 > Proof.   Fix r ∈{1, 2} and consider a real unit eigenvector v = (v1 , v2 ) corresponding to λr E ybk (ybk )> . We have

             λr E ybk (ybk )> = λr E ybk (ybk )> v> v = v> λr E ybk (ybk )> v = v> E ybk (ybk )> v. From [4] (Equation (10)) we obtain   E ybk (ybk )>    2π (1−k1 )k 2π (1−k2 )k 2π (1−k1 )k 2π (1−k2 )k cos E xn−k1+1 xn−k2+1 cos sin E xn−k1+1 xn−k2+1 1 n cos n n n n =   2π (1−k1 )k 2π (1−k2 )k nk ,k∑=1 sin 2π (1−k1 )k cos 2π (1−k2 )k E x x sin sin E x x n − k + 1 n − k + 1 n − k + 1 n − k + 1 1 2 n n n n 2 2 1 1 i 1 n h  > E xn:1 xn:1 wk1 wk>2 = n k ,k∑=1 k1 ,k2 1

2

with wl =

cos sin

2π (1−l )k n 2π (1−l )k n

! ,

l ∈ {1, . . . , n},

Entropy 2018, 20, 399

6 of 15

and consequently,    1 λr E ybk (ybk )> = n

=

1 n

n



h  i > E xn:1 xn:1

k1 ,k2 =1 n n

∑ ∑ [Un ]k1 ,h λh

k1 ,k2

v> wk1 wk>2 v

   > E xn:1 xn:1 [Un ]k2 ,h v> wk1 wk>2 v

k1 ,k2 =1 h=1 > n n  wk>1 v [Un ]k1 ,h λh h =1 1 ,k 2 =1

1 = nk



1 = n

n

h =1

1 = n

2  n   > > w v [ U ] λ E x x n n:1 n:1 ∑ l ∑ h l,h l =1 h =1





   > λh E xn:1 xn:1

n



k 1 =1

   > E xn:1 xn:1 [Un ]k2 ,h wk>2 v ! wk>1 v[Un ]k1 ,h

n



k 2 =1

! wk>2 v[Un ]k2 ,h

n

  −1  > > > where with E xn:1 xn:1 = Un diag1≤ j≤n λ j E xn:1 xn:1 Un being a diagonalization of E xn:1 xn:1 the eigenvector matrix Un is unitary. Therefore, 2 2       1 n n  1 n n   > > > > > ≤ λ1 E xn:1 xn:1 λn E xn:1 xn:1 ∑ wl v[Un ]l,h ≤ λr E ybk (ybk ) ∑ wl v[Un ]l,h . ∑ ∑ n h =1 l =1 n h =1 l =1

To finish the proof we only need to show that 1 n

2 n 1 > ∑ ∑ wl v[Un ]l,h = 2 . h =1 l =1 n

Applying Equation (9) and [4] (Equations (14) and (15)) yields 2 1 n n > ∑ wl v[Un ]l,h ∑ n h =1 l =1 2  2 1 n  1 n 2π (1 − l )k 2π (1 − l )k = ∑ wl> v = ∑ cos v1 + sin v2 n l =1 n l =1 n n 2  n  n  1 1 n 2π (1 − l )k 1 2π (1 − l )k 2 2π (1 − l )k 2π (1 − l )k = v21 ∑ cos + v22 ∑ sin + 2v1 v2 ∑ cos sin n l =1 n n l =1 n n l =1 n n   1 n 2π (1 − l )k 2 v22 1 n 4π (1 − l )k = v21 ∑ cos + + v1 v2 ∑ sin n l =1 n 2 n l =1 n !   2 v2 1 n 2π (1 − l )k 1 n 4π (l − 1)k + 2 − v1 v2 ∑ sin = v21 ∑ 1 − sin n l =1 n 2 n l =1 n 2 !   2 n  n 4π (l −1)k v 1 2π (1 − l )k 1 = v21 1 − ∑ sin + 2 − v1 v2 ∑ Im e n i n l =1 n 2 n l =1 ! n 4π (l −1)k v2 v2 1 1 1 = 1 + 2 − v1 v2 Im ∑ e n i = v> v = . 2 2 n 2 2 l =1

3. RDF Upper Bounds for Real Gaussian Vectors We first review the formula for the RDF of a real Gaussian vector given by Kolmogorov in [1].

Entropy 2018, 20, 399

7 of 15

Theorem 3. If xn:1 is a real zero-mean Gaussian n-dimensional vector with positive definite correlation matrix, its RDF is given by 1 R xn:1 ( D ) = n

(

> 1 λk E xn:1 xn:1 ∑ max 0, 2 ln θ k =1 n

 )

∀D ∈

> tr E xn:1 xn:1 0, n

 # ,

where tr denotes trace and θ is a real number satisfying D=

1 n

n



n   o > min θ, λk E xn:1 xn:1 .

k =1

We recall that R xn:1 ( D ) can be thought of as the minimum rate (measured in nats) at which one must encode (compress) xn:1 in order to be able to recover it with a mean square error (MSE) per dimension not larger than D, that is:   2 E k xn:1 − xg n:1 k2 n

≤ D,

where xg n:1 denotes the estimation of xn:1 and k · k2 is the spectral norm. The following result provides an optimal coding strategy for xn:1 in order to achieve R xn:1 ( D )   > > whenever D ≤ λn E xn:1 xn:1 . Observe that if D ≤ λn E xn:1 xn:1 then 1 R xn:1 ( D ) = 2n

> λk E xn:1 xn:1 ∑ ln D k =1 n



> det E xn:1 xn:1 1 = ln 2n Dn

 .

(13)

  −1 > > = Un diag1≤k≤n λk E xn:1 xn:1 Un Corollary 1. Suppose that xn:1 is as in Theorem 3. Let E xn:1 xn:1  > be a diagonalization of E xn:1 xn:1 where the eigenvector matrix Un is real and orthogonal.  > If D ∈ 0, λn E xn:1 xn:1 then 1 R xn:1 ( D ) = n

n

1 ∑ Rzk ( D) = 2n k =1

E z2k ln ∑ D k =1 n

 (14)

with zn:1 = Un> xn:1 .   Proof. We encode z1 , . . . , zn separately with E kzk − zek k22 ≤ D for all k ∈ {1, . . . , n}. g xg n:1 : = Un z n:1 , where   zen  ..  zg n:1 : =  .  . ze1

Let

As Un> is unitary (in fact, it is a real orthogonal matrix) and the spectral norm is unitarily invariant, we have     

2  2 2

E k xn:1 − xg E Un> xn:1 − Un> xg E kzn:1 − zg n:1 k2 n:1 2 n:1 k2 = = n n n       2 2 2 n n E ∑k=1 (zk − zek ) ∑k=1 E (zk − zek ) ∑nk=1 E kzk − zek k2 = = = ≤ D, n n n and thus, R xn:1 ( D ) ≤

1 n

n

∑ R z k ( D ).

k =1

Entropy 2018, 20, 399

8 of 15

To finish the proof we show Equation (14). Since           > > > > > E zn:1 z> , n:1 = E Un xn:1 xn:1 Un = Un E xn:1 xn:1 Un = diag1≤k ≤n λk E xn:1 xn:1 we obtain   h  i E z2k = E zn:1 z> n:1

n−k +1,n−k +1

      > > = λn−k+1 E xn:1 xn:1 ≥ λn E xn:1 xn:1 ≥ D > 0.

Hence, applying Equation (13) yields 1 n

n

1 ∑ Rzk ( D ) = n k =1

=

 1 E z2k 1 ∑ 2 ln D = 2n k =1

1 2n

n

n



ln

k =1

> λn−k+1 E xn:1 xn:1 ln ∑ D k =1  n

> λk E xn:1 xn:1 D



= R xn:1 ( D ).

Corollary 1 shows that an optimal coding strategy for xn:1 is to encode z1 , . . . , zn separately. We now give two coding strategies for xn:1 based on the DFT whose computational complexity is lower than the computational complexity of the optimal coding strategy provided in Corollary 1. > Theorem 4. Let xn:1 be as in Theorem 3. Suppose that yn:1 is the DFT of xn:1 and D ∈ 0, λn E xn:1 xn:1

 . Then

n 2 e x ( D ) ≤ R˘ x ( D ) ≤ 1 ∑ ln E(|yk | ) R xn:1 ( D ) ≤ R n:1 n:1 2n k=1 D

      ∗

 > − V diag ∗ E x x> V x x V V

E

n n n:1 n:1 n n 1 ≤ k ≤ n n:1 n:1 k,k 1 F  ≤ R xn:1 ( D ) + ln 1 + , √ > 2 n λn E xn:1 xn:1

(15)

(16)

where k · k F is the Frobenius norm,

e x ( D ) := R n:1

      

and R˘ xn:1 ( D ) :=

        

Ry n ( D )+2 ∑n−1n Ryc ( D 2 ) + Ryn ( D ) k = +1 2

2

2 ∑n−1n+1 k= 2

Ryc ( k

D 2

k

n

if n is even,

) + Ryn ( D ) if n is odd,

n

  D Ry n ( D )+∑n−1n RRe(y ) ( D 2 ) + RIm(yk ) ( 2 ) + Ryn ( D ) k = 2 +1 k 2 n   n −1 D ∑ n+1 RRe(y ) ( 2 )+ RIm(y ) ( D 2 ) + Ryn ( D ) k= 2

k

k

n

if n is even, if n is odd.

Proof. Equations (15) and (16) were presented in [4] (Equations (16) and (20)) for the case where the  > is Toeplitz. They were proved by using a result on the DFT of random correlation matrix E xn:1 xn:1 vectors with Toeplitz correlation matrix, namely, ref. [4] (Theorem 1). The proof of Theorem 4 is similar to the proof of [4] (Equations (16) and (20)) but using Theorem 1 instead of [4] (Theorem 1). Observe  > has been made. that in Theorems 1 and 4 no assumption about the structure of E xn:1 xn:1 Theorem 4 shows that a coding strategy for xn:1 is to encode yd n e , . . . , yn separately, where d n2 e 2 denotes the smallest integer higher than or equal to n2 . Theorem 4 also shows that another coding strategy for xn:1 is to encode separately the real part and the imaginary part of yk instead of encoding yk when k ∈ {d n2 e, . . . , n − 1} \ { n2 }. The computational complexity of these two coding strategies based on the DFT is lower than the computational complexity of the optimal coding strategy provided in

Entropy 2018, 20, 399

9 of 15

Corollary 1. Specifically, the complexity of computing the DFT (yn:1 = Vn∗ xn:1 ) is O(n log n) whenever the fast Fourier transform (FFT) algorithm is used, while the complexity of computing zn:1 = Un> xn:1 is O(n2 ). Moreover, when the coding strategies based on the DFT are used, we do not need to compute  > . It should also be mentioned that for these a real orthogonal eigenvector matrix Un of E xn:1 xn:1  > is not even required, in fact, for them coding strategies based on the DFT the knowledge of E x x n:1 n:1  

we only need to know E ybk (ybk )> with k ∈ {d n2 e, . . . , n}.

e x ( D ) and R˘ x ( D ), The rates corresponding to the two coding strategies given in Theorem 4, R n:1 n:1  > can be written in terms of E xn:1 xn:1 and Vn by using Lemma 1 and the following lemma. Lemma 2. Let yn:1 and D be as in Theorem 4. Then 1. 2. 3. 4.

E ( y2 ) Ryk ( D ) = 21 ln Dk for all k ∈ {1, . . . , n} ∩ { n2 , n}.   E((Re(yk ))2 ) E((Im(yk ))2 )−( E(Re(yk )Im(yk )))2 for all k ∈ {1, . . . , n − 1} \ { n2 }. Rybk D2 = 14 ln 2 ( D2 )   E((Re(yk ))2 ) RRe(yk ) D2 = 12 ln for all k ∈ {1, . . . , n − 1} \ { n2 }. D 2   E((Im(yk ))2 ) RIm(yk ) D2 = 21 ln for all k ∈ {1, . . . , n − 1} \ { n2 }. D 2

Proof. (1) Applying Equation (2) and [4] (Lemma 1) yields        > 0 < D ≤ λn E xn:1 xn:1 ≤ E |yk |2 = E y2k . Assertion (1) now follows directly from Equation (13). (2) Applying Theorem 2 we have     > ) λn ( E xn:1 xn:1 D 0< ≤ ≤ λ2 E ybk (ybk )> . 2 2 Consequently, from Equation (13) we obtain   Rybk

D 2



  E (Re (yk ))2

   det  > b b y y E (Im (yk ) Re (yk )) det E ( ) k k 1 1 = ln = ln  2  2 4 4 D D 2

 E (Re (yk ) Im (yk ))    E (Im (yk ))2

.

2

Assertions (3) and (4) Applying Equations (3) and (4) yields

and

> λn E xn:1 xn:1 D 0< ≤ 2 2



  ≤ E (Re (yk ))2 .

> λn E xn:1 xn:1 D 0< ≤ 2 2



  ≤ E (Im (yk ))2 .

Assertions (3) and (4) now follow directly from Equation (13). We end this section with a result that is a direct consequence of Lemma 2. This result shows e x ( D ) and R˘ x ( D ), when the rates corresponding to the two coding strategies given in Theorem 4, R n:1 n:1 are equal. Lemma 3. Let xn:1 , yn:1 , and D be as in Theorem 4. Then the two following assertions are equivalent: 1. 2.

e x ( D ) = R˘ x ( D ). R n:1 n:1 E (Re (yk ) Im (yk )) = 0 for all k ∈ {d n2 e, . . . , n − 1} \ { n2 }.

Entropy 2018, 20, 399

10 of 15

Proof. Fix k ∈ {d n2 e, . . . , n − 1} \ { n2 }. From Lemma 2 we have  2Rybk

D 2



    2 2 E Re y E Im y − ( E (Re (yk ) Im (yk )))2 ( ( )) ( ( )) k k 1 = ln  2 2 D 2







2 2 1 E (Re (yk )) E (Im (yk )) ≤ ln  2 2 D



2



1 ln 2

E (Re (yk ))

2



  E (Im (yk ))2

1 ln 2     D D = RRe(yk ) + RIm(yk ) . 2 2

=

D 2

+

D 2

4. Low-Complexity Coding Strategies for Gaussian AWSS AR Sources We begin by introducing some notation. The symbols N, Z, and R denote the set of positive integers, integers, and (finite) real numbers, respectively. If f : R → C is continuous and 2π-periodic, we denote by Tn ( f ) the n × n Toeplitz matrix given by

[ Tn ( f )] j,k = t j−k , where {tk }k∈Z is the sequence of Fourier coefficients of f , i.e., tk =

1 2π

Z 2π 0

f (ω )e−kωi dω

∀k ∈ Z.

If An and Bn are n × n matrices for all n ∈ N, we write { An } ∼ { Bn } if the sequences { An } and { Bn } are asymptotically equivalent, that is, {k An k2 } and {k Bn k2 } are bounded and k A −B k limn→∞ n√n n F = 0 (see [5] (Section 2.3) and [6]). We now review the definitions of AWSS processes and AR processes. Definition 1. A random process { xn } is said to be AWSS if it has constant mean (i.e., E( x j ) = E( xk ) for all  ∗ } ∼ { T ( f )}. j, k ∈ N) and there exists a continuous 2π-periodic function f : R → C such that { E xn:1 xn:1 n The function f is called (asymptotic) PSD of { xn }. Definition 2. A real zero-mean random process { xn } is said to be AR if n −1

xn = wn −

∑ a−k xn−k

∀n ∈ N,

k =1

or equivalently, n −1

∑ a−k xn−k = wn

∀n ∈ N,

(17)

k =0

 where a0 = 1, a−k ∈ R for all k ∈ N, and {wn } is a real zero-mean random process satisfying that E w j wk = δj,k σ2 for all j, k ∈ N with σ2 > 0 and δj,k being the Kronecker delta (i.e., δj,k = 1 if j = k, and it is zero otherwise). The AR process { xn } in Equation (17) is of finite order if there exists p ∈ N such that a−k = 0 for all k > p. In this case, { xn } is called an AR( p) process.

Entropy 2018, 20, 399

11 of 15

The following theorem shows that if xn:1 is a large enough data block of a Gaussian AWSS AR source, the rate does not increase whenever we encode it using the two coding strategies based on the DFT presented in Section 3, instead of encoding xn:1 using an eigenvector matrix of its correlation matrix. Theorem 5. Let { xn } be as in Definition 2. Suppose that { ak }k∈Z , with ak = 0 for all k ∈ N, is the sequence of Fourier coefficients of a function a : R → C which is continuous and 2π-periodic. Then  2 > 1. infn∈N λn E xn:1 xn:1 ≥ max σ |a(ω )|2 > 0. ω ∈[0,2π ]  > 2. Consider D ∈ 0, infn∈N λn E xn:1 xn:1 . (a)

If { xn } is Gaussian, 1 σ2 e x (D) ≤ R˘ x (D) ≤ K1 (n, D) ≤ K2 (n, D) ≤ K3 (n, D) ln = R xn:1 (D) ≤ R n:1 n:1 2 D

(b)

∀n ∈ N, (18)

where K1 (n, D ) is given by Equation (16), and K2 (n, D ) and K3 (n, D ) are obtained by replacing   2 > > λn E xn:1 xn:1 in Equation (16) by infn∈N λn E xn:1 xn:1 and max σ | a(ω )|2 , respectively. ω ∈[0,2π ] If { xn } is Gaussian and AWSS, e x ( D ) = lim R˘ x ( D ) = lim K3 (n, D ). lim R xn:1 ( D ) = lim R n:1 n:1

n→∞

n→∞

n→∞

(19)

n→∞

Proof. (1) Equation (17) can be rewritten as Tn ( a) xn:1 = wn:1

∀n ∈ N.

Consequently,       > > Tn ( a) E xn:1 xn:1 = σ2 In ( Tn ( a))> = E Tn ( a) xn:1 ( Tn ( a) xn:1 )> = E wn:1 wn:1

∀n ∈ N.

As det( Tn ( a)) = 1, Tn ( a) is invertible, and therefore,   −1   −1    −1 > = σ2 ( Tn ( a))∗ Tn ( a) = σ2 ( Tn ( a))> Tn ( a) E xn:1 xn:1 = σ2 ( Tn ( a))−1 ( Tn ( a))> !    −1  σ2 2 ∗ 2 = Nn diag1≤k≤n = σ Nn diag1≤k≤n (σk ( Tn ( a))) Nn Nn∗ (20) (σk ( Tn ( a)))2 for all n ∈ N, where Tn ( a) = Mn diag1≤k≤n (σk ( Tn ( a))) Nn∗ is a singular value decomposition of Tn ( a). Thus, applying [8] (Theorem 4.3) yields    > λn E xn:1 xn:1 =

σ2

(σ1 ( Tn ( a)))

2



σ2 maxω ∈[0,2π ] | a(ω )|2

>0

∀n ∈ N.

(2a) From Equation (13) we have    −1  det σ2 ( Tn ( a))−1 ( Tn ( a))>

> det E xn:1 xn:1 1 1 ln = ln n 2n D 2n Dn  n n 2 2 σ σ 1 1 1 σ2   = = ln ln = ln 2n D n det ( T ( a)) det ( T ( a))> 2n Dn 2 D



R xn:1 ( D ) =

n

n

Assertion (2a) now follows from Theorem 4 and Assertion (1).

∀n ∈ N.

Entropy 2018, 20, 399

12 of 15

(2b) From Assertion (2a) we only need to show that

     ∗

> − V diag ∗ >

E xn:1 xn:1 n 1≤k ≤n Vn E xn:1 xn:1 Vn k,k Vn F √ lim = 0. n→∞ n

(21)

As the Frobenius norm is unitarily invariant we obtain

 h    i 

∗ > ∗

E xn:1 x > − Vn diag 1≤k≤n Vn E xn:1 xn:1 Vn k,k Vn n:1

F √ 0≤ n

h   i 





 ∗ E x x> ∗ −C

Vn diag

b > V V V ( f ) b n n n:1 n:1 n n 1≤ k ≤ n

Tn ( f )− Cn ( f )

E xn:1 xn:1 − Tn ( f ) k,k F F F √ √ √ ≤ + + n n n

h     i 





 ∗ > ∗



> bn ( f )

E xn:1 xn:1 − Tn ( f )

Tn ( f )− C

Vn diag1≤k≤n Vn E xn:1 xn:1 − Tn ( f ) Vn k,k Vn F F F √ √ √ = + + n n n

h     i 





 ∗ E x x>



diag

> V − T ( f ) Vn b n n:1 n:1 n 1≤ k ≤ n

Tn ( f )− Cn ( f )

E xn:1 xn:1 − Tn ( f ) k,k F F F √ √ √ = + + n n n



 

  





> > b

Tn ( f )− Cn ( f )

Vn E xn:1 xn:1 − Tn ( f ) Vn

E xn:1 xn:1 − Tn ( f ) F F F √ √ √ ≤ + + n n n





> − T ( f ) bn ( f )

E xn:1 xn:1

Tn ( f )− C

n F F √ √ =2 + , n n ∗ ∗ bn ( f ) = Vn diag where f is (asymptotic) PSD of { xn } and C 1≤k ≤n ([Vn Tn ( f )Vn ]k,k ) Vn . Assertion (2b)  > } ∼ { T ( f )} and [9] (Lemma 4.2). now follows from { E xn:1 xn:1 n

If ∑0k=−∞ | ak | < ∞, there always exists such function a and it is given by a(ω ) = ∑0k=−∞ ak ekωi for all ω ∈ R (see, e.g., [8] (Appendix B)). In particular, if { xn } is an AR( p) process, a(ω ) = ∑0k=− p ak ekωi for all ω ∈ R. 5. Sufficient Conditions for AR Processes to be AWSS In the following two results we give sufficient conditions for AR processes to be AWSS. Theorem 6. Let { xn } be as in Definition 2. Suppose that { ak }k∈Z , with ak = 0 for all k ∈ N, is the sequence of Fourier coefficients of a function a : R → C which is continuous and 2π-periodic. Then the following assertions are equivalent: 1. 2. 3. 4.

{ xn } is AWSS.  > } is bounded.

{ E xn:1 xn:1 2 { Tn ( a)} is stable (that is, {k( Tn ( a))−1 k2 } is bounded). a(ω ) 6= 0 for all ω ∈ R and { xn } is AWSS with (asymptotic) PSD

σ2 . | a |2

Proof. (1)⇒(2) This is a direct consequence of the definition of AWSS process, i.e., of Definition 1. (2)⇔(3) From Equation (20) we have

 

>

E xn:1 xn:1

= 2

2  



1 −1 2 ∗ 2

= σ N diag M = σ T ( a )) (

n n n 1≤k ≤n σ ( T ( a ))

2 n k (σn ( Tn ( a)))2 2 σ2

2

for all n ∈ N. (3)⇒(4) It is well known that if f : R → C is continuous and 2π-periodic, and { Tn ( f )} is stable then f (ω ) 6= 0 for all ω ∈ R. Hence, a(ω ) 6= 0 for all ω ∈ R.

Entropy 2018, 20, 399

13 of 15

n o  Applying [8] (Lemma 4.2.1) yields ( Tn ( a))> = ( Tn ( a))∗ = { Tn ( a)}. Consequently, from [7] (Theorem 3) we obtain n  o n o ( Tn ( a))> Tn ( a) = { Tn ( a) Tn ( a)} ∼ { Tn ( aa)} = Tn | a|2 . Observe that the sequence         −1   

1 

1 >

E xn:1 x >

( Tn ( a))> Tn ( a) = x x =

E n:1 n:1 n:1

σ2

2 σ2 2 2  is bounded. As the function | a|2 is real, applying [8] (Theorem 4.4) we have that Tn | a|2 is Hermitian  and 0 < minω ∈[0,2π ] | a(ω )|2 ≤ λn ( Tn | a|2 ) for all n ∈ N, and therefore,

  −1

1 1

Tn | a|2

= ≤

2 )) 2 λ T a | min ( (| n n ω ∈[0,2π ] | a ( ω )| 2

∀n ∈ N.

Thus, from [5] (Theorem 1.4) we obtain     −1     −1  1  > > 2 E x x = T ( a )) ∼ T | a | . T ( a ) ( n n n n:1 n:1 σ2 Hence, applying [10] (Theorem 4.2) and [5] (Theorem 1.2) yields 

 1  > E x x n:1 n:1 σ2







 Tn

1 | a |2

 .

Consequently, from [8] (Lemma 3.1.3) and [8] (Lemma 4.2.3) we have    2   n  o  σ 1 > 2 E xn:1 xn:1 = Tn . ∼ σ Tn | a |2 | a |2 (4)⇒(1) It is obvious. k Corollary 2. Let { xn } be as in Definition 2 with ∑0k=−∞ | ak | < ∞. If ∑∞ k =0 a−k z 6 = 0 for all | z | ≤ 1 then { xn } is AWSS.

Proof. It is well known that if a sequence of complex numbers {tk }k∈Z satisfies that ∑∞ k =−∞ | tk | < ∞ k 6 = 0 for all | z | ≤ 1 then { T ( f )} is stable with f ( ω ) = ∞ kωi for all ω ∈ R. and that ∑∞ t z t e ∑k=−∞ k n k =−∞ k kωi for all ω ∈ R. Thus, Therefore, { Tn (b)} is stable with b(ω ) = ∑∞ k =0 a − k e     n

o 

o n >  −1





−1 > −1

( T ( a )) = ( T ( a )) = ( T ( b ))

( Tn ( a))−1 =

n n n



2 2 2

2

is bounded with a(ω ) = ∑0k=−∞ ak ekωi for all ω ∈ R. As { Tn ( a)} is stable, from Theorem 6 we conclude that { xn } is AWSS. 6. Numerical Example and Conclusions 6.1. Example Let { xn } be as in Definition 2 with a−k = 0 for all k > 1. Observe that

σ2 maxω ∈[0,2π ] | a(ω )|2

=

σ2 . (1+| a−1 |)2

If | a−1 | < 1 from Corollary 2 we obtain that the AR(1) process { xn } is AWSS. Figure 1 shows R xn:1 ( D ), σ2 e x ( D ), and R˘ x ( D ) by assuming that { xn } is Gaussian, a−1 = − 1 , σ2 = 1, D = R = 49 , n:1 n:1 2 (1+| a |)2 and n ≤ 100.

−1

Figure 1 also shows the highest upper bound of R xn:1 ( D ) presented in Theorem 5,

Entropy 2018, 20, 399

14 of 15

namely, K3 (n, D ). Observe that the figure bears evidence of the equalities and inequalities given in Equations (18) and (19).

Rxn:1 (D) exn:1 (D) R ˘ xn:1 (D) R K3 (n, D)

0.7

Nats per dimension

0.65

0.6

0.55

0.5

0.45

0.4 10

20

30

40

50

60

70

80

90

100

n

Figure 1. Considered rates for a Gaussian AWSS AR(1) source.

6.2. Conclusions The computational complexity of coding finite-length data blocks of Gaussian sources can be reduced by using any of the two low-complexity coding strategies here presented instead of the optimal coding strategy. Moreover, the rate does not increase if we use those strategies instead of the optimal one whenever the Gaussian source is AWSS and AR, and the considered data block is large enough. Author Contributions: Authors are listed in order of their degree of involvement in the work, with the most active contributors listed first. All authors have read and approved the final manuscript. Funding: This work was supported in part by the Spanish Ministry of Economy and Competitiveness through the CARMEN project (TEC2016-75067-C4-3-R). Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2. 3. 4. 5. 6. 7. 8.

Kolmogorov, A.N. On the Shannon theory of information transmission in the case of continuous signals. IRE Trans. Inf. Theory 1956, 2, 102–108. [CrossRef] Gray, R.M. Information rates of autoregressive processes. IEEE Trans. Inf. Theory 1970, 16, 412–421. [CrossRef] Pearl, J. On coding and filtering stationary signals by discrete Fourier transforms. IEEE Trans. Inf. Theory 1973, 19, 229–232. [CrossRef] Gutiérrez-Gutiérrez, J.; Zárraga-Rodríguez, M.; Insausti, X. Upper bounds for the rate distortion function of finite-length data blocks of Gaussian WSS sources. Entropy 2017, 19, 554. [CrossRef] Gray, R.M. Toeplitz and circulant matrices: A review. Found. Trends Commun. Inf. Theory 2006, 2, 155–239. [CrossRef] Gray, R.M. On the asymptotic eigenvalue distribution of Toeplitz matrices. IEEE Trans. Inf. Theory 1972, 18, 725–730. [CrossRef] Gutiérrez-Gutiérrez, J.; Crespo, P.M. Asymptotically equivalent sequences of matrices and multivariate ARMA processes. IEEE Trans. Inf. Theory 2011, 57, 5444–5454. [CrossRef] Gutiérrez-Gutiérrez, J.; Crespo, P.M. Block Toeplitz matrices: Asymptotic results and applications. Found. Trends Commun. Inf. Theory 2011, 8, 179–257. [CrossRef]

Entropy 2018, 20, 399

9. 10.

15 of 15

Gutiérrez-Gutiérrez, J.; Zárraga-Rodríguez, M.; Insausti, X.; Hogstad, B.O. On the complexity reduction of coding WSS vector processes by using a sequence of block circulant matrices. Entropy 2017, 19, 95. [CrossRef] Gutiérrez-Gutiérrez, J.; Crespo, P.M. Asymptotically equivalent sequences of matrices and Hermitian block Toeplitz matrices with continuous symbols: Applications to MIMO systems. IEEE Trans. Inf. Theory 2008, 54, 5671–5680. [CrossRef] c 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access

article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).