Vandermonde Factorization of Toeplitz Matrices and ... - IEEE Xplore

21 downloads 0 Views 1MB Size Report
Dec 15, 2013 - Manuscript received January 29, 2013; revised July 19, 2013; accepted ... of Integrated Circuits, Erlangen 91058, Germany (e-mail: ... filtering as in speech coding by the ACELP algorithm [2]–[4], .... Note that since (14) can be readily implemented by calcu- .... Cholesky decompositions, of the Toeplitz matrix.
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 61, NO. 24, DECEMBER 15, 2013

6257

Vandermonde Factorization of Toeplitz Matrices and Applications in Filtering and Warping Tom Bäckström, Member, IEEE

Abstract—By deriving a factorization of Toeplitz matrices into the product of Vandermonde matrices, we demonstrate that the Euclidean norm of a filtered signal is equivalent with the Euclidean norm of the appropriately frequency-warped and scaled signal. In effect, we obtain an equivalence between the energy of frequencywarped and filtered signals. While the result does not provide tools for warping per se, it does show that the energy of the warped signal can be evaluated efficiently, without explicit and complex computation of the warped transform. The main result is closely related to the Vandermonde factorization of Hankel matrices. Application examples for ACELP, perceptual modelling, and time-warped MDCT are discussed. Index Terms—Convolution, filtering, Hankel, line spectral frequencies, time-frequency analysis, Toeplitz, Vandermonde, warping.

I. INTRODUCTION

T

OEPLITZ matrices appear frequently in signal processing applications, especially when filtering is involved. In fact, filtering can be expressed by a convolution matrix, which is a non-square Toeplitz matrix, and further, the product of a convolution matrix with itself is a square, Hermitian, positive definite, Toeplitz matrix. The purpose of this work is to demonstrate a connection between filtering, Toeplitz matrices and warped time-frequency transforms. Here, warping refers to sampling of a signal at points which are not necessarily regularly spaced. For example, audio applications frequently employ frequency-warping on to the perceptually motivated Bark-scale, whereby the signal representation corresponds to the characteristics of the ear [1]. In short, the main result of this paper shows that in terms of signal energy, that is, Euclidean distance, filtering corresponds to frequency-warping. In practical terms, whenever our objective is to measure the energy of a filtered signal, we can equivalently measure the energy of an appropriately frequency-warped and scaled signal, and vice versa. The strictly analytic relation always contains all three operations, filtering, warping and scaling, but as we will find out, the relationship can be in some cases approximated by a relation between filtering and warping only.

The result will be useful for example in speech and audio coding applications, where quantization error is usually optimized by a mean square error in a perceptually weighted domain. Since such weighting can be represented by time-domain filtering as in speech coding by the ACELP algorithm [2]–[4], by frequency-domain weighting as in speech and audio coding in the MDCT domain [4], [5], or by frequency-warping as described by perceptual theory [1], [5], the results presented here will show relationships between these domains. The presented results are based on a Vandermonde factorization of Toeplitz matrices, which is very similar to the Vandermonde factorization of Hankel matrices [6] and it is in fact also a special case of the Carathéodory parametrization of covariance matrices [7, Sec. 4.9.2]. While the factorization is thus known within both numerical analysis, systems identification and spectral analysis, to the author’s knowledge, its interpretation with respect to filtering and warping has not been provided before. II. VANDERMONDE FACTORIZATION A Vandermonde matrix is defined as [8]

.. .

.. .

.. .

.. .

(1)

are always distinct, whereby In this work, the scalars is full rank. A special type of Vandermonde matrices that we will encounter are those that have all on the unit circle , which we call balanced Vandermonde matrices. Observe that Fourier transform matrices are thus up-to scaling equal to balanced Vandermonde matrices since the corresponding are equi-spaced on the unit circle. Also other balanced Vandermonde matrices correspond to time-frequency transforms, even if the would lie un-evenly distributed on the unit circle. Balanced Vandermonde matrices are thus warped Fourier transforms. Observe that Fourier matrices are often scaled by whereby is unitary, but here we omit this normalization in the interest of notational simplicity. Toeplitz and Hankel matrices, , and , are defined as [8]

Manuscript received January 29, 2013; revised July 19, 2013; accepted September 11, 2013. Date of publication September 17, 2013; date of current version November 13, 2013. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Lawrence Carin. The author is with the International Audio Laboratories Erlangen, Friedrich Alexander University of Erlangen-Nürnberg (FAU) and the Fraunhofer Institute of Integrated Circuits, Erlangen 91058, Germany (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSP.2013.2282271 1053-587X © 2013 IEEE

..

. .

.. .

..

.

..

.. .

..

.

..

.

..

.

.. .

.. .

(2)

(3)

6258

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 61, NO. 24, DECEMBER 15, 2013

The current work is closely related to the Vandermonde factorization of Hankel matrices [6] (4) is a diagonal matrix and superscript denotes the where transpose without complex conjugation. Observe that methods exist for fast factorization of with complexity [9]. Since a Toeplitz matrix is an upside-down Hankel matrix, we can readily find the Vandermonde factorization of Toeplitz matrices as

where is a positive definite Hermitian Toeplitz matrix, corresponding to the autocorrelation of the sequence . We can then apply the balanced Vandermonde factorization to obtain (9) Since is definite, also will be definite, that is, its diagonal elements are real and positive. We can then define a definite diagonal matrix such that , whereby and

(5) where the Vandermonde matrix is defined by the scalars . Note that here we use the super-script to indicate the complex conjugate transposition of matrices and super-script to indicate complex conjugation of scalars. Since this factorization can be obtained from the Hankel factorization by trivial steps, also the fast factorization methods for Hankel matrices can be used here. Some Toeplitz matrices, most notably positive definite Hermitian Toeplitz matrices, can be factorized such that the Vandermonde matrix is balanced, whereby and thus . We then obtain the balanced Vandermonde factorization (6) The proof has appeared in [10], but for completeness, it is presented also in the Appendix of this work. The set of Toeplitz matrices for which a balanced Vandermonde factorization can be generated is not restricted to positive definite Hermitian matrices. For example, consider an circulant matrix , defined by its elements , and where is the modulo-operator. Recall that circulant matrices have a well known factorization employing Fourier matrices , such that , where is again some diagonal matrix. Clearly this is also a balanced Vandermonde factorization, although is not necessarily definite nor Hermitian. Other Toeplitz matrices, such as convolution matrices and lower-triangular matrices have also Vandermonde factorizations with interesting structure, but those are beyond the scope of this article and left to further study. A. Euclidean Norms Let

be a

convolution matrix defined as

.. .

.. .

..

..

.

.

..

.

..

.

.. ..

.. .

.

correspond to Recall that balanced Vandermonde matrices warped Fourier transforms. This formula thus states that we can, instead of evaluating the energy of a filtered signal , equivalently evaluate the energy of the frequency warped signal weighted by . While (10) is already a useful relation, sometimes multiplication by can be undesirable. It is however generally possible to find factorizations such that (11) The approximation is accurate as long as is well-behaved. Estimation of such will be discussed in more detail in Section III. Observe that while the length of vector always exceeds that of , the terms and are the same length as . In coding theory, such transforms are said to have critical sampling. Moreover, since is full rank, the transform has also the perfect reconstruction property, that is, from the transform , we can perfectly reconstruct by . Note that this inverse can be readily obtained [11]. The critical sampling and perfect reconstruction properties are important in coding applications, since they ensure that no information is lost or added in the transform. Since the objective is to minimize quantization error energy in a perceptual domain, if the filtered signal describes the signal in a perceptual domain, we can equivalently minimize error in the transform domain . Time-warping is the dual of frequency-warping in the sense that the sampling grid of the time-signal is modified. This effect can be achieved by applying a warped transform on a frequencydomain signal. In other words, if is a balanced Vandermonde matrix and is the Fourier transform, then will be a timewarped version of . From (10) it follows that (12)

(7)

.

(10)

.. .

Then consider a signal convolved with , that is, consider the product . Its Euclidean norm can be written as (8)

This relationship can be interpreted such that a signal warped, or re-sampled in the time-domain, has the same energy as a signal suitably filtered in the frequency domain. Note that, if our task is to find a convolution matrix for a given time-warping operation , then the diagonal matrix is a free parameter, for which we can choose for example the trivial case, . However, if our objective is to find the time-warp corresponding to a given frequency-domain filter , then the diagonal matrix is required.

BÄCKSTRÖM: VANDERMONDE FACTORIZATION OF TOEPLITZ MATRICES AND APPLICATIONS IN FILTERING AND WARPING

III. ALGORITHMS A. Vandermonde Factorization We will here present an algorithm for calculation of the Vandermonde factorization of an Toeplitz matrix . While factorizations could be obtained using the algorithms developed for Hankel matrices, the current approach always gives balanced Vandermonde matrices, which is required for the link to warping. Vandermonde factorizations are not unique, whereby we always have to begin by choosing a starting point, a scalar which is to be the coefficient for the first row of the Vandermonde matrix. The value is usually as good as any. and solve , Define a vector where is a vector with elements . Determine the polynomial roots of . Define the Vandermonde matrix using the coefficients and calculate the diagonal matrix . Since is definite and Hermitian, the roots are on the unit circle, distinct and is full rank (for proof, see Appendix). In terms of complexity, the two most difficult steps are estimation of the roots of and calculation of the inverse of . When is large, then it is important to consider the numerical stability of these operations as well. The inversion of will be discussed in more detail in the following section. With respect to finding the roots of , observe that since the roots are on the unit-circle, they can be found at the zero-crossings of the sufficiently over-sampled Fourier transform of the coefficients of . B. On Inversion of Vandermonde Matrices The inverses of Vandermonde matrices have an analytic form and can be readily determined [11], [12]. Using the results of the current paper, we find an additional approach for inverting Vandermonde matrices. Namely, consider the pseudo-inverse of , that is, (13) is a positive where we have used the fact that definite Toeplitz matrix. Since super-fast methods for solving Toeplitz systems exist, e.g. [13], the main complexity of the pseudo-inverse approach comes from multiplication by . The complexity of the pseudo-inverse approach is thus . C. Euclidean Distance Estimates There are four cases to consider here: 1) Filter to warping and scaling—From a given matrix obtain and such that (10) holds. 2) Filter to approximate warping—From a given matrix obtain such that (11) holds. 3) Warping to filter—From a given matrix (and ) obtain such that (10) holds exactly. 4) Warping to approximate filter—From a given matrix (and ) obtain such that (10) holds approximately. The Case 1 is trivial and can be obtained by calculating from (8) and applying the algorithm presented in Section III-A. For Case 2, the problem is to determine such a that . A sufficient solution

6259

would be to find a such that . A heuristic approach would be to place the points such that the energy distribution is retained. Specifically, if is the Fourier transform of the coefficients of , then one possibility is to define such that (14) steps In effect, the formula places an analysis point with in cumulative energy. To understand why this is a valid approximation, consider the following. Define the Toeplitz matrices , where the diagonal matrix has elements . It follows that . It also follows that the Fourier spectrum corresponding to the sequence defining is a peak at . Where peaks are dense, their energies accumulate and where peaks are sparse, there is less energy. Any filter can thus be approximated in this domain by moving along the unit circle. Note that since (14) can be readily implemented by calculating the cumulative sum of the discrete Fourier transform, this approach can be computationally highly effective. In comparison, Case 1 requires the estimation of the accurate Vandermonde factorization, which involves estimation of the roots of an order- polynomial, which is a computationally intensive operation. The opposite direction, Case 3, where the objective is to calculate from such that (10) holds, that is, . Clearly, if only is given and no , then we can replace by an identity matrix. In any case, is Hermitian Toeplitz and positive definite if is positive definite. We can then use the normal equations of linear prediction [14], to obtain a vector . The impulse response of can then be used as a filter to form the convolution matrix . It follows that . To avoid filtering with the infinite-length impulse response of , we can use as an IIR filter. However, filtering the vector with a length IIR filter is still as complex as multiplication by , whereby this form provides no benefit in complexity. Case 4, on the other hand, provides an approximation that can be used to decrease complexity. As in Case 3, we form the Toeplitz matrix , but instead of the matrix, we consider a principal sub-matrix . Solving again provides an IIR filter that can be used to filter . Since it is of length , it can be achieved with much less complexity than multiplication by . The accuracy of the approximation naturally depends on the characteristics of . Generally, if the points are sufficiently evenly distributed over the unit circle, then it is easy to find good approximations. IV. APPLICATION EXAMPLES A. ACELP—Filtering to Frequency Warping As an example of the basic principles how frequency warping can be used instead of filtering, let us study algebraic code excited linear prediction (ACELP). The ACELP paradigm is currently the most frequently used approach for speech coding

6260

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 61, NO. 24, DECEMBER 15, 2013

and it is employed in standards such as AMR-WB, G.718 and MPEG USAC [2], [3], [15]. The parameters of ACELP are optimized in a perceptual domain where the perceptual model is represented by a filter. The optimization problem is of the form (15) where is a convolution matrix, the target signal and the quantized signal. For simplicity, this formulation deviates slightly from convention, the implications of which are described in [16], [17]. Quantization of presents one of the main computational challenges of ACELP, since in (15), samples of are correlated with each other and they can thus not be independently quantized. It follows that the only possibility is to try a large number of different quantizations to find that which minimizes (15). This process is also known as the analysis-by-synthesis approach, since each potential quantization has to be synthesised by filtering with , to find the best quantization. Quantization of is therefore computationally a very complex process. . An alternative would be to quantize the filtered signal However, is an oversampling transform, which means that the number of samples in is larger than in . It follows that will be inefficient in terms of entropy, alquantization of though the samples are uncorrelated in terms of the Euclidean norm of (15). Using (10) we can, however, replace the optimization problem of (15) by (16) The balanced Vandermonde matrix represents here a warped corresponds to Fourier transform and the diagonal matrix a weighting of the frequency components. In terms of the objective function, it is then equivalent to evaluate the Euclidean or the warped Fourier transform norm of the filtered signal . Instead of quantizing or , we can thus consider quanor . Since is a transform with tizing samples of the critical sampling, it follows that quantization does not suffer from inefficiency due to oversampling. Note, however, that distribution of error will be different depending on whether or quantized and further study is required to determine if that has an perceptual effect. Still, the warped signal provides a domain where samples can be independently quantized, such that the cross-products of samples have no effect on the total quantization noise energy. It follows that the computationally complex analysis-by-synthesis approach can be replaced by simple quantization. Note that any other factorization, such as the eigenvalue or could be Cholesky decompositions, of the Toeplitz matrix used equally well to orthogonalize the objective function. The benefit of the Vandermonde factorization is, however, that it bears a physical interpretation as a description of frequency content of the signal. It follows that this domain has a straightforward perceptual interpretation.

Fig. 1. Illustration of Euclidean evaluation on a Bark-scale. (a) The autocorrelation corresponding to Bark-scale and (b) its Fourier spectrum. The autocorrelation obtained by straightforward evaluation of the Bark-scale is depicted by a black line, the interpolated Bark-scale by the dashed blue line, and two different linear predictive models by the red and green lines.

B. Perceptual Models—Frequency Warping to Filtering Perceptual frequency scales, such as Mel- and Bark-scales, describe auditory signals on a scale where the frequency steps are perceptually equal. For example, the Bark-scale corresponds to the frequency bandwidth of the auditory filter and thus relates to the perceptual resolution [1], [18]. In speech and audio coding applications, clearly it can be useful to evaluate coding performance in a domain where resolution corresponds to perceptual resolution. The Bark-scale is thus an attractive warped frequency scale. Assuming a sampling frequency of , the Bark-scale frequencies would be . The corresponding Vandermonde matrix can be formed by setting the analysis points to and their complex conjugates. and the The autocorrelation matrix is obtained by first column is illustrated in Fig. 1(a), where the black line depicts the autocorrelation obtained this way. In the Fourier spectrum, Fig. 1(b), we can clearly see the analysis points for example at 4800, 5800, 7000 and 8500 Hz. In other words, the spectrum is sampled at these points, but the bandwidth of the analysis points is too narrow to capture the space between the analysis points. By interpolating frequencies between the analysis points , for example by up-sampling by a factor of 32 and low-pass filtering, we obtain the autocorrelation depicted by the dashed blue line. Clearly, the spectrum is now much more well-behaved and the noise appearing at the end of the autocorrelation in Fig. 1(a) is also gone. It seems reasonable to assume that the spectral characteristics of perception would be continuous, whereby this interpolated model is likely to more accurately correspond to human perception. To obtain a filter corresponding to the Bark-scale, a first-order linear predictive (LP) filter was calculated from the interpolated autocorrelation and it is depicted by the red dotted line and

BÄCKSTRÖM: VANDERMONDE FACTORIZATION OF TOEPLITZ MATRICES AND APPLICATIONS IN FILTERING AND WARPING

crosses. Since the LP filter models only the two first autocorrelation points accurately, it remains suboptimal on the higher lags. To get an overall better match, we manually tuned a second filter to match the overall shape of the autocorrelation. This manually tuned filter is depicted by the green dash-dot line. These LP filters can then be used in (11) to evaluate the energy of a signal on a Bark-scale. must be suffiWe can thus see that the sampling points ciently dense that an accurate autocorrelation can be formed. As a rule of thumb, the largest distance should be at most as large as the corresponding linear sampling. In other words, the warped domain must have more analysis points than the linear domain. In our example, however, the LP model was a first-order filter with 2 taps, whereby the oversampling ratio of our example was much above the necessary. Concluding, it is clear that evaluation of signal energy on the Bark-scale can be readily approximated by determining the energy of a filtered signal. By avoiding the computationally complex frequency warp, this approximation can provide a significant advantage. In other words, a codec which models the signal efficiently in the linear domain, can evaluate quantization error energy on a warped scale, without the computationally complex, explicit evaluation of the warped signal, but with a simple first order filter in the time-domain. C. Time-Warped MDCT—Time Warping to Frequency Domain Filtering As a final example, consider the case of estimating the SNR of a signal modelled in a time-warped domain but evaluated in linear domain. Specifically, consider coding of speech signals with a frequency-domain codec. Speech signals have a rapidly changing fundamental frequency, which distorts the frequencydomain representation. However, by time-warping the signal prior to the time-frequency transform, the harmonic structure of voiced signals can be retained [19]. Let be the warped Fourier transform corresponding to the transform from time-warped frequency-domain to linear timedomain. If we want to evaluate the energy of the time-warped in linear time, we have frequency-domain error signal . In other words, by applying an appropriately approximated convolution matrix , we can evaluate the signal to noise ratio as (17) If is approximated by a simple filter, then significant gains in computational complexity can achieved by replacing by . In a speech and audio coding application, time-warping is usually a tool which is used if it improves SNR. If the SNR does not improve with time-warping, then non-warped coding will be used. Moreover, if the SNR can be estimated without a time-warp operation, then the choice between warped and non-warped coding can be done without an explicit warping-operation, and the warping-operation is needed only when the option of time-warping is chosen. Since time-warping is a computationally complex operation, this way it can be avoided in many cases and overall complexity is significantly reduced.

6261

V. DISCUSSION AND CONCLUSION This article presents a Vandermonde factorization of Toeplitz matrices, similar to the well-known factorization of Hankel matrices [6]. The presented factorization can be applied to give a relation between the Euclidean distance of filtered signals and the Euclidean distance of frequency warped and scaled signals. By an approximation, the relation can be shown to apply also without scaling, whereby the density of the spectral lines comes to correspond to the energy of the corresponding filter. In retrospect, this result is not surprising; Analyzing a signal at frequencies close to each other gives, in terms of the norm, more weight to that area, and the same weighting can be performed by filtering the time-signal. Observe that this result has already been implicitly used in coding of linear prediction coefficients with line spectral frequencies (LSF) [10]. There, a linear predictive filter is converted to an ordered set of spectral lines, whose positions uniquely encode the information of that filter. The positions of the spectral lines exactly correspond to the analysis points of the Vandermonde matrix which factorizes the autocorrelation matrix corresponding to the filter. In difference, however, LSFs encode the information through two sets of frequency-lines but omits the information embedded in the diagonal matrix . Due to their symmetric relationships of the two line frequency sets, the information can be retained even without the diagonal matrix. Windowing is the time-frequency dual of filtering, whereby a natural question is whether the presented results apply also on windowing functions. It should be noted, however, that in discrete time, windowing in the time-domain corresponds to circular filtering in the frequency domain. Circulant matrices already have a well-known decomposition employing Fourier matrices, whereby this case is not interesting from the current perspective. For band-limited signals it is, however, possible to employ relationships between (non-circular) filtering and windowing, such that Vandermonde factorization can be again employed. Still, detailed analysis of this topic is beyond the scope of the current work. The presented results can be applied in a wide array of topics within signal processing. For example, i. speech coding by the ACELP algorithm employs the norm of a filtered signal in its objective function. Applying a convolution matrix on a vector introduces a correlation between samples such that they cannot be quantized independently. By quantizing the frequency-warped signal instead, the samples become independent, whereby complexity of the quantization can be significantly reduced. ii. Perceptual analysis of signals often employ warped frequency scales. Using the results of the current work, the energy of such warped signals can be calculated without explicit warping, but filtering only. Since re-sampling and warped time-frequency transforms are both computationally complex operations, replacing them by filtering can significantly reduce complexity. iii. Signals with continuously varying pitch, such as speech, are difficult to analyze or encode with methods which assume local stationarity. Time-warping of such signals alleviates the problem but introduces computational complexity. Through the results of the current paper, such complexity can be reduced since the SNR can be estimated without explicit calculation of the warped signal, but only filtering.

6262

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 61, NO. 24, DECEMBER 15, 2013

These applications present the most important combinations of the filtering and warping results of this paper. In fact, the presented results demonstrate that the perceptual models commonly used, which take the forms of warping, frequency weighting and time-domain filtering all have an analytic connection to each other. We have not, however, taken a position on whether perceptual theory has to be adjusted accordingly, but left that for further study. Simultaneously, the presented applications demonstrate the wide variety of applications where the results can be used. Most important, however, are the theoretical implications of this work. Namely, with respect to the Euclidean norm, filtering is equivalent to an appropriately re-sampled and scaled frequency representation of the signal.

which is equivalent with , since . In other words, if is on the unit circle, then all are also on the unit circle. From [10], we know that the roots are also always distinct, for . Moreover, from (19) we see that

.. .

.. .

.. .

(23) that is (24)

APPENDIX PROOF OF VANDERMONDE FACTORIZATION

Inserting (24) into the trivial identity

Theorem 1: Every positive definite Hermitian Toeplitz matrix can be factorized as , where is a balanced Vandermonde matrix and is a positive definite diagonal matrix. Proof: Let be a positive definite, Hermitian Toeplitz mawith and trix. Define a vector let be the solution to . The solution has elements and exists because is positive definite and it can be factorized as

.. .

.. .

(18)

corresponds to a zero of the polynomial . This defines the polynomial which thus corresponds to the matrix in a slight abuse of notation. We have

where

.. . Multiplying by

yields (25)

Re-organizing terms yields (26) which is only possible if (27) for some scalar . The corresponding matrix equation is

.. .

.. .

.. .

(28) Since is the polynomial where the zero has been removed, we can for generate a set of polynomials or equivalently

(19)

(29) such that

yields

(30)

.. . (20) where is positive real and inite. Solving for by eliminating system yields

since is positive deffrom this two-equation

Collect vectors to the columns of matrix and to the rows of . Then by defining a diagonal matrix with the diagonal elements (where ), we obtain (31)

(21) Observe that with We can readily show that

we have

is equivalent with (22)

(32)

BÄCKSTRÖM: VANDERMONDE FACTORIZATION OF TOEPLITZ MATRICES AND APPLICATIONS IN FILTERING AND WARPING

It follows that since has the zeros for product will be diagonal. Moreover, matrix and

, then the is a unit (33)

where is a diagonal matrix. This proves that (6) can be developed for any positive definite, Hermitian Toeplitz matrix. The opposite, that the product is always Toeplitz, can be readily shown by writing out the formula for each element of matrix . ACKNOWLEDGMENT The author would like to thank professor Peter Stoica for providing useful last minute comments when the article already was in pre-print. REFERENCES [1] E. Zwicker, “Subdivision of the audible frequency range into critical bands (frequenzgruppen),” J. Acoust. Soc. Amer., vol. 33, p. 248, 1961. [2] Adaptive Multi-Rate (AMR-WB) Speech Codec 2007, 3GPP TS 26.190 V7.0.0. [3] Frame Error Robust Narrow-Band and Wideband Embedded Variable Bit-Rate Coding of Speech and Audio From 8–32 kbit/s, ITU-T G.718, 2008. [4] M. Neuendorf, P. Gournay, M. Multrus, J. Lecomte, B. Bessette, R. Geiger, S. Bayer, G. Fuchs, J. Hilpert, and N. Rettelbach et al., “Unified speech and audio coding scheme for high quality at low bitrates,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP 2009), 2009, pp. 1–4, IEEE. [5] M. Bosi and R. E. Goldberg, Introduction to Digital Audio Coding and Standards. Dordrecht, The Netherlands: Kluwer Academic Publishers, 2003. [6] D. Boley, F. Luk, and D. Vandevoorde, “Vandermonde factorization of a Hankel matrix,” Scientific Computing, pp. 27–39, 1997. [7] P. Stoica and R. L. Moses, Spectral Analysis of Signals. Upper Saddle River, NJ, USA: Pearson/Prentice, 2005. [8] G. H. Golub and C. F. van Loan, Matrix Computations, 3rd ed. Baltimore, MD, USA: John Hopkins Univ. Press, 1996.

6263

[9] D. Boley, F. Luk, and D. Vandevoorde, “A fast method to diagonalize a Hankel matrix,” Linear Algebra and Its Applications, vol. 284, no. 1, pp. 41–52, 1998. [10] T. Bäckström and C. Magi, “Properties of line spectrum pair polynomials—A review,” Signal Process., vol. 86, no. 11, pp. 3286–3298, November 2006. [11] L. Turner, “Inverse of the Vandermonde matrix with applications,” NASA Tech. Note vol. TN D-3547, 1966. [12] A. Klinger, “The Vandermonde matrix,” The American Mathematical Monthly, vol. 74, no. 5, pp. 571–574, 1967. [13] M. Stewart, “A superfast Toeplitz solver with improved numerical stability,” SIAM J. Matrix Anal. Appl., vol. 25, no. 3, pp. 669–693, 2003. [14] J. Makhoul, “Linear prediction: A tutorial review,” Proc. IEEE, vol. 63, no. 4, pp. 561–580, Apr. 1975. [15] M. Neuendorf, P. Gournay, M. Multrus, J. Lecomte, B. Bessette, R. Geiger, S. Bayer, G. Fuchs, J. Hilpert, N. Rettelbach, F. Nagel, J. Robilliard, R. Salami, G. Schuller, R. Lefebvre, and B. Grill, “A novel scheme for low itrate unified speech and audio coding—MPEG RM0,” in Proc AES Convention 126, Munich, Germany, May 7–10, 2009. [16] T. Bäckström, “Computationally efficient objective function for algebraic codebook optimization in ACELP,” presented at the Interspeech, Aug. 2013. [17] T. Bäckström, “Comparison of windowing in speech and audio coding,” in Proc. IEEE Workshop Appl. Signal Process. Audio and Acoust. (WASPAA 2013), New Paltz, CA, USA, Oct. 2013. [18] H. Fastl and E. Zwicker, Psychoacoustics: Facts and Models. Berlin, Germany: Springer-Verlag, 2006, vol. 22. [19] B. Edler, S. Disch, S. Bayer, G. Fuchs, and R. Geiger, “A time-warped MDCT approach to speech transform coding,” in Proc 126th AES Convention, Munich, Germany, May 2009.

Tom Bäckström (S’02–M’04) is professor of Speech Coding at the International Audio Laboratories Erlangen (AudioLabs), which is a joint research unit between the Friedrich-Alexander University Erlangen-Nürnberg (FAU) and Fraunhofer Institute of Integrated Circuits IIS, Erlangen, Germany. He received his M.Sc. and D.Sc. (tech.) degrees from Helsinki University of Technology, Finland, respectively, in 2001 and 2004 and continued there as a post-doc until 2008. Since then he has worked at the AudioLabs, first as a researcher and from 2013 as professor. His research interests include speech and audio coding and perception, speech production, time-frequency analysis, matrix and polynomial algebra.

Suggest Documents