Transparent Differential Coding for High-Resolution Digital Audio*

0 downloads 0 Views 2MB Size Report
sented as an alternative to sigma-delta modulation (SDM) Unlike SDM, the model is inherently linear ... scribed as an alternative format to linear pulse code modulation ...... pling and noise shaping techniques applied to analog- to-digital and ...
ENGINEERING REPORTS

Transparent Differential Coding for High-Resolution Digital Audio* M. J. H A W K S F O R D ,

AES Fellow

Centre for Audio Research and Engineering, University of Essex, UK C04 3SQ

A coding method using cascaded stages of exact differentiation with equalization is presented as an alternative to sigma-delta modulation (SDM) Unlike SDM, the model is inherently linear and can operate losslessly together with an exceptionally wide audio bandwidth. Bit rates are competitive and signal processing can be performed between successive stages of coding. Applications include future ultrahigh-capacity DVD and bridge the debate between DSD and LPCM.

0 INTRODUCTION A method of using cascaded stages of differential coding together with noise shaping and equalization is described as an alternative format to linear pulse code modulation (LPCM) and to s i g m a - d e l t a modulation (SDM). LPCM is the basis code that underpins most digital audio systems and is now incorporated into the new established DVD-audio standard as proposed by the DVD consortium WG-4. The advantages of LPCM are well established and can be summarized by observing that the only fundamental distortions are brickwall band limitation resulting from uniform sampling together with additive noise introduced because of quantization and dither. In a system that is correctly implemented and functioning, neither of these processes results in correlated distortion. It has also become topical to consider SDM as a means of transporting or storing digital audio, partly because of the widespread use of l-bit converters for both analogto-digital conversion (ADC) and digital-to-analog conversion (DAC) [ 1]. The argument in favor says that with simple iinkage of ADC and DAC, the absence of decimation and oversampling filters eliminates processingrelated errors, offers a wider bandwidth, and therefore allows greater system transparency. The counter argument recognizes that this is only valid for 1-bit converters and is therefore limited in appeal and application. ADC and DAC technology now embraces oversampling and noise shaping in conjunction with low-resolution * Presented at the 107th Convention of the Audio Engineering Society, New York, 1999 September 24-27; revised 2001 April 27 480

uniform quantizers with randomization methods to decorrelate residual hardware-induced errors [2], offering performance advantages over the 1-bit converter. Multibit converters are theoretically linear devices as long as certain conditions are met. Provided a nonsaturating, uniform quantizer is incorporated and correctly formed dither precedes the quantizer, linearity is theoretically achievable, commensurate with proper supporting processing architectures. In contrast, there is no equivalent case for SDM where, because of the saturating nature of the 1-bit quantizer, there is no known theorem that guarantees linear operation. However, in presenting this argument it is recognized that with dither and advanced loop design, low distortion is possible [3]. Some observations about the range of linear operation of SDM are made in Section 1. The quest for more signal bandwidth, combined with a welcome reduction in processing requirements, is at the heart of the DVD-audio specification, where against all technical and political odds, a maximum sampling rate of 192 kHz has been adopted. Even this high sampling rate is overshadowed by SDM, whose bandwidth extends to one-half the serial bit rate, all be it with the presence of gross quantization noise. Once the sampling rate of a system is sufficiently high, the option of using noise shaping to reduce the sample resolution is attractive [4]-[7], where, provided the uniform requantizer is not overloaded, it is possible to retain linear operation. The work presented here takes a hybrid approach. It encompasses the advantages of using high sampling rates with noise shaping but applies differential coding to achieve greater efficiency. It will be shown that a performance exceeding SDM can be obtained at commensurate bit rates, but with the linearity J Audio Eng Soc ,Vol 49, No 6, 2001 June

ENGINEERING REPORTS

CODING FOR HIGH-RESOLUTION DIGITAL AUDIO

advantage of LPCM and with better overload performance.

1 COMPARISONS OF DM AND SDM AND OBSERVATIONS ON LINEARITY OF 1-BIT SYSTEMS The concept of the 1-bit coder was termed delta modulation (DM) [8] and precedes the wide adoption of LPCM. In its simplest form it consists of a comparator with a negative feedback loop incorporating an integrator into the feedback path, as shown in Fig. 1. The classical transformation of this topology from DM to SDM was first presented by Inose and Yasuda in 1962/ 1963 [9], [10] and is shown in evolutionary form in Fig. 2. The reason why this transformation is reproduced is that the DM form has a closer relationship to LPCM. For

1-bit quantizer

example, it is possible to mimic first-order DM operation using an open-loop model with slew rate constraints applied to enable the slope overload condition to be incorporated. Slope overload is entered where the output code is a sequence of all l's or all O's and the error signal in the first-order loop exceeds 1 quantum. Since this model operates using a uniform quantizer, then, provided the slope overload threshold is not exceeded, dither can be added and linear operation inferred as per LPCM. Earlier work has also shown that in the nonslope-overload condition, DM is equivalent to timequantized phase modulation [11], [12]. Consequently coding linearity in terms of a 1-bit coder is defined where the reconstructed signal can at some stage be configured as LPCM and where during quantization appropriate dither is applied and the signal is constrained so that no clipping or slope-limiting distortion

D-type bistable ~.grator

1-b~o~put~

input

,nt

sample clock -~

-

-- reconstructed output

Fig. 1. First-order (single-integrator) delta modulator (DM).

integrator

1-bit quantizer

D-type bistable

input

= 1- bit output,,~

sample clock ~

-

integrator

integrator

1-bit quantizer

D-type bistable

input - ~ _

= 1- bit output.,~

-

sample clock j

Fig 2 Reconfiguration of DM to SDM. J. Audio Eng. Soc, Vol 49, No 6, 2001 June

481

HAWKSFORD

ENGINEERING REPORTS

ror, which would not be bounded by the error of a linear quantizer. It is this mechanism that is at the core of identifying the nonlinear behavior of DM and by inference that of SDM. SDM and DM can normally accommodate a second integrator to improve the coding performance, although more than two integrators require careful loop design to ensure stability, which is a direct consequence of slope overload constraining the output. Interestingly, when the models of either Fig. 3 or Fig. 4 include integration in a feedback loop [13], this apparently first-order loop is actually that of second-order DM or SDM. Fig. 5 shows the additional integrator in the forward path, whereas Fig. 6 replaces the open-loop DM structure with an equivalent first-order feedback loop. Finally, in Fig. 7 this is reconfigured to form classic second-order SDM, where the thread of equivalence should be observed, although this latter form includes slope-overload limiting. Consequently, the linear regime of a l-bit coder can be determined by considering the equivalent model that embeds one or more uniform quantizers and limits the input excitation such that the modulus of the differential of the output does not exceed unity. This applies for both the single- and the double-integration model. If the slope-overload condition is removed, allowing the bound on the signal differential to be relaxed, then provided that dither is applied correctly at the input to each quantizer, linearity can be inferred. Eliminating the slope-overload condition also allows higher order noise shaping to be applied. It is here that a combination

of any form occurs. To illustrate this definition, Fig. 3 shows an openloop DM that includes slope-limitation circuitry [13]. In essence, it is two interleaved flash converters consisting of a bank of comparators and D-type bistables interleaved on alternate clock cycles to mimic the quantization of first-order DM. However, the D-type bistables are also connected vertically to act as an u p - d o w n thermometer-style shift register such that the upward and downward progression of 1 pulses is constrained by the clock rate. This implements exactly the slopeoverload condition of a first-order DM. The multibit, multilevel output code of the vertical register is then logically differentiated to form the binary output sequence. The two interleaved quantizers are shown explicitly in Fig. 4(a), although in this configuration the slope-overload circuitry has been omitted. However, by introducing a half-quantum offset on alternate samples [12] the same operation, hence idle channel performance, can be achieved using a single quantizer, as shown in Fig. 4(b). In Fig. 4(a) both a positive and a negative ramp is superimposed onto the input of each respective quantizer, which then simulates the idle channel behavior of first-order DM. However, providing the input signal with dither is constrained such that the differentiated quantizer output sequence does not exceed the limits + 1 or - 1, the coder is linear within the coding envelope of DM. If on the other hand the differentiated output sequence exceeds this limit, then DM would demand slope limitation, and this would imply an additional er-

local analog

+---]cl~ ref. ~

l

[ other LJ-------4-~-stages

output

I

localDAC Digital differentiator

clock

analog input - -

clock--dc oF----~ =.)"

- ref.

I I

f'--r

other stages

I

1 bitDMcode

I

Fig. 3 Open-loop, first-order DM with slope overload circuitry. 482

J Audio Eng. Soc, Vol 49, No 6, 2001 June

ENGINEERING REPORTS

CODING FOR HIGH-RESOLUTION DIGITAL AUDIO

Also, by including complementary pre- and deemphasis equalization the relationship between noise floor and amplitude clipping can be modified. A front-end encoder with a complementary decoder is shown in Fig. 8. It uses an equalizer cascaded with a kth-order noise shaper. We assume here that the source information is encoded with LPCM and that the sampling rate is 8f~. The aim is to use a sufficiently high sampling rate such that most of the bandwidth advantage claimed of SDM is achieved, but with the additional advantage of linearity implied by using multilevel uniform quantization with dither. In its basic form the recovered output is derived using a complementary deemphasis filter in cascade with the noise shaper output.

of multilevel code and noise shaping offers a fundamental advantage over systems using only a 1-bit code as it enables linear encoding. 2 O V E R S A M P L I N G , NOISE SHAPING, AND EQUALIZATION The application of noise shaping has been researched in depth for a range of applications that include ADC, DAC, and PWM together with signal requantization as part of a more general signal processing architecture. Provided a signal processor includes uniform quantization with optimal dither, then an exchange between amplitude resolution and sample rate can be made [4]- [7]. r ...................................

, r

&

0.5d{+ft + 0.5}

uniform quantizers

Input

output

l output

0.5d{-ft + 0.5} (a) +0.25(I ~t

0 -0.25d

..........

_

_

i

i

i

/

i

1_

2_

3__

fs

k

fs

sample

input

~output

uniform quan~zer Fig. 4. Open-loop, first-order DM without slope overload circuitry J Audio Eng Soc, Vol 49, No 6, 2001 June

483

HAWKSFORD

ENGINEERING REPORTS

word length compatible with the noise shaper input word length (such as 32 bit); so with proper design minimal compromise is implied. However, using the structure of Fig. 9(a), the pre- and deemphasis networks are both driven by the same quantized signal, whereby quantization can be incorporated into the encoder loop together with its own dither signal. By synchronizing encoderdecoder dither (or using no dither), the same signal can be recovered at the output of the side chain. Complementary equalization is achieved using the classic feed-

The aim of this processor is to mimic as closely as possible the performance of a 24-bit LPCM system, except for a relaxation in the high-frequency overload margin to match the characteristic of real-world audio signals. As such, the noise spectrum over the 0- to 24-kHz band should be comparable with (or better than) the noise floor of 24-bit LPCM, although this can be relaxed in the ultrasonic region. A direct approach is to cascade an equalizer and a noise shaper, as shown in Fig. 8. The output data of the equalizer need only be truncated to a

0 .......... r,

1

f-;

~t 1 2

r,

3

r.

*••1

integrator

~

sample

uniform quantizer Fig 5. Open-loop DM enclosed with single-integrator feedback

Integrator

1-bit quantlzer

D-type bistable

,- 1 - bit output

Input

Integrator

First-order DM~ Fig 6 Equivalent second-order DM (integrator in forward path) Integrator

Integrator

1-bit qu antizer

D--type bistable.

1- bit output

Fig 7 Classic second-orderDM 484

J Audio Eng Soc, Vol 49, No 6, 2001 June

ENGINEERING REPORTS

CODING FOR HIGH-RESOLUTION DIGITAL AUDIO

in Figs. 8 and 9 is that the signal transfer function is unity. This is achieved by including a feedforward path directly to the input of the quantizer and also delaying the main input by one sample period in order to compensate for the unit sample delay required in the feedback path. This process is demonstrated in the following analysis together with complementary equalization. The intermediate sequence Vint(Z ) can be expressed in terms of the input sequence Vin(Z ) and noise sources qr and qr as

b a c k - feedforward topology, which is conceptually similar to that first used in the Dolby-A noise-reduction system, although this was an analog realization. To investigate the effect of the two quantization processes within this system, forward and reverse noise sources qf and qr are shown in the additive noise model of Fig. 9(b). An additive noise model is supported because both quantizers are uniform (although different) and optimal dither is assumed. However, a difficulty encountered with this technique is possible additional constraints on stability, especially when high-order noise shapers are used. As such the cascaded equalizer may prove more tractable and has been employed in the simulations presented in Section 5. A characteristic of the noise shaper topology shown

Vi"t(z) - 1 + z-1 l E(z)

[Vin(Z)+1 + z qf - l H(z)

qr z - I

]

(1) uniform quint imr dither

~lullim lion

4digital integrators with hmdlorward

/

De-emphalis

-d, t._ ',"

equali~, tion

o~put

(} Fig 8 Cascade of preemphasis, noise shaping, and deemphasis / ....................................... i

!

input~

]

uniform quantizer

~

9-J~---~--4~ recovered k j/ - output

-- ~/f/

L. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

dither

dither

(a) i )__~ Vin(Z

i

"

,i/(' i

i '

~ Vint(Z

.Vo..,

,"

,, i i i i i i i

i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

qr

9

qr

(b) Fig. 9. Conceptual system combining equalization and noise shaping together with noise model J Audio Eng Soc, Vol 49, No 6, 2001 June

485

ENGINEERING REPORTS

HAWKSFORD

while the recovered output

Vout(Z)is

Vout(Z ) = Vint(z)[l -t- z - ] E ( z ) ] q- z - l q r .

Vim(Z), the

overall input-output

= Vin(Z) -~- 1 -[- zqf - 1H(z)

(3)

Hence substituting for function is

V~

(2)

Provided the same error signal q~ appears in both the encoder and the decoder, then the final output is independent of the truncation noise in the side chain. This suggests that dither is not required in the truncation of the side chain output, removing the need for dither synchronization. Eq. (3) confirms that the overall signal transfer function is unity, whereas the noise shaping transfer function is [1 + z - 1H(z)] - 1. However, inspection of the intermediate signal shows that E(z) performs preemphasis given by the function [1 + z-~ E(z)]-~, which has a direct effect on overload performance. Overload can be specified by placing a frequency-domain bound on the intermediate signal Vint(f) weighted by a function W(f), such that

Vint(f) IW(f)l ~ X

i i

m

g

$

(4)

where h is a constant. However, to determine [W(f)], the transfer function of the intermediate coding stage must be determined, which depends directly on the number of cascaded differentiators [see Eq. (7)]. Fig. 10 shows a possible method of implementation for the filters H(z) and E(z).

3 LOSSLESS DIFFERENTIAL CODING Section 2 described a method that combined equalization and noise shaping. In this section greater efficiencies are explore by employing lossless multistage differential coding, where our investigation incorporates up to three stages together with overload correction. Differential encoding requires subsequent integration to recover the signal, so it is critical that there be no errors between encode and decode stages, including saturation or requantization: In practice a small amount of side channel information is required to reset the integrators in order to account for errors. Also, effective "ac coupling" should be employed in the decoder to eliminate longterm signal drift and to protect against startup transients. However, it will be shown in Section 4 that certain distortions are permissible, provided the area under the encoded sequence does not change over a given period of time and that errors in the higher order integral waveforms are accommodated. Fig. 11 shows a two-stage lossless processor that uses complementary differentiation and integration, although additional stages of differentiation and integration can be added. The respective encoding and decoding z trans486

i

e~

i

{

o

\ i

1 $

J. Audio Eng Soc, Vol. 49, No 6, 2001 June

ENGINEERING REPORTS

CODING FOR HIGH-RESOLUTION DIGITAL AUDIO

forms are Ten(Z ) ~- (1 - z - l ) 2

Tdec(Z)

=

(1

-

z-l)

-2

(5)

(6)

.

From Eq. (5) the encoding magnitude frequency response IEN(f)I is given as

IEN(f)I

= 2(1 - cos(2"rrfT)).

(7)

As the coder differentiates the output of the noise shaper, there is a coding advantage in band-limiting the output of the noise shaper to minimize the rate of change. A simple process can average over two adjacent samples, and this option was included in the model. However, there is an interesting problem. As the signal is bandlimited, it produces longer words at the output, so the difference between adjacent samples in terms of quanta may increase. A balance therefore has to be found.

4 SOFT AMPLITUDE LIMITING USING TIMEDISPERSIVE CORRECTION For the output bit rate of an N-stage differential encoder to be bounded, the output word length must be limited, implying amplitude clipping and error in the recovered signal. However, as decoding can include up to N = 3 cascaded integrators (see Section 3), any uncontrolled modification of the encoded signal can cause the recovered signal to diverge. Hence when clipping distortion occurs, pulse area must be conserved so that the integrated signal remains stable and converges to the required signal level. However, a simple control of the signal average, although necessary, is insufficient as it is shown that errors in the integral waveforms that result from pulse dispersion must also be taken into account. Introducing appropriately metered time dispersion into signal elements that experience overload can conserve pulse area. To demonstrate the correction process employed in the encoder, Fig. 12 illustrates the principle [14]. In the first waveform a pulse is shown to exceed the overload threshold where the error component is shaded. The first example uses a single backward pulse correction procedure where the excess pulse amplitude of sample n is transmuted and added to the sample n + 1. Consequently when pulses n and n + 1 are considered together, their total area is conserved. However, although the area under this curve is correct and results in the first integral converging to the correct value, there is a finite loss of area under the first integral. Conse-

quently the second integral does not converge to the correct value (should two or more stages of differentiation-integration be used). The second example shows a similar procedure, but here half the excess pulse amplitude is added to sample n + 1 whereas half is added to sample n - 1. This process yields a time-symmetric dispersion of the error, with the error remaining symmetrically centered on sample n, where the effect is similar to symmetrical slewrate distortion. However, because pulses are amplitude quantized and must remain so when the error dispersion is added, dividing the overload error into two equal parts requires the error to be an even number of quanta. Consequently if the overload error is an odd number of quanta, then the error is increased artificially by one quantum prior to division. The waveforms shown in Fig. 12 reveal that there is no longer an error in the area under the first integral waveform, just a small redistribution of the waveform in time. However, extending to the second integral, as illustrated in Fig. 13, shows that although the correct amplitude is reconstructed after the third pulse, an error in the area under the curve remains, implying a convergence error in the third integral waveform. This requires additional processing to secure the accuracy of the third integral, which is relevant if three stages of differentiation-integration are used. To correct for convergence error in the third integral, a symmetrical five-pulse substitution sequence is used, as shown in Fig. 14. In all the integral calculations that follow the waveforms are integrated from the sequence beginning at sample n - 2 to the sequence ending at sample n + 2. For acceptable error correction then samples n + 2 and above in the third integral waveform must be identical to the third integral of the nonclipped waveform. The integrals of the five-pulse sequence at sample n + 2 are evaluated and compared to the integrals of the nonclipped sample as follows. For the first integral evaluated at sample n + 2. L = a 0 + 2(a 1 + a2). For the second integral evaluated at sample n + 2, L = a 0 + 2(a 1 + a2).

(8)

For the third integral evaluated at sample n + 2, 6L = 6a 0 + 13a 1 +

16a2.

(9)

Defining the error in sample n as e and equating e to the sum of the substitution pulses in samples n - 1, n - 2, n + 1, a n d n + 2, e = 2(al + a2).

(10)

Fig. 11 Two-stage complementary differentiation and integration J Audio Eng Soc, Vol 49, No 6, 2001 June

487

HAWKSFORD

ENGINEERING REPORTS

S o l v i n g Eqs. ( 8 ) - ( 1 0 ) y i e l d s the a m p l i t u d e s o f the five substitution pulses, a0 = L - e

(11)

2 a I = ~e

(12)

1 a2 = - 7 e .

Eq. (13) reveals a w e i g h t i n g o f 1/6 in the error e. As a c o n s e q u e n c e , errors m u s t be q u a n t i z e d to m u l t i p l e s o f six q u a n t a to a v o i d further q u a n t i z a t i o n d i s t o r t i o n if an

ovedoad (clipping)

pulse amplitude m

...............................

i/

rl-4

n-3

(13)

0

armr

~

.......................

+ v e c l i p p i n g level

I i

n-2

n-1

n

n+l

! n+2

n+3

................................................................. pulse amplitude

n+4

I n+8

=-nT

ve clipping level

backward time displacement of overload distortion

: errorl~i_ !o-ver"l~'d"1

+ ve clipping level

. . . . . . . . . . . . . . . . . . . . . . . . . .

1,-4

1.-3

.-a.~

[n-,

/

~n*2

n,,~

.................................................................

~4

~.+5

D'-nT

ve clipping level

pulse amplitude forward and backward time displacement of overload distortion ~..v.,,o.d

........................

. . . . . . '. . . . . . . ' . . . . . . . . . . . . . . . . . . . . .

n-4

n-1

n-3

n--2

n

n+2

n~

+ ve clipping level

n+4

I n+S

, nT

I ................................................................

ve clipping level

Fig 12 Time dispersive correction of an overload error

488

J Audio Eng Soc, Vol 49, No 6, 2001 June

ENGINEERING REPORTS

CODING FOR HIGH-RESOLUTION DIGITAL AUDIO

increase in output resolution is to be avoided. The correction procedure operates as follows. First the undistorted differentials are computed (either one stage, two stages, or three stages, depending upon choice) from the output of the noise shaper. The resolution of the output code then establishes the upper and , pulse

lower clipping levels. Where an overload error occurs, the absolute value of a sample is reduced by the nearest multiple of six quanta. The four remaining substitution pulses are then calculated and summed with the adjacent samples in the output sequence. The resulting waveform is scanned again for overload errors. If any remain, the kpulse amplitude

ampll lude

/

\

mmmm

n-2

n-3

-

n-1

m

if.re.....

nT

n--4

n-3

n-2

nT

n-1

Integrated pulse amplitude

-,-2S,-2

................ -SI ....... /

~

Flrat~lage integral~on

~d,

n-3

nT

n.-1

n--2

Integrated pulse amplitude

[ ~

r~4

Integrated

~

wmplIt~lo

n-2

n.-1

~-- n T

pulse amlditude

Seccnd.sta le inlsgration

......................

mm~

........................................

line representB non dlatorled waveform

dotte

w ..............

n .......................................

.\

i ...............

I

m,l~r

n-4

n- 3

n-2

n-1

nT

n-4

n-3

n--Z

n--1

~-

nT

Fig 13. Two-stage integration performance. J Audio Eng Soc, Vol 49, No. 6, 2001 June

489

HAWKSFORD

ENGINEERING REPORTS

p r o c e d u r e is r e p e a t e d as m a n y times as required until the resulting output falls within the o v e r l o a d limits. As noise shaping is used prior to differentiation, there can be c o n s i d e r a b l e interpulse differences such that if one pulse is, say, a high positive value, then the adjacent pulses are u s u a l l y negative and can a c c o m m o d a t e the error without o v e r l o a d . This factor helps a c c o m m o d a t e the d i s p e r s i v e error sequences. The p r o c e d u r e can be e x t e n d e d to the fourth integral using a s y m m e t r i c a l seven-pulse substitution. Using a approach s i m i l a r to that used before (but p e r f o r m e d from n - 3 to n + 3), it follows that

Then, a0 = L - e

(14)

a I = -0.5e

(15)

a 2 = 1.7e

(16)

a 3 = -0.7e.

(17)

The factor 1.7 implies that a m i n i m u m value for e is 10 quanta, m a k i n g the pulse w e i g h t i n g s in the substitution sequence a 0 = L - 10, al = 5, a 2 = 17, and a 3 = 7. H o w e v e r , the coefficient az being greater than a 0 implies gain, so there is an increased p r o b a b i l i t y that this substitution could actually push adjacent s a m p l e s into clipping. As a result, this higher order correction was not pursued in the present study.

e = 2(a I + a 2 + a3). F o r the first integral evaluated at s a m p l e n + 3, L = a 0 + 2(a I + a2 + a 3 ) . F o r the second integral e v a l u a t e d at s a m p l e n + 3,

5 SIMULATION

L = a 0 + 2(a I + a z + a 3 ) .

AND RESULTS

The coder and d e c o d e r , including equalization, noise shaping, d i f f e r e n t i a t i o n - i n t e g r a t i o n , and clipping correction,~were simulated in M A T L A B . (See A p p e n d i x for listing.l) H o w e v e r , it is not intended to give an exhaustive search o f the coding options but to illustrate

F o r the third integral e v a l u a t e d at s a m p l e n + 3, 10L = 10a 0 + 21a I + 24a 2 + 29a 3 and for the fourth integral evaluated at s a m p l e n + 3,

' For MATLAB Simulation Program in Appendix, open this paper title in Supplementary Material to Papers on AES website at http//www aes org/journal/suppmat/

4L = 4a o + 9a 1 + 12a 2 + 17a 3.

actual pulse amplitude ....... _~..... I I~--I- sample amplitude: L overload error_ l ...... ._ _~_e_~_......... clipping level .new pulse amplitude ........ I e: selected error r

9

r

n-2

n-1

n +1

....................................

"-e T"

a2~

--

qlF

=

n +2

clipping level: L

" D

n-2

n-1

n

n+l

n+2

L-e

(213) e n-2

(2/3 i e

l n-1

-(116) e

l

n+2 n

n+l

~ --(1/6) e

Fig. 14 Five-pulse substitution for third-integral error correction.

490

J AudioEng Soc.,Vol 49, No. 6, 2001 June

ENGINEERING REPORTS

CODING FOR HIGH-RESOLUTION DIGITAL AUDIO

the performance achievable and in particular to illustrate the performance of the overload option using both the three- and the five-pulse substitution procedure. The spectral plots show the line spectrum of the input together with the noise floor of the coding-decoding process, including emphasis and deemphasis. To interpret the noise floor, a 24-bit/96-kHz LPCM dithered reference signal is generated and scaled so that both the output of the recovered signal and that of the reference signal have the same maximum amplitude. The first set of results is shown in Fig. 15 and corresponds to the following data: order = 6 A1 = 10000

%input amplitude %input frequency of A1 %input frequency of A2 %digital audio sampling frequency (96 kHz) %oversampling ratio (relative to fs) %vector length, bits %output word length of differentiator %reference signal resolution

A2 = 10000 fin1 = 1000 fin2 ----2 0 0 0 0

fs = 96000 m=4 v=16 qout= 1 1 qin = 24

In this example the input consists of two equal amplitude sine waves (that is, A1 = A2 = 10 k) of respective frequencies 1 kHz and 20 kHz and the system sampling rate is set at 4 times 96 kHz. The parameters are selected

%noise shaper order %input amplitude

Spectrum input, nth-order, de-emphasis -20 -40 -60 -80 -100 -120 -140 -160 -108

-2130 -220 2

4

6

8

10

12

14

16

18 x 10 4

(a) Compressed time-domain output (no average) stage 1, stage 2, stage 3, ovedoad 5OOO 4800

9

30OO 2OOO 1000 O -1000 -2000 -3000 -4008 -5000 0

20

40

60

80

108 (b)

120

140

160

180

200

Fig. 15. (a) Recovered spectrum of coder-decoder (bandwidth 0 tofs/2 Hz). (b) Output of first, second, and third differentiators. 9~ J Audio Eng Soc, Vol 49, No 6, 2001 June

491

HAWKSFORD

ENGINEERING REPORTS

reduced from 11 bit to 9 bit. Fig. 16(b) reveals a lower output level from the third differentiator where this signal is dominated now mainly by noise shaper activity rather than the signal. These factors imply that a lower bit rate can be used before clipping distortion occurs, where in the second simulation the bit rate is now reduced to 3.4560 Mbit/s. To demonstrate the effect of clipping and the two proposed forms of error correction that can be applied after the third differentiator, the simulations are repeated with the same 5-kHz and 1-kHz signals but with the input signal raised in level. It was observed that a 6-dB increase gave occasional clipping so the signal was

to achieve a performance close to 24-bit LPCM at 20 kHz, where the channel bit rate is 4.2240 Mbit/s. Note how in Fig. 15(b) the output of the third differentiator is much lower than that of the second differentiator and remains within the clipping bounds determined by the output word length (qout). However, because of sixthorder noise shaping and differential coding a dynamic range well in excess of 24 bit is achievable at lower frequencies. In Fig. 16 similar results are computed but with input signal frequencies of 5 kHz and 1 kHz. Here the input levels have undergone a five-fold increase in level (that is, A1 = A2 = 50 k), the noise shaper is reduced to fifth order and the output word length is

Spectrum input, nth-order,de-emphasis

2

4

6

8

10

12

,+o/y

18 X 10+

.

1, stage2, stage3, ovedoad

!.,..,Isl t l

t

1"o

16

stage

Ca) Com3ressedtime-domainoutput (no average}' 51300 . . . . . . . .

~

14

t d;l `'-