ATSC DTV Receiver Implementation JOHN G. N. HENDERSON, FELLOW, IEEE, WAYNE E. BRETL, SENIOR MEMBER, IEEE, MICHAEL S. DEISS, SENIOR MEMBER, IEEE, ADAM GOLDBERG, SENIOR MEMBER, IEEE, BRIAN MARKWALTER, MEMBER, IEEE, MAX MUTERSPAUGH, SENIOR MEMBER, IEEE, AND AZZEDINE TOUZNI, MEMBER, IEEE Invited Paper
Receivers of broadcast digital television (DTV) service operate in an often difficult environment of electrical interference, multiple other TV signals in close frequency proximity, multipath, wide dynamic range input signals, and uncertain antenna choice and installation. Receivers must demodulate and decode the signal and optimize its processing for different display technologies—a process that can include format conversion between progressive and interlaced scanning and different screen pixel counts. Data that supports the new services enabled by digital transmission must be decoded and processed. Receiver designers must implement the required functions cost-effectively even as they strive to provide performance and feature differentiation from their competitors’ products. This paper describes all of the DTV receiver functions and references the associated standards. Emphasis is given to the difficult areas of signal reception and demodulation and to sections of the receiver that enable attractive and recognizable consumer features. Keywords—AC-3, CRT, display, digital television (DTV), equalizer, forward error correction, LCD, MPEG-2, multipath, plasma display panel (PDP), receiver, eight-level vestigial sideband (8-VSB).
I. INTRODUCTION The ubiquity, commercial success, and technological maturity of analog television service are giving way to the improved performance and opportunities for new features made possible by digital transmission and digital image compression. Receivers for broadcast digital TV (DTV) service must Manuscript received June 28, 2005; revised October 15, 2005. J. G. N. Henderson is with Hitachi America Ltd., Cape May Point, NJ 08212 USA (e-mail:
[email protected]). W. E. Bretl is with Zenith Electronics Corp., Lincolnshire, IL 60069 USA. M. S. Deiss is with Thomson Inc., Indianapolis, IN 46290 USA. A. Goldberg is with Sharp Laboratories America, Fairfax, VA 22031 USA. B. Maqrkwalter is with the Consumer Electronics Association, Arlington, VA 22201-3834 USA. M. Muterspaugh is with Thomson Inc., Indianapolis, IN 46250 USA. A. Touzni is with ATI Research Inc., Yardley, PA 19067 USA. Digital Object Identifier 10.1109/JPROC.2005.861023
recover the signal under the greatest possible range of transmission impairments. They must also condition the decoded video data for the scanning format (e.g., interlaced or progressive) and row and column pixel count of the display. In these matters, digital receivers offer new signal recovery and processing opportunities—a lower received S/N requirement for essentially error-free decoding, multipath reduction, and decoupling of the display scanning functions from the actual transmitted signal. Analog television transmissions contain within their signals the actual pulses for display scan synchronization, which consume display time while delivering very little information. Digital transmissions and receivers achieve synchronization more efficiently. As explained in other papers within this issue, DTV transmitted according to the Advanced Television Systems Committee (ATSC) Standard uses eight-level vestigial sideband transmission (8-VSB), a constrained form of MPEG-2 video compression, AC-3 for digital audio, and a customized MPEG-2 transport layer [1]. These major areas of the receiver offer different degrees of product differentiation opportunities. Fully compliant video and audio decoders are fairly well constrained by their standards. On the other hand, features enabled by signals contained in the transport layer or within the video data offer a variety of new features with associated user interfaces that differ among manufacturers. Recovery and demodulation of the signal, involving interdependencies among the designs of the tuner, demodulator, error correction, and multipath mitigation circuits, offer a significant opportunity for product performance differentiation. The training signals buried within DTV transmissions and the receiver processing for gain control and multipath handling also enable receiver control of “smart” antennas, especially useful for indoor reception. II. SIGNAL RECEPTION The circuits that determine the quality of signal recovery are contained in the portion of a digital TV receiver known as
0018-9219/$20.00 © 2006 IEEE PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
119
Fig. 1. Digital TV receiver “front-end” subsystem.
the “front end,” which includes all circuitry from the antenna input through the process of forward error correction (FEC) that is associated with demodulation of the 8-VSB signal [2]. (See Fig. 1.) The output of the receiver front end is the input to the Transport Layer decoder. Specifically, the functional blocks are the following. • Tuner, including RF amplifier(s) (with automatic gain control), associated filtering, and the local oscillator (LO) (or pair of LOs in the case of double conversion tuners, with automatic fine tuning) and mixer required to bring the incoming RF channel frequency down to that of the intermediate frequency (IF) amplifier/filter. • IF amplification (with automatic gain control) and filtering, including the major portion of predecoding gain, channel selectivity, and at least a portion of the desired-channel band-shaping. • Digital demodulation, including in-band interference rejection, multipath cancellation, and signal recovery. • FEC, wherein errors in the demodulated digital stream caused by transmission impairments are detected and corrected for incoming signals with signal-to-impairment ratios above a specified threshold. Packets with uncorrectable errors are “flagged” for possible mitigation in the video and audio decoders. A. Expected Signal Conditions ATSC document A/74, “Recommended Practice: Receiver Performance Guidelines,” [3] describes the conditions of signals and transmission impairments under which digital TV receivers should be expected to operate. Specific performance guidelines for sensitivity, overload, phase noise, selectivity, and multipath are enumerated. B. Evolution of ATSC Receivers’ Front End Performance 1) The Original ATSC Receiver: The original prototype “Grand Alliance” ATSC receiver was designed to meet 120
Table 1 ATSC Receiver Performance Comparison
specific requirements related to the selection of competing transmission system proposals. As a result, it had the best possible performance for error threshold under white noise conditions and for interference rejection. In the area of multipath (“ghost”) correction, however, it was limited in both the range of ghost delays and the amplitude of ghosts that it could compensate, as well as the rate of change (Doppler rate) of the multipath. This was partially due to hardware limitations of the equalizer size and speed that could readily be built at the time. It was also partially due to inadequate estimates of the difficulty of the multipath conditions likely to be encountered by an in-home receiver that were the prevailing expert opinion in those early days of DTV. While the initial design worked well with a fixed outdoor antenna, it became apparent that improvement was required to allow eventual replacement of analog portable devices using nonoptimum antennas. Demodulator and equalizer designs in production receivers have improved since the “Grand Alliance” prototype that was built in 1993. The improvements are summarized in two types of laboratory tests: tests that show (over time) the increase in multipath delay and amplitude of single ghosts that can be corrected, and ensemble tests using multiple signal paths. Ability to handle dynamic multipath has also improved. PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Fig. 2.
Performance of current-design ATSC receiver. [3].
Table 2 State-of-the-Art Performance With Brazil Ensembles
2) Single Ghosts: Table 1 shows the performance evolution over time of equalizer range and maximum single-ghost amplitude correctable. Fig. 2 shows (for current-generation receivers only), the combination of these variables, and adds SNR as a parameter [4]. 3) Ensembles: Many different ensembles of laboratory ghosts have been proposed over the years, at first based on expert opinion of the types of ghosts likely to be encountered. The early ensembles are no longer a challenge to any reasonable receiver design, and later ensembles have incorporated longer delay, more complex, and higher ghost energy cases. A selection of ensembles that has become known as the Brazil ensembles [4] because of their initial use in that country contains one relatively easy case, plus three cases of increasing complexity, plus one pathological case of a worst case multiple transmitter scenario. The performance of current state-of-the-art design receivers with these ensembles is summarized in Table 2. 4) Equalizers: Most early-generation receivers, like the ATSC prototype, relied on some variation of a time-domain HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
equalizer using an LMS algorithm [5] and having a fixed main tap position. Such an equalizer typically uses feed forward finite impulse response (FIR) operation for the pre-main-tap (preghost) portion and some portion of the taps immediately following the main tap. The remaining taps are used in an infinite impulse response (IIR) configuration, which allows use of decision feedback to eliminate the growth of noise in the operation of the circuit for nonzero taps in the IIR portion. Nonzero taps in the FIR portion, however, produce a linear filtering action, which results in amplification of noise at the same frequencies that are emphasized in the multipath compensation. One improvement to ATSC receiver equalizers has been extending the length. This may be done for the postghost taps at the expense of more hardware and chip size, but without greatly affecting the SNR threshold. Greater preghost length has also been added as shown in recent designs, but this requires a tradeoff of ghost predelay capability versus noise increase. Generally, some small tradeoff of white-noise-only performance is acceptable, since the performance for the 121
Fig. 3. Functional DTV receiver.
more common case of combined preghosts and noise is improved. Another improvement has been the adoption of special initialization algorithms such as “blind” equalization [6] to improve the acquisition performance of the basic LMS algorithm. Precalculation of the initial taps based on a channel estimate is also useful. Current designs include the ability to dynamically define the center tap position, and to operate symmetrically on preand postghosts, as illustrated by the curves in Fig. 3. 5) Doppler Performance: Doppler (varying multipath) performance has been improved greatly over the original prototype receiver, which was hampered by the inability to have all the desired interconnects among the equalizer parts and related circuits. Optimum Doppler performance involves not only the equalizer, but the automatic gain control (AGC), which may occur in two or more stages of the receiver, the synchronization of carrier and symbol clocks, and any other feedback loops in the receiver. The ATSC 8-VSB signal contains a channel reference at a 43-Hz rate, which is insufficient for following fast Doppler. Current receivers use the VSB data itself in addition to the reference in order to follow fast Doppler changes. Current receivers can follow Doppler rates well over 100 Hz, compared to approximately 10 Hz for early prototypes. C. ATSC 8-VSB Signal Recovery and Multipath Equalization Issues In the range of radio frequency spectrum used for ATSC television transmission, difficulties in signal reception are due mostly to frequency selective fades in the signal power (regions of relative attenuation in the frequency domain), giving rise to both static and dynamic multipath distortions that must be attenuated sufficiently in the receiver. To facilitate a more specific discussion, we will describe the receiver as being a device with three main functionalities: an optimization of the energy received by the receiver, a demodulation from IF (typically 44 Mhz) to baseband signal, and—finally—a combined error decoding function of the 122
symbols and data formatting function into corrected MPEG2 packets that are ready for video processing. The output of the second function is a soft estimate of the transmitted digital symbols. This operation consists of solving a mathematical problem, well known in the signal processing community, as the channel deconvolution problem. For simplicity, the two first functionalities will be henceforth referred to as front-end demodulation. (See Fig. 3.) In the case of simple error coding procedures—such as the 4-state, trellis coded modulation and the (187, 207, ) Reed–Solomon code defined in the ATSC A/53 standard [1]—knowledge of the decoding algorithms and their optimized implementation are described in textbooks. See [7] and [8] for examples. (A detailed description of the ATSC encoder is given in a separate paper.) Improving decoding capabilities for algorithms such as the maximum likelihood symbol estimation (Viterbi algorithm) and Reed–Solomon decoder (both of which are used in the ATSC standard) is possible at the cost of a significant increase in numerical complexity. For most practical situations, the FEC is perceived as not being a performance limitation in ATSC terrestrial receivers. Additionally, as this function is not explicitly related to the RF signal modulation, we can assume that the decoding functions, as well as the formatting of the data into MPEG packets, are well-understood design engineering problems that are fairly orthogonal from the front-end demodulation design. The discussion on the signal recovery architecture for 8-VSB will, therefore, be limited to the front-end demodulation portion of the receiver, which specifically integrates components related to VSB modulation. Notice that a reference design for the receiver architecture of the 8-VSB transmission system has been addressed from within the ATSC standard organization (see ATSC A/54A, [2]). To maximize the performance of the 8-VSB DTV receiver, multiple complementary options are possible. Optimization of the incoming signal energy to the receiver could be addressed by solutions such as, but not limited to, the use of appropriate antenna devices, the use of smart antennas with directional programmable capabilities, and PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
the optimization of AGC functionalities to control RF and IF gains in different types of environments (e.g., low channel sensitivity, digital/analog adjacent interferences, narrow band interferences, impulse noise, etc.). Notice that, for the most part, the problems associated with the maximization of the incoming energy are not specific to the modulation, but rather to the shape of the spectrum, the effective channel bandwidth, the rolloff factor, etc. Additionally, it must be understood that maximizing the energy is not necessarily the best option leading to the minimum bit error rate (BER). This point, perhaps nonintuitive at first, will be further discussed later in the paper. Optimization of the demodulation signal (i.e., the function translating the signal from IF to the soft symbol estimates) is, among all the functions needed in the receiver, the one that is the most directly related to the modulation characteristics of the transmitted signal. One approach typically used in demodulation is to take advantage of side information (e.g., pilot signal, additional redundancy in the transmitted data, structural information, etc.) as reference information, known by the receiver, which, compared to the received signal, allows construction of an estimate of the added interference that ultimately would lead to an estimate of the desired signal. Although, this information is often explicitly introduced by the standard for specific purposes (we will give precise examples in the rest of the paper of this type of information in the case of the 8-VSB ATSC standard), there are cases where additional structural information could be derived from the nature of the signal, and/or its statistical properties, to constructed estimators that have not necessarily been anticipated by the authors of the standard. These techniques may have a great advantage when designers aim for performance that might not have been fully defined by the specifications of the standard. It is because 8-VSB is a fairly original modulation (at least in the digital domain) that we will discuss some aspects associated with the nature of that modulation and explain, as an example, how one may benefit from the properties of the VSB signal to improve the demodulation function. D. Energy Maximization of Desired Signal 1) Tuner Design Considerations: Fig. 4 presents a block diagram of the RF processing, the IF amplifier and 44-MHz filtering, and the demodulator/equalizer. Let us begin by examining the interactive design constraints between the RF amplifier and the mixer. The two primary performance parameters are noise figure (NF) and rejection of interference. Noise figure is the decibel ratio of the output signal to noise versus the input signal to noise. Ideally, this would be 0 dB; in practice it is about 7 dB for a typical DTV tuner. The rejection of interference is most often expressed as the desired signal to undesired signal (a stronger station on a different channel) or D/U ratio that can be tolerated. Historically, there have been unique interference mechanisms in the mixer relating to the choice of the IF of 44 MHz. The “half IF beat” is caused by an undesired signal at a frequency 22 MHz above the desired channel and is caused by the second harmonic of the LO beating with the second harmonic of the RF input HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
Fig. 4. Tuner block diagram.
signal to provide a 44-MHz IF signal. The “oscillator beat” is generated by an undesired signal at the LO frequency. The “image frequency” interference comes from an undesired signal 88 MHz above the desired signal and is a normal mixing product except that the signal is 44 MHz above the LO instead of 44 MHz below. These will be discussed later. The primary cause of reception problems from interference is third-order nonlinear distortion caused by the strong interfering station overloading one or more stages in the tuner. The intercept point concept is used to calculate interference caused by nonlinear distortion. Third-order distortion is proportional to the cube of applied voltage and causes cross-modulation (XM3) and intermodulation (IM3). In theory, at the intercept point (IP3), the distortion would be equal to the signal. For each decibel of signal reduction below that point, the distortion drops 3 dB (i.e., 2 dB relative to the signal). For example, assume a commercial quality double-balanced Shottky diode mixer. Typically, such a device has an IP3 (third-order intercept point) of 15 dBm when supplied with 10-dBm LO drive. Thus, if the interfering signal is at 10 dBm, the distortion products would dB below the signal, the interfering signal. be The tolerable interference level at the input to the tuner is reduced by the net gain of the RF stage. Of course, the RF stage may have distortion limitations as well, although the dominant cause of distortion is most often the mixer. Let us next examine the tracking RF filters. There is usually a single (L–C) resonator filter at the input of the RF amplifier and a double resonator (two L–C circuits) filter between the RF amplifier output and the mixer. Of particular importance is the use of varactor diodes to achieve convenient electronic tuning. Current technology produces diodes with 0.5-ohm series resistance which limits the device factor to 80 at 500 MHz. While it might be desirable to design these filters with narrow bandwidths, the loss in such filters increases rapidly with decreasing bandwidth. The well-known formula below gives the relationship
Loss
where is loaded varactor/inductor.
and
unloaded
of
123
Table 3 Loss and Selectivity of a Single-Tuned Filter
Table 3 gives examples of single-tuned filters with varying designs. For the present, assume the first design with 30-MHz bandwidth and 2-dB loss. There will be one resonator before the amplifier giving a 2-dB loss and two between the amplifier and mixer giving a 4-dB loss. We can now finish the RF amplifier design with the interference constraints and predict other performance parameters. Assume the RF amplifier has a 1-dB noise figure and gain of 15 dB. To calculate the system noise figure, we use the cascaded noise figure formula
Fig. 5.
Input RF filter response.
NF
where and are noise figures of the first and second is the first-stage gain in power ratios. stages, and Combining the first filter, the amplifier, and the second double filter, we have a noise figure of 3 dB and gain of 9 dB. The mixer loss is 7 dB. If the following 44-MHz IF amplifier has a 3-dB noise figure (NF), then the mixer/IF has a , , , and the total tuner 10 dB NF. noise figure is now 10 Log(3.13) or 5.0 dB. Note that the RF amplifier noise figure and/or antenna power match are not the limiting factors. The limitations from the technologies of the mixer and varactor diodes have an equal contribution. If the RF amplifier were to have a NF of 3.0 dB, perhaps due to an input mismatch that did not provide the optimum source impedance, the system noise figure would be 6.3 dB. If the resonator loss were improved to 1 dB each (and a 3-dB NF amplifier), the system noise figure would become 4.9 dB. However, with present varactors the resulting bandwidth would be far too wide for adequate interference rejection. If the bandwidth is made more narrow to improve interference performance, the noise figure increases rapidly. Regarding tolerance to interference, if input signals to the mixer are roughly constrained to be 10 dBm or lower, the distortion products will be tolerable, but this bears closer examination with the design of the RF stage and filters. Fig. 5 is an actual plot of the input filter selectivity of an exemplary RF stage. Fig. 6 is the corresponding plot of the entire RF stage (as indicated by the dotted box in Fig. 1) tuned to channel number , including the three tracking filters and their losses. Each horizontal division is 12 MHz or two channels. Note that the net gain is 9 dB as in the example above. The filters are somewhat narrower in bandwidth than the above example. It is also customary to choose filter topologies that give greater attenuation to the high frequency side and improve image rejection. Some filters incorporate an 124
3 dB BW is 14.8 MHz + 4 (+22 MHz) is 21.4 dBc + 8 (+44 MHz) is 44 dBc + 14 (+88 MHz) is 69 dBc Input is 20 dBm; center band gain is 9 dB
N N N Fig. 6.
0
0 0 0
RF amplifier and filters response.
“image trap” that provides extra rejection at 88 MHz, the image frequency for a single-conversion tuner. Table 4 shows the ATSC A/74 Receiver Guidelines [3] for DTV interference for receiving “weak” or 68-dBm signals. Consider the first adjacent channel rejection at 40 dB (NTSC), and assume no appreciable selectivity from the RF filters. With a desired signal at 68 dBm (weak ATSC signal), the interfering signal would be 28 dBm. With an RF gain of 9 dB, the inputs to the mixer are 59 dBm (desired) and 19 dBm (undesired). Using the third-order intercept of the mixer at 15 dBm, the distortion products would be dB below the undesired or 87 dBm. Using a 15-dB signal to noise/distortion threshold for ATSC, this might imply a 13-dB margin on meeting the guideline. It is more important to look at the margin of the PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Table 4 Taboo Channel D/U in dB From the ATSC A/74 Receiver Guidelines
undesired input level. If the D/U ratio were 45 dB (instead of 40), the distortion products are at 72 dBm and the system fails; thus, there is only a 5 dB margin against the guideline. The RF amplifier may also be a contributing factor and limit performance, but it is useful to note the limitations imposed by the mixer. The guideline becomes more restrictive for interferers further away in frequency. At six-channel spacing, the guideline is 57 dB, which would limit the RF gain to 3 dB. The tracking RF filters meet this need easily for the mixer, but now the RF amplifier and the protective selectivity of Fig. 4 must be considered. As a practical matter, the RF amplifier and the mixer must both be considered. The distribution of gain and selectivity is a complex compromise. Returning to the mixer device, there are configurations with better distortion, but they also require much higher LO drive levels. In order to meet the FCC broadcast and cable requirements, the leakage of LO energy at the antenna port must be reduced to very low levels (per FCC regulations), and this limits such use of high LO power. Further, the RF amplifier must also have a high reverse attenuation to meet these limits. Dual conversion is often suggested as a solution for interference problems, but the limitations above exist equally for single or dual conversion. Dual conversion can help specific , the “osinterference problems: the “half IF beat” for and the image problem at . cillator beat” for However, a double-balanced mixer tends to cancel the and interference mechanisms and special “image retrap” circuits are often able to improve the jection. In addition, a new tuner IC is recently available with an image canceling mixer. This device uses two mixers with controlled matching such that the required 90 phase shifts can be achieved over the required TV bands and offers performance meeting the A/74 guidelines. Dual conversion can offer some aid in the LO leakage because the oscillator operates at 900 MHz or more above the RF signal; thus, the RF filters can offer more rejection than with single conversion. Offsetting this, internal leakage of the first and second LOs and harmonics will generate interference on many channels. Both designs require careful isolation techniques and the choice of single or dual conversion is not obvious. Once the signal has been converted to the standard 44-MHz IF, it can be filtered using surface acoustic wave (SAW) devices. Technology has provided very good performance, as shown in Fig. 7. The stop band can be 40–50 dB HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
Fig. 7.
SAW filter characteristic.
with modest devices and two such devices are often cascaded with an amplifier between to compensate for losses. With such rejection, the adjacent channel performance is almost always limited by the distortion issues occurring before these filters. There are some minor issues of temperature drift and losses, and improvements are being made. A very significant issue for the future is the adjacent channel energy or “splatter” radiated from DTV transmitters. With the present FCC mask, first adjacent D/U for DTV into DTV is limited to 27–29 dB. The issue is in the transmitter technology, which involves compromises with power efficiency and various linearity techniques to reduce the emissions. As the TV spectrum is reduced and repacked with DTV stations having channels adjacent to each other, this will become a greater issue. The receivers are capable of 33 dB or better and are not the limiting factor at present. Digital reception additionally requires good phase noise performance in the LO. While this must be considered in DTV tuner design, it is not an issue with present technology. 2) Automatic Gain Control (AGC): AGC is employed to maintain a virtually constant signal level to the demodulator, both for NTSC and DTV. Historically, NTSC receivers sampled the demodulated video output during the horizontal synchronizing pulse interval, compared this to a desired reference voltage, generated an error voltage proportional to the difference and applied that error voltage to control the gain of the IF amplifier and the RF amplifier in the tuner. For weak signals, both IF and RF amplifiers were operated at maximum gain. For stronger signals, the IF amplifier gain was reduced while the RF was operated at maximum gain. At some signal level, usually 49 dBm, the tuner was reaching the overload limit, and for signals greater than this, the IF amplifier gain was held nearly constant while the RF amplifier gain was reduced. This system maintained the optimum signal levels in both the RF and IF amplifiers so that noise and overloading was avoided. The AGC function also minimized reception degradation by filtering out Doppler ripple (airplane flutter) and canceling or clipping impulse noise. 125
For NTSC transmissions, FCC practice prevented adjacent channel assignments in a given area, so first adjacent channel interference was not a large problem. Thus, the AGC function was totally derived from the desired signal; adjacent channels had no significant influence. With DTV, the signal and signal environment are somewhat different. During the transition period when both DTV and analog signals are broadcast, virtually twice as many signals are in the TV spectrum. After analog NTSC broadcasts are removed, the TV spectrum will be decreased by removing channels 52–69. This requires assignment of adjacent channels in both cases. The adjacent channel(s) may be occupied by either an NTSC signal or another DTV signal that is significantly stronger. Filtering in the RF stages of the tuner and in the IF amplifier remove most of the interference, but it is often desirable to modify the RF AGC to avoid overloading by very large interferers—this is often called “dwideband AGC.” The DTV receiver often has two independent AGC control loops. The “IF AGC” controls only the gain of the IF amplifier and is responsive only to the desired channel signal. The demodulated data signal amplitude is sampled, compared to a reference and the difference creates an error signal to alter IF gain until the data level matches the reference. The response time is rather fast to remove part of any Doppler ripple in the signal. This must be adjusted and optimized with the equalizer function that also operates to remove Doppler effects. The “RF AGC” loop controls only the RF amplifier gain. The 44-MHz output of the tuner is sampled, compared to a reference level and the resulting error signal used to adjust the RF gain in a similar manner. Typically, this level is set to reduce the RF gain for input signals greater than perhaps 59 dBm, although this choice may vary with individual products. This level is often called the “RF AGC threshold” and determines the noise and overload performance of the system. A property of gain control is that the signal to noise is determined by this threshold—even as the signal level increases, essentially the same signal to noise ratio as measured at that threshold will remain. For example, given a 6-MHz channel, the Gaussian noise is 106.2 dBm. The system may add 10 dB excess noise (as discussed later) yielding noise at 96.2 dBm. With the threshold of 59 dBm, the signal to noise will be 37.2 dB for signals at 59 dBm or higher. If the threshold is set at 54 dBm, the signal to noise will be 42.2 dB. Setting the threshold to a higher level (e.g., 49 dBm) would give better signal to noise but would invite overload from interfering signals. For the general case, the threshold setting of 59 dBm provides a good margin against overload from moderate adjacent channels. It also provides a good signal to noise margin in case of severe multipath, Doppler ripple, or other signal impairments. Notice that DTV permits a lower threshold than NTSC because the analog NTSC system requires a 50-dB or better signal to noise ratio to display good video; if worse, the video is gradually degraded by the appearance of noise or “snow.” The ATSC digital system requires, in general, a 15-dB SNR as a minimum. The video quality is good for signals better than the minimum and virtually nonexistent if worse. This is often 126
called the “cliff effect,” and although the ATSC system works well at worse signal to noise, a margin must be given such that normal signal variations will not cross below that minimum, causing an intermittent total loss of video and sound. The bandwidth of the signal channel going into the RF loop sampler should be wide enough to include part or all of the first adjacent channels. If a large interfering signal is present, the energy for it also contributes to the error signal for the AGC and acts to reduce the RF amplifier gain. If done properly, this can optimize the RF gain to amplify the desired signal enough for adequate signal to noise but not overload the RF amplifier, mixer, or other stages in the tuner. In the simplest case, assume the signal sampler has a bandwidth to include the entire interfering first adjacent channels as well as the desired signal. When the adjacent signal is equal to the desired, the RF gain will be reduced by an additional 3 dB. If the adjacent signal is 25 dB stronger, the RF gain will be reduced by 25 dB and the system signal to noise will be as if the threshold had been set at 84 dBm, yielding dB. Since the minimum for 8-VSB reception is 15 dB, this will not work properly. The ATSC A/74 guideline [3] requires operation with a 33-dB stronger adjacent DTV signal or a 40-dB stronger analog signal. In practice, one of several methods is used. The sampler may have a controlled bandwidth such that only a portion of the adjacent channel energy is sampled. This reduces the contribution to the gain reduction and gives a better system performance. Other techniques involve measuring the IF AGC control and comparing that to the RF AGC in a microprocessor. Various algorithms can then be employed to determine the level of interference and derive a modified RF gain control signal. This allows reception in a difficult environment by offsetting signal to noise against overload from interference. Note however, that if the signal is also impaired by severe multipath or Doppler ripple or is very weak, reception may not be possible regardless of the AGC parameters. 3) Smart Antennas: The use of a receiver controlling automatically the directional pattern of a single or group of antennas could be an attractive feature for optimization of the overall performance of the receiver. The basic idea is that for a given channel (i.e., RF frequency) there is possible a direction leading to an optimized reception. The direction is dependent on the receiver capability and the geographical location of the receiver. The best possible antenna steering may change from one channel to another. Field experience suggests that, for a given channel, it is unusual for good reception to be achieved in all possible antenna steering directions. Automatically controlled antennas can be facilitated by including in the 8-VSB receiver an antenna control interface such as the one described in EIA/CEA-909 standard [9]. This standard provides information to properly equipped antennas for direction, amplifier gain, polarization, and channel number for each channel as it is tuned. Examples are described in [10] and [11]. Considerations of antenna design and steering optimization algorithms are beyond the scope of this paper. A scan procedure would determine the optimum orientation. It PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
is possible that preferred antenna configurations for each channel would be stored in receivers as a starting point for channel acquisition in the future. Even if the antenna system is not under automatic control, a good outdoor antenna, perhaps associated with a mast-mounted low noise amplifier, can improve reception in fringe areas. E. Demodulator Design Issues: Taking Advantages of 8-VSB Modulation Properties In this section, we address the problems of symbol time estimation and carrier estimation for 8-VSB modulation. More precisely, we will show how well-known solutions for QAM modulation can be generalized to the case of 8-VSB modulation. To demonstrate the effectiveness of the proposed solutions, the data model will increase in complexity in conjunction with the complexity of the problem addressed. Additional information on this subject can be found in [12]–[21]. 1) Signal Model and Problem Formulation for Symbol Time Synchronization: The purpose of the symbol time synchronization is to lock the receiver’s local symbol clock to that transmitted so that proper sampling of the transmitted data occurs and the intersymbol interference introduced by the pulse shape of the transmitted symbol is minimized. Assume that is the VSB data symbol period (i.e., ns) and is the fractional sampling (phase) error; the problem of timing synchronization lies in matching the of the received sampling rate and the phase , where data to the transmitted sampling rate and denote respectively the estimated value of the parameters and . A simplified model of the sampled received signal, ex, representing pressed as a function of the timing offset the relative fraction of the time symbol offset, can be defined by the expression (1) where are symbols that belong to the finite , which correalphabet spond to the real (in-phase) part of the 8-VSB constellation. We assume that the symbols form a sequence of independent and identically distributed (i.i.d) variables. For ATSC is the time domain terrestrial broadcast system the filter square root raised cosine pulse shape function defines as
(2)
where the parameter stands for the so-called rolloff factor. for ATSC modulation.) The noise perturbation ( of the model (1) is assumed to be i.i.d., of Gaussian distri1 the varibution and zero-mean. We will denote by ance of the noise distribution. In this simplified model, we assume that the dc offset (which creates the small in-phase pilot in the modulator) introduced at the transmitted signal has been removed at the receiver. Notice also that this model includes neither the channel transmission nor the imaginary (quadrature) contribution of the VSB signaling as the information is not relevant for the subsequent developments. conforms to the Nyquist Since the pulse shape filter for , the received conditions defined by . We signal is synchronized with the transmitter when can assume, without restriction that belongs to the interval (0, 1) and to simplify the notation we will choose , since practically speaking the value of is not relevant for the synchronization algorithm. A robust estimation of is proposed in the next section. More precisely, we will show how one can synchronize the received signal to the transmitter by only using the information furnished by the received signal (1). a) Example of the cost function for time synchronization: We address the synchronization problem with the following Constant Modulus (CM) cost function defined by (3) where is referred as the dispersion constant. The CM cost function is simply a measure of the ‘distance’ between the and a constant value square of the received signal (threshold) . In the case where both the in-phase and the quadrature components of the VSB signal would have been would have been modconsidered in (1) the term ified by the square of the modulus of the received signal . b) Minima: Justification of the CM criterion for time , synchronization is given by the fact that, for which is verified for ATSC modulation, it can be proven that defined by the equation there exist a constant (threshold) at the bottom of the page, where for any the only is given by . In other minimum of words, as long as is large enough the minimum of always leads to the expected optimum estimator, which corresponds to a synchronized received signal . depends only of the statisInterestingly, the threshold tics of the VSB symbols (which are known) as well as of 1
E fxg denotes the mathematical expectation of the random variable x.
(4)
HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
127
the value of the rolloff factor, which is also known. Notice that the minima of CM criterion are independent of the variance of the noise. Practical steps for minimizing effectively can be based on one of the numerous variants of the gradient stochastic algorithm. (This is essentially the same type of technique used in channel equalization.) 2) Signal Model/Problem Formulation for RF Carrier Synchronization: a) VSB modulation: The purpose of RF carrier synchronization is to lock the receiver’s LO in both frequency and phase for synchronous detection of the VSB carrier so that proper demodulation of the data signal occurs with minimal intersymbol interference. While both an in-phase and quadrature component can be output from the synchronous detector, all the data is carried in the in-phase portion of the signal. The transmitted small in-phase pilot can be used when available (i.e., when it has not be notched by the channel multipath) in concert with other more advanced methods of RF carrier synchronization that do not require a pilot. of a disA continuous, time domain VSB modulation can be expressed as follows: crete source
(5) where is the pulse shape filter defined in the previous section. The first part of the terms in brackets in (5) corresponds to the in-phase component of the VSB signal whereas the second part corresponds to the quadrature component of the VSB signal. This will become more apparent in the simplified model proposed below. b) Simplified data model: An approximation of the pulse shape filter for VSB modulation when the rolloff is small is given by (6) where is the standard mathematical expression for a Taylor approximation of the second order. The equation above denotes an expansion (Taylor development) of the pulse shape filter in terms of the rolloff factor . (The simplificationabove is perfectly justified in the case of the 8-VSB modulation used in the ATSC standard for DTV terrestrial broadcast.) In the rest of the paper, we will assume without . This parameter will depend solely on restriction that the sampling frequency used at the receiver, and therefore, as stated above, the parameter is arbitrary as long as the Nyquist for condition is respected. A sampled version of leads to the expression (for a large enough )
for the even components and
(8) for the odd components. Notice the permutation of the role of the in-phase and quadrature components in (7) and (8). This is essentially the property (that does not exist for QAM modulation) that will be used in VSB modulation to create an estimator of the RF frequency offset that do not rely on the in-phase pilot. Prior to presenting the criterion let us further clarify the data model. Let us denote the imaginary (quadrature) contribution of (7) and the real (in-phase) contribution of (8) and , respectively. For (7) and (8), it can be shown is the convolution of the impulse that response of the Hilbert transform and the even/odd symbols of the transmitted sequence source. We recall that the th tap of the Hilbert impulse response is defined by for and for . This implies basically that digital VSB modulation can be seen as a discrete-time analytical signal for which the real (in-phase) component corresponds to a pulse amplitude modulated (PAM) sequence. The baseband approximation of a VSB constellation obtained, for example, at the output of a complex linear equalizer can be approximated as (9) is a complex, circular, where zero-mean Gaussian noise statistically independent of the . source, for which we assume that Moreover, in the case of 8-VSB modulation for ATSC, the variances of the even and odd sequences of the transmitted . We also symbols are identical and are denoted by and the assume that the real (in-phase) component imaginary (quadrature) component of the noise contribution have the same variance (due to the nature of unitary transformation of the Hilbert filter, this model is quite realistic in practice). c) RF carrier phase estimation: Equation (9) can simply be understood as a model of RF carrier phase rotation in the complex plane of the modulated 8-VSB symbols, that possibly varies over time. A key by a phase factor point is to remark that, in the absence of noise, the received signal (19) exhibits the following special properties: (10) (11)
(7)
128
In the rest of the paper, we will take advantage of these properties to construct an estimate of . In Fig. 8, we illustrate the effect of a phase rotation of a received 8-VSB PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Fig. 8.
8-VSB constellation rotated by ' =
=4 (no noise).
signal (19) in the complex plane for a constant phase offset . d) Diversity blind criterion for phase/carrier synchrosuch that nization: We are looking for the phase estimate (12) For where
where,
,
is given by
(13) Notice that one can verify that, for 8-VSB modulation, the symbol source generates a sub-Gaussian sequence ). Roughly speaking, this means that (i.e., the statistical distribution of the source is “somewhere” between a uniform and a Gaussian distribution. This is a key technical point that will be useful in the analysis of the ex. Basically, the minima of the cost function trema of depends on the statistical properties of the source. It will be shown in the next section that the sub-Gaussian property of the VSB signal is key for constructing an adequate
based on . The advantage of the criestimator of terion (13) with respect to a mean square type criterion that could easily be derived from (10) and (11) is that the criterion does not require knowledge of the transmitted data. In other words, it does not require the use on any pilot information provides by the PN sequence embedded in the transmitted signal. This is particularly important in the case where the RF phase offset is continuously changing over time. is clearly mule) Extrema analysis: The criterion timodal with respect of the parameter of interest . Because of its multimode nature, it is important to establish clearly the structure of the possible attractors of the cost-function in cases where a gradient technique is used to minimize the function. Indeed, the gradient estimation is guaranteed to converge locally. In other words, the attractor (extremum of the cost function) to which the algorithm will converge depends on the position of the initial condition. The presence of spurious minima may cause the algorithm to converge toward a biased estimator. By setting the derivative of the function above equal to zero, we found that the extrema of the CM criterion are given by the equation at the bottom of the page, where it can be shown that
(a) (b) (14) (c) (d) HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
129
and where
and are, respectively, the amplitude and time where delay associated with the th propagation path. Assuming the receiver is fixed, the Doppler shift for the th propagation path is given by
In the light of the comments above on the possible presence of spurious attractors for the estimate, it is important to interpret the solutions (b) and (c) before presenting any further analysis. To do so, we will assume that the impulse response of the Hilbert filter is not truncated. In that case, and . In it can be shown that other words, the solutions (b) and (c) become equal to . The stability of extrema can be addressed through a tedious but straightforward analysis . The of the sign of the second-order derivative of result can be summarized as follows. Under the assumption of sub-Gaussianity of the transmitted signal, the solutions (a) and (d) are respectively classified as a global minimum and local minimum. The solution (b) is a global maximum. Furthermore, it can be shown that the attractor (d) can be removed by slightly modifying the original criterion. Notice the solution (a) has a periodicity of which implies that the source is recovered with a sign ambiguity. This ambiguity can easily be removed by using synchronization information embedded in the transmitted signal (that basically has a well defined polarity). Notice finally that the criterion is transparent to the effect of the channel multipath since the operation of RF carrier recovery occurs after the channel equalization. 3) Multipath Impairments and Channel Modeling: 8-VSB demodulation for DTV can be quite challenging when the channels exhibit high multipath distortions, dynamic variation over a short period of time (in the range of a fraction of a millisecond), and low SNR conditions (typically a few decibels above the threshold of visibility (TOV) TOV dB). A possible option to cope with these difficulties is to develop a receiver design that would take advantage of all possible information available in the standard specification. Every DTV receiver requires an equalizer whose function is to remove the interference intersymbols generated by the channel multipath. This is accomplished, for example, by using available reference signals, in time and frequency domain, and specific properties of VSB modulation. This is the approach we are describing hereafter. 4) Channel Models: Let the received signal be defined by (15) The complex envelope of the transmitted signal is denoted , and is the carrier frequency. Finally, denotes the real part of . If the channel is composed of propagation paths (assumed to be distinct), then a model of the received bandpass signal can roughly be expressed as (16)
130
(17)
where is the wavelength of the transmitted signal, and and are respectively the angles between the velocity direction of the reflecting object and the paths between the reflecting object and the transmitter and receiver, respectively. Field studies of DTV signal reception have illustrated a wide range of varying multipath and noise conditions. Besides the noise limitation, it is generally admitted that receiver performance is limited by the ability to equalize long impulse response channels and by the ability to track fast time-varying channels. To be able to specify the appropriate algorithm (and the associated complexity and cost), it is necessary to first evaluate the typical channel conditions for indoor and outdoor reception. 5) The Experience of the Field: Examples of DTV captured signals are furnished below as an illustration of the various conditions that can be observed in the field. a) Impulse Response Length: The length of the impulse response is the primary element to be investigated as the complexity of the receiver is, in most cases, still widely dominated by the length of the equalizer filters used to mitigate the multipath effect of the propagation channel. The complexity of the equalizer defined in terms of gate count represents roughly 60% to 80% of the complexity of a typical terrestrial 8-VSB front-end. If we assume, for sake of simplicity, that the equalizer filter is defined as a transverse filter of length , then experience suggests that a rule of thumb is to select the equalizer length such as . Theoretically, one could determine exactly the minimal length of the equalizer for a given residual equalization error if the distance of the zeros of the polynomial function characterizing the channel impulse to the unit circle is known. In practice this is a metric difficult to obtain; that is why in most cases designers simply attempt to (slightly) overestimate the equalizer length. To do so, one must not underestimate the typical channel length for a given statistical coverage area target. Practically speaking, one would attempt to identify “corner cases” scenarios to provide enough margins of errors. The experience of field measurement is critical as typical propagation models derived from antenna planning factor may result in erroneous assumptions by a scaling factor up to one order of magnitude. In the example provided by Figs. 9 and 10, which correspond to TV channel (impulse response) signals captured with an outdoor antenna, the channel span cover respectively -14 and 57 ms. This condition can easily occur in Raleigh propagation condition; i.e., where there is no direct path from the transmitter to the receiver as might occur in mountainous PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Fig. 9.
Fig. 10.
Channel impulse response with long postechoes.
Channel impulse response with long preechoes.
areas or densely packet urban (downtown) areas with tall buildings. Assuming that these situations are somehow exceptional (which seems to be confirmed by several field measurements), a conservative estimate of the maximum channel ms. This would effectively lead would be HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
to a requirement for an equalizer length of several hundreds of coefficients! Assuming that the complexity is not a determining factor the question is how to estimate these coefficients knowing that the training PN511 sequence embedded in each 8-VSB is most likely not long enough to estimate the channel coefficient with a least square tech131
. Fig. 11.
Channel impulse response snapshot of field capture channel Was-49-36-06 142 000-opt observed at
nique. This question will be addressed in the section on the equalizer architecture. 6) Time-Varying Channel: Another critical issue concerns the ability of the receiver to track dynamic impulse response channels. As we saw above, the dynamic behavior of the channel is a function for the most part of the speed and the size of the reflecting object. In the example below, we illustrate a situation where the impulse response, consisting of two main taps, has the interesting property of having the taps oscillating, with each tap becoming for a period of time the main propagation path. This dynamic variation of the channel encompasses simultaneously two behaviors that might create difficulty for the receiver. The first behavior represents the variation of the channel coefficient over time. The variation of each channel coefficient must be tracked and compensated by the equalizer in the receiver. The difficulty here is that the variation of one coefficient of the channel results in a modification of all the coefficients of the equalizer. The variation of each coefficient of the equalizer would be typically different for each tap. One can construct academic examples to be convinced of this claim. The second behavior describes a phenomenon where the notion of dominant path becomes not well defined over time. In this particular case, there is no dominant path. This can cause additional difficulties to the receiver as the notion of a dominant path can be used in the design of the equalizer. For example, one can design an equalizer such that the global impulse response combining the channel and the equalizer corresponds to a time delay furnished 132
t
= 5:167
by the dominant path. The mean square error performance of the equalizer is basically a function of the distance of recovered symbol to the dominant path. If the dominant path changes over time, then the time domain reference of the estimated symbol must either change with the channel (which implies again a redesign of the equalizer) or remain fixed (which leads to a degradation of performances). (See Figs. 11–13.) This phenomenon will be further discussed in the section on equalizer architecture. 7) Receiver and Equalizer Architecture: Motivations for different types of equalizer structures are proposed in this section. Equalizer design techniques are not specifically addressed, as equalizer design is a topic that has been widely documented in specialized literature over the past years. a) Decision feedback equalizer (DFE) architecture: In this section we will focus on the DFE architecture described below, since this architecture represents the dominant solution used for terrestrial DTV receivers. The architecture is described in Fig. 14. We assume in this structure that the input of the equalizer is a signal synchronized in time and frequency to the source. As was briefly described above, the function of the equalizer is to remove the interference intersymbols generated by the channel multipath. In this specific structure, the role of the forward filter is, roughly speaking, to transform the channel into a global impulse response equivalent to a channel with postechoes multipath, since a DFE cannot remove preechoes. The function of the feedback equalizer is introduced to remove the postechoes contributions of the remaining multiPROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Fig. 12.
Channel impulse response snapshot of field capture channel Was-49-36-06 142 000-opt observed at t = 5:227.
Fig. 13.
Channel impulse response snapshot of field capture channel Was-49-36-06 142 000-opt observed at t = 5:229.
path. A decision device is introduced in the feedback loop to estimate the transmitted symbols. HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
Although it is true that a DFE is not optimum with the respect to minimizing error probability (equivalent to the crite133
Fig. 14. DFE equalizer structure.
rion of maximizing likelihood a posteriori), it is nevertheless also true that an optimum receiver rapidly becomes impossible to implement when the length of the impulse response turns out to be as long as is the case for the terrestrial impulse response DTV channels. Linear transverse equalizers are another option that could be used, but both the theory and experience demonstrate that these equalizers offer lower performance than DFE in terms of mean square error estimation between the estimated symbols (at the output of the equalizer) and the transmitted symbols. In the case of a VSB modulation, we will assume that the forward filter is a real or complex value coefficient filter and the feedback filter is a real value filter. Additionally to the equalizer itself, one may assume that it may be necessary to remove additional residual frequency offset in the received signal. This function could be merged with the equalizer and integrated in a different location in Fig. 14. In the case of a complex value forward equalizer, for example, this function could be added before or after the forward equalizer. In the case of a real value forward equalizer, this function could be added before or after the equalizer. These kinds of implementation details may have a substantial effect on the final performance of the receiver, but their coverage is beyond the scope of this paper. b) Estimation of the equalizer coefficients: After the structure of the equalizer, the choice of the technique used for the estimation of the equalizer filter coefficient is perhaps the second most important item in the design of the receiver. Below we will separate the cost function, expressed in terms of the equalizer coefficients, to be optimized and the method leading to the optimization of the cost function. In terms of cost functions leading to cost-effective optimization techniques there are essentially two criteria (which can be used to some extent simultaneously) that are widely described in the literature. The first basic solution consists of minimizing the mean square error between the 134
source (transmitted signal) and the receive signal after the equalizer. Assuming that the receiver parameters such as time and frequency offset are known, the mean square error becomes essentially a function of the equalizer coefficients. The mean square technique assumes, of course, that the receiver will use known binary frame sync information sent by the transmitter according to the ATSC standard. This information could be any of the known sequence symbols embedded in the frames being sent. The effectiveness of such criteria depends on how long the channel impulse response is vis-à-vis the length of the sequence. If the channel is too long, the problem of finding the equalizer coefficient becomes ill-conditioned. The problem will therefore not have a unique solution, and this will degrade the performance considerably because there is no guarantee in practice that the equalizer solution of that optimization problem will maximize the output signal to noise ratio of the equalizer. The second set of criteria involves specific statistical properties and/or geometrical properties that the equalizer intends to recover via the optimization of a specific cost-function. In contrast with the first set of criteria defined as training-based solutions, the second set is called blind criteria because knowledge of the transmitted signal is not specifically required. One the most widely used blind criterion for equalization is the constant modulus criterion (for which a flavor has been provided in the section above to address time symbol synchronization and RF carrier synchronization). The criterion is defined as an average square distance between the modulus (square root of the sum of the square of the in-phase component and the square of the quadrature component) and of estimated symbols (at the output) of the equalizer and a fixed constant that depends on the statistical properties of the signal transmitted. In the case of the 8-VSB modulation used in the ATSC standard, the constant modulus criterion appears to be a very effective solution because it does not suffer from the same limitations PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
as the mean square criterion. Indeed, it has been shown that the MSE criterion remains ineffective for a large range of DTV channels, as the training sequence is both too short and too infrequent for typical time-varying channel conditions. Additionally, the CM criterion equalizer has properties which are very comparable to the MSE equalizer. Typically, low-complexity solutions are required in DTV receivers for the optimization of the above mentioned criteria. A very common solution, which has numerous variations, is simply to use an iterative gradient descent algorithm (based on the derivative of the criterion) to find the minimum point (i.e., filter) of each cost function. Numerous studies and solutions have been proposed in the literature to optimize the convergence speed and or the tracking capability of such algorithm. The improved techniques may use frequency domain implementation, additional constraints (to limit the range of the coefficient filters), and improved initialization techniques to start with a closed form solution that would be as close as possible to the minimum point. III. TRANSPORT DECODING AND PROCESSING Receivers must parse and process data ancillary to audio and video in order to locate and properly decode the audio and video programming as well as to present information to the viewer that may be essential to the viewing experience (e.g., closed captioning to a hearing-impaired viewer) or that provides information helpful to the viewer (e.g., an electronic program guide). The MPEG-2 Transport Stream is a bit stream framed into 188-B transport stream (TS) packets. Each packet contains a field that identifies the stream of data of which it is a part [packet identifier (PID)], thereby enabling the multiplexing of many different streams of data. There are other fields that indicate other pieces of necessary data. For example, the program clock reference (PCR) enables synchronization of the 27-MHz MPEG clocks. A complete description of PCR and of the effects of jitter is contained in a companion paper on transport found elsewhere in this issue. That paper also contains a full discussion and explanation of the acronyms used below. A. PSI/PSIP Structure and Processing PSIP is a collection of tables, descriptors and other data. Receivers must use a certain amount of this data to locate programming and present it. For example, PSIP facilitates DTV receiver tuning of specific programming using “virtual” channels, without viewer knowledge of actual RF channels and frequencies. Other portions of PSIP data are useful for enabling optional features (including an electronic program guide). B. Accumulating Programs The transport stream broadcast is continuous, which is to say that there is no beginning or end and it will change over time. Furthermore, the data that describes the stream is carried in many different pieces (different tables, descriptors, etc.), and as a result the stream must be constantly examined for ongoing changes. Typically, a receiver will maintain HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
a database of information containing the information in the transport stream (and information contained in other transport streams left over from the last time it was tuned to that stream). The transport stream is monitored, and changes are reflected in the database. Programs may be discovered and tuned using information carried in either of two sets of tables: PAT and PMT, or VCT and SLD. 1) CVCT and TVCT: There are small but significant differences between the cable virtual channel table (CVCT) and the terrestrial virtual channel table (TVCT). The CVCT includes “ path_select,” to indicate on which of two coax cables the channel is carried,2 whether the channel is carried via an out-of-band path, and a one-part channel number. The TVCT does not include “ path_select” or any out-of-band options, and requires a two-part major/minor channel number. 2) One and Two-Part Numbers: Digital cable systems continue to use one-part channel numbers; each channel is identified by a single number, whether that channel is carried via an analog path or a digital one. Terrestrial broadcast, in order to maintain long-entrenched branding and identity, uses a two-part channel number. Two-part numbers allow broadcasters to continue to be known as “Channel 4,” and allow consumers to choose among several virtual channels included in the “Channel 4” programming. Though the standard does not specify the delimiter between the major and minor numbers, a “dot” or “dash” is typically used. PSIP (and, to some extent, FCC rules) requires certain numbers be used for the major and minor parts. For example, for terrestrial broadcast, the major channel number is limited to the range 1 to 99 for ATSC DTV or audio services and the minor channel number is required to be zero for analog broadcasts. See ATSC A/65B Annex B for more details. 3) Hide Guide and Hidden: The VCT carries two bits which are used to indicate whether a program should be listed in the program guide, should be tunable, and whether it should be completely hidden (for example, an alternate channel used for directed channel change). If a channel is not “hidden” (the hidden bit is ‘0’), the channel should be accessible, displayed on the guide, etc. If , it should a channel is hidden, but with not be tunable but should be displayed in the guide (e.g., it is an inactive channel). If a channel is hidden and with , then it should not be displayed in the guide and should not be directly tuneable by a viewer. C. Program-Related Data The transport stream includes significant detail about current and future programming, intended to allow a receiver to accumulate and display an electronic program guide. 1) Event Information Table (EIT)/Extended Text Table (ETT) Parsing: The EIT and ETT provide information on programming—“television shows.” This information 2This feature is mostly deprecated. There are some cable systems still using dual cables, but this number is decreasing.
135
Table 5 ATSC Transmission Formats (from ATSC A/53D Annex A)
includes the name of the event, the start time and length, and optionally a description (carried in the ETT). Each event also has a descriptor loop, which optionally carries content advisory information (“V-Chip”), audio information, and caption information. 2) Content Advisory: The content advisory information, carried in the content advisory descriptor, carries information that describes what rating levels a program has in each of the ratings dimensions described by the ratings system in use. There are several ratings systems described [and that may be broadcast via the ratings region table (RRT)]. Rating region 0 01 duplicates the analog ratings system (“TV-G,” “TV-PG,” etc.), and new ratings systems may be defined and announced via RRT broadcast. See CEA-766 for details on defined ratings regions. 3) Caption Descriptor: The caption service descriptor is required to be in the EIT for each event with captioning. The descriptor identifies the caption language and other information (“easy reader” designation, for example).
D. Maintenance of Channel Map Similar to PAT and PMT or VCT and SLD acquisition and caching, typically a receiver will construct and maintain a database containing what can be broadly referred to as “event data”—what is on, when, what captions are available, what languages, etc. However, this database is significantly more complicated than the PAT and PMT or VCT and SLD data, for several reasons: there is much more data, it is much more interconnected (more relational), and it is carried on many different channels at varying repetition rates. During normal operation receivers are likely to accumulate data, continually updating the database as new data is seen and old data expires. At other times (e.g., in “standby mode”), receivers are likely to tune to known broadcast signals and accumulate information—as well as search for new stations coming on-air. CEA CEB-12-A has a detailed description and recommendations on this process. 136
IV. DISPLAY SYSTEMS A. Format Conversion For the purpose of this paper, format conversion is defined as a spatiotemporal translation of a transmitted image format into another format best suited for display to the viewer. Within an HDTV receiver design, format conversion enables lower implementation complexity, improved performance, and increased consumer convenience. Specific examples of where implementation complexity can be lowered and performance improved are as follows. • In CRT-based designs, format conversions may permit near constant horizontal and vertical deflection rates, which simplify convergence systems and deflection energy recovery systems. • Format conversions allow efficient use of a wider range of display technologies, which may not have a native resolution equivalent to any of the ATSC transmission formats. • Format conversion can mitigate the effects of burn-in with phosphor-based displays by resizing the image so that inactive portions of video are not persistently displayed at the same screen coordinates location. Specific examples relating to consumer convenience include following. • Format conversion permits backward compatibility to existing display systems. • Format conversion allows the maximum interoperability with new and evolving display systems. • Format conversion can minimize transients presented to display subsystems as transmission formats change during programming, allowing visually seamless transitions. • Format conversion can resize the transmitted image most appropriate for viewing preferences. Table 5 lists the transmission formats allowed in ATSC broadcast signals. There are several noteworthy points in this table. First, look at the frame rate codes 1 and 2. These two codes refer to rates of 23.976 and 24 Hz, respectively, and they differ PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Table 6 The 18 ATSC Formats
by a factor of exactly 1001/1000. (The same relationship applies to codes 4 and 5, and 7 and 8.) The significance of this factor is backward compatibility to NTSC frame rates, and by implication, frequency relationships among chrominance, luminance, and audio components in an RF modulated signal, or chrominance and luminance components in a composite signal. The best down-conversion to NTSC display formats will be served with frame rate codes of 1, 4, and 7, as the spectra of the signal components will interleave properly, presenting minimum mutual interference. For receivers with integrated displays, or set-top boxes with component outputs, there should be no performance impact. A second item to note is aspect ratio information; there are two target image aspect ratios: 16 : 9 and 4 : 3. For 1920 1080 and 1280 720 transmission formats, square pixels result when displayed on a 16 : 9 aspect ratio display. And for 640 480, square pixels result when displayed on a 4 : 3 display. However, for 704 480 transmission formats, square pixels never result, and in fact, the target display can be either 4 : 3 or 16 : 9 for this image format. As a result, the ATSC table can be rewritten as 18 explicit transmission formats, as given in Table 6. Since a consumer generally watches one display at a time, a common goal in HDTV receiver design is to optimally convert all 18 ATSC transmission formats into a single narrow range of output formats best suited for the display of choice. To complicate matters, however, the actual active video area could be smaller than the transmission format. A user option is sometimes involved in such an event to allow format conversion to perform further processing, such as resizing the active image area to fit the display (Table 7). Although HDTV receivers are nominally defined with 16 : 9 format displays, some manufacturers may opt to design an integrated receiver incorporating a 4 : 3 format display. Furthermore, set-top HDTV receivers may have to account for connectivity to both 16 : 9 and 4 : 3 displays. HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
Some display technologies may not react favorably to extended display of images having black horizontal and vertical bars, such as resulting from letterboxing, side panels, or “pillar-box.” Phosphor-based displays may “burn-in” these bars to the extent that manufacturers often take proactive measures to mitigate these effects. Examples of proactive measures include: 1) resize (zoom) the image so that the active video fills the display area; 2) translate the luminance value of bars from black to a neutral gray (or some function of the average picture luminance level) so that burn-in is equalized across the display area; 3) read the active format description (AFD) and bar_data() in the ATSC bitstream to process the images, per above; 4) directly read luminance value to deduce locations of active area video in the case that AFD and bar_data() are inadequately coded. B. Decoder Film Mode Detection Film mode detection, as used within an encoder, attempts to undo the 3 : 2 pull-down (telecine) process on the video, thus dropping some degree of redundant video. This process is sometimes known as detelecine, and creates a 24 frames per second (fps) transmitted sequence from the source video, as follows:
A bit stream savings is realized and image quality can be improved or additional video can be placed upon within 137
Table 7 Display Options
the transport multiplex. Following an explicitly defined 3 : 2 pull-down cadence indicated in the video bit stream, a decoder will redo the telecine process for a 60 (59.97) Hz display, as follows:
In the case that detelecine is not done at the encoder, a transmitted interlaced signal will offer the best conversion to progressive frames at the decoder if a method is devised to perfectly reconstruct original film frames (by detelecine), then redo telecine to create a progressive display. The following transform shows a telecine conversion of 24-fps film material to 30-fps interlace video, which is transmitted from the encoder:
The above interlaced frames may be transmitted with some bitstream savings. The following process, at the decoder, shows decoded fields (30-fps interlace video has 60 fields per second) that are detelecined into film frames at 24 fps, then telecined into progressive video frames at 60 fps progressive: 138
Failure to properly reconstruct the original film frames in this scenario will result in progressive frames for display that have elements of two film frames, causing an apparent double image (feathering, blurring, or “mouse teeth” on edges) if motion occurs. In the scenario where interlaced material is broadcast having content originally telecined from film (which may have been done quite earlier), the encoder is oblivious to the film origin, so no ancillary information about the 3 : 2 cadence is found in the video bit stream. The decoder is left to itself sort out which decoded fields belong to which film frames. The decoder’s film mode detector provides hints as to the missing information, performing comparisons of fields for nearly identical content. A 3 : 2 pull-down cadence can be inferred from field similarities and differences, but this process is not always perfect. Nevertheless, the improvements are substantial enough to be useful to include in consumer products. In addition to the possibility of improper reconstruction of the original 3 : 2 cadence, a perfectly reconstructed cadence still exhibits some slight artifacts: A slight judder (a recurring stutter in the apparent motion of video) is perceived due PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Fig. 15.
Lip sync error as a function of frame drop/repeat.
to the fact that 3 : 2 pulldown results in some film frames being displayed for three display frames (3/60th of a second) while other film frames are displayed for two display frames (2/60th of a second). The three-frame images slightly dominate, since they are displayed longer, and motion is perceived as being less “smooth.” One is most likely to see judder during horizontal pans. Using 3 : 3 pulldown, a 72-Hz display rate could mitigate this judder for 24-fps film material, but this causes problems with other frame rates. A few manufacturers offer the feature of 72-Hz display for film material, but the display switches to 60 Hz for nonfilm video. C. Display Rate Control MPEG-2 video compression, as used in ATSC, specifies all frame rates in terms a specific number of ticks of a 27-Mhz system time clock, which can be recovered from the transport stream using the embedded PCRs (program_clock_reference). The MPEG PTSs (presentation_time_stamps) are further specified in units of the system time clock divided by 300. It is possible therefore, to display each decoded frame at a time precisely defined by PTS, and such a constraint preserves a constant end-to-end system delay from encoder input frames to decoder output frames. Display of video at a time precisely defined by the PTS implies a display frame rate that is precisely locked to a recovered MPEG system time clock. Every coded frame is displayed: no decoded frames need be dropped nor unexpectedly repeated. Precise display rate control is important for those manufacturers who believe that every transmitted frame should be displayed precisely the number of times as intended, and that lip sync errors should be held to a minimum. An architecture that requires a recovered MPEG system time clock to drive a display pixel clock is not always efficient to implement. Some ATSC receivers have display system architectures that run asynchronously with the system time clock. Decoded frames become available at a rate which precess through display frames. If decoded frames are received faster than the display rate, then decoded frames must be dropped occasionally from display. If decoded frames are received slower, then those frames must be repeated occasionally to the display. Some manufacturers find this architecture tolerable, since the drop or repeat can be held to a single event per several minutes. Beyond an occasional visual judder, the frame precession between decode and display introduces lip sync errors (equal to one display frame time), which grows throughout the precession until a frame is dropped or repeated. (See Fig. 15.) HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
For displays with frame rates of 1/30th of a second, lip sync errors may be intolerable; displays with 1/60th of a second frame rates may be marginally tolerable. Several standards and findings have been written on the subject of tolerable lip sync errors: ATSC IS-191 [22], ITU-R BT.1359-1 (1998), ITU-R BR.265. The ATSC finding states that a re15 ms ceiver should present audio and video with a tolerance. In practice, it is possible to design a system with much tighter constraints, as described below. An improved alternative to dropping and repeating frames is dropping and repeating nondisplayed (vertical interval) lines in a frame. Assuming a raster generator that can be dynamically programmed on a frame by frame basis, lip sync errors can be held to a few tens of microseconds, even though the display clock runs asynchronously with the system time clock used by MPEG. Additionally, this approach allows each frame to be displayed as intended (at the time it was intended) and without judder. Proper display timing is achieved by having the decoder’s system controller compare the actual display time versus the intended display time (via PTS), and programming the raster generator to speed up or slow down, accordingly. For conversion of HD video to standard definition displays using composite or RF remodulation to NTSC, it is almost always desirable to recover the 27-Mhz MPEG system clock to be used directly as the NTSC encoder timebase. This approach preserves frequency relationships from MPEG system time clock to frame rate timing, and from frame rate timing to chrominance encoding. Display startup is another display related quality issue that manufacturers may choose to address in their products. Fig. 16 illustrates input frame timing at the encoder, transmission of coded frames, and output frame timing at the decoder. For all frames of a given video bitstream, D minus E is a constant. An additional display quality issue that receivers address is management of display frame timing. This is not a problem in and of itself, but what happens if the decoder switches from the current decoded video stream to another video stream, either from the same transport multiplex or a different one switching among nonaligned display times among programs? Fig. 17 illustrates the display frame timing issue. Either due to different source video phasing or different end-to-end system delay, it is likely that the display raster generator will need to be reset to force an initialization in display display timing at the decoder, when switching the decoding from the first stream to the second, given that precise MPEG PTS timing is to be maintained. The resulting display discontinuity may be visible as a partial frame, or worse, impart a transient to the display device that will require more than one display frame time to recover. While some manufacturers may find this frame timing discontinuity tolerable (or perhaps having little effect), it could play havoc with on-screen displays which are usually present during channel changes. An alternative to reseting the raster generator timing may be to impart an additional display delay at the decoder, such that a new video stream display timing 139
Fig. 16.
Input frame timing relationships.
Fig. 17.
Frame timing relationship.
aligns perfectly with the old display timing. The amount of display alignment delay to be added will be computed at bitstream acquisition and will be a constant for a given bitstream and display startup. The maximum magitude of this alignment delay should be no larger than one display frame time. Rather than add a new video delay buffer in the decoder, it may be more efficient to add delay by way of the decoder bit buffer (additional bit buffer margin) and delay MPEG decoding and display by the amount of alignment delay needed. Audio should be delayed by an equivalent amount to maintain acceptible lip sync error. V. COMMON CONSUMER HIGH DEFINITION DISPLAY TECHNOLOGIES Selection of display technologies has a major impact on the viewing experience of HDTV, but often this is simply 140
a matter of user preference since there is no perfect choice for all viewing situations. A comparison among popular HD display technologies is represented in Table 8. A. CRT-Based Displays Of the display technologies listed above, CRT-based displays come closest to a purely analog system. This low noise, high bandwidth, analog characteristic is the reason for the CRT’s capacity to display a high pixel depth. CRTs also differ from most other display types in that multiformat deflection systems obviate the need for format converters. Monitor manufacturers have occasionally chosen to forgo multiformat deflection systems in favor of digital format converters, with benefits of simpler designs (e.g., high-voltage CRT supplies which piggyback off of the PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Table 8 Comparison of Display Technologies
deflection circuitry) and deflection energy recovery circuits that boost power efficiency. Downsides of electronic format conversion, however, include the potential for introduction of conversion artifacts, such as reduced pixel depth and image contouring at low luminance levels. Direct view CRTs have a significant weight problem in the HDTV application. The preferred 16 : 9 picture aspect ratio and higher resolution (which implies larger image size) runs afoul of a structural design that can safely hold back atmospheric pressure in such a large volume. Unfortunately, the common solution is “more glass,” which quickly adds to weight. Consumer preferences for a flat faceplate further exacerbates structural integrity issues. Again, more glass is a solution. The “screen size”-to-weight ratio easily favors the rear projector, for CRT-based HDTV systems. One manufacturer’s 34-in diagonal 16 : 9 direct view HDTV monitor weighs in at a hefty 170 lb. Larger screen sizes can even approach 300 lb. By comparison, a 57-in diagonal rear projector from another manufacturer weighs in at a slightly higher 198 lb. B. Plasma Display Panel Plasma display panel technology is a moderately robust system in which an electric discharge ionizes a mixture of inert gases (helium, xenon, and perhaps others) that are permanently sealed in a small cell, causing it to emit UV HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
light, which in turn fluoresces colored phosphors at the rear of the cell. The colored light is emitted through the transparent electrode at the front of the cell. Color gamut and contrast ratio are improved with colored filters and masking over the transparent electrode. (EMI filters are also part of this filter structure.) The colored phosphors are subject to some degree of decay in efficiency over the lifetime of the panel. Plasma display panels are an inherently digital system in which individual cells are turned full-on and full-off at a rate above the flicker fusion threshold of the human eye. The ratio of on-time to off-time precisely determines the brightness perceived, but is also a source of some spatiotemporal artifacts on moving edges within an image. Additionally, care must be taken in how cells are modulated, even for static images, or false image contouring may result. A cell (or subpixel) is one of red, green, or blue, and at least three cells compose a single pixel. Plasma display panels have the desirable characteristics of being thin, flat, and large. Weight is reasonable: a 50-in diagonal screen weighs in at about 90 lb from one manufacturer. Plasma display panels shown by manufacturers at the 2005 Winter CES show achieved resolutions as high as 1920 1080 with screen sizes up to 80 in for production units. The physical structure of a plasma display panel is illustrated in Fig. 18. 141
Fig. 18.
Major elements of a plasma display panel device.
Fig. 19.
Three-imager LCD display.
C. Liquid Crystal Display Liquid crystal displays are a transmissive technology that enjoys a perfect spatial uniformity and no screen burn-in effect. LCD displays are rather trendy due to the very thin form factor and low weight. These displays also have good color reproduction and longevity, but some downsides are mechanical fragility and difficulty in manufacturing large screen sizes. Nevertheless, one popular LCD manufacturer has announced a 65 diagonal display having a 1920 1080 resolution. Traditionally, these displays have been hampered with low contrast ratio and limited viewing angles, but improvements in back-lighting technology have helped performance. Contrast ratios as high as 1000 : 1 and wide viewing angles are advertised, as well as faster image response times, reducing the motion “smearing” of earlier displays. One small technology company has advertised a hybrid LCD/LED technology in which a modulated 142
LED back-light array boosts contrast ratio of the LCD to 40 000 : 1. LCD technology is also instantiated in some rear projection designs; however, low contrast ratio and low optical efficiency remain as carryovers from direct view characteristics. As a result, high gain screens are used which can limit the viewing angle. An example three-imager LCD design is shown in Fig. 19. D. Liquid Crystal on Silicon (LCOS) This LCD variant utilizes a reflective imager rather than a transmissive imager (Fig. 20). LCOS technology has several of the LCD technology advantages (high resolution, good color reproduction, and no burn-in). Instantiated in one-chip or three-chip rear projector designs, this technology can meet adequate to good contrast ratios and optical efficiency. However, pixel uniformity in dark colors is difficult to achieve. This technology is still maturing: one manufacturer utilizes PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Fig. 21
Fig. 20.
Functional elements of an LCOS display.
a design with a larger pixel fill-factor, improving overall optical efficiency and contrast ratio. The brightness response curve is also said by the manufacturer to improve low brightness pixel uniformity. E. Digital Micromirror Device (DMD) DMD technology utilizes a grid of microscopic electronically tiltable mirrors (each of which can independently tilted by 12 ) manufactured on a single slice of semiconductor silicon, as the active imaging element. Each pixel location, as represented by a single mirror, cycles between full on and full off at the extremes of the tilt positions. The ratio between on-times and off-times determines pixel brightness. As a result, the DMD has an inherent and precise digital brightness response, as in PDP display technology. The DMD differs from PDP in the fact that most consumer display implementations are color sequential, thus obviating the need for three imagers. For a single DMD imager implementation, each pixel must cycle between ON and OFF for each of three primaries: R, G, and B per frame. In order to minimize color decomposition (rainbow effects) when a viewers gaze averts from point to point on the image, red, blue, and green light is imaged multiple times (4–6) per frame(1/60thofasecond) ontheimager.Butas aconsequence, the short on/off times per color and per frame limit the levels of brightness that can be rendered. This is particularly problematic at low brightness levels when images can begin appearing noisy (dark noise). A spinning circular color wheel composed of red, green, and blue dichroic filters at the lamp output creates the color sequential illumination for the DMD imager. Although simple, this structure has two downsides: 1) at any instant, light that composes one primary is used and remaining light is thrown away (some systems try to recover a certain degree of spoke light) and 2) light modulated by wheel “spokes” and imager can interact to create additional artifacts. On the upside, single imager DMD based displays have very good color uniformity, high contrast ratio, and precise convergence. Three imager projectors, as available for digital cinema, are rather expensive for consumer use. HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
DMD.
A functional illustration of the DMD system is given in Fig. 21. Most popular DMD imagers for HDTV application have a native resolution of 1280 720 pixels (mirrors), and frame update rates are very high, as previously indicated. The highest resolution DMD-based light engines announced at the 2005 Winter Consumer Electronics Show this year incorporate a feature called “pixel smoothing,” which can double the effective resolution of the imager. “Pixel smoothing” is achieved by micropositioning the image registered on the projection screen by a half-pixel, shifting at a 120 Hz rate. Shifting is achieved by deflecting the entire image using a mirror driven by a voice coil or piezo actuator; a full cycle of two shift positions is completed in 1/60th of a second. Given less than a 100% fill factor per pixel, an improved resolution can be achieved in addition to the “smoothing” feature. Manufacturers have claimed resolution as high as 1920 1080 pixels at 60 fps. VI. DIGITAL CABLE READY REQUIREMENTS A. Digital Cable Ready Defined Over the last 30 years, the dominant television delivery means in the United States has switched from broadcast television to cable television. That switch is significant for designers of DTVs today. The impact is most evident by examining the treatment of cable television transmission and television receivers in Federal Communication Commission (FCC) regulations. According to 47 C.F.R. 76.605, NTSC (analog) channels delivered by cable systems must meet the following requirement: “The cable television channels delivered to the subscriber’s terminal shall be capable of being received and displayed by TV broadcast receivers used for off-the-air reception of TV broadcast signals, as authorized under part 73 of this chapter” [23]. That is to say that the analog cable channels must be delivered in a manner that a broadcast television receiver can receive them. As the U.S. television industry made its transition to digital, the cable television industry developed its transmission standards through the Society of Telecommunications Engineers (SCTE) and chose QAM as the modulation method, 143
Table 9 RF Requirements for U.S. Digital Cable Ready Receivers
while the broadcast television industry worked through ATSC and chose VSB modulation. In the digital realm, there is no longer a requirement that cable transmit digital video using the same system as broadcast television. However, the cable television industry has agreed to transmit digital video according to a set of standards. As a result of the cable and television manufacturing industries reaching an agreement on both parties’ obligations to ensure DTVs can be compatible with digital cable systems and vice versa, there is now a regulatory definition of what is required for a television to be marketed in the United States as “digital cable ready” [24] and a complimentary obligation for cable systems [25]. These regulations require that a digital cable ready product, which necessarily tunes and demodulates QAM channels, also includes a “DTV broadcast tuner.” According to the NCTA, 67.1% of U.S. households subscribe to cable [26]. This penetration level provides the marketplace incentive to design digital receivers that receive cable television, and due to the additional requirement that digital cable ready receivers include a digital broadcast tuner, many receivers today accommodate both transmission systems. The technical details of what is required to be called digital cable ready can be traced from 47 C.F.R. 15.123 to a document called Uni-Dir-PICS-I01-030 903: “Uni-Directional Receiving Device: Conformance Checklist: PICS Proforma.” This PICS document lists receiver requirements and traces those requirements to published standards. B. RF Requirements In contrast to broadcast television, where the expected signal conditions can be estimated but not governed, dig144
ital cable television signals are required to meet certain transmission parameters at the receiver. These transmission system standards from SCTE, which cable systems are required to meet by FCC regulation, then serve as the basis for PICS requirements on digital cable ready receivers. Table 9 provides an example set of important RF requirements taken from the PICS document. C. CableCARD and Other Interfaces The most distinguishing feature of a digital cable ready receiver is the inclusion of a slot for the cable operator provided security module, called a CableCARD. This architecture allows the conditional access (security) system to be completely separated from the television, so that the operator retains control over and can replace the security system without compromising the television. This interface and the related processing blocks in the television and CableCARD are shown in Fig. 22. The CableCARD interface is electrically and mechanically compliant with the PC Card standard but undergoes a personality change during initialization to support a custom interface subset of the CEA PC Card standard. The RF input at the television contains QPSK signaling from the headend to the CableCARD in a band from 70 to 130 MHz. This link is called forward data channel (FDC) in ANSI/SCTE 402 004 and carries data at a rate between 1.544 and 3.088 Mb/s according to two different standards [27]. The FDC contains entitlement management messages (EMMs) for digital channel conditional access and other general messaging. There is no reverse data channel (upstream signaling) in a unidirectional product. The RF input also contains NTSC analog channels and quadrature amplitude modulation digital channels in the PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Fig. 22.
CableCARD POD-host interface.
range 54–864 MHz. Six megahertz wide QAM channels are called forward application transport (FAT) channels and use either 64 or 256 constellation points to deliver approximately 27 or 39 Mb/s, respectively. FAT channels contain MPEG-2 transport streams. Once demodulated, the transport stream is delivered to the CableCARD over the In Band interface where the conditional access encryption is removed if the subscriber has access to that channel. The stream is then reencrypted with the DFAST algorithm to protect it from being copied when returned to the television. D. System Information and Emergency Alert Messages An extended channel application in the CableCARD opens a permanent session with the host over which several “flows” can be opened. These flows are requested by the host through a command that specifies what type of service is being requested (always MPEG table sections for unidirectional devices) and the 13-b MPEG-2 Packet Identifier (PID) value for which the host is asking the CableCARD to filter table sections from the forward data channel. The FDC contains QPSK data that is demodulated by the Host and passed to the CableCARD where some data, such as EMMs, terminate, and some data is PID-filtered to be passed to the host via extended channel flows. In accordance with ANSI/SCTE 652002, CableCARDs are required to supply system and service information (SI) in the form of MPEG table sections so that the television can navigate the available services. [28] Cable systems may deliver out-of-band SI using one of six profiles that define the combination of tables and descriptors used by that profile. In addition to handling the variety of profiles that might be delivered on U.S. cable systems, digital cable ready TVs must use the two-part channel number in the two_part_channel_number_descriptor() for HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
channel identification and navigation, if such descriptor is present for a given channel. System information will also be delivered in the FAT channel (in-band) transport stream multiplex in support of channel navigation when a CableCARD is not present. This SI is also delivered in the form of MPEG-2 table sections. The transport stream multiplex, including audio, video, program-specific information, and program elementary stream constraints, as well as service, program guide and emergency alert information, is defined in ANSI/SCTE 542 004. This standard relies upon ATSC A/65B for much of the in-band program and system information protocol (PSIP) definition [29]. Digital cable ready televisions must be able to process emergency alert information that is delivered through either the FDC channel or the FAT channel in accordance with ANSI J-STD-042 [30]. This standard defines a cable_emergency_alert() message delivered in packets identified with the SI_base PID value. For transport streams carrying programs in the clear, that is unscrambled and viewable without a CableCARD, packets are identified by PID 0 1 FFB. For FDC use on the extended channel, emergency alert messages are carried in packets with SI_base PID value 0 1 FFC. VII. INDUSTRY STANDARDS AND REGULATIONS A. FCC Regulations Digital cable systems in the U.S. are now required to support CableCARDs by complying with the two main standards defining the CableCARD interface, which are, as referenced by the FCC in [25], SCTE 282 003 [27], and SCTE 412 003 . Digital cable systems with an activated channel capacity of at least 750 MHz are additionally required to support a core set of transmission standards, 145
which are, as referenced in FCC rules, SCTE 402 003 [31], ANSI/SCTE 652002, ANSI/SCTE 542003, and ATSC A/65B. FCC rules provide details and exceptions to these standards as mutually agreed by consumer electronics manufacturers and cable system operators. FCC rules also define the requirements for a device that seeks to be marketed as digital cable ready and make use of an operator-supplied CableCARD [24]. The rules require that a digital cable ready product: • tunes NTSC analog channels transmitted in the clear; • tunes digital channels that are transmitted in compliance with SCTE 402 003; • allows navigation of channels based on channel information provided through the cable system in compliance with ANSI/SCTE 652 002 and/or PSIP-enabled navigation defined in ANSI/SCTE 542 003; • includes the POD-host interface specified in SCTE 282 003 and SCTE 412 003; • responds to emergency alerts that are transmitted in compliance with ANSI/SCTE 542 003; • includes a DTV (ATSC) broadcast tuner; • provides either a DVI or HDMI interface with HDCP copy protection on a phased-in schedule according to screen size (480 p only products are allowed a Y Pr Pb interface). Manufacturers are also required to show compliance with the procedures set forth in Uni-Dir-PICS-I01-030 903 at a qualified test facility for the first digital cable ready model and provide a description of functionality in postsale literature. REFERENCES [1] ATSC A/53D, Digital Television Standard Advanced Television Systems Committee, Washington, DC, 2005. [2] ATSC A/54A, Guide to the use of the ATSC digital television standard, Advanced Television Systems Committee, Washington, DC, Dec. 4, 2003. [3] ATSC A/74, Recommended practice: Receiver performance guidelines Advanced Television Systems Committee, Washington, DC, Jun. 18, 2004. [4] Communications Research Centre Canada, “Results of the laboratory evaluation of Zenith 5th generation VSB television receiver for terrestrial broadcasting report (Version 1.1)” Sep. 2003 [Online]. Available: http://www.crc.ca/en/html/crc/home/research/broadcast/zenith_lab_report.pdf [5] S. U. H. Qureshi, “Adaptive equalization,” Proc. IEEE, vol. 73, no. 9, pp. 1349–1387, Sep. 1985. [6] M. Ghosh, “Blind decision feedback equalization for terrestrial television receivers,” Proc. IEEE, vol. 86, no. 10, pp. 2070–2081, Oct. 1998. [7] S. Lin and D. J. Costello, Jr., Error Control Coding: Fundamentals and Applications, ser. Prentice-Hall Series in Computer Applications in Electrical Engineering, , F. F. Kuo, Ed. Englewood Cliffs, NJ: Prentice-Hall, 1983. [8] G. C. Clark, Jr. and J. B. Cain, “Error-correction coding for digital communications,” in Applications of Communication Theory, R. W. Lucky, Ed. New York: Plenum, 1981. [9] Consumer Electronics Association, Antenna control interface, EIA/ CEA-909, Jun. 2002. [10] A. Youtz, J. Zygmaniak, S. Reichgott, and D. Koeger, “An EIA/ CEA-909 compatible smart antenna system for digital terrestrial broadcasting applications,” presented at the TPS Conf., 2002. [11] J. D. Kraus, Antennas, 2nd ed. New York: McGraw-Hill, 1988. [12] I. Fijalkow, A. Touzni, and J. R. Treichler, “Fractionally spaced equalization using CMA: Robustness to channel noise and lack of disparity,” IEEE Trans. Signal Process., vol. 45, no. 1, pp. 55–66, Jan. 1997.
146
[13] T. Irmer, “Digital multiprogram systems for television sound and data services for cable,” in ITU-T Recommendation J.83 ITU-T COM09 R R006E1.WW2, International Telecommunications Union Nov. 2, 1995. [14] L. Deniere, E. de Carlvalho, and D. T. M. Slock, “Identifiability conditions for blind and semi-blind multiuser multichannel identification,” in Proc. SSAP’98 pp. 372–375. [15] E. de Carvalho and D. T. M. Slock, “Cramer-Rao bounds for semiblind, blind and training sequence based channel estimation,” in Proc. SPAWC’97 pp. 129–132. [16] S. Adireddy, L. Tong, and H. Viswanathan, “Optimal placement of training for unknown channels,” presented at the 2001 Conf. Information Sciences and Systems (CISS 2001), Baltimore, MD. [17] A. Touzni, I. Fijalkow, and J. P. LeBlanc, “Semi-blind spatiotemporal equalization of FIR filters with controlled delay,” presented at the IEEE Digital Signal Processing Workshop (DSP’98), . [18] E. A. Lee and D. G. Messerschmitt, Digital Communication, 2nd ed. Boston, MA: Kluwer, 1999. [19] D. N. Godard, “Self-recovering equalization and carrier tracking in two-dimensional data communication systems,” IEEE Trans. Commun., vol. COMM-28, no. 11, pp. 1867–1875, Nov. 1980. [20] A. Touzni et al., “Phase recovery based on minimization of single axis constant modulus criterion: Performance analysis,” in Proc. CISS’2001 vol. 2, pp. 811–816. [21] G. D. Forney, Jr., “Maximum likelihood estimation of digital sequences in the presence of intersymbol interference,” IEEE Trans. Inf. Theory, vol. IT-18, no. 3, pp. 363–378, May 1972. [22] ATSC implementation subcommittee finding: Relative timing of sound and vision for broadcast operations Advanced Television Systems Committee, Washington, DC, Jun. 26, 2003. [23] U.S. Code of Federal Regulations, 47 C.F.R. 76.605(a)(1)(i), Technical standards. [24] U.S. Code of Federal Regulations, 47 C.F.R. 15.123, Labeling of digital cable ready products. [25] U.S. Code of Federal Regulations, 47 C.F.R. 76.640, Support for unidirectional digital cable products on digital cable systems. [26] 2004 year-end industry overview National Cable and Telecommunications Association [Online]. Available: http://www.ncta.com/ pdf_files/NCTAYearEndOverview04.pdf [27] ANSI/SCTE 282 004, Host-POD interface standard Society of Cable Telecommunication Engineers. [28] ANSI/SCTE 652 002, Service information delivered out-of-band for digital cable television Society of Cable Telecommunication Engineers. [29] ATSC A/65B, Program and system information protocol (PSIP) for terrestrial broadcast and cable Advanced Television Systems Committee, Washington, DC, Mar. 18, 2003. [30] Emergency alert message for cable , ANSI J-STD-042, Joint Standard of the Consumer Electronics Association and Society of Cable Telecommunication Engineers, Dec. 2002. [31] Digital cable network interface standard ANSI/SCTE 402 004, Society of Cable Telecommunication Engineers.
John G. N. Henderson (Fellow, IEEE) received the B.S.E.E. degree cum laude from the University of Pennsylvania, Philadelphia, and the M.S.E. degree from Princeton University, Princeton, NJ. He is Senior Director and Special Consultant for Hitachi America, Ltd. and its Home Electronics (America) Division. At Hitachi, he has managed development projects in digital television, cable modem, and wireless modem technologies. He represents Hitachi at such standards organizations as the Advanced Television Systems Committee (where he is a Vice-Chair of the Technology and Standards Group) and the Consumer Electronics Association. He was deeply involved with the ACATS process in which a digital TV system for the United States was selected. Previously, at RCA Laboratories and its successor, Sarnoff Corp., he contributed to developments in television tuner control systems, IF filter design, video, and to systems brought by Sarnoff to the ACATS process. He holds more than 25 issued U.S. patents and has presented numerous papers on television-related topics.
PROCEEDINGS OF THE IEEE, VOL. 94, NO. 1, JANUARY 2006
Wayne Bretl (Senior Member, IEEE) received the BSEE from Illinois Institute of Technology in 1966. He joined Zenith Electronics in 1975. He is a Principal Engineer in the R&D Department, Zenith Electronics, Lincolnshire, IL. He holds over 15 patents in television technology and related areas. Mr. Bretl is a member of the Society of Motion Picture and Television Engineers, the Audio Engineering Society, and the Society for Information Display, and represents Zenith in ATSC and a number of professional and industry associations.
Michael S. Deiss received the B.S. degree in electrical engineering from the University of Illinois, Champaign-Urbana. in 1975. In 1975, he began working for RCA at the David Sarnoff Research Center, Princeton, NJ, studying tuning control systems. As a Senior Staff Engineer for Thomson, Indianapolis, IN, he has participated within Grand Alliance, ATSC (T3, T3/S8, T3/S11), and MPEG committees in creation of the US-HDTV and ISO/IEC13818-1 standards. Additionally, he has participated in creation of the standards used within the DirecTV satellite system and has led the design team that created Thomson’s first high-definition MPEG decoder IC and HDTV products. He currently manages a research group at Thomson investigating display technologies, image processing, and video compression. He currently holds 38 foreign and domestic patents in areas of tuning control systems, cryptographic systems, displays, video compression, and electronic program guides.
Adam Goldberg (Senior Member, IEEE) received the B.S. degree in computer science from Iowa State University, Ames, in 1992. Previously, he held positions at C-Cube Microsystems, DiviCom, and Microware Systems. He is currently the Director, Television Standards and Policy Development at Sharp Laboratories of America, Fairfax, VA. In that role, he is Sharp’s primary representative to television-related standards-making activities in the United States. He is also the current Chair of the CEA R4.3 committee and was the Chair of the CEA working group that developed a recommended practice for PSIP implementation for receivers. He has extensive experience in digital television and particularly MPEG-2 Systems (ISO/IEC 13 818-1) and its applications (like PSIP), and has been involved in ATSC, DVB, and SCTE engineering committees. Mr. Goldberg is a member of the Society of Motion Picture and Television Engineers (SMPTE) and the Society of Cable Telecommunications Engineers (SCTE).
Brian Markwalter received the B.S. and M.S.E.E. degrees from the Georgia Institute of Technology, Atlanta. He was Director of Program Management at Intellon Corporation, a fabless semiconductor company specializing in power line communications. In this role, he helped develop and launch the technology adopted by the HomePlug Powerline Alliance for data networking over residential power lines and received a patent for this work. He is currently Vice President of Technology in the Technology and Standards department of the Consumer Electronics
HENDERSON et al.: ATSC DTV RECEIVER IMPLEMENTATION
Association (CEA), Arlington, VA. He represents CEA technical interests in interindustry venues related to digital television, spectrum management, cable compatibility, and copy protection. He also supports accredited standards work within CEA’s committee structure for video systems and cable compatibility of television receivers and related video products. In the technology policy arena, he works with CEA member companies to develop the technical underpinnings of CEA’s spectrum and DTV-related FCC filings.
Max Muterspaugh (Senior Member, IEEE) received the B.S.E.E. and M.S.E.E. degrees from Purdue University, West Lafayette, IN, in 1966 and 1967, respectively. He joined RCA and worked in television design, including video IF and tuners. He has participated in ATV Advisory Committee PS/WP-3 on Spectrum Utilization. After the formation of the Grand Alliance, he participated on the Transmission Group and continues to participate in DTV committees. He has worked on the DSS satellite system and other digital television projects. He is currently a Principal Member of the Engineering Staff with Thomson, Indianapolis, IN, working on digital communication systems and DTV. Recently, he was chairman of the CEA committee to formulate standard EIA/CEA-909 “Antenna Control Interface.” He has been granted 29 U. S. patents.
Azzedine Touzni received the M.S. degree in optical telecommunications from the Institut Galile, Paris, France, in 1994, the M.S. degree in signal processing and image processing from the University of Cergy-Pontoise, Cergy-Pontoise, France, in 1995, and the Ph.D. degree with highest honors from Ecole National Superieure de l’Electronique et de ses Applications (ENSEA), Cergy-Pontoise, in 1998. He participated in the definition and the analysis of one of the first multigigabyte all optical network designs at Alcatel Research in France. In 1999, he was affiliated with the Institut National de Recherche en Informatique et Automatique (INRIA), Roquencourt, France, and was a faculty member at Cornell University, Ithaca, NY, as a Research Associate at the School of Electrical Engineering pursuing postdoctoral research in signal processing techniques. In 2000, he founded AT Consulting, Paris, and was a consultant in digital communication and signal processing applied to communication. After spending nearly two years with Nxtwave Communications, he became part of ATI Research, Yardley, PA, and has worked in several different areas of digital receivers, from RF demodulation to digital rights management for content protection, in particular for HDTV and handheld mobile TV. He has authored several dozen conference and journal publications and has written several book chapters. Dr. Touzni is a member of several international standards organizations, such as Digital Video Broadcast (DVB) and the Advanced Television Systems Committee (ATSC), and was involved in the definition of the Enhanced VSB modulation (E-VSB) specifications adopted by the ATSC organization. He is a reviewer for several international signal processing and communications conferences and for IEEE TRANSACTIONS ON SIGNAL PROCESSING, IEEE SIGNAL PROCESSING LETTERS, IEEE TRANSACTIONS ON INFORMATION THEORY, IEEE TRANSACTIONS ON COMMUNICATIONS, IEEE TRANSACTIONS ON BROADCASTING, and IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION.
147