Alexander E. Mohr ... This thesis presents a framework in which images can be transmitted over ... 2.4 Forward Error Correction for Packet Erasure Channels .
An Algorithm for Protecting Medical Images that are Transmitted over Channels with Packet Loss
Alexander E. Mohr
A thesis submitted in partial ful llment of the requirements for the degree of Master of Science in Engineering University of Washington 1999
Program Authorized to Oer Degree: Bioengineering
University of Washington Graduate School
This is to certify that I have examined this copy of a master's thesis by Alexander E. Mohr and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the nal examining committee have been made.
Committee Members: Eve A. Riskin Richard E. Ladner Yongmin Kim
Date:
In presenting this thesis in partial ful llment of the requirements for a Master's degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of this thesis is allowable only for scholary purposes, consistant with \fair use" as prescribed in the U.S. Copyright Law. Any other reproduction for any purpose or by any means shall not be allowed without my written permission. Signature Date
University of Washington Abstract
An Algorithm for Protecting Medical Images that are Transmitted over Channels with Packet Loss by Alexander E. Mohr Chair of Supervisory Committee Professor Eve A. Riskin Department of Bioengineering This thesis presents a framework in which images can be transmitted over channels with high packet loss rates with the addition of unequal amounts of forward error correction (FEC). It develops an algorithm that can optimize the allocation of FEC to the image data so as to maximize the expected image quality. By using both the importance of the output of the image coder and an estimate of the probability of losing packets, it considers distortion-rate tradeos in its assignments to provide graceful degradation when packets are lost. The algorithm is also modular in that it can use any compression scheme that produces a progressive bitstream.
TABLE OF CONTENTS List of Figures
ii
Chapter 1:
Introduction
1
Chapter 2:
Background
4
Chapter 3:
An Algorithm for Assigning Forward Error Correction 11
Chapter 4:
Results
19
Chapter 5:
Conclusions and Suggestions for Future Work
29
2.1 2.2 2.3 2.4
Set Partitioning in Hierarchical Trees . . . . . . . . . . . . Joint Source/Channel Coding Using SPIHT . . . . . . . . Prior Work on Transmitting SPIHT Over Lossy Channels . Forward Error Correction for Packet Erasure Channels . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Channel Loss Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1 Lena Test Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Ultrasound Test Image . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 MRI Test Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bibliography
4 5 6 7
14 15 16 19 22 24
31 i
LIST OF FIGURES 2.1 In Leicher's application of PET to MPEG [11], he applied 60% priority to message fragment M1 (the I frames), 80% priority to M2 (the P frames), and 95% priority to M3 (the B frames). Each message and its associated FEC use the same range of bytes in every packet. . . . . . 3.1 Each of the rows is a stream and each of the columns is a packet. A stream contains one byte from each packet. Here, a message of 32 bytes of data (numbers 1-32) and ten bytes of FEC (F) is divided into seven streams and sent in six packets. . . . . . . . . . . . . . . . . . . . . . 3.2 Demonstration of how much data can be recovered when one of six packets is lost. Here, stream one is unaected by the loss, streams two through ve use FEC to recover from the loss, and in stream seven, only the bytes up to the lost packet are useful to the decoder. . . . . 3.3 At each iteration of the optimization algorithm, Q bytes of data can be added or subtracted to any of the L streams. . . . . . . . . . . . . 4.1 The 512 512 Lena test image. (a) The original. (b) Compressed at 0.2 bpp with SPIHT. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Eect of packet loss on PSNR for unequal erasure protection, equal erasure protection, Rogers and Cosman's PZW [16], and unprotected SPIHT. The two erasure protection results are from an exponential packet loss model with a mean loss rate of 20%. . . . . . . . . . . . ii
9
12
13 17 20
21
4.3 Image quality of \Lena" at 0.2 bpp total rate for unequal erasure protection on a channel with an exponential loss model that averages 20%. Loss rates from left to right are: 30%, 40%, 50%, and 60%. . . . . . . 4.4 Data fraction for each stream. Note that the FEC fraction is (1 - Data Fraction). Stream 1 is the rst stream (most important data) and stream 47 is the last stream (least important data). . . . . . . . . . . 4.5 The 439374 ultrasound of a jugular vein thrombosis. (a) The original. (b) Compressed at 0.79 bpp with SPIHT. . . . . . . . . . . . . . . . 4.6 Comparison of PSNR vs. fraction of packets lost for unequal erasure protection, equal erasure protection, and no loss protection for the jugular vein ultrasound image. The channel loss model is an exponential with a mean of 20%. . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Ultrasound image quality at 0.79 bpp total rate for unequal erasure protection on a channel with an exponential loss model that averages 20%. Loss rates from left to right: 12.5%, 25%, 37.5%, and 50%. . . . 4.8 The 256 256 MRI test image. (a) The original. (b) Compressed at 1.0 bpp with SPIHT. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 Comparison of PSNR of MRI test image vs. fraction of packets lost for unequal erasure protection, equal erasure protection, and no loss protection. The channel loss model is an exponential with a mean of 10%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10 MRI test image quality at 1.0 bpp total rate for UEP on a channel with an exponential loss model that averages 10%. Loss rates from left to right: 10%, 20%, 30%, and 40%. . . . . . . . . . . . . . . . . . . .
iii
22
23 24
25
26 27
28
28
ACKNOWLEDGMENTS All manners of profuse thanks go to my advisors, Professors Eve A. Riskin and Richard E. Ladner, for their help and guidance throughout my studies. They have been kind, thoughtful, and supportive as I've searched for an area of study. I also wish to thank the other member of my thesis reading committee, Professor Yongmin Kim, both for his understanding and for graciously allowing me to defend during his hectic rst week of serving as Chair of the department. I must also thank my parents, Iris and Ray Mohr, for teaching me to look at the world with curiosity; my brother, Steven Mohr, for his wonderful perspective; Professor Pedro Verdugo, for introducing me to bioengineering; Jill Goldschneider, for paper tips and LATEX tricks; the current and former members of the Data Compression Lab, M. Y. Jaisimha, B. S. Srinivas, Holly Johnson, Agnieszka Miguel, and Jill, for their insight and inspiration; Professor William Pearlman, for the use of SPIHT source code; Phil Chou, Albert Wang, and Sanjeev Mehrotra, for allowing me to spend a summer with them at Microsoft; the many people who read drafts of my papers and sat through practice talks; and lastly, the uncountable researchers upon whose work this thesis builds.
iv
1
Chapter 1
INTRODUCTION When an ambulance arrives at the scene of a medical emergency, the paramedic team is responsible for assessing the situation, stabilizing the victims, and then transporting them to a medical facility for treatment. The ambulance will usually radio ahead to the medical facility and inform them of the patient's condition before arrival. This information includes a description of the accident site; the patient's vital statistics, such as heart rate, pulmonary rate, and blood pressure; and the paramedics' assessment of injuries through visual inspection and tactile probing. Recently, a group of people has been trying to give the medical facility more information about the patient before he arrives. Giovas, et al. [9] equipped an ambulance with a notebook computer and connected it to an electrocardiogram (ECG) recorder. By transmitting the ECG signal over a cellular telephone, they estimate that acute myocardial infarction could be diagnosed 25 minutes earlier. That extra time could aord the patient a better chance of survival. It would be helpful if a similar approach could be used with a portable ultrasound device to send images to the destination medical facility. With a near real-time feed from the ultrasound device, that facility could monitor the patient as he is transported and evaluate the patient's internal injuries to speed triage, taking the patient directly to the medical unit most suited for dealing with a particular type of injury.
2
Many elements of such a system are currently in development. Battelle's Paci c Northwest National Laboratory has created a prototype portable ultrasound device, the 85 pound MUSTPAC-1. It consists of a Silicon Graphics Indy computer with a at panel display, a modi ed Hitachi two-dimensional ultrasound scanner, and specialized software [3]. Though originally developed for battle eld use, the MUSTPAC-1 could also be installed in an ambulance. Another important element of such a system is the wireless communication channel between the ambulance and the medical facility. Although a cellular telephone is adequate for transmitting ECG traces, a faster channel is desirable for images. For example, Metricom, Inc_is deploying to the general public their Ricochet2 wireless access technology in large metropolitan areas over the next few months. Ricochet2 provides a 128 kbps connection from a mobile unit to a corporate LAN using a standard Internet Protocol packet interface. Unfortunately, wireless communications are often plagued by bursty errors, which cause packets to be discarded. Those discarded packets cause loss of data and most likely decoding failure if the lost data are not replaced. When each packet is assigned a unique sequence number, it is known which packets are received and which have been discarded, so an attempt can be made to recover from the error burst. The sequence number allows the receiver to sort the packets according to their transmission order: any gaps in the sequence are known to be lost packets and out-of-order packets can be positioned correctly. The receiver takes whatever action it deems best to decode the data. Because we would like the medical facility to receive near real-time updates of the ultrasound images, the automatic retransmission request mechanism of TCP/IP can-
3 not be used. With TCP/IP, not only are packet losses treated as a sign of congestion, which is not an issue in this scenario, but the receiver requests that the sender try transmitting a particular packet again. We instead need some way of adding redundancy to the transmitted images. That redundancy can then be used by the receiver to display a possibly-degraded image without having to wait for any retransmissions. Note that we do of course want that image to be degraded as little as possible, even when a large fraction of transmitted packets are lost. This thesis focuses on the issue of adding controlled redundancy to a progressive image coder so that the overall expected image quality can be maximized when an estimation of packet loss probabilities exists. The integration of this algorithm into a complete system that includes a portable ultrasound device and a wireless communication device is left as potential future work. Chapter 2 covers background material on a wavelet-based image compression algorithm that uses Set Partitioning In Hierarchical Trees (SPIHT) to encode signi cance information about transform coecients. Chapter 2 also discusses previous work on protecting the SPIHT coder from packet losses. Chapter 3 introduces a framework that relies on Forward Error Correction (FEC) to recover from lost packets, and then uses that framework to develop an algorithm that can eciently determine a nearoptimal assignment of FEC to the image data. Chapter 4 presents the results of this algorithm on test images, while Chapter 5 concludes and mentions possible future applications.
4
Chapter 2
BACKGROUND Because packets in a wireless channel are discarded at random, there is no way to specify an intrinsic priority level for a particular packet. However, the transmitted data vary in importance: some bytes of an image coder's output will be extremely important to the received image's quality, while other bytes will have much less eect on image quality. An ecient coding strategy needs to quantify the importance of different chunks of data and protect important data more than it protects less-important data. A compression algorithm that provides a progressive ordering of the data will determine which data are most important to the signal quality. In this chapter, a progressive image compression algorithm is described, and previous work on protecting it from bit errors and channel losses is presented.
2.1 Set Partitioning in Hierarchical Trees An example of a progressive image compression algorithm is the SPIHT coder [17]. This algorithm is an extension of Shapiro's Embedded Zerotree Wavelet algorithm [20]. These two new algorithms are a signi cant breakthrough in lossy image compression in that they give substantially higher compression ratios than prior lossy compression techniques including JPEG [13], vector quantization [8], and the discrete wavelet transform [2] combined with quantization. In addition, the algorithms allow for progressive transmission [24], which means that coarse approximations of an image can be reconstructed quickly from beginning parts of the bitstream. They also
5 require no training and are of low computational complexity. In addition, the eects of wavelet-based compression on the diagnostic quality of images has been well-studied in recently published literature [14, 6, 18, 27]. Given the apparent acceptance of wavelet-based compression by the medical community, the SPIHT coder should give excellent compression results while providing medical professionals with documented quality measures.
2.2 Joint Source/Channel Coding Using SPIHT Joint source/channel coding is an area that has attracted a signi cant amount of research eort. Despite the fact that Shannon's separation theorem [19] states that for a noisy channel, the source and channel coders can be independently designed and cascaded with the same results as given by a joint source/channel coder, complexity considerations have led numerous researchers develop joint source/channel coding techniques [7, 25]. To date, a majority of this eort has been for xed rate codes because they do not suer from the synchronization problems that occur with variable rate codes. (Notable exceptions that have considered joint source/channel coding schemes for variable rate codes include work on reversible variable length codes that can be decoded in both directions [26]. However, these codes can still have problems with synchronization.) SPIHT has the advantage of yielding impressive compression ratios for still images, but images compressed with SPIHT are vulnerable to data loss. Furthermore, because SPIHT produces an embedded or progressive bitstream, meaning that the later bits in the bitstream re ne earlier bits, the earlier bits are needed for the later bits to even be useful. However, SPIHT's excellent compression performance is leading
6 researchers to consider the problem of transmitting images compressed with SPIHT over lossy channels and networks.
2.3 Prior Work on Transmitting SPIHT Over Lossy Channels Sherwood and Zeger [21] protected images compressed with SPIHT against noise from the memoryless binary symmetric channel with RCPC codes [10] with good results. They extended this work to images transmitted over the Gilbert-Elliott channel (a fading channel) in [22]. In the latter case, they implement a product code of RCPC and Reed-Solomon codes and nd that this outperforms the work in [21] even for the binary symmetric channel. Although their work was done for channels prone to bit errors, some of the concepts they present will be useful for a channel with packet loss. Rogers and Cosman [16] used a xed-length packetization scheme called packetized zerotree wavelet (PZW) compression to transmit images compressed with a modi ed SPIHT over lossy packet networks. The algorithm does not use any channel coding, and hence can be considered joint source/channel coding. They implemented a scheme to t as many complete wavelet trees (i.e. one coecient from the lowest frequency wavelet subband along with all its descendants) as possible into a packet. One drawback to this packetization procedure is that trees may not t perfectly into a packet, which reduces the eciency of the compression. On the other hand, the packetization ensures that catastrophic derailment will not occur because erasures do not propagate between the packets; instead, this algorithm degrades gracefully in the presence of packet loss because the packets are independent. If a packet is lost, they attempt to reconstruct the lowest frequency coecients from the missing trees of wavelet coecients by interpolating from neighboring low
7 frequency coecients that have been correctly received by the decoder. To simplify their algorithm, they used fewer levels of wavelet decomposition and removed the arithmetic coder from the SPIHT algorithm. The modi cation of the SPIHT algorithm caused a decrease of about 1.1 dB in the PSNR at 0.209 bits per pixel for the case of a channel without losses. Note that their algorithm relies on intrinsic redundancy within the data, and thus it cannot handle packet losses in excess of a few percent. These two schemes were combined into a hybrid scheme in [4]. The authors consider the case where, in addition to packet loss, packets can arrive with bit errors in them. They use channel coding to correct bit errors and PZW to conceal packet losses. If they can not correct all of the bit errors in a packet, they consider the packet to be erased. The hybrid scheme shows resilience to packet loss, bit errors, and error bursts. It is still based on the modi ed SPIHT algorithm used in [16], which does not perform as well as the original SPIHT algorithm.
2.4 Forward Error Correction for Packet Erasure Channels Priority Encoding Transmission (PET) [1] is an algorithm that assigns FEC, according to priorities speci ed by the user, to message fragments (also speci ed by the user) sent over lossy packet networks. Each of these fragments is protected from packet erasures by added FEC. Priorities can range from high, which means that the message fragment can be recovered if relatively few packets are received by the decoder, to low, meaning that most or all packets need to be received to recover the message fragment. This recovery is possible by treating the message fragment as the coecients of a polynomial in a Galois eld and evaluating it at a number of additional points, thus creating redundant data [15]. The receiver can recover the message fragment
8 by interpolation from any subset of the transmitted packets, so long as it receives a fraction of packets at least as large as the priority of the message fragment. In the PET algorithm, each message fragment is assigned a xed position within each packet. For gure 2.1, the rst fragment M1 and its FEC F1 consist of the rst L1 bytes of each packet, the second fragment M2 and its FEC F2 consist of bytes from (L1+1) to (L2 ) of each packet, and M3 and F3 consist of the remaining bytes of each packet. PET determines the value of Li for each fragment and the total number of packets N , making the assumption that the number of fragments is much smaller than the number of bytes in each packet, and constrained by the user-speci ed priorities. The PET algorithm does not specify how to choose the priorities to assign to the various message fragments: this assignment is left to the user. It de nes priorities as the fraction of transmitted packets that must be received to decode the message; thus a high priority is represented by a low percentage. Leicher [11] applied PET to video compressed with MPEG and transmitted over packet loss channels. He used a simple three-class system in which M1 was the intraframe (I) frames and had priority 60%, M2 was the forward-only predicted (P) frames and had priority 80%, and M3 was the forward-backward predicted (B) frames and had priority 95%. Thus, he can recover the I frames from 60% of the packets, the I and P frames from 80% of the packets, and all the data from 95% of the packets. This is diagrammed in Figure 2.1. In related work, Davis, Danskin, and Song [5] presented fast lossy Internet image transmission (FLIIT) which is a joint source/channel coding algorithm that, like PET, assigns dierent levels of FEC to dierent types of data, but it considers distortionrate tradeos in its assignments. They begin with a 5-level discrete wavelet transform, create an embedded bitstream by quantizing each subband's coecients in bit planes, apply entropy coding, and pack the bitstream from each subband into 64-byte blocks.
9
Fraction Needed to Recover Message
Byte Position
1 . . . . . . L.1 L2 . . L
20%
40%
60%
M1
F1
M2 M3 1 2 3 . . .
80%
F2 F3
Packets
. . . N
Figure 2.1: In Leicher's application of PET to MPEG [11], he applied 60% priority to message fragment M1 (the I frames), 80% priority to M2 (the P frames), and 95% priority to M3 (the B frames). Each message and its associated FEC use the same range of bytes in every packet. To do bit allocation, they determine the reduction in distortion due to each block, similar to work in [23]. They then compare the greatest decrease in distortion from those blocks with the addition of a block of FEC data to the already-allocated blocks and allocate the block of data or block of FEC that decreases the expected distortion the most. They only consider three simple cases of assigning FEC to a block: no protection, protection that consists of one FEC block shared among a group of blocks, and replication of the block. They nd that, as expected, it is advantageous to apply more FEC to
10 the coarse/low frequency wavelet scales and to the most signi cant bit planes of the quantization. The FLIIT algorithm is one of the rst pieces of work to explicitly consider distortion-rate tradeos in making FEC assignments for lossy packet networks. However, it is limited by the coarse assignment of only three cases, and the reliance on the compression algorithm they have selected (for example, SPIHT can yield a PSNR that is over 1 dB higher than their algorithm). In the next chapter, we discuss our algorithm for adding FEC to images compressed with SPIHT that are transmitted over lossy packet networks. It considers distortion-rate tradeos in its assignments and uses a powerful error correction code that is derived from Reed Solomon codes.
11
Chapter 3
AN ALGORITHM FOR ASSIGNING FORWARD ERROR CORRECTION While the algorithms in [22, 16, 4, 5] yield good results for memoryless and fading channels and for lossy packet networks, there are additional ways to transmit compressed images over a channel that discards packets such that image quality gracefully degrades with increasing packet loss|even at high loss rates. Speci cally, we will protect images with unequal amounts of FEC in a manner similar to the PET scheme, but we will consider the eect on image quality of each data byte when assigning protection. This chapter introduces a framework that relies on FEC to recover from lost packets, and then uses that framework to develop an algorithm that can eciently determine a near-optimal assignment of FEC to the image data. In our approach to assigning unequal amounts of FEC to progressive data, we remove PET's restriction that the number of message fragments be much less than the number of bytes in each packet. Instead, we use a number of message fragments equal to the number of available bytes in each packet and have our algorithm dynamically choose the length and content of each message fragment. We add FEC to each message fragment to protect against packet loss such that the fragment and the FEC form a stream. The message is divided into L streams such that each stream has one byte of each of N packets. In Figure 3.1, each of the L = 7 rows is a stream and each of the N = 6 columns
12
1
Streams
2 3 4 5 6 7
1 4 8 12 17 22 27
2 5 9 13 18 23 28
3 6 10 14 19 24 29
F 7 11 15 20 25 30
1
2
3
4
Packets
F F F F F F 16 F 21 F 26 F 31 32 5
6
Figure 3.1: Each of the rows is a stream and each of the columns is a packet. A stream contains one byte from each packet. Here, a message of 32 bytes of data (numbers 1-32) and ten bytes of FEC (F) is divided into seven streams and sent in six packets. is a packet. For a given stream i, for i = 1; 2; : : : ; L, containing both data bytes and FEC bytes, as long as the number of lost packets is less than or equal to the number of FEC bytes, the entire stream can be decoded [1]. Figure 3.1 shows one possible way to send a message of 32 bytes of data (numbers 1-32) and ten bytes of FEC (F). Notice that in the gure, more bytes of FEC are applied to the earlier parts of the message and fewer are used for the later parts of the message. For SPIHT, which produces an embedded bitstream, the earlier parts of the message should have the highest priority because they are most important to the overall quality of the reproduction.
13
Streams
2 3 4 5 6 7
1 4 8 12 17 22 27
2 5 9 13 18 23 28
1
2
3 6 10 14 19 24 29
? ? ? ? ? ? ?
3
4
Packets (a)
F F F F F F 16 F 21 F 26 F 31 32 5
6
1 2
Streams
1
3 4 5 6 7
1 4 8 12 17 22 27
2 5 9 13 18 23 28
1
2
3 6 10 14 19 24 29 3
7 11 15 16 20 21 25 26 X X 4
Packets
5
X 6
(b)
Figure 3.2: Demonstration of how much data can be recovered when one of six packets is lost. Here, stream one is unaected by the loss, streams two through ve use FEC to recover from the loss, and in stream seven, only the bytes up to the lost packet are useful to the decoder. Figure 3.2 shows the case where one packet out of six is lost, and ve are received correctly. In this case, the rst six streams can be recovered since they contain ve or fewer data bytes. The last stream can not be decoded since it contains six bytes of data and no FEC. We point out that bytes 27-29 from the seventh stream are useful since they were received correctly, but for an embedded bitstream, bytes 31 and 32 are not useful without byte 30. Similarly, if two packets are lost, bytes 1-11 are guaranteed to be recovered and bytes 12-15 may or may not be recovered. In messages of practical length, however, those few extra bytes have only a small eect on image quality. Analogous to progressive transmission [24], even if severe packet loss occurred, we could recover a lower delity version of the image from the earlier streams that are decoded correctly. Each additional stream that is successfully decoded improves the quality of the received message, as long as all previous streams are correctly decoded.
14
3.1 Notation Assume we have a message M , which is simply a sequence of data bytes to be transmitted. For example, this could be a still image compressed with SPIHT to 0.5 bits per pixel. If, instead of sending M , we send a pre x of M and some FEC, we can still maintain the same overall bit rate. We let mi equal the number of data bytes assigned to stream i and let fi = N ? mi equal the number of FEC bytes assigned to stream i. We de ne an L-dimensional FEC vector whose entries are the length of FEC assigned to each stream as:
f = (f1 ; f2; : : : ; fL): For a given f, we divide M into fragments Mi (f) and de ne Mi (f) to be the sequence of data bytes in the ith stream. That is, Mi(f) includes the bytes of message M from position Pji?=11 mj + 1 to position Pij=1 mj ; i = 2; 3; : : : ; L, with M1(f) composed of m1 bytes of stream 1. We denote a pre x of M containing the rst j fragments for redundancy vector f as:
M (j; f) = M1 (f)M2 (f) : : : Mj (f): We now introduce the incremental PSNR of stream i:
gi(f) = PSNR[M (i; f)] ? PSNR[M (i ? 1; f)]: The quantity gi(f) is the amount by which the PSNR increases when the receiver decodes fragment i, given that all fragments prior to i have already been decoded. We set g1(f) to be the dierence in PSNR between the case in which M1 (f) is received and the case in which no information is received.
15 Because the data are progressive, we require that fi fi+1 ; i = 1; 2; : : : ; L ? 1; that is, the FEC assigned to the streams is non-increasing with i. With this requirement, if Mi (f) can be decoded, then M1 (f); M2(f); : : : ; Mi?1(f) can also be decoded. The packet loss model we use is given by a PMF pn; n = 0; 1; : : : ; N , such that pn is the probability that n packets are lost. To simplify calculations, we determine the probability that k or fewer packets are lost, and thus the CDF is c(k) = Pkn=0 pn; k = 0; 1; : : : ; N . The quantity c(fi) is the probability that receiver can decode stream i. We can now calculate the expected PSNR of the received message as a function of f by summing over the L streams:
X G(f) = c(fi)gi(f): L
i=1
In designing our algorithm to assign FEC, we seek the f that maximizes G(f) subject to our packet loss model pn.
3.2 Channel Loss Model To determine the FEC vector f, we need an estimate of the channel loss model pn that a message is likely to encounter. In keeping with a modular design philosophy, our approach assumes that channel loss behavior can be modeled by an estimator that outputs a PMF that indicates the likelihood that a particular number of packets is lost. That estimate will be aected by a number of factors. In the case of the mobile ultrasound machine, the estimator could account for the type of surrounding terrain, any inclement weather, and interference from other transmitters|or it could use only the previous history of losses. Indeed, one might imagine a simple estimator
16 that uses a sliding window of recent packet losses to estimate future channel conditions. However, as channel estimation is an active eld of research, the speci cs of how the estimator derives its estimate of current channel conditions are not explicitly addressed by this thesis.
3.3 The Algorithm Finding the globally optimal assignment of FEC data to each stream is computationally prohibitive for a useful amount of data, nor can a dynamic programming approach be used because the solution to one subproblem aects the solution of another. We therefore developed a hill-climbing algorithm that makes limited assumptions about the data, but is computationally tractable. As mentioned in Section 3.1, we constrain fi fi+1. This constraint is a direct consequence of using progressive compression, because the importance of each byte of data is generally decreasing: more important data appears earlier in the sequence, while less important data appears later in the sequence. Additionally, we assume that a single byte missing from the progressive bitstream causes all later bytes to become useless. That is, the image coder makes extensive use of context when compressing the images, both in its zerotree coding and in the arithmetic entropy coder. Even if the bytes of data that follow the missing byte are received correctly by the decoder, without context, any improvement in image quality is negligible. Finally, the PMF should be reasonably well-behaved: the more it deviates from a unimodal function, the larger the search distance must be to capture the irregularities. We initialize each stream to contain only data bytes, such that mi = N and
17 Q
Stream
1 2 3
FEC Source Data
4 :
L 1 . . . . . . . . . . . . . . . . . . N
Byte Number Figure 3.3: At each iteration of the optimization algorithm, Q bytes of data can be added or subtracted to any of the L streams.
fi = 0; i = 1; 2; : : : ; L. In each iteration, our algorithm examines a number of possible assignments equal to 2QL, where Q is the search distance (maximum number of FEC bytes that can be added or subtracted to a stream in one iteration) and L is the number of streams. We determine G(f) after adding or subtracting 1 to Q bytes of FEC data to each stream (see Figure 3.3), while satisfying our constraint fi fi+1. We choose the f corresponding to the highest G(f), update the allocation of FEC data to all aected streams, and repeat the search until none of the cases examined improves the expected PSNR. This algorithm will nd a local maximum that we believe is quite close to the global maximum and, in some cases, may be identical to the global maximum. Note that for every byte of FEC data that we add to a stream, one byte of data needs to be removed. When changing the FEC assignment, we start at the rst stream aected by the new allocation, move its last data byte to the next stream,
18 move the last data byte of this stream to the next stream, and so on. This causes a cascade of data bytes to move down the streams until the last data byte from stream L is discarded. This part of the algorithm makes use of our assumption that the compressed sequence is progressive, because the data byte that we discard is among the least important in the embedded bitstream. In the next chapter, we present the results of applying our algorithm to real images and compare those results to ones previously published.
19
Chapter 4
RESULTS In this chapter, the algorithm developed in the previous chapter is applied to a variety of test images. Because the work in this thesis draws upon previous publications in the areas of image compression, joint source/channel coding, and networking, the rst test image is chosen to be the standard within those elds, the \Lena" image. After placing the algorithm within that context, results are given for an ultrasound image of a jugular vein thrombosis. Finally, to show that the algorithm also works well on other kinds medical images, results for a sagittal brain MRI are presented.
4.1 Lena Test Image For these experiments, we used the standard 512 512 gray-scale Lena image compressed with SPIHT. The original Lena image is in Figure 4.1(a) and it is compressed with SPIHT to 0.2 bpp in Figure 4.1(b). In our experiments, we chose a total bit rate of 0.2 bits per pixel for the combination of data and FEC bytes and used an exponential packet loss model with a mean loss rate of 20%. Because ATM packets have a payload length of 48 bytes and one byte is required for a sequence number, we place 47 bytes of data in each packet and send 137 packets, giving a total payload size of 6576 bytes, of which 6439 are data. Including the sequence number, the bit rate is 0.201 bits per pixel. Excluding it, the bit rate is 0.197 bits per pixel. Convergence of the algorithm is typically reached in about 27 iterations and requires 0.05 seconds on an Intel Pentium II 300MHz workstation. Our algorithm maximizes the expected PSNR for two cases: unequal erasure pro-
20
(a)
(b)
Figure 4.1: The 512 512 Lena test image. (a) The original. (b) Compressed at 0.2 bpp with SPIHT. tection (UEP), in which the algorithm could freely allocate FEC to the compressed data; and equal erasure protection (EEP), in which the algorithm was constrained to assign FEC equally among all of the streams. For UEP assignment, the result was an allocation with an expected PSNR of 29.42 dB. For EEP assignment, the result was an allocation with an expected PSNR of 28.94 dB, or 0.48 dB lower than the UEP assignment result. As shown in Figure 4.2, under good channel conditions (erasure rates of up to 32%, which occur 80% of the time) UEP yields a PSNR that is 0.66 dB higher than EEP. This is because more bytes are used for data and fewer for FEC. EEP does surpass UEP when loss rates are 33% to 51%, but those occur with only 11% probability. In addition, UEP degrades gracefully whereas EEP has a sharp transition at loss rates near 51%. As expected, both of these cases substantially outperform not using any
21 34
Unequal Protection Equal Protection PZW Unprotected SPIHT
32
30
28
PSNR in dB
26
24
22
20
18
16
14
0
0.1
0.2
0.3
0.4 0.5 0.6 Fraction of Packets Lost
0.7
0.8
0.9
1
Figure 4.2: Eect of packet loss on PSNR for unequal erasure protection, equal erasure protection, Rogers and Cosman's PZW [16], and unprotected SPIHT. The two erasure protection results are from an exponential packet loss model with a mean loss rate of 20%. protection on the data, except when the loss rate is very low. At those low loss rates, e.g. below 1-2%, unprotected SPIHT will often survive with a signi cant pre x of the transmitted data remaining intact, and the more robust PZW coder [16] will perform slightly better. On the other hand, performance for those two techniques degrades rapidly as losses increase, while the addition of FEC allows protected data to survive at larger loss rates. We also note that the protected data is aected only by the number of lost packets, but the reconstruction quality of unprotected SPIHT, and to a lesser extent PZW, depends upon which packets are lost.
22
Figure 4.3: Image quality of \Lena" at 0.2 bpp total rate for unequal erasure protection on a channel with an exponential loss model that averages 20%. Loss rates from left to right are: 30%, 40%, 50%, and 60%.
We display results of using UEP in Figure 4.3. It shows the graceful degradation of the image transmitted over a lossy packet network with loss rates of 30%, 40%, 50%, and 60%. Notice that the image quality remains high at a loss rate of 40%, that the image is recognizable at a loss rate of 50%, and that gross features of the image can still be seen at a loss rate of 60%. Finally, we show how the UEP algorithm assigns data and FEC to the dierent data streams for the Lena image compressed with the SPIHT algorithm in Figure 4.4. Stream 1 is the rst stream (most important data from the SPIHT algorithm) and it has an assignment of 24% data and 76% FEC. Stream 47 is the last stream (least important data) with 70% data and only 30% FEC. The amount of FEC decreases with increasing stream number, as required by our algorithm.
4.2 Ultrasound Test Image We also applied our assignment algorithm to a 439 374 grayscale ultrasound image of a jugular vein thrombosis compressed with SPIHT (see Figure 4.5). To present results applicable to a mobile ultrasound device, we made use of the speci cations of Metricom's Ricochet2 technology, which transports packets of approximately 500
23 1 5 10
Stream Number
15 20 25 30 35 40 45 47 0
0.1
0.2
0.3
0.4 0.5 0.6 Data Fraction
0.7
0.8
0.9
1
Figure 4.4: Data fraction for each stream. Note that the FEC fraction is (1 - Data Fraction). Stream 1 is the rst stream (most important data) and stream 47 is the last stream (least important data). bytes. We also want images to update continuously approximately every second, so the image was transmitted in 32 500-byte packets over a channel with an exponential mean loss rate of 20%. The total bit rate was 0.79 bits per pixel for the combination of data and FEC bytes. Convergence of the algorithm for UEP was typically reached in about 45 iterations and required 0.14 seconds of CPU time on an Intel Pentium II 300MHz workstation. In Figure 4.6, we demonstrate the eects of both the unconstrained UEP assignment and the constrained EEP assignment. Under better channel conditions (erasure rates of up to 35%) the UEP assignment yields a PSNR of 38.65 dB, which is 0.99 dB higher than the 37.66 dB result of EEP assignment. Again, this is because more bytes are used for data and fewer for FEC. As before, UEP degrades gracefully whereas EEP would give very poor image quality if the experienced loss rate rose
24
(a)
(b)
Figure 4.5: The 439 374 ultrasound of a jugular vein thrombosis. (a) The original. (b) Compressed at 0.79 bpp with SPIHT. above 53%. As expected, both of these cases once again substantially outperform not using any protection on the data, except when the loss rate is very low, e.g. below 1%. We also examine image quality for the ultrasound image at a variety of loss rates. Figure 4.7 shows the graceful degradation of the image transmitted over a lossy packet network with loss rates of 12.5%, 25%, 37.5%, and 50%. Notice that the image quality remains high, even when the loss rates exceed the 20% mean.
4.3 MRI Test Image Finally, to show that the FEC assignment algorithm also works on MRI images, it was applied to a 256 256 grayscale magnetic resonance image of a brain compressed with SPIHT (see Figure 4.8(a)). In this case, the image was transmitted in 174 47-byte payloads over a channel with an exponential mean loss rate of 10%. The total bit rate
25 45
Unequal Erasure Protection Equal Erasure Protection Unprotected SPIHT
40
PSNR in dB
35
30
25
20
15 0
0.1
0.2
0.3 0.4 0.5 0.6 0.7 Fraction of Packets Lost
0.8
0.9
1
Figure 4.6: Comparison of PSNR vs. fraction of packets lost for unequal erasure protection, equal erasure protection, and no loss protection for the jugular vein ultrasound image. The channel loss model is an exponential with a mean of 20%. was 1.0 bits per pixel for the combination of data and FEC bytes. Convergence of the algorithm was typically reached in about 46 iterations and required 0.08 seconds of CPU time on an Intel Pentium II 300MHz workstation. In Figure 4.9, we show the results of using our algorithm for both UEP and for EEP. Under better channel conditions (erasure rates of up to 24%), UEP yields a PSNR of 36.02 dB, which is 1.08 dB higher than the 34.94 dB result of EEP. As before, UEP degrades gracefully, whereas EEP would give very poor image quality if the experienced loss rate were above 33%. Figure 4.10 shows the graceful degradation of the image protected with UEP and transmitted over a lossy packet network with
26
Figure 4.7: Ultrasound image quality at 0.79 bpp total rate for unequal erasure protection on a channel with an exponential loss model that averages 20%. Loss rates from left to right: 12.5%, 25%, 37.5%, and 50%. loss rates of 10%, 20%, 30%, and 40%. Notice that the image quality remains high at a 30% loss rate and the image is still recognizable as a sagittal brain slice at the 40% loss rate.
27
(a)
(b)
Figure 4.8: The 256 256 MRI test image. (a) The original. (b) Compressed at 1.0 bpp with SPIHT.
28
Brain Image PSNR for Exponential Packet Loss Model with 10% Mean 40
Unequal Erasure Protection Equal Erasure Protection Unprotected 35
PSNR in dB
30
25
20
15
10
0
0.1
0.2
0.3
0.4 0.5 0.6 Fraction of Packets Lost
0.7
0.8
0.9
1
Figure 4.9: Comparison of PSNR of MRI test image vs. fraction of packets lost for unequal erasure protection, equal erasure protection, and no loss protection. The channel loss model is an exponential with a mean of 10%.
Figure 4.10: MRI test image quality at 1.0 bpp total rate for UEP on a channel with an exponential loss model that averages 10%. Loss rates from left to right: 10%, 20%, 30%, and 40%.
29
Chapter 5
CONCLUSIONS AND SUGGESTIONS FOR FUTURE WORK We have presented a framework that allows recovery of image data, even when the transmission medium is aected by large packet loss rates. We then developed an algorithm that can optimize the allocation of FEC within that framework. By using both the importance of the output of the SPIHT coder and an estimate of the probability of losing packets, it considers distortion-rate tradeos in its assignments to provide graceful degradation when packets are lost. Those assignments result in guarantees about image quality for a particular assignment: the image quality is xed for a given fraction of lost packets, regardless of which speci c packets are lost. The algorithm is also modular in that it can use any compression scheme that produces a progressive bitstream and any channel estimator that can provide a probability distribution function of packet losses. We imagine three distinct directions for potential future work. First, this algorithm could be incorporated into a complete mobile ultrasound system by combining a portable ultrasound device, such as the MUSTPAC-1, with a wireless communication network, such as the one provided by Metricom's Ricochet2 technology. Constructing this device would involve numerous systems engineering challenges. The wireless communication network that is employed would have to be studied carefully to guide the selection of a channel estimator that can accurately predict packet loss probabilities. Getting access to a portable ultrasound device and obtaining clearance to use it in eld trials would also not be a trivial task.
30 Alternatively, the principles of FEC assignment presented in this thesis could be used to guide study in related elds. For example, it might be possible to apply unequal protection to schemes that do not use explicit FEC codes. Rather, by integrating redundant data directly into the image compression process, a joint source-channel coder might be created. Preliminary work in this area can be found in [12]. In addition, some networks allow the user to negotiate quality of service requirements. Because providing higher quality service is more expensive for the network, the various service classes are likely to have dierent costs associated with them. If a large price dierential exists between a high-quality (no loss) channel and a low-quality (medium{high loss) channel, then UEP might be advantageous. It could be used to transform a high-bandwidth, low quality channel into a series of lower-bandwidth channels with much higher qualities of service. Finally, improvements might be made to the UEP assignment algorithm itself. Careful optimization of the source code would decrease its execution time, as would implementation on a programmable multimedia processor. It might also be possible to design an algorithm that uses successive approximation to re ne the allocation of FEC in a progressive manner, an enhancement that would allow the algorithm to terminate before completion and still nd a reasonably good allocation. Successive approximation could be useful for situations in which time constraints require an extremely fast algorithm.
31
BIBLIOGRAPHY [1] Andres Albanese, Johannes Blomer, Je Edmonds, Michael Luby, and Madhu Sudan. Priority encoding transmission. IEEE Transactions on Information Theory, 42(6):1737{1744, November 1996. [2] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies. Image coding using wavelet transform. IEEE Transactions on Image Processing, 1(2):205{220, April 1992. [3] J. Coleman, A. Savchenko, A. Goettsch, K. Wang, P. Bono, R. Little eld, and M. Macedonia. Teleinvivo: a collaborative volume visualization application. In Medicine Meets Virtual Reality, pages 115{124, 1997. [4] Pamela C. Cosman, Jon K. Rogers, P. Greg Sherwood, and Kenneth Zeger. Combined forward error control and packetized zerotree wavelet encoding for transmission of images over varying channels. IEEE Transactions on Image Processing, 1998. Submitted for publication. [5] Georey M. Davis, John M. Danskin, and Xiyong Song. Joint source and channel coding for Internet image transmission. In Proceedings of ICIP, volume 1, pages 21{24, September 1996. [6] B.J. Erickson, A. Manduca, P. Palisson, K. Persons, F. Earnest, V. Savcenko, and N.J. Hangiandreou. Wavelet compression of medical images. Radiology, 206(3):599{607, March 1998.
32 [7] N. Farvardin. A study of vector quantization for noisy channels. IEEE Transactions on Information Theory, 36(4):799{809, July 1990. [8] A. Gersho and R. M. Gray. Vector Quantization and Signal Compression. Kluwer Academic Publishers, Boston, 1992. [9] P. Giovas, D. Papadoyannis, D. Thomakos, G. Papazachos, M. Rallidis, D. Soulis, C. Stamatopoulos, S. Mavrogeni, and N. Katsilambros. Transmission of electrocardiograms from a moving ambulance. Journal of Telemedicine and Telecare, 4(1):5{7, 1998. [10] J. Hagenauer. Rate-compatible punctured convolutional codes (RCPC Codes) and their applications. IEEE Transactions on Communications, 36(4):389{400, April 1988. [11] Christian Leicher. Hierarchical encoding of MPEG sequences using priority encoding transmission (PET). Technical Report TR-94-058, ICSI, November 1994. [12] Agnieszka C. Miguel, Alexander E. Mohr, and Eve A. Riskin. SPIHT for generalized multiple description coding. Submitted for publication in Proceedings of ICIP, 1999. [13] W. B. Pennebaker and J. L. Mitchell. JPEG: Still Image Data Compression Standard. Van Nostrand Reinhold, New York, 1993. [14] K. Persons, P. Palisson, A. Manduca, B.J. Erickson, and V. Savcenko. An analytical look at the eects of compression on medical images. Journal of Digital Imaging, 10(3 Suppl 1):60{66, Aug 1997. [15] L. Rizzo. Eective erasure codes for reliable computer communication protocols. ACM Computer Communication Review, 27(2):24{36, April 1997.
33 [16] Jon K. Rogers and Pamela C. Cosman. Robust wavelet zerotree image compression with xed-length packetization. In Proceedings Data Compression Conference, pages 418{427, March-April 1998. [17] Amir Said and William A. Pearlman. A new, fast, and ecient image codec based on set partitioning in hierarchical trees. IEEE Transactions on Circuits and Systems for Video Technology, 6(3):243{250, June 1996. [18] D.F. Schomer, A.A. Elekes, J.D. Hazle, J.C. Human, S.K. Thompson, C.K. Chui, and W.A. Murphy. Introduction to wavelet-based compression of medical images. Radiographics, 18(2):469{481, Mar{Apr 1998. [19] C. E. Shannon. A mathematical theory of communication. Bell Systems Technical Journal, 27:379{423, 1948. [20] J. M. Shapiro. Embedded image coding using zerotrees of wavelet coecients. IEEE Transactions on Signal Processing, 41(12):3445{3462, December 1993. [21] P. G. Sherwood and K. Zeger. Progressive image coding for noisy channels. IEEE Signal Processing Letters, 4(7):189{191, July 1997. [22] P. G. Sherwood and K. Zeger. Error protection for progressive image transmission over memoryless and fading channels. IEEE Transactions on Communications, 1998. Accepted for publication. [23] Y. Shoham and A. Gersho. Ecient bit allocation for an arbitrary set of quantizers. IEEE Transactions on Acoustics Speech and Signal Processing, 36(9):1445 { 1453, September 1988. [24] K-H Tzou. Progressive image transmission: a review and comparison of techniques. Optical Engineering, 26(7):581{589, July 1987.
34 [25] Ren-Yuh Wang, Eve A. Riskin, and Richard Ladner. Codebook organization to enhance maximum a priori detection of progressive transmission of vector quantized images over noisy channels. IEEE Transactions on Image Processing, 5(1):37{48, January 1996. [26] J. Wen and J. D. Villasenor. Reversible variable length codes for ecient and robust image and video coding. In Proceedings Data Compression Conference, pages 471{480, March-April 1998. [27] B. Zhao, L.H. Schwarz, and P.K. Kijewski. Eects of lossy compression on lesion detection: predictions of the nonprewhitening matched lter. Medical Physiology, 25(9):1621{1624, September 1998.