video scaling algorithm based on multirate signal ... - CiteSeerX

0 downloads 0 Views 272KB Size Report
ABSTRACT. This paper presents an approach for image and video scaling using multirate signal processing. The main objective is to scale images with arbitrary ...
THIS PAPER APPEARED IN THE PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, CHICAGO, ILLINOIS, OCTOBER 4--7, 1998.

IMAGE/VIDEO SCALING ALGORITHM BASED ON MULTIRATE SIGNAL PROCESSING Soontorn Oraintara

and

Truong Q. Nguyen

Electrical and Computer Engineering Department Boston University 8 St. Mary’s St., Boston, MA 02215 USA Email : [email protected] and [email protected] ABSTRACT

Consequently

This paper presents an approach for image and video scaling using multirate signal processing. The main objective is to scale images with arbitrary rational scaling ratio without visible aliasing or distortion artifact. The approach can be applied to grey scale images, color images and video signals in both spatial domain and time domain. Cosine modulation is used to minimize the required on-chip memory since only a prototype filter is stored with some cosine modulation factors. The filters are shown to have comparable regularity for each scaling factor. An efficient structure is proposed for limited bit length filter coefficients which has no imaging artifact after the filter coefficients are quantization. Simulations on still image and video scaling are presented.

1. A REVIEW OF IMAGE/VIDEO SCALING Digital image/video processing is an active research area for multimedia applications due to the efficiency of digital representation of signals. Image/video scaling is desirable in system with different display resolutions. Figure 1 shows a structure for scaling an input signal x(n) with a scaling ratio L/M . Without loss of generality, we assume that gcd(L, M ) = 1 where gcd is the greatest common divisor, i.e. AL + BM = 1 for some integers A and B. The input x(n)

u(n)

L

v(n)

H

L,M

(z)

y(n)

M

Figure 1: A block diagram for L/M image scaler

signal x(n) is upsampled by L, filtered by an interpolation filter HL,M (z), and downsampled by M . Hence the size of the output y(n) is L/M of the size of the input x(n). The relation among x(n), u(n), v(n) and y(n) can be summarized as follow [1]:



u(n) =

x(n/L) 0

if L|n ⇐⇒ U (z) = X(z L ) otherwise

v(n) = h(n) ∗ u(n) ⇐⇒ V (z) = H(z)U (z)

y(n) = v(M n) ⇐⇒ Y (z) =

1 M

X

M −1 k=0

(1)

(2)

V (ej2πk/M z 1/M ) (3)

y(n) =

X

k∈Z

h(k)u(M n − k) =

X

h(M n − Lm)x(m). (4)

m∈Z

Note that from the right-hand side of Eq (4), effectively, the length of the filter is reduced by a factor of L1 , and thus the number of multiplier needed for the computation is reduced. Such a computation can be implemented using the polyphase structure [1]. Although the block diagram presented in Figure 1 is for one dimensional processing, the same method can be applied in on image where the horizontal and vertical directions are scaled independently. Furthermore, color image scaling can also be done by using similar algorithm on the color components either in the original domain (R-G-B) or in the transformed domain (Y-I-Q). The advantage of processing in the Y-I-Q domain is that one could reduce the computation requirement by processing the decimated I and Q components. Not only can the spatial directions be processed using the system in Figure 1, but also it can be scaled along the time axis. This is an application for frame rate conversion for video signals. There are two major artifacts in multirate signal processing: aliasing and imaging artifacts from downsampling and upsampling respectively [2]. These artifacts degrade the image quality and must be suppressed in the algorithm. Notice that the frequency characteristics of the interpolation filter HL,M (z) depends on the values of L and M . From Eq (1), upsampling-by-L introduces imaging components at frequency 2πk , 1 ≤ k ≤ L − 1 and L thus the upsampled signal u(n) has to be filtered by a lowpass π filter with cutoff frequency of L . On the other hand, from Eq (3), downsampling-by-M also introduces aliasing artifact by expanding the signal spectrum of v(n) by a factor of M . Therefore an π anti-aliasing lowpass filter with cutoff frequency of M is required. Figure 2(a) and (b) show the frequency spectra of the signals for the case when L > M and L < M respectively. When L > M (Figure 2(a)), the cutoff frequency of HL,M (z) has to π be L to avoid the imaging artifact and thus the aliasing is automatically suppressed. If L < M (Figure 2(b)), the cutoff freπ quency of HL,M (z) has to be M ; otherwise the imaging artifact will be created. Therefore the cutoff frequency of HL,M (z) must π be max(L,M . ) In practice, implementing ideal filters is impossible, and therefore the transition band of each filter can not be zero. The nonzero stopband attenuation of the filter will pick up some energy around

X(ω)

π

−π

π

−π

U(ω)

U(ω)

π

−π

π

−π

V(ω)

−π

−ω c

V(ω)

ωc

π

−π

−ω c

Y(ω)

ωc

π

Y(ω)

π

−π

2. A DISCUSSION ON FILTER DESIGN WITH LOW ON-CHIP MEMORY REQUIREMENT

X(ω)

π

−π

(a)

As discussed in section 1, it is known that for different combination of L and M , a different scaling filter HL,M (z) is needed. In other words, lowpass filters with arbitrary cutoff frequencies are required for arbitrary scaling ratios. However storing many independent filters’ coefficients requires large on-chip memory especially when the filter lengths are very long which is not preferred in commercial products. To overcome this problem, we propose to use cosine modulation filter, where starting from a narrow band (linear phase) lowpass prototype filter, filters with different cutoff frequency can be obtained using cosine modulation. However when the cutoff frequency of the desired lowpass filter is much larger than that of the prototype filter, the passband and stopband attenuation will collectively increased, and thus the prototype filter has to be properly chosen.

(b)

Figure 2: Frequency responses for the cases when (a) L > M and (b) L < M .

Let P (z), a lowpass filter with cutoff frequency ωc = π/M for some integer M (which is assumed to be large) be the prototype filter. Figure 3 shows a typical frequency response of P (z) and its passband (δp ) and stopband (δs ) ripples. Let 1+δ p

imaging frequencies. If the filter has high attenuation, these imaging artifacts are not visible in the scaled image. However, it is well known that human visual perception is very sensitive to the DC value of an image, and thus, it is important to have exactly zero response at the imaging (aliasing) frequencies: 2kπ for k = L 1, 2, ..., L−1. Figure 6 demonstrates the imaging artifact when the filter has high stopband attenuation but not exactly zero at aliasing frequencies. The image is scaled at 5:4 scaling ratio. In order to suppress this imaging artifact (the vertical and horizontal lines), the processing filter needs to have zeros at aliasing frequencies. This is another constraint on the processing filter HL,M (z). In summary the properties of the filter HL,M (z), for each L and M with gcd(L, M ) = 1 follow:

1 1−δ p

0.8

0.6

0.4

0.2

δs 0

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Figure 3: Frequency response of a prototype filter P (ω) • The filter HL,M (z) has ideal cutoff frequency at ωc = π . max(L,M ) • HL,M (e

j 2kπ L

) = 0 for 1 ≤ k ≤ L − 1.

For each scaling ratio, different filter HL,M (z) is needed and typically it has to be re-designed which is not efficient for arbitrary scaling ratio. Another difficulty when all the filters are independent is that large on-chip memory is required. Moreover when the filter coefficients have to be quantized to a limited bit length, the regularity property can be destroyed easily and that results in visible imaging artifact. Recently, an approach in FIR filter design using cosine modulation has been proposed [3] which provides an efficient, with a small number of parameters, mapping between two FIR filters. A special case when the maximally-flat halfband filter is used for the mapping has been derived in closed form and the resulting filters have approximately-flat frequency response (very high stopband attenuation) while the transition bandwidth is comparable to that of the original maximally-flat halfband filter. However, regularity can not be imposed in such a mapping, and thus some modification has to be made in order to use this type of filter in our application.

X R/2

h(n) = 2p(n)

cos(

k=1

(2k − 1)πn ) M

(5)

be a modulated lowpass filter generated by p(n) with cutoff frequency Rπ/M for some even integer R. The passband and stopband ripples are δp + (R − 1)δs and Rδs respectively which are growing linearly with R. On the other hand, if δp + (R − 1)δs and Rδs are small, then δp and δs have to be small and hence in order to obtain high attenuation filters for arbitrary R, the attenuation of the prototype filter should be small. In our simulations, we use the approximately-flat filter obtained from [3] as a prototype filter which is given by p(n) = lim f (x) x→n

sin(xπ/M ) sin(xπ/2)

(6)

where f (n) is the maximally-flat halfband filter and is given by [1]

X X (−1) 2K k−1

f (n) = 2

k=0 l=0

n−K+k

22(K+l)

2KK +l−1 l

l



2l . K −k+l−n (7)

Therefore different cutoff frequency filters can be obtained as in Eq (5). Although the M -th band filter calculated in (6) has high attenuation, zero at aliasing frequencies is not guaranteed and therefore L1 some imaging artifact appears in the scaled image. Let { M , L2 , ..., 1 M2

more. In image scaling application, zero at aliasing frequencies is more important than the M -th band condition of the filter and thus this is acceptable. In the next section, we will illustrate that the structure proposed in Eq (8) can be used even when the numerical precision is limited. Figure 7 shows an example in image scaling

LJ MJ

} be the set of all possible scaling ratios with gcd(Lj , Mj ) = 1 for all j. Let M = lcm{L1 , ..., LJ , M1 , ..., MJ }, where lcm is the least common multiplier. We first design an M -th band filL ter p(n) using Eq (6). For a particular scaling ratio Mjj , define ∆

prototype filter L=3, M=4 L=4, M=3

hj (n) = gj (n) ∗ dj (n)

(8)

where gj (n) is the modified version of Eq (5):

0 X p(n) @2

(Kj −1)/2

j

X

qj (2i − 1) cos

i=1

M

, even K

(9) The purposes of dj (n) and qj (n) will be illustrated as followed. Conventionally, if dj (n) = δ(n) and qj (n) ≡ 1, hj (n) is a lowpass filter with cutoff frequency at max(Lπj ,Mj ) . To prevent the scaled image from the imaging artifact, zero at aliasing frequencies 2πk need to be imposed. The shortest filter with zero at aliasLj 2πk Lj

is the rectangular window of length Lj . By ing frequencies letting dj (n) be a rectangular window with length Lj , hj (n) is guaranteed to have zeros at aliasing frequencies, however, this is equivalent to cascading with another filter with frequency response which is given by: Dj (ω) =

sin(Lj ω/2M ) . Lj sin(ω/2M )

(10)

The purpose of q(n) is to compensate this magnitude distortion, and can be described by: qj (k) =

−50

−100

j

i=1

Kj /2

2p(n)

 2πin  1 q (2i) cos + 1A , odd K M  π(2i − 1)n 

Log−magnitude Response



M and the scaling filter Hj (z) = HLj ,Mj (z) is Kj = max(L j ,Mj ) given by

8 > > > < g (n) = > > > :

0

Lj sin(kπ/2M ) . sin(Lj kπ/2M )

(11)

To summarize, given a set of possible scaling ratios, the parameters that need to be stored are: • the prototype filter coefficients p(n), • the cosine modulation table and • the modulation factors qj (n). Figure 4 shows the frequency response of the prototype filter (M = 12), the scaling filter when Lj = 3, Mj = 4 and when Lj = 4, Mj = 3 with length 23. Notice that the structure of the filter mapping in Eq (6) will satisfy the M -th band condition, i.e. h(kM ) = δ(k) but regularity is not guaranteed. On the other hand, by modifying the filter as in Eq (8), with the choice of dj (n) being a rectangular window of length Lj , the filter will now have one regularity but the M -th band condition is not remained any

−150

0

0.05

0.1

0.15

0.2 0.25 0.3 Normalized Frequency

0.35

0.4

0.45

0.5

Figure 4: A 12-th band prototype filter and lowpass filters with length 27 for the cases Lj = 3, Mj = 4 and Lj = 4, Mj = 3.

where a part of the original TV image (figure 7(a)) is scaled with ratios 3:4 and 4:3. The filter has length 27 with cutoff frequency at π/4. The resulting images are shown in figure 7(b) and figure 7(c) respectively. The filters used in this example are previously shown in Figure 4.

3. LIMITED BIT LENGTH PROCESSING For low power consumption, rational filter coefficients are preferred. They need to be quantized into a finite word length. If the filter coefficients hj (n) are quantized directly, although the stopband attenuation is high, the frequency response of the scaling filter will not be exactly zero which results in imaging artifact after the image is scaled. The structure of the filter presented in Eq (8) can be divided into two part: 1. dj (n) which impose the zero at aliasing frequency and have integer (rational) values, and 2. gj (n) which have floating point values. In order to quantize hj (n), it suffice to quantize only the gj (n) part while keeping the dj (n) unchanged. Let g˜j (n) denote the quantized version of gj (n). The ˜ j (n) (beresulting quantized filter will now have rational values h cause dj (n) and g˜j (n) are rational numbers) with zeros at aliasing frequencies guaranteed. Figure 5 shows the cascading structure of the quantized filter coefficients ˜ hj (n). Figure 8 shows an example when a video sequence is interpolated on the time axis. Figure 8 (a) shows six consecutive of the SUZIE sequence. This video is frame converted to 50% frame rate and then converted back to the original frame rate. Figure 8 (b) shows the corresponding frames of the interpolated video sequence.

4. CONCLUSION In this paper, we have presented a method for image scaling using multirate signal processing. The approach can be applied in any kind of signals in any dimension including grey scale images, color images and video signals. A modified structure of cosine modulation filter is presented by which the zeros at aliasing frequencies are structurally imposed even when the filter coefficients are quantized into finite word length. Some examples in filter designs and the corresponding scaled images are also presented with no visible aliasing or imaging artifacts. A simulation on frame rate conversion of a video sequence is also presented.

5. REFERENCES [1] P. P. Vaidyanathan. Multirate Systems and Filter Banks. Prentice-Hall, Englewood Cliffs, NJ, 1993. [2] G. Strang and T. Nguyen. Wavelets and Filter Banks. Wellesley-Cambridge, Wellesley, MA, 1996.

(a)

[3] S. Oraintara and T. Nguyen. m-th band filter design based on cosine modulation. In Proc. ISCAS, May 1998.

~g (n)

d (n)

j

j

~ h (n) j

(b)

Figure 5: A cascading structure of the scaling filter which preserves the zeros at aliasing frequencies after being quantized into finite word length.

(c)

Figure 6: Imaging artifact created by using a scaling filter without exact zero response at aliasing frequencies.

Figure 7: (a) Original image, (b) 3/4-scaled image and (c) 4/3scaled image

(a)

(b) Figure 8: Six consecutive frames of (a) the original SUZIE sequence, and (b) the interpolated SUZIE sequence (downsampling by 2 followed by upsampling by 2).

Suggest Documents