IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012
1391
Antialiasing Filter Design for Subpixel Downsampling via Frequency-Domain Analysis Lu Fang, Oscar C. Au, Ketan Tang, and Aggelos K. Katsaggelos
Abstract—In this paper, we are concerned with image downsampling using subpixel techniques to achieve superior sharpness for small liquid crystal displays (LCDs). Such a problem exists when a high-resolution image or video is to be displayed on low-resolution display terminals. Limited by the low-resolution display, we have to shrink the image. Signal-processing theory tells us that optimal decimation requires low-pass filtering with a suitable cutoff frequency, followed by downsampling. In doing so, we need to remove many useful image details causing blurring. Subpixel-based downsampling, taking advantage of the fact that each pixel on a color LCD is actually composed of individual red, green, and blue subpixel stripes, can provide apparent higher resolution. In this paper, we use frequency-domain analysis to explain what happens in subpixel-based downsampling and why it is possible to achieve a higher apparent resolution. According to our frequency-domain analysis and observation, the cutoff frequency of the low-pass filter for subpixel-based decimation can be effectively extended beyond the Nyquist frequency using a novel antialiasing filter. Applying the proposed filters to two existing subpixel downsampling schemes called direct subpixel-based downsampling (DSD) and diagonal DSD (DDSD), we obtain two improved schemes, i.e., DSD based on frequency-domain analysis (DSD-FA) and DDSD based on frequency-domain analysis (DDSD-FA). Experimental results verify that the proposed DSD-FA and DDSD-FA can provide superior results, compared with existing subpixel or pixel-based downsampling methods. Index Terms—Downsampling, frequency analysis, subpixel rendering.
I. INTRODUCTION
A
SINGLE pixel on a color liquid-crystal display (LCD) consists of several primary colors, which are typically three colored stripes ordered (depending on the display) either as blue, green, and red (BGR), or as red, green, and blue (RGB). The colored stripes are called subpixels. The colors of the three subpixels are fused together to appear as a single color to the human visual system (HVS) due to the blurring by the optics
Manuscript received December 14, 2010; revised May 11, 2011; accepted August 07, 2011. Date of publication August 30, 2011; date of current version February 17, 2012. This work was supported in part by the Research Grants Council (RGC) of the Hong Kong Special Administrative Region, China, under Grant GRF 610109. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Kiyoharu Aizawa. L. Fang, O. C. Au, and K. Tang are with the Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong (e-mail:
[email protected];
[email protected];
[email protected]). A. K. Katsaggelos is with the Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL 60208 USA (e-mail:
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2011.2165550
Fig. 1. Red, green, and blue subpixels of a pixel in an LCD stripe display.
and the spatial integration by the nerve cells in the human eyes [1]–[3]. Although the subpixels are easily visible when viewed at a very close distance (particularly with the help of a magnifying glass), as shown in Fig. 1, they are not individually visible beyond a certain distance. However, researchers found that, by controlling the subpixel values of neighboring pixels, it is possible to microshift the apparent position of a line and/or to achieve higher edge sharpness. Methods that take this into account are called subpixel rendering algorithms [4]. Subpixel rendering techniques stem from the problem of font rendering on LCDs. For a long time, a simple pixel-based font display has been used, and the smallest level of detail that a computer can display on an LCD is a single pixel [5]. On the other hand, subpixel rendering takes advantage of the fact that each pixel on a color LCD is actually composed of individual subpixel stripes to give greater font details [6], [7]. Fig. 2 shows that simple pixel-based rendering can cause sawtooth artifacts in sloping edges, as shown in Fig. 2(a) [8]. Due to the fact that a pixel is composed of three separable subpixels, the number of points that may be independently addressed to reconstruct the image can be increased three times, and we can ‘borrow’ subpixels from adjacent whole pixels. Fig. 2(b) depicts a much smoother result compared with Fig. 2(a) using subpixel rendering, which microshifted the apparent position of the sloping edge by a distance of one or two subpixel. However, subpixel rendering causes an imbalance in the local color, as shown in Fig. 2(c), which is an artifact commonly called ‘color fringing’ [6], [9], because, for some pixels, only one or two (but not three) subpixels are turned on/off. In this paper, we are concerned with image downsampling using subpixel techniques to achieve sharper images for small LCDs. Such a problem exists when a high-resolution image or video is to be displayed on low-resolution display terminals. For example, while digital pictures are usually captured at very high resolutions (e.g., 10 megapixels), many of them would be displayed on LCD computer monitors, photoframes, or small LCD screens on mobile phones or personal digital asistants, which have considerably lower resolutions (e.g., 0.8 megapixels on SVGA or 0.2 megapixels on some smartphones). A similar situation exists for videos, where full-high-definition (HD, 1920 1080) TV or full-HD movies in Blu-ray may be viewed
1057-7149/$26.00 © 2011 IEEE
1392
Fig. 2. Rendering of a sloping edge. (a) Pixel-based rendering. (b) Subpixel rendering (conceptual) result. (c) Subpixel rendering (actual color pattern).
Fig. 3. (a) DPD. (b) Magnified result of DPD, where grass is “broken” due to aliasing artifacts. (c) DSD. (d) Magnified result of DSD, where grass is smooth but has color fringing artifacts.
on HD-ready (1366 768) or standard-definition (720 576 or 720 480) TVs or monitors. To view high-resolution images/videos on low-resolution displays, a downsampling procedure is required. A simple way called Direct Pixel-based Downsampling (DPD) in this paper is to perform downsampling by selecting pixels. (In this paper, the term one out of every means that no antialiasing filter is applied.) It can incur severe aliasing artifacts in regions with high spatial frequency [such as staircase artifacts and broken lines, as shown in Fig. 3(b)] [10]. An improved method is called Pixel-based Downsampling with an Anti-aliasing Filter (PDAF), in which the antialiasing filter is applied before DPD. It suppresses aliasing artifacts at the expense of blurring the image, as only low-frequency information can be retained in the process [10], [11]. Note that neither the DPD nor the PDAF incur color artifacts. Since the number of individual reconstruction points in an LCD can be increased three times by considering the subpixels, the application of subpixel rendering in downsampling schemes
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012
may lead to an improvement in apparent resolution. Higher apparent resolution is always attractive to consumers, because, for a given physical size of display, higher resolutions can make images more realistic by rendering more details. For example, for two displays with the same physical size of 4 in 3 in, a 704 576 display would convey more details than a 352 288 display. This is also why HDTV looks nicer than SDTV on displays with the same physical size. Unfortunately, the direct application of the subpixel approach to downsampling may create a color-fringing problem and may result in very annoying artifacts [6], [9]. Thus, some filtering is needed to suppress the color-fringing artifacts without significantly damaging the improved apparent resolution. In [6], Gibson uses a five-tap low-pass filter to smooth the results of the subpixel-based downsampling. However, the low-pass filter relieves color fringing at the expense of image blurring and can only be adopted as an enhancement technique for achromatic images. Based on psychophysical experiments, [12] defines an error metric in the frequency domain and derives the filter coefficients by minimizing this metric. Betrisey et al. [13] have applied the results of Platt [12] to achromatic (grayscale) font rendering, which is the basis of the Microsoft’s ClearType system. In [14] and [15], an algorithm based on the HVS is proposed to suppress visible chrominance aliasing. However, all these methods process subpixel-based downsampling horizontally. Researchers typically do not attempt to apply subpixel-based downsampling vertically as there is a common perception that little can be gained in the vertical direction due to the horizontal arrangement of the subpixels. In this paper, we attempt to use frequency-domain analysis to analyze what happens in subpixel-based downsampling and why it is possible to achieve higher apparent resolution. We then design novel antialiasing filters for existing subpixel-based downsampling schemes based on our frequency-domain analysis. The rest of this paper is organized as follows: In Section II, we first analyze the frequency characteristics of DPD and two existing subpixel-based downsampling schemes, i.e., direct subpixel-based downsampling (DSD) and diagonal DSD (DDSD). Since DSD and DDSD cause color-fringing artifacts, we design novel antialiasing filters for them based on our frequency-domain analysis, and we call the resulting schemes DSD based on frequency-domain analysis (DSD-FA) and DDSD based on frequency analysis (DDSD-FA), respectively. In Section III, we introduce the objective measures to evaluate the proposed methods. Experimental results are described in Section IV. Finally, Section V concludes this paper. II. PROPOSED SUBPIXEL DOWNSAMPLING BASED ON FREQUENCY-DOMAIN ANALYSIS A. DSD-FA For simplicity, we assume that an input high-resolution ) of size is to be downsampled image (meaning to a low-resolution image (meaning ) of size , device, where and to be displayed on a . (Note that if is not of size , i.e., the downsampling ratio is not 3, we can use regular interpola.) In tion or decimation methods to resize to be
FANG et al.: ANTIALIASING FILTER FOR SUBPIXEL DOWNSAMPLING VIA FREQUENCY-DOMAIN ANALYSIS
this paper, we will use to index and to index such that is the th pixel of and is the th pixel of . Daly and Kovvuri [16] proposed a simple subpixel-based downsampling scheme that we call DSD in this paper. DSD decimates the red, green, and blue components alternately in the horizontal direction. DSD copies the red, green, and blue components (i.e., the three subpixels) of the pixel from , three different pixels in , such that , and , as shown in Fig. 3(c). A close examination of Fig. 3 reveals that, while the aliasing artifacts in DPD cause the grass to be “broken”, the DSD fills in the gaps in the grass, making it continuous and sharp at the expense of introducing undesirable color fringing artifacts. To better understand the spatial multiplexing of RGB comof ponents of a DSD image, we define an auxiliary image , and we use to represent the size th pixel of . We define the red component of such that for and for all possible , and elsewhere, such that for
are the RGB modulation functions, and color components of (R, G, or B for is Fourier transform of
1393
are the three or ). The
(5) where represents the discrete-time Fourier transform (with zero padding outside the support of and ), and is the convolution operator. As the three RGB modulation functions in (3) and (4) are based on the cosine function, their Fourier transforms are expressed as Diracs, i.e., [as shown in (6)]
, (1)
otherwise. Effectively, out of every 3 3 submatrix of , one value is copied from , and the other eight are zero. Similarly, we for , ; define for , ; and zero elsewhere. As such, if DSD is applied to either or , the same small image will be obtained. We define the auxiliary luminance component for the th pixel as . . Later, we As this is defined for DSD, we will call this will define a similar for DPD and DDSD, and we will call and , respectively. By definition, can them be written as
(6) (2) where
, , and where Similarly, we can compute the Fourier transform of . We then represent them in a matrix form
. and
for otherwise
(3)
(4)
(7)
1394
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012
where , and
, denotes transposi-
tion. Rewriting (5) with (7) and considering the relationship beand tween (8) we can obtain (9), shown at the bottom of the page, where
(10)
To compare the difference between DSD and DPD, we define similarly to (2), of which the three RGB the corresponding , as defined in (11). Note that modulation functions are is identical for the R, G, and B color components in DPD, i.e.,
Fig. 4. (a) Frequency spectra of I
. (b) Frequency spectra of I
.
three have more energy at low than high frequencies, as ex, , and are weighted sums of , , pected. Although according to (10), their magnitudes in Fig. 5 can be and and appear to have more compact quite different. Both spectra than . This can be explained as follows: Any signal such as can be decomposed into a low-frequency term and . It is well known that the high-frea high-frequency term quency components of different colors tend to be similar [17], . This implies that i.e.,
(11) The Fourier transform of
is given by
(13)
(12) Fig. 4 shows the magnitude of in (9) and in (12). We clearly see nine replicated spectra situated at and in each and corresponding to the nine Diracs locations. of are Examining Fig. 4(a), all nine replicated spectra of with equal magnitude since the magnitude of and in (12) is equal to . The magnitude of is obviously different . While the three center replicated spectra from that of are , the three on the right are , and the three on of corresponding to (9). The horizontal shifting the left are of sampling locations of the three colors in DSD leads to the and on the two sides of . It replacement of by and are significantly different from . also appears that To examine the difference between , , and , we compare the magnitude of , , and , as shown in Fig. 5. All
since . This means that is mainly a low-frequency signal, with very little high-frequency energy. A similar . While and are low-frequency argument applies to signals, contains significant high-frequency energy because
(14) With the low-frequency nature of and , the horizontal overlap in DSD between the center and the on the right on the left are significantly lower than the horizontal and the overlap in DPD among the s. To prevent aliasing in DPD, it is well known that a low-pass filter with a cutoff frequency of 1/6 should be applied as this is a 3:1 downsampling situation. On
(9)
FANG et al.: ANTIALIASING FILTER FOR SUBPIXEL DOWNSAMPLING VIA FREQUENCY-DOMAIN ANALYSIS
1395
Fig. 5. (a) Frequency spectra of the Y component. (b) Frequency spectra of the C component. (c) Frequency spectra of the C component.
^ , and C^ . Fig. 6. Spectra of Y^ , C
the other hand, with the smaller amount of horizontal overlap in DSD, it is perhaps possible for us to use a higher cutoff frequency (larger than 1/6) to retain more of the signal details in . In the following, we will seek to find suitable cutoff frequency values according to certain criteria. As the horizontal and vertical overlapping situations are different, the horizontal and vertical cutoff frequency will be different. be both the horizontal and vertical shifts of the Let is located at , replicated spectra such that, for DSD, (0, 0), and ; at , , and ; and at , , and . For DPD, it is at and be the horizontal and the same nine locations. Let vertical cutoff frequencies for different downsampling methods. As previously mentioned, for DPD, the horizontal and vertical cutoff frequency to prevent aliasing should be . For DSD, as the three vertical replicated spectra are identical, the vertical cutoff frequency should be the same as . We seek to find a larger horizontal DPD with . cutoff frequency As noted in [18], all natural images possess a certain degree of spatial correlation, and their energy is sparsely distributed and usually concentrated in the lower frequency bands. We studied the spectra of a set of natural images, as shown in Fig. 6, and found that the majority of the energy is highly concentrated in the lower frequency, as pointed out by [18]. We further observe that the Laplacian probability density function appears to be a good model. We thus apply a zero-mean circularly symmetric Laplacian model to represent the normalized spectra of , , , , and . In other words, and , respectively, where is the sum of the magnitudes of the corresponding frequency spectra over all frequencies. The variance of the corresponding symmetric Lapla. In other words, for any 2-D frequency excian is pressed in polar coordinates, , , and are modeled as scaled circularly symmetric Laplacian distribution
(15) to be the frequency In the proposed DSD-FA, we choose where the main spectra has power at least greater than or . We look for equal to that of the aliasing spectra
Fig. 7. DDSD.
such that and
, where , i.e.,
(16) Therefore, we apply a rectangular antialiasing low-pass filter in DSD-FA, and the horizontal and vertical cutoff frequency are given by (17) B. DDSD-FA In our simulations, we observe that the enhanced apparent resolution of DSD tends to take place in regions with edges with a vertical component (i.e., nonhorizontal edges). In general, DSD does not have improvement in smooth regions or regions with horizontal edges. In [19], we propose the DDSD scheme, which decimates the red, green, and blue components alternately in a diagonal direction, as shown in Fig. 7. (Our proposed DDSD-FA approach works also for the antidiagonal direction, but we use diagonal in this explanation.) To further understand the potential and limitation of various downsampling schemes, we generate an artificial large 420 containing four subimages, as image ( ) of size 420
1396
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012
Fig. 8. (a) Original L image. (b) S obtained by DPD. (c) S obtained by DSD. (d) S obtained by DDSD. L is of size 420 420, and S is of size 140 140.
2
2
. The mean and variance for DPD, DSD, and and DDSD are shown in Table I. for all methods and subimages, sugAs expected, gesting that the average line width is correct for all methods. is nonzero in all four directions, indicating that For DPD, the line spacing of DPD is irregular due to the aliasing artifacts, as shown in Fig. 8. However, DPD has no color fringing arti, and the line spacing is regular for facts. For DSD, three directions (Subimage-V, Subimage-D, and Subimage-AD) at the expense of annoying color fringing artifacts. However, is nonzero in Subimage-H, which has irregular line spacing , and the line spacing but no color fringing. For DDSD, is also regular for three directions (Subimage-V, Subimage-H, is nonzero for the diagonal direcand Subimage-AD), and tion. Both DSD and DDSD use subpixel-based downsampling to keep line spacing regular in three out of four directions at the expense of color fringing. In the fourth direction, both fail to keep regular line spacing, but no color fringing is observed either. Following a similar definition to (2), we have
TABLE I MEAN AND VARIANCE OF DPD, DSD, AND DDSD FOR SUBIMAGE-H, SUBIMAGE-V, SUBIMAGE-D, SUBIMAGE-AD
(19) where the three RGB modulation functions for DDSD are
shown in Fig. 8. Each of the four subimages, which we call Subimage-H, Subimage-V, Subimage-D, and Subimage-AD, contains 15 pairs of black and white lines in the horizontal, vertical, diagonal, and antidiagonal directions, respectively. The width of each black or white line is 7 pixels (with a total of 21 subpixels). In our experiment, is downsampled by a factor of 3 with DPD, DSD, and DDSD to produce three 140 140 images, as shown in Fig. 8(b)–(d), respectively. We define a subpixel-based regularity measure for each subimage as follows: (18) is the number of black lines; is the width of the where is the width black lines in the image; and of the th black line in the DPD, DSD, or DDSD image; and and is . In our experiment, , the unit of
(20) Using a similar derivation, the Fourier transform of can be expressed as (21), shown at the bottom of the page, where , , and are given in (10). Fig. 9(b) shows the magnitude of . We clearly see nine and replicated spectra located at corresponding to the nine Diracs locations in (21). The arrangement of the nine replicated spectra in DDSD is obviously different from that in DPD and DSD. The three replicated are located in the vertical direction at the center of DSD, whereas they are in the antidiagonal direction in DDSD. Both the horizontal and vertical shiftings of the sampling locations of the three colors in DDSD as compared with DPD lead and . to the replacement of six in the DPD spectra by and , the center in With the low-frequency nature of DDSD has significantly lower overlapping with the horizontally and vertically neighboring spectra than DPD. Compared with DSD, DDSD has the advantage that the center overlaps less with the vertical neighbors, even though its overlap with the antidiagonal neighboring can be considerable. Therefore, while
(21)
FANG et al.: ANTIALIASING FILTER FOR SUBPIXEL DOWNSAMPLING VIA FREQUENCY-DOMAIN ANALYSIS
DSD can extend its cutoff frequency beyond the Nyquist frequency horizontally, DDSD can be extended both horizontally and vertically. To suppress aliasing, we need to remove the effect of all eight replicated spectra on the center main spectra. While a rectangular antialiasing low-pass filter is possible, we find that a circular low-pass filter is also possible. Here, we find the optimal of each and compare them. For a rectangular filter, the larger the cutoff frequency, the more high-frequency content of the center will be retained. Thus, we seek to maximize the area of the passband, which , where due is to symmetry. To suppress the aliasing effect of the horizontal and vertical replicated spectra, the constraint is that with and , as defined in (16). To suppress the aliasing effect of the antidiagonal replicated spectra, the constraint is the radial cutoff should separate two spectra at from the center. In other words, a distance of . The problem can be formulated as
s.t. (22) where
and
are (23)
The
optimal
solution
is
simply . Let us define . It can be shown in the Appendix that and , and thus, . Therefore, the optimal solution can be simplified to , and the corresponding ) is optimal passband area (denoted as
1397
. To suppress aliasing from horizontal and vertical repli. To supcated spectra, the constraint is press aliasing of the antidiagonal replicated spectra, the con. The optimal solution to the straint is problem
s.t. is
(25)
simply
area (denoted as
, and ) is
the
corresponding
passband
.
(26)
We now compare the rectangular and circular antialiasing filters by plotting the passband area in Fig. 11, according to (24), (26), and (27), shown at the bottom of the page, where the red and green line segments together represent the optimal passband area of DDSD-FA, with red marking the situations when rectangular is dominant and green when circular. The transition between rectangular and circular happens at point . In other words, the circular filter is better when its passband area is larger than that of the rectangular filter, i.e., . This happens when . The two situations are illustrated in Fig. 10. Fig. 10(a) shows the case when the circular filter has a larger passband area than and are the rectangular filter. This tends to occur when considerably “smaller” than . In our experiments, this occurs often. Fig. 10(b) depicts the case when the rectangular filter is and are rather “large.” better. This tends to occur when It occurs seldomly. We further plot the optimal passband area of DSD-FA according to (17), which is represented by the blue line in Fig. 11. It is clear that the passband area of DSD-FA intersects with DDSD-FA at point , and we have
(24) . For the circular antialiasing filter, we need to find the cutoff frequency to maximize the area of the passband, which is
(28)
(27)
1398
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012
Fig. 9. (a) Frequency spectra of I
. (b) Frequency spectra of I
Fig. 12. Histogram distributions of chrominance components (U , V ): (upper) Original image; (middle) PDAF; (lower) DDSD. (a) U . (b) V . .
Fig. 13. Source images 1–12 used in the experiment.
A. LSM Fig. 10. (a) The circular filter has larger passband area than the rectangular. (b) The rectangular filter has larger passband area than the circular.
As pointed out in Section III, in smooth regions with low-frequency details, subpixel-based downsampling methods barely provide improvement. The major resolution differences of various methods happen in their high-frequency details. Thus, for any image to be measured, we apply an 1-D high-pass filter in four directions: horizontal , vertical , diagonal , and antidiagonal , although other high-pass filters can be used as well. Luminance sharpness measure (LSM) [19] is the average of the directional high-frequency “energy”, i.e., LSM
Fig. 11. Area of the passband for DSD-FA and DDSD-FA.
(29)
By definition, LSM is nonnegative. A larger LSM indicates that there is more luminance high-frequency energy, which suggests higher apparent luminance resolution. B. LAM
Therefore, when , the passband area of DDSD-FA is larger than that of DSD-FA; thus, DDSD-FA retains more high-frequency energy than DSD-FA, and vice versa.
III. OBJECTIVE MEASURES Our goal of performing subpixel-based downsampling is to improve the apparent luminance resolution and to suppress color fringing artifacts. In this section, we describe four objective measures for such improvements. Here, we choose the YUV color space [20], although other similar color space could have been used as well.
Recall that the DPD images tend to have severe aliasing artifacts, with many broken lines and sawtooth. If we use LSM to measure the sharpness of DPD, the LSM would be quite large since DPD aliasing generates much high-frequency energy. While LSM is good for many subpixel downsampling schemes, it misses the aliasing effects of DPD. Therefore, we need another measure for such aliasing effects. In DPD, most of the aliasing artifacts appear as broken lines. This inspires us to propose a novel measure for “discontinuity energy” called luminance aliasing measure (LAM). For any test image to be considered, we use the corresponding PDAF image as a reference to identify the locations of lines and edges, as lines and edges
FANG et al.: ANTIALIASING FILTER FOR SUBPIXEL DOWNSAMPLING VIA FREQUENCY-DOMAIN ANALYSIS
1399
TABLE II LSM FOR SOURCE IMAGES 1–12
TABLE III LCM FOR SOURCE IMAGES 1–12
TABLE IV LAM FOR SOURCE IMAGES 1–12
in PDAF are continuous (not broken) [10]. The following procedure is applied to the component of a downsampled image to compute its LAM:
1 Compute edge map of the PDAF image, and label the edge locations as 1 and nonedge locations as 0. Thus, if is an edge pixel in PDAF. Otherwise, . 2 Any pixel of with is called an edge of , compute the pixel of . For any edge pixel and each of the edge square difference between pixels within its eight connected neighbors.
of all square differences 3 Compute the global average computed in step 2 for . for the PDAF image and 4 Similarly, compute for the DPD image. defined as 5 Compute LAM LAM
(30)
LAM will be large for images with broken lines and small for those without. By definition, LAM is 1 for DPD and 0 for PDAF. LAM can be negative for images with edges and lines even smoother than PDAF.
1400
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012
TABLE V CDM FOR SOURCE IMAGES 1–12
C. LCM
IV. IMPLEMENTATION AND RESULTS
We notice that, for two images (downsampled from the same large image) having similar values of LSM and LAM, the one with higher contrast tends to provide better visual quality. As noted in [21], the contrast of these images can be measured by their global variance, which we will call luminance contrast measure (LCM), i.e.,
(31)
D. CDM In our experiments, we observe that chrominance distortion typically is negligible in PDAF but very severe in DSD and DDSD. This can be partly illustrated in Fig. 12, which shows that the chrominance histogram of and of the original large image is very similar to that of PDAF but significantly different from that of DDSD. Therefore, for any image to be measured, we choose the chrominance components of PDAF as the reference and compute PSNR and PSNR as follows: PSNR
MSE
MSE and are the components of where respectively. PSNR is defined similarly.
(32) and PDAF,
In this section, we simulate the proposed DSD-FA and DDSD-FA subpixel-based downsampling methods and compare with PDAF, which is pixel based, and Betrisey [13] and Gibson [6], which are subpixel based. Twelve test images shown in Fig. 13 are used. Various methods are applied to these images to downsample them by a factor of 3 both horizontally and vertically. Betrisey et al. [13] defined an error metric in the frequency domain based on psychophysical experiments and derived the filter coefficients by minimizing this metric. The optimal filter consists of nine filters applied to the three color channels. They further use a one-pixel-wide box filter to approximate the nine optimal filters. They treat the R, G, and B subpixels as individual signal samples and apply the one-pixel-wide three-tap box filter directly to smooth the result of subpixel rendering. Instead of the one-pixel-wide box filter, Gibson applies a five-tap low-pass filter directly to smooth the result of subpixel rendering. As a result, the R, G, and B values within adjacent pixels tend to be very similar, if not identical, making the resulting image appear monochrome. Therefore, we expect the chrominance distortions of Betrisey and Gibson to be large for full-color images. We examine the apparent resolution of luminance ( ) and the distortion of chrominance ( and ) of various methods using our proposed objective measures. To compare the luminance performance, we show their LSM, LAM, and LCM values in Tables II–IV, respectively. For the chrominance performance, we show their chrominance distortion measure (CDM) values in Table V. Due to limited space, only some typical downsampled images are shown in Fig. 14 and 15. As Platt [12] noted,
FANG et al.: ANTIALIASING FILTER FOR SUBPIXEL DOWNSAMPLING VIA FREQUENCY-DOMAIN ANALYSIS
1401
Fig. 14. Experimental results for source image 10. (a) DPD. (b) DSD. (c) DDSD. (d) PDAF. (e) Betrisey. (f) Gibson. (g) DSD-FA. (h) DDSD-FA.
printing is a subtractive process, and thus, showing our results in printed form may be a problem. Since the special color fringing artifacts and blurring are best seen on LCD display, we urge the readers to view our image results without scaling on our homepage directly: http://ihome.ust.hk/~fanglu/DDSD-FA.htm. For LSM, a larger value implies larger high-frequency energy, and methods with larger LSM tend to be better. In Table II, DPD has the highest LSM (1.000) implying that it is very sharp. The average LSM of DSD (0.937) and DDSD (0.900) are slightly lower than DPD; thus, they tend to be slightly less sharp. PDAF has a significant smaller average LSM (0.706) as expected, because PDAF is considerably much more blurred than DPD. The proposed DSD-FA and DDSD-FA can achieve larger LSM (0.851 and 0.837, respectively) and, thus, sharper images than PDAF and other downsampling schemes with prefiltering. We have similar observations about LCM in Table III. Again, the proposed DSD-FA and DDSD-FA have the best LCM and, thus, the best contrast among the methods with prefiltering. In both tables, the best downsampling results with prefiltering are underlined. , the passband area Recall that, when of DDSD-FA is larger than DSD-FA; thus, DDSD-FA retains
more high-frequency energy than DSD-FA. In all our experis satisfied, and DDSD-FA iments, should have higher LSM and LCM than DSD-FA. However, as shown in Table II, sometimes, DSD-FA can have higher LSM than DDSD-FA. This is mainly because ideal low-pass filters are infinitely long in the spatial domain, which is impossible to implement, and in our implementation, we truncate the ideal -tap filter, where . As the low-pass filter to a approximation is not always good, sometimes, DSD-FA can have higher LSM and LCM than DDSD-FA. Although DPD has the largest LSM and LCM implying it is very sharp with high contrast, it actually has very significant of DPD introduce aliasing artifacts. The replicated spectra many high-frequency aliasing components, causing it to have large LSM and LCM. In comparison, DSD has similar vertical aliasing but smaller horizontal aliasing due to the replicated and being “smaller” than . Similarly, DDSD spectra has smaller aliasing than DSD, because the dominant replicated is farther away from the center spectra. Thus, exspectra amining LSM and LCM alone are not sufficient, and LAM is another good performance measure to consider. For LAM, a smaller value implies smoother lines and edges, and methods
1402
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012
Fig. 15. Experimental results for source image 13. (a) DPD. (b) DSD. (c) DDSD. (d) PDAF. (e) Betrisey. (f) Gibson. (g) DSD-FA. (h) DDSD-FA.
Fig. 16. Source images 13–16 of size 1024
2 768.
with small LAM are better. In Table IV, we observe that DPD has the largest LAM (1.000) among all methods. This suggests
that DPD incurs the most severe aliasing artifacts, in the form of the severe broken lines and jagged edges, as shown in Fig. 14 in spite of its large LSM and LCM. Similarly, although DSD and DDSD have large LSM and LCM, their LAM are rather large at 0.719 for DSD and 0.532 for DDSD. The proposed DSD-FA can significantly reduce the LAM of DSD by 0.463 on the average, and the LAM of the proposed DDSD-FA is 0.309 smaller than DDSD. The LAM of Betrisey and Gibson are negative, suggesting that their lines and edges are even smoother than PDAF.
FANG et al.: ANTIALIASING FILTER FOR SUBPIXEL DOWNSAMPLING VIA FREQUENCY-DOMAIN ANALYSIS
1403
TABLE VI LSM FOR SOURCE IMAGES 13–16 WITH SAMPLING RATIO 4:1
TABLE VII LAM FOR SOURCE IMAGES 13–16 WITH SAMPLING RATIO 4:1
TABLE VIII CDM FOR SOURCE IMAGES 13–16 WITH SAMPLING RATIO 4:1
Note that these images are rather blurred as reflected by their low LSM and LCM. In general, the proposed DSD-FA and DDSD-FA have much fewer aliasing artifact than DSD and DDSD, respectively, and have less blurring than other downsampling schemes with prefiltering. CDM is a measure of chrominance distortion. Large PSNR and PSNR imply less color fringing artifacts; thus, methods with higher values tend to be better. In Table V, we observe that the PSNR and PSNR of DPD are much higher than all subpixel-based methods because DPD (and PDAF) is pixel based, and pixel-based methods tend not to introduce any chrominance distortion. As shown in Figs. 14 and 15, both DSD and DDSD produce objectionable visual artifacts (e.g., color fringing artifacts) in edge areas, whereas pixel-based methods are largely free of color distortion. As expected, the PSNR and PSNR of Betrisey and Gibson are much worse than other methods, because they tend to remove all colors leading to huge chrominance distortion. We also observe that the PSNR and PSNR of the proposed DSD-FA and DDSD-FA are about 3–6 dB higher than DSD or DDSD, suggesting that the proposed methods can suppress chrominance distortion effectively. To confirm obvious differences with other methods, we have added four more source images including cartoon and texture images, which are shown in Fig. 16. To test the proposed algorithms for a sampling factor other than 3:1, we simulate the case for 4:1 and 5:1. Due to limited space, we only show the LSM, LAM, and CDM values for the case 4:1 in Tables VI–VIII, respectively.
V. CONCLUSION In this paper, we have used frequency-domain analysis to explain the aliasing behavior of several downsampling schemes, i.e., DPD (pixel-based), DSD, and DDSD (subpixel-based), and why it is possible to achieve a higher apparent resolution. According to our frequency analysis and observation, the cutoff frequency of the antialiasing filter for DSD and DDSD can be effectively extended beyond the Nyquist frequency using novel antialiasing filters. Applying the proposed filters to DSD and DDSD, we have obtained two improved schemes: DSD-FA and DDSD-FA. Experimental results verify that the proposed DSD-FA and DDSD-FA can provide superior results compared to existing subpixel or pixel-based downsampling methods. APPENDIX A Recall that (33) where is the sum of the magnitudes of the corresponding fre, and over all frequency. quency spectra . The variance of the corresponding symmetric Laplacian is Rewriting (33), we have (34) For the right side of (34), we calculate the values for a variety of source images and plot them in Fig. 17(a). We observe that the values of are slightly larger than or similar to those of
1404
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 3, MARCH 2012
and =E values of the corresponding normalized Y^ (i = 0), C^ (i = 1) and C^ (i = 2). (a) . (b) =E .
Fig. 17.
spectra of
and
,( and ). We also calculate the values and plot them in Fig. 17(b). It is clear that and . This is mainly because the sum of and the magnitudes of is significantly larger than that of ,( and ). Therefore, is a positive number and of
[10] R. C. Gonzalez and E. W. Richard, Digital Image Processing. Englewood Cliffs, NJ: Prentice-Hall, 2005. [11] P. S. R. Diniz, Digital Signal Processing. Cambridge, U.K.: Cambridge Univ. Press, 2010. [12] J. C. Platt, “Optimal filtering for patterned displays,” IEEE Signal Process. Lett., vol. 7, no. 7, pp. 179–181, Jul. 2000. [13] C. Betrisey, J. F. Blinn, B. Dresevic, B. Hill, G. Hitchcock, B. Keely, D. P. Mitchell, J. C. Platt, and T. Whitted, “Displaced filtering for patterned displays,” in Proc. SID Int. Symp. Digest Tech. Papers, 2000, vol. 31, pp. 296–301. [14] D. S. Messing and S. Daly, “Improved display resolution of subsampled colour images using subpixel addressing,” in Proc. IEEE ICIP, 2002, vol. 1, pp. I-625–I-628. [15] D. S. Messing, L. Kerofsky, and S. Daly, “Subpixel rendering on nonstriped colour matrix displays,” in Proc. IEEE ICIP, 2003, vol. 2, pp. II-949–II-952. [16] S. J. Daly and R. R. K. Kovvuri, “Methods and systems for improving display resolution in images using subpixel sampling and visual error filtering,” U.S. Patent 09/735 424, Aug. 19, 2000. [17] N. X. Lian, L. Chang, Y. P. Tan, and V. Zagorodnov, “Adaptive filtering for color filter array demosaicking,” IEEE Trans. Image Process., vol. 16, no. 10, pp. 2515–2525, Oct. 2007. [18] J. W. Goodman and Y. Lam, “A mathematical analysis of the DCT coefficient distributions for images,” IEEE Trans. Image Process., vol. 9, no. 10, pp. 1661–1666, Oct. 2000. [19] L. Fang and O. C. Au, “Novel 2-D MMSE subpixel-based image down-sampling for matrix displays,” in Proc. IEEE ICASSP, 2010, pp. 986–989. [20] Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios, ITU-R BT.601-5, 1995. [21] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971–987, Jul. 2002. Lu Fang received the B.S. degree from the University of Science and Technology of China, Hefei, China, in 2007 and the Ph.D. degree from Hong Kong University of Science and Technology (HKUST), Kowloon, Hong Kong, in 2011. She used to visit Northwestern University, Evanston, IL, under the support of Prof. A. K. Katsaggelos. She is currently with the Department of Electronic and Computer Engineering, HKUST.
(35) Similarly, we have
. REFERENCES
[1] Y. Amano, “A flat-panel TV display system in monochrome and color,” IEEE Trans. Electron. Devices, vol. ED-22, no. 1, pp. 1–7, Jan. 1975. [2] T. Benzschawel and W. E. Howard, “Method of and apparatus for displaying a multicolor image,” U.S. Patent 5 341 153, Aug. 23, 1994. [3] L. M. Chen and S. Hasegawa, “Influence of pixel-structure noise on image resolution and color for matrix display devices,” J. Soc. Inf. Display, vol. 1, no. 1, pp. 103–110, Jan. 1993. [4] P. Barten, “Effects of quantization and pixel structure on the image quality of color matrix displays,” in Proc. IEEE Int. Conf. Display Res., 1991, pp. 167–170. [5] ClearType information [Online]. Available: http://www.microsoft.com/typography/cleartypeinfo.mspx [6] S. Gibson, Sub-pixel font rendering technology [Online]. Available: http://www.grc.com/cleartype.htm [7] M. A. Klompenhouwer, G. De Haan, and R. A. Beuker, “Subpixel image scaling for color matrix displays,” J. Soc. Inf. Display, vol. 11, no. 1, pp. 99–108, Mar. 2003. [8] J.-S. Kim and C.-S. Kim, “A filter design algorithm for subpixel rendering on matrix displays,” in Proc. 15th EUSIPCO, 2007, pp. 1487–1491. [9] S. Daly, “Analysis of subtriad addressing algorithms by visual system models,” in SID Int. Symp. Digest Tech. Papers, 2001, vol. 32, pp. 1200–1203.
Oscar C. Au received the B.A.Sc. degree from the University of Toronto, Toronto, ON, Canada, in 1986 and the M.A. and Ph.D. degrees from Princeton University, Princeton, NJ, in 1988 and 1991, respectively. After being a Postdoctoral Researcher with Princeton University for one year, he joined the Hong Kong University of Science and Technology (HKUST), Kowloon, Hong Kong, as an Assistant Professor, in 1992. He has been a Professor of the Department of Electronic and Computer Engineering, Director of Multimedia Technology Research Center (MTrec), and Director of the Computer Engineering (CPEG) Program in HKUST. His fast motion estimation algorithms were accepted into the ISO/IEC 14496-7 MPEG-4 international video coding standard and the China AVS-M standard. His lightweight encryption and error resilience algorithms were accepted into the China AVS standard. He has performed forensic investigation and stood as an expert witness in the Hong Kong courts many times. He has published about 320 technical journals and conference proceeding papers. He is the holder of eight U.S. patents and more than 60 pending patents on signal-processing techniques. His research interests include video and image coding and processing; watermarking and lightweight encryption; speech and audio processing; fast motion estimation for MPEG-1/2/4, H.261/3/4, and AVS; optimal and fast suboptimal rate control; mode decision; transcoding; denoising; deinterlacing; postprocessing; multiview coding; scalable video coding; distributed video coding; subpixel rendering; JPEG/JPEG2000; high dynamic range imaging; compressive sensing; halftone image data hiding; graphics processing unit processing; and software–hardware co-design.
FANG et al.: ANTIALIASING FILTER FOR SUBPIXEL DOWNSAMPLING VIA FREQUENCY-DOMAIN ANALYSIS
Dr. Au is a Board of Governor member of the Asia Pacific Signal and Information Processing Association. He has been an Associate Editor for the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE TRANSACTIONS ON IMAGE PROCESSING, and IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—PART 1. He is also on the Editorial Boards of the Journal of Signal Processing Systems, Journal of Multimedia, and Journal of Franklin Institute. He has been Chair of the CAS Technical Committee (TC) on Multimedia Systems and Applications; Vice Chair of SP TC on Multimedia Signal Processing; and a member of the CAS TC on Video Signal Processing and Communications, CAS TC on DSP, and SP TC on Image, Video and Multidimensional Signal Processing. He served on the Steering Committee of the IEEE TRANSACTIONS ON MULTIMEDIA and the IEEE International Conference of Multimedia and Expo (ICME). He also served on the organizing committee of the IEEE International Symposium on Circuits and Systems in 1997, IEEE International Conference on Acoustics, Speech and Signal Processing in 2003, the ISO/IEC MPEG 71st Meeting in 2005, International Conference on Image Processing in 2010, and other conferences. He was General Chair of the Pacific-Rim Conference on Multimedia in 2007, IEEE ICME in 2010, and Packet Video Workshop in 2010. He was an IEEE Distinguished Lecturer in 2009 and 2010, and has been keynote speaker for a few times. He was the recipient of the Best Paper Awards in the IEEE Workshop on Signal Processing Systems 2007 and the Pacific Rim Conference on Multimedia 2007.
Ketan Tang received the B.S. degree from the University of Science and Technology of China, Hefei, China, in 2009. He is currently working toward the Ph.D. degree in the Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong.
1405
Aggelos K. Katsaggelos received the Diploma degree in electrical and mechanical engineering from Aristotelian University of Thessaloniki, Thessaloniki, Greece, in 1979 and the M.S. and Ph.D. degrees in electrical engineering from Georgia Institute of Technology, Atlanta, in 1981 and 1985, respectively. In 1985, he joined the Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL, where he is currently a Professor holder of the AT&T chair. From 1997 to 2003, he was the holder of the Ameritech Chair of Information Technology. He is also the Director of the Motorola Center for Seamless Communications; a member of the Academic Staff, NorthShore University Health System; and an affiliated faculty with the Department of Linguistics. He has an appointment with the Argonne National Laboratory, Argonne, IL. He has published extensively in the areas of multimedia processing and communications (180 journal papers, more than 400 conference proceeding papers, and 40 book chapters). He is the holder of 19 international patents. He is a coauthor of Rate-Distortion Based Video Compression (Kluwer, 1997), Super-Resolution for Images and Video (Claypool, 2007), and Joint Source-Channel Video Transmission (Claypool, 2007). Dr. Katsaggelos is a Fellow of The International Society for Optical Engineers (2009). He was the Editor-in-Chief of the IEEE SIGNAL PROCESSING MAGAZINE (1997–2002), a Board of Governor Member of the IEEE Signal Processing Society (1999–2001), and a member of the Publication Board of the IEEE Proceedings (2003–2007). He was the recipient of the IEEE Third Millennium Medal (2000), the IEEE Signal Processing Society Meritorious Service Award (2001), the IEEE Signal Processing Society Technical Achievement Award (2010), an IEEE Signal Processing Society Best Paper Award (2001), an IEEE ICME Paper Award (2006), an IEEE ICIP Paper Award (2007), and an ISPA Paper Award (2009). He was also a Distinguished Lecturer of the IEEE Signal Processing Society (2007–2008).