Hyper-Spectral Content Aware Resizing

0 downloads 0 Views 3MB Size Report
reasons in image processing. Often, it is done ... content aware image resizing algorithm, called seam carving, was ... on many mobile devices, but the main issue arising with the cropping is ..... Proceedings of the 4th international Conference on. Mobile and ... [9] R. C. Gonzalez and R. E. Woods, "Digital Image. Processing ...
Hyper-Spectral Content Aware Resizing Jesse Scott1, Richard Tutwiler2, Michael Pusateri1 1

Electronic and Computer Services Pennsylvania State University 149 Hammond Building University Park, PA 16802

Abstract- Image resizing is performed for many reasons in image processing. Often, it is done to reduce or enlarge an image for display. It is also done to reduce the bandwidth needed to transmit an image. Most image resizing algorithms work based on principles of spatial or spatial frequency interpolation. One drawback to these algorithms is that they are not image content aware and can fail to preserve relevant features in an image, especially during size reduction. Recently, a content aware image resizing algorithm, called seam carving, was developed. In this paper we discuss an extension of the seam carving algorithm to hyper-spectral imagery. For a hyper-spectral image with an MxN field of view and with P spectral layers, our algorithm identifies a one pixel wide path through the image field of view containing a minimum of information and then removes it. This process is repeated until the image size is reduced to the desired dimension. Information content is assessed using normalized spatial power metrics. Several such metrics have been tested with varying results. The resulting carved hyper-spectral image has the minimum reduction in information for the resizing based upon energy metrics used to quantify information. We will present the results of seam carving applied to imagery sets of: three spectra RGB imagery from a standard still camera, two spectra imagery generated synthetically, and three spectra imagery captured with VNIR, SWIR, and LWIR cameras. I. INTRODUCTION Spatial image resizing is a fundamental problem with many viable solutions. The goal of resizing is to change the dimension of an image to fit a specific resolution or display while retaining as much information in the image as possible. Single image resizing has been the focus of scholarly works, such as [1][2][3][4][5][6], attempting to adjust an image for different sizes. A variety of approaches are attempted to improve the performance of image resizing. The most simplistic and ubiquitous version of spatial resizing is the scaling method, but it has the drawback of detail loss during spatial interpolation [9]. Cropping is used often for reduction of images for small displays like those on many mobile devices, but the main issue arising with the

2

Applied Research Laboratory Pennsylvania State University 450 Science Park Road University Park, PA 16802

cropping is the inability to retain all pertinent information if the crop size is not big enough [9]. Other resizing methods are more promising, but also have some limitations. The saliency based methods [7][8] attempt to extract the important information from the scene, but the failure is that saliency does not always indicate the most important parts of the image. A promising resizing method is a content aware image resizing algorithm called seam carving [1]. Seam carving has been implemented for visible band grayscale images and has also been extended to color imagery. Both processes use a grayscale version of the image to generate the seams for carving and then carve the seams from all layers uniformly. We propose an extension to multi and hyperspectral imagery that generates the algorithm’s spatial power metric based upon all the spectral layers of the image; our objective is to extend the content awareness of the algorithm to encompass the various modalities available within a multi-spectral image. II. BACKGROUND Many methods have been presented in the literature for single image resizing including cropping, scaling, and various resizing techniques [3][4][5][6][7][8]. When selecting a method, a core requirement is the pixel correlation between layers. If a method reduces the size of each layer of the image differently, the spatial correlation between layers is lost. The subsections A through C and figure 1 present two fundamental methods along with the seam carving approach. A. Cropping Cropping is a simple and common method for resizing images. Cropping selects a window of the image for preservation and removes any pixels outside the window. However, the cropping method often sacrifices some important regions in order to retain other important portions of the image as illustrated in figure 1. There are automated cropping techniques [7][8][10] that retain specified regions, such as faces, but require significant preprocessing to select the regions of interest (ROI).

Figure 1. An example to illustrate resizing techniques [1]. From left to right: raw original imagery, reduction through cropping, reduction through interpolation, and reduction through seam carving. A secondary result is also discontinuity within the image, distracting the person viewing the image as well as making post processing more difficult. B. Interpolation Both spatial and spatial frequency interpolation result in the loss of high spatial frequency detail from the image. This process can even eliminate small, but important, objects. Additionally, interpolation results in image warping when applied with different reduction ratios making it unsuitable for changing the image’s aspect ratio. C. Seam Carving Content aware image resizing by seam carving [1] reduces the resolution of an image by removing data from the image that has minimal impact on the image content. Seam carving begins by using the image to generate a content energy map; the energy function used impacts the content that will be preserved in the image. The energy map is spatially integrated in either the horizontal or vertical dimension to produce the spatial power map in that direction. The minimal power path through the map is selected by a dynamic programming algorithm; this path becomes the seam removed from the image and energy map. The direction of integration of the updated energy map is alternated to produce paths in both directions; the process is continued until the desired dimensionality is reached. D. Multi-spectral Data The purpose of any imagery is to collect information about a scene. The added benefit of multi-spectral imagery is that each band of imagery can provide unique scene information not available in the remaining bands. These data sets are co-located imagery sensed over a broad set of wavelength ranges. They can be acquired through various techniques including multiple filters over a single sensor, optically aligned sensor suite, and multi-camera imagery (ex. VNIR, SWIR, and LWIR cameras) that are aligned via post processing.

III. PROPOSED EXTENSION Seam carving has been shown in [1] and [2] to perform well in forward looking scenes to provide deterministic behavior and to introduce minimal artifacts. The visual artifacts are generally minor discontinuities in scene content that are the result of seam carving through continuous, uniform portions of the image. This paper proposes to extend the seam carving method by applying it to multilayer data including multi-spectral data sets. This paper investigates two aspects of extending the seam carving method to a wide range of wavelengths. Generating energy maps for each layer of the imagery set and a strategy to merge that energy are the focus points of the processing algorithm. The energy of a set is determined by filtering the individual layers of the set with a kernel and then merging these individual energy layers into one energy map. The energy map is then utilized by a dynamic programming algorithm to generate a spatial power map that we term the power image; it is the directional spatial integration of the energy image. The method is deterministic, but results vary widely based upon the energy function and the merge function utilized. The resulting imagery may contain artifacts, but these discontinuities are not distracting until significant data reduction occurs. A. Energy Detection The energy detection occurs in each layer of the imagery. The detection process has a simple requirement of one to one pixel mapping from input to output. The operation used in the conversion is the sole controlling factor in extracting the unique properties of the layer. The current energy functions used are kernel based, but any method that meets the one to one mapping requirements can be applied. The properties of the function are vital to multi-spectral seam carving. Each layer of the imagery set contains a subset of the total scene information. To extract this information, our function needs to address two types of information. The first type is information shared between

Figure 2. Three bands of a color image for rose. The organization left to right is red, green, and blue.

Figure 3. Sobel based energy of rose color bands. The organization left to right is red, green, and blue energies. layers that must be accounted for, but should not be duplicated. The second type is unique information existing only in this spectral layer. Extracting this information is vital to information retention with data reduction. The function selection process is currently a manual operation that is a subjective process performed offline by the user. The kernels that we have implemented and tested are: Sobel, Prewitt, Laplacian, LoG, and impulse. The Sobel, Prewitt, Laplacian, and LoG are edge detectors and their application requires a normalization operation. We suspect that other edge detection kernels would provide similar operation. The impulse kernel is a simple gain weighting on each pixel in the image; we also allow a layer to be given a weight of zero in the energy map. The application of any specific kernel is subjective, but high pass filters provide good results in the visible spectrum while the thermal spectrum seems performs better with impulse kernels. Shortwave infrared imagery performs well using high pass filters with small pass bands. Image noise affects the results of any kernel convolution, but the noise is often passed directly to the energy image. The kernel for extracting the unique scene information varies from wavelength to wavelength. Figures 2 and 3 show the effects of a Sobel filter on each of the color bands for the flower image. Each wavelength range presents its own unique and common scene information. The red channel’s unique information is the edge transition from the background green leaves to the foreground pink rose petals. The green channel’s unique information is the distinction between the individual petals as well as the texture of the petals and foreground leaves.

Figure 4. Results of median energy fusion. Top is raw color image and bottom is merged energy image.

Figure 5. Example of synthetic imagery seam carving. Rows left to right represent original, energy, and resized images. Columns top to bottom represent visible band and thermal band imagery.

Figure 6. Intermediate steps of the seam carving algorithm. The left image is the fused energy and the right is the horizontally integrated power image The blue channel’s unique information is a combination of background/foreground transition and foreground details. The common information is indicated within the background noise generated from the green leaves. B. Energy Fusion Once a set of energy images has been generated, the layers must be fused to incorporate both the unique and common information from each layer. The unique and common information must be weighted to avoid having the composite energy image dominated by common information.

The methods for merging are currently defined as mean, median, max, and sum. These are not the only methods that can be applied. The only constraint to the process is that it must take a vector input generated from all the layers and result in a scalar output value. All the approaches tested have equal weighting for each value in the input set, however this is not a requirement and may be advantageous when determining the best fusion method. Weighting the layers allows more importance to be placed on a single layer relative to the other layers. Figure 4 illustrates the results of the median fusion method as it correlates to the original input image. The fusion result is

Figure 7. Multi-spectral three band seam carving example. The rows are from top to bottom are visible near infrared, long wave infrared, and short wave infrared. The columns from left to right are original raw, energy, and resized image. mapped to a color image to help illuminate the high and low energy portions of the composite energy image. IV. RESULTS The results of this algorithm extension are further illustrated in the remaining two examples: synthetic two band imagery and multi-spectral three band imagery. The results illustrate the unique scene information integrated in each of the layers. A. Synthetic, two bands In figure 5, the raw visible band image is generated by DIRSIG [11] simulation tools for a VNIR wavelength range of 400 to 700 nm and a LWIR wavelength range of 8 to 12 um. This pair was run through our algorithm using the Prewitt kernel for the visible band and the LoG kernel for

the thermal band. These bands were fused via a sum operation. Once the energy is fused, 150 vertical seams were calculated and then carved from the imagery set. The process reduces the total pixel count by 29% of both the visible band and the thermal band results while maintaining spatial pixel correlation. In figure 6, the fused energy image was generated via a sum operation and is shown in the left image. The layers of the energy image used to make the fused energy are in figure 5, the center column. Once the energy is fused, 150 seams were calculated using the power image shown in the right of figure 6. Image power is generated from the fused energy by integrating across the image. The right image of figure 6 is horizontal integration. Both the fused energy and power images are shown in a “jet” color map to illuminate the differential intensities. The energy images can be inspected to see the unique information in each layer. In this case, the thermal layer can

detect the individual in the tree, but the visible layer can see through the car windows. The combination if these two unique channels allows both situations to be considered while seam carving.

V. FUTURE WORK This work is a first step to content aware multi-spectral image resizing by seam carving. We have investigated the basic operation required for application of seam carving to multi-layer imagery. The options considered here for both energy calculation and merging are not the only possibilities. A method is needed for determining the proper kernel for a specific layer via an objective method. Currently, kernel selection is done by the operator and is a subjective process. Determining the kernel, based on the wavelength range of the imagery, is a vital requirement to automate the algorithm. A second aspect to investigate is improving the seam carving algorithm by implementing a process similar to the graph cut process illustrated in [2], allowing the merge process to be incorporated in the carving process by making seam carving a 3D process. This should improve visual performance while including dimensionality reduction in the carving process and eliminating the merge process, including the approximations resulting from it. VI. CONCLUSIONS

Figure 8. The intermediate steps in the multi-spectral three band seam carving example. The top image is the fused power and the bottom image is the horizontal power image. B. Multi-spectral, three bands The following example presents the multi-spectral seam carving method applied to a real three band example. The imagery presented in figure 7 contains a visible near infrared layer, a long wave infrared layer, and a short wave infrared layer. These layers were processed using the Prewitt, Laplacian of Gaussian (LoG), and Sobel operators respectively. The resulting energy images are shown adjacent to the raw inputs. The three energy images are fused using the average method and then integrated with the results shown in figure 8. Using the information in the power image of figure 8, the three resulting resized images are then shown in the third column of figure 7. Observe that the VNIR layer provides information about the ground in the scene, the SWIR layer contains most of the tree line information, and the thermal provides information at the sky line and of the people in the center of the scene. In this example, the original image set is reduced by 47% with a reduction of 150 rows and 150 columns.

We have investigated a method for reducing multi-layer images according to the image content. This method takes into consideration the unique information from each layer of the imagery based on the wavelength of each layer. The method combines the information from each layer to accommodate the unique scene content. This allows the final imagery set to be reduced by considering all scene information and therefore removing the least important information from the imagery while considering all information about the scene. The methods used to determine the energy of a layer are vital to incorporate that layer’s unique scene information. These methods are kernel based methods which are used to map the image layer to an energy image. The kernels need to be selected subjectively based on the image type and wavelength range. The kernel must extract scene information unique to this range and must also consider the effects of introducing non-unique information. The results of our method show significant reduction in imagery data size with little reduction in information. The method considers the unique scene information of each layer and incorporates it into the seam carving process. The resulting imagery maintains portions of the imagery set corresponding to objects anywhere in the entire set. This method is also applicable to standard visible band imagery. VII. AKNOWLEDGEMENTS The authors would like to thank Allen Waxman for providing the real three band imagery and Robert Collins for his input at the beginning of this work.

VIII. REFERENCES [1] S. Avidan and A. Shamir, “Seam Carving for ContentAware Image Resizing,” ACM Transactions on Graph, vol. 26, no. 3, article 10, July 2007. [2] M. Rubinstein, A. Shamir and S. Avidan, “Improved Seam Carving for Video Resizing,” ACM Transactions on Graph, vol. 27, no. 3, article 16, August 2008. [3] V. Setlur, S. Takagi, R. Raskar, M. Gleicher and B. Gooch. 2005. Automatic image retargeting. In Proceedings of the 4th international Conference on Mobile and Ubiquitous Multimedia (Christchurch, New Zealand, December 08 - 10, 2005). MUM '05, vol. 154. ACM, New York, NY, 59-68. [4] H. Liu, X. Xie, W. Ma and H. Zhang, “Automatic browsing of large pictures on mobile devices,” Proceedings of the eleventh ACM international conference on Multimedia, pp. 148–155, 2003. [5] F. Liu and M. Gleicher, “Automatic Image Retargeting with Fisheye-View Warping,” ACM Symposium on User Interface Software and Technology, pp. 153–162, 2005.

[6] F. Liu and M. Gleicher, “Video Retargeting: Automating Pan and Scan,” In ACM international conference on Multimedia, pp. 241–250, 2006. [7] B. Suh, H. Ling, B. B. Bederson and D. W, Jacobs. 2003. Automatic thumbnail cropping and its effectiveness. In Proceedings of the 16th annual ACM symposium on User interface software and technology, ACM Press, New York, NY, USA, 95–104. [8] L. Itti, C. Koch and E. Neibur. 1999. A model of saliency based visual attention for rapid scene analysis. PAMI vol. 20, no. 11, pp. 1254–1259. [9] R. C. Gonzalez and R. E. Woods, "Digital Image Processing, Third Edition", Prentice Hall, 2008. [10] P. Viola and M. Jones. 2001. Rapid object detection using a boosted cascade of simple features. In Conference on Computer Vision and Pattern Recognition. [11] Digital Imaging and Remote Sensing Image Generation (DIRSIG) software. [Online]. Available from: http://dirsig.cis.rit.edu/