Efficient Scene Preparation and Downscaling Prior to ... - IEEE Xplore

2 downloads 0 Views 1020KB Size Report
Engineering, Newcastle University, Newcastle upon. Tyne, UK. [email protected]. Abstract—Retinal prostheses are moving towards providing.
1

Efficient Scene Preparation and Downscaling Prior to Stimulation in Retinal Prosthesis Walid Al-Atabany1, 2

Patrick Degenaar1

1

Department of Biomedical Engineering, Helwan University, Egypt. 2School of Electrical, Electronic and Computer Engineering, Newcastle University, Newcastle upon Tyne, UK. [email protected]

School of Electrical, Electronic and Computer Engineering, Newcastle University, Newcastle upon Tyne, UK. [email protected]

Abstract—Retinal prostheses are moving towards providing a return to some functional vision for those with the Retinitis Pigmentosa disease. Optoelectronic/optogenetic retinal prosthesis holds particular promise. As the various techniques are unlikely to return perfect vision in the first instance, we need to explore how best to present the visual scene. The key task is to restore mobility and scene recognition to the patients. Therefore, some form of reduction for the visual information should be applied before transfer to the retina. In particular, scene segmentation can reduce unimportant textures, thus increasing the contrast of the key features and objects. Based on the thermal characteristics of objects, in this paper we present a new processing platform to just transfer important objects segmented using mixed visible-infra red imaging. With this new segmentation approach, complex objects are still distinguishable even with low effective resolution.

I. INTRODUCTION Vision is perhaps our most important sense and its loss can be devastating. Around the world Millions are blighted with conditions such as Glaucoma, Age related Macular Degeneration (AMD), diabetic retinopathy and cataracts. These cause severe visual impairment but are becoming increasingly treatable with pharmaceutical or surgical approaches. Retinitis Pigmentosa (RP), a class of hereditary degenerative disorders is much less treatable. It affects about 1 out of 3000 live births[1] and causes dysfunction of the rod photoreceptors, leading to nightblindness, tunnel vision and eventual total blindness. There is no effective treatment for most patients with AMD and RP. In the 1990’s, Stone et al [2] discovered that the retinal ganglion cells were still intact in RP patients, even after the onset of full blindness. This formed the basis for subsequent work on electrical retinal prosthesis targeting retinal ganglion cells. Since then several research groups have endeavored to develop retinal prostheses as a treatment option for these retinal degenerative diseases. The quality of the prosthetic vision is determined by the number of pixels, where they are placed, and their ability to stimulate a signal the brain can understand. Simulation of pixelized vision have suggested that at least several hundred electrodes are needed to perform useful visual functions such as navigation or face recognition [3]. In particular, Margalit et al [4] suggested that at least 625 pixels are needed for resolving images in a tiny (1.7◦ or less) central field. However, currently vision prostheses devices contain only limited numbers of electrodes which constrain the visual resolution implant recipients are likely to perceive.

978-1-4799-1471-5/13/$31.00 ©2013 IEEE

182

Nevertheless, recent patient trials have demonstrated that intraocular electrical stimulation using retinal implants can return a form of basic vision based on phosphene percept [5, 6]. We have previously proposed an optogenetic/ optoelectronic to retinal prosthesis. We believe it can improve both the resolution and the dynamic Range (contrast) that could be returned to the patient [7]. This technology is based on the photosensitization of neurons using light gated ion channels and pumps, which have been successfully expressed in RGC’s [8], bipolar cells [9] or photoreceptors [10]. Such an approach requires high brightness illumination, and we have therefore demonstrated individually addressable LED arrays with 256 individual stimulators [11] with the capacity to scale to 10,000 with larger arrays. Whichever approach is used, electronic or optogenetic, restoring perfect vision is unlikely to be in the near future. More likely, the resulting vision from a higher resolution retinal prosthesis is likely to have characteristics of both early stage RP patients and age related macular degeneration patients (i.e. tunnel vision with very poor visual acuity). Previously, we have shown that effective contrast enhancement algorithms such as cartoonization can improve visual recognition in visually impaired but not blind patients with retinal degenerative disorders [12]. For those with tunnel vision, we have also demonstrated a non-linearly shrinking approach [13] to spatially compress the non important features of the visual scene, increasing the effective field of view To cope with the low resolution issue of these prosthetic devices, non-trivial image processing techniques are required in order to maximally utilize the limited number of available stimulation points. Most of the previously published effort has focused mainly on representing the scene by its image edges or enhancing the contrast using histogram equalization. Previously, we have presented an image processing platform to enhance the visual scene before downscaling and transmitting it into our micro LEDs arrays [14, 15]. The platform was based on enhancing the scene contrast by cartoonizing and non-linearly downscaling it into smaller size. To maximize the information of the downscaled image, it is necessary to segment it into important and less important regions. Although many image segmentation techniques exist in the literature, we require ones which can be implemented on a real time low power platform (a blind person might not

2 want to recharge batteries every 30 minutes). In this paper we use the information extracted from an infrared camera in order to assist us segmenting key features in the visual scene based on the object temperature. We thereby explore how it may be a useful alternative to feature recognition systems which are processor heavy. II. METHODS

(2)

Figure 2 shows how the grayscale range of an image would be exponentially stretched and compressed for the original image and its negative. To remove any discontinuity, the segmented image is smoothed by convolving it with a Gaussian filter. ̂

The image processing components are described in the flow chart of Figure 1and consist of two main paths; Input CMOS image

Input IR image

Exponentially scaled IR

Exponentially scaled Negative-IR

This segmented image is used as a decision map by normalizing it between 0 and 1. This decision map will be used to fuse the information between the grayscale input image and the edge-weighted image.

Converting the image into greyscale

300

Noise removal

Gradient extraction

250 Exp. Scaled Negative_IR Image

Edge dilation

150 100

A

50

50

0

0 0

Creating the edgeweighted image

Exp. Modified Grayscale Image

200

150

100

IR segmentation

300

Exp. Scaled IR Image 250 200

Gaussian smoothing and normalization (0-1)

(3)



50

100

150

200

250

B

0

50

100

150

200

250

Figure 2 Exponentially stretching and compressing the grayscale dynamic range for the original image and the negative of it (A). The addition of the two images is exponentially scaled in (B).

Creating a segmented edge-weighted image

B. Creating an edge-weighted image Downscaling

Figure 1 System flowchart. It shows the main stages of the segmentation and mixing approach.

A. Scene segmentation Infrared cameras generally capture thermal photons from objects acting as black body radiators. Based on this concept we are interesting to segment objects with temperatures different than the ambient temperature. Infrared cameras use different colour palettes to differentiate between hot and cold objects. For example hot objects are coded by brighter gray levels and cold ones are coded by darker gray levels. In order to separate just the cold and hot objects from the surrounding background, we apply an exponential scaling function on both the original infrared image and its negative version. Exponential scaling changes the dynamic range of the image in order to enhance (boost) the high intensity pixel values while decreasing the low intensity pixel values. In reality exponential scaling stretches the grayscale levels in the brighter regions of an image at the expense of the darker regions compressing their grayscale levels. (1)

is the original image, and are the exponentially scaled images for the original and its negative. Then the two exponentially scaled images are added together and scaled exponentially in order to suppress the low intensity pixel values. Then the resulted image includes the segmented cold and hot objects in the original infrared image.

183

As at this stage for all the retina prosthesis devices, only a grayscale image will be used in stimulating the retina. In this stage, the visible colour image is firstly converted into grayscale image. In previous work we showed that simplifying the visual scene is a crucial step in suppressing irrelevant while preserving the relevant features of that scene [16]. We use anisotropic smoothing as it helps better maintain the high frequency of important features while suppressing low frequency textures. It therefore eliminates noise and low importance textures, while avoiding smoothing across object boundaries. It is an iterative process which progressively smoothes the image while maintaining the significant boundaries by reducing the diffusivity at those locations having a larger likelihood to be edges [17]. The discrete equation of the anisotropic diffusion filter is: (4)

represents the gradient operator and is the diffusion coefficient or the diffusivity of the equation. denotes the iteration number, is the time step (it controls the accuracy and the speed of the smoothing) and , , and represent the gradient of the image in four directions. Although there are different methods to calculate the gradients [18], we use Sobel operator for its simplicity in implementation and processing time. The diffusion coefficient is then calculated from the following equation. √

The gradient image is then calculated as follows.

(5)

3 Adding the two scaled images together forms the segmented image as shown in Figure 3 (D). (6) √

In reality, patients would be able to adjust the smoothness or simplification level through increasing or decreasing the number of iterations e.g. by turning an analog knob. To increase the edge thickness and as well remove any discontinuity in the gradient image, we convolve it with a Gaussian filter as described in Eq. (3) and normalize it between 0 and 1. We then define two threshold values, and we set all pixels of the normalized gradient image below to 0 and all the pixels above are set to 1 according to how dense more edges are needed. We then define a threshold value K below which all the pixels will be raised to K. The value of K determines how the background information of the image needed to be reserved which will be controlled by the patient. The gradient image became now a weighting matrix that will determine how much details of the visible image will be reserved while increasing the brightness of the relevant edges. The edgeweighted image will be: (7)

The infrared segmented image is used to create weighted decision regions by which a linear combination of the pixels in the visible and the edge-weighted visible images is used to generate corresponding pixels in the fused image. Then the fused image will be: ̂ (

̂

)

A

B

C

D

Figure 3 The IR scene segmentation pathway. (A) is an IR image showing a person with a body temperature higher the surround and a cold cup on a table. The image is segmented into hot and cold objects in (B) and (C) respectively. (D) is the combined hot and cold segmented image.

As discussed previously [15], scene simplification has a great impact in suppressing irrelevant texture when going to downscaling the scene. To demonstrate the effect of scene simplification, the spatial derivative was calculated for a 512×512 image and a simplified version. The images are scaled to 32×32 and 16×16 respectively to show the effect over different resolutions. As shown from the bottom row of Figure 4, scene simplification removed low importance textures while keeping the relevant boundaries intact. This scene simplification step has a striking effect when using low spatial resolution electronic prosthesis systems. By simplifying the scene the key features, in this case a deer on a low contrast background, are enhanced. This effect is scalable across resolutions and is also striking for spatiotemporal processing (data not shown).

(8)

III. RESULTS AND DISCUSSION Fusing information extracted from both infrared and visible imagers have been explored previously [19, 20]. However, in this paper we use this infrared image to segment its corresponding image in the visible range. The Infrared camera used in this work is the Optris PI160 imager with a resolution of 160×120 pixels. While the visible camera used in this work is the mvBlueFOX-220AC with a resolution of 648×488 pixels. The two cameras are optically aligned to receive the same scene by dividing the incoming light into 2 equal beams using a special IR/Visible beam splitter designed for the purpose of this work. The plate beam splitter has been designed to give 85% visible transmission (420 to 700nm) and higher than 80% reflection (8 to 12 microns) at 45 degrees. As our lab built system was not perfectly optomechanically aligned, we used an image registration software technique to improve the result via software, which is beyond the scope of this paper. Figure 3 shows a captured infrared image that includes a person standing in a lab with his body temperature higher than the surrounding environment and a cold cup on a table. The hot and cold objects in the image are extracted by exponentially scaling the dynamic range of the original image and its negative image so that brighter objects are boosted to higher intensity values while darker objects are suppressed.

184

Figure 4 The effect of scene simplification in reducing irrelevant information and keeping only the relevant objects. Top row from left to right: a 512×512 scene of a deer on a low contrast background and its spatial derivative image with the same resolution and when scaled into 32×32 and 16×16 pixels, respectively. The bottom row shows the same thing for the simplifyed and edge enhanced scene.

A

B

C

D

E

F

Figure 5 The segmented visible image pathway. The figure shows the visible image of the one shown in Figure 3 (A) and its edge-weighted image (B). (C) is the visible segmented image with assistment of IR image. (D) is the downscaling of (A) into 32x32 pixels, (E) and (F) show how (A) and (C) may look to someone with a 32x32 stimulator retinal prosthesis, repectively.

4 The process of creating the edge-weighted image and the segmentation pathway is shown in Figure 5. Increasing or decreasing the objects details in the edge-weighted image shown in Figure 5 (B) can be controlled through changing the K value mentioned previously. Both of the original visible and visible segmented (Figure 5 (C)) images are downscaled into 32×32 pixels. We can see that at this low resolution level the details of the foreground objects have been fused with the background of the image as shown in Figure 5 (D). However downscaling the segmented image combined with the edgeweighted image enhanced the perception of the important objects in the scene (Figure 5 (E)).

and the European Research Council for FP7 (249867) funding.

Our hypothesis is that using low number of electrodes to stimulate the residual cells in the retina creates blurry vision. However during navigation patients urgently need an assist in detecting moving objects normally without any delay or blurring effect. This edge-weighted image aims to increase the contrast between objects by highlighting the edges of the moving objects and the edges between to distinguish objects while suppressing the other homogeneous pixels in the scene. In addition adding the segmented objects to this edgeweighted image keeps the patients seeing important details as well.

[4]

VI. REFERENCES [1] [2] [3]

[5]

[6] [7] [8] [9] [10]

A

B

C

D

Figure 6 The preference at higher resolution. The two left images (A) and (B) show how the image in Figure 5(A) may look for someone with a 64×64 and 128×128 stimulator retinal prosthesis. Images in the right (C) and (D) show the same for the visibly segmented image shown in Figure 5(C).

[11] [12] [13]

Going higher with the electrodes or micro array resolution such as the optogenitic retinal prosthesis approach [7] means more information can be perceived as shown in Figure 6. Although that it will be based on the preference of the patient to switch between the original or the segmented images, the visible edge-weighted segmented image is still distinguishable than the original image when downscaled into 64×64. However, going up to 128×128 pixels shows that the original image contains more information especially for the background but the image is still blurry.

[14] [15] [16]

[17]

IV. CONCLUSIONS We have shown how segmenting the visual scene into important and less important regions is an essential step in optoelectronic retinal prosthesis with low number of stimulating electrodes. Using the infrared camera speeded up this segmentation process. The new approach can be implementable on portable processing devices such as mobile GPU systems, though dedicated ASIC implementation is also a possibility. V. ACKNOWLEDGEMENTS We would like to thank The Royal Society, The BRC, the Engineering Physical Sciences Research Council (F029241)

185

[18]

[19] [20]

C. Hamel, "Retinitis pigmentosa," Orphanet Journal of Rare Diseases, vol. 1, pp. -, Oct 11 2006. J. L. Stone, et al., "Morphometric Analysis of Macular Photoreceptors and Ganglion-Cells in Retinas with Retinitis-Pigmentosa," Archives of Ophthalmology, vol. 110, pp. 1634-1639, Nov 1992. G. J. Chader, et al., "Artificial vision: needs, functioning, and testing of a retinal electronic prosthesis," in Progress in Brain Research. vol. Volume 175, E. M. H. I. H. J. W. A. B. B. G. J. B. Joost Verhaagen and F. S. Dick, Eds., ed: Elsevier, 2009, pp. 317-332. E. Margalit, et al., "Retinal prosthesis for the blind," Survey of Ophthalmology, vol. 47, pp. 335-356, Jul-Aug 2002. M. S. Humayun, et al., "Preliminary 6 Month Results from the Argus (TM) II Epiretinal Prosthesis Feasibility Study," Embc: 2009 Annual International Conference of the Ieee Engineering in Medicine and Biology Society, Vols 1-20, pp. 4566-4568, 2009. E. Zrenner, et al., "Subretinal electronic chips allow blind patients to read letters and combine them to words," Proceedings of the Royal Society B-Biological Sciences, vol. 278, pp. 1489-1497, May 22 2011. P. Degenaar, et al., "Optobionic vision--a new genetically enhanced light on retinal prosthesis," J Neural Eng, vol. 6, p. 035007, Jun 2009. A. D. Bi, et al., "Ectopic expression of a microbial-type rhodopsin restores visual responses in mice with photoreceptor degeneration," Neuron, vol. 50, pp. 23-33, Apr 6 2006. P. S. Lagali, et al., "Light-activated channels targeted to ON bipolar cells restore visual function in retinal degeneration," Nature Neuroscience, vol. 11, pp. 667-675, Jun 2008. V. Busskamp, et al., "Genetic Reactivation of Cone Photoreceptors Restores Visual Responses in Retinitis Pigmentosa," Science, vol. 329, pp. 413-417, Jul 23 2010. P. Degenaar, et al., "Individually addressable optoelectronic arrays for optogenetic neural stimulation," in Biomedical Circuits and Systems Conference (BioCAS), 2010 IEEE, 2010, pp. 170-173. W. Al-Atabany, et al., "Designing and testing scene enhancement algorithms for patients with retina degenerative disorders," BioMedical Engineering OnLine, vol. 9, p. 27, 2010. W. Al-Atabany, et al., "Improved content aware scene retargeting for retinitis pigmentosa patients," BioMedical Engineering OnLine, vol. 9, p. 52, 2010. W. Al-Atabany and P. Degenaar, "Scene optimization for optogenetic retinal prosthesis," in Biomedical Circuits and Systems Conference (BioCAS), 2011 IEEE, 2011, pp. 432-435. W. Al-Atabany, et al., "A processing platform for optoelectronic/optogenetic retinal prosthesis," IEEE Trans Biomed Eng, Nov 24 2011. W. Atabany and P. Degenaar, "A Robust Edge Enhancement Approach for Low Vision Patients Using Scene Simplification," in Cairo International Biomedical Engineering Conference CIBEC 2008, ed, 2008, pp. 1--4. P. Perona and J. Malik, "Scale-Space and Edge-Detection Using Anisotropic Diffusion," Ieee Transactions on Pattern Analysis and Machine Intelligence, vol. 12, pp. 629-639, Jul 1990. M. Sharifi, et al., "A classified and comparative study of edge detection algorithms," International Conference on Information Technology: Coding and Computing, Proceedings, pp. 117-120, 557, 2002. S. Singh, et al., "Infrared and visible image fusion for face recognition," Biometric Technology for Human Identification, vol. 5404, pp. 585-596, 2004. J. M. Kriesel and N. Gat, "True-Color Night Vision (TCNV) Fusion System using a VNIR EMCCD and a LWIR Microbolometer Camera," Signal Processing, Sensor Fusion, and Target Recognition Xix, vol. 7697, 2010.

Suggest Documents