A study of image quality assessment and color image

Master in 3D Multimedia Technology

A study of image quality assessment and color image reconstruction algorithms for mono-sensor camera Master Thesis Report

Presented by

Franck JOURNES and defended at

Université Jean Monnet Saint-Etienne

Academic Supervisor(s): Xavier GONON Alain Tremeau Jury Committee:

A study of image quality assessment and color image reconstruction algorithms for mono-sensor camera

ii

A study of image quality assessment and color image reconstruction algorithms for mono-sensor camera Franck JOURNES 2014/06/05


Abstract To infer the color information in a mono-sensor camera systems, a color filter array is placed on top of the single sensor to register only one color in each pixel of the sensor. Those camera systems use algorithms to restore a full color image, and the quality of the reconstructed image could vary from one algorithm to another. In this thesis, we proposed to study the impact of the different components of the acquisition chain on the image quality. We also study some of the state-of-the-art algorithms and the different quality metrics for image quality assessment. We observe that the choice of an appropriate algorithm is dependent on the frequency content which evolves all over the acquisition chain.

i


Preface Foremost, I would like to express my sincere gratitude to all members of Developpement Amont et Nouveau domaines of THALES Angenieux for their encouragement and friendship during the last six month. I wanted to thank especially my tutor GONON Xavier for his patience, motivation, enthusiasm, and immense knowledge. I would like to express my deepest acknowledgment to Prf. TREMEAU Alain and MUJIC Elvir which proposed to me the internship. I also thank all my professors from GUC and UJM for their interesting courses which help me a lot for this internship. Last but not the least, I would like to thank my family and my girl friend for their encouragement and patience.

ii


Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . Preface . . . . . . . . . . . . . . . . . . . . . . . . . Contents . . . . . . . . . . . . . . . . . . . . . . . . List of Figures . . . . . . . . . . . . . . . . . . . . . List of Tables . . . . . . . . . . . . . . . . . . . . . . 1 Introduction . . . . . . . . . . . . . . . . . . . . 2 Video camera presentation . . . . . . . . . . . . 2.1 Video camera architecture . . . . . . . . . . 2.1.1 Optical system . . . . . . . . . . . . 2.1.2 Digital sensors . . . . . . . . . . . . 2.1.3 Technologies . . . . . . . . . . . . . 2.1.4 Signal processing in a video camera 3 Image demosaicking algorithms . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . 3.2 Formalization and properties . . . . . . . . 3.3 Non-adaptive algorithm . . . . . . . . . . . 3.3.1 Nearest neighbor replication . . . . . 3.3.2 Bilinear interpolation . . . . . . . . . 3.3.3 Smooth hue transition . . . . . . . . 3.4 Adaptive algorithm . . . . . . . . . . . . . . 3.4.1 Pattern recognition interpolation . . 3.4.2 Edge sensing interpolation . . . . . . 3.4.3 Linear weighted interpolation . . . . 3.4.4 Frequency domain algorithm . . . . 3.4.5 Other algorithms . . . . . . . . . . . 4 Image Quality Assessment . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . 4.2 Demosaicking Artefact . . . . . . . . . . . . 4.3 Test Images . . . . . . . . . . . . . . . . . . 4.4 Objective evaluation . . . . . . . . . . . . . 4.4.1 Fidelity measurement . . . . . . . . 4.4.2 Perceptual Measure . . . . . . . . . . 4.5 Frequency domain measure . . . . . . . . . 4.6 Artefact analysis . . . . . . . . . . . . . . . 4.6.1 Blurring Measure . . . . . . . . . . . 4.6.2 False color measure . . . . . . . . .

iii

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

i ii iii v vii 1 3 3 3 6 6 11 14 14 14 15 16 17 17 19 19 21 24 25 33 36 36 36 38 39 40 42 44 45 45 47


4.6.3 Zipper measure . . . . . . 4.6.4 Alternating hue in a video 4.7 Conclusion . . . . . . . . . . . . 5 Result . . . . . . . . . . . . . . . . . . 5.1 Methodology . . . . . . . . . . . 5.2 Experimentation and observation 5.3 Conclusion . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . Table: CPSNR . . . . . . . . . . . . . . . Table: ∆E . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

iv

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

48 50 51 53 53 54 59 61 66 68


List of Figures 1 2 3 4 5 6 7 8 9 10 11 12 13 14

15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Main steps of acquisition of a single sensor camera (dotted steps are optional) . . Exemple of geometrical distortion: Left, Barrel distortion; Right, Pincushion distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Influence of a diaphragm to the geometrical distortion: Top, Barrel distortion; Bottom, Pincushion distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exemple of vignetting effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chromatic aberration due to the lens . . . . . . . . . . . . . . . . . . . . . . . . . Image with chromatic aberration . . . . . . . . . . . . . . . . . . . . . . . . . . . (a) Image suffering of aliasing. (b) image with antialiased filter . . . . . . . . . . Illustration of how an OLPF remove the aliasing . . . . . . . . . . . . . . . . . . . Acquisition of a 3-sensor device . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acquisition of the foveon sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acquisition of a single sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Energy band formation of an atom of silicium . . . . . . . . . . . . . . . . . . . . Simplified representation of the photoelectric effect in a sensor . . . . . . . . . . (a)Bayer pattern, (b) Yamanaka pattern, (c) vertical stripe pattern, (d) diagonal stripe pattern, (e) diagonal Bayer pattern, (f,g) pseudo-random patterns, (h) HVSbased pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modelisation of the effect of micro-lens (right) compared to a system without micro-lens (left) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Left : Image without white balance; Right : Image with white balance . . . . . . Left : Spectral response of a mono-sensor; Right : MacBeth color checker . . . . . Example of miss interpolation of an edge . . . . . . . . . . . . . . . . . . . . . . . Nearest neighbor interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bayer pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Different smooth hue transition algorithms: (c,e) based on ratio and (d,f) based on differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cok’s kernel classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bayer pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hirakawa’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Filter used to estimate the luminance . . . . . . . . . . . . . . . . . . . . . . . . . Alleysson’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fourier spectrum of an image with the luminance and the 8 chrominance band. . Filter proposed by (11 × 11) Alleysson (a) and (5 × 5) Lian (b) . . . . . . . . . . . Pipeline of the Gunturk’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . Locations of the four polyphase components . . . . . . . . . . . . . . . . . . . . .

v

3 4 4 5 5 5 6 6 7 7 8 8 9

11 11 12 13 16 16 17 19 20 22 23 26 26 28 29 31 32


31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

48 49

50 51 52 53 54

Lu’s algorithm. Where s01 and g01 are respectively the signal and the green at 01. F00 , F10 , F11 are the polyphase filter computed for the kth iteration. . . . . . . . . An example of zipper artifact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example of possible artefact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . left: Kodak images, right: McMaster images . . . . . . . . . . . . . . . . . . . . . Statistics of the Kodak and the McMaster datasets from [42] . . . . . . . . . . . . left: Input sine image, right: Output sine image with affected contrast . . . . . . . Example of computation of the MTF . . . . . . . . . . . . . . . . . . . . . . . . . Profile of an Edge use to measure the blur in an image . . . . . . . . . . . . . . . Example of over-detection of zipper . . . . . . . . . . . . . . . . . . . . . . . . . . Example of pattern considered as zipper . . . . . . . . . . . . . . . . . . . . . . . Example of alternating hue by shifting the phase of the bayer matrix . . . . . . . Pipeline of image quality assessment . . . . . . . . . . . . . . . . . . . . . . . . . Airy function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exemple of displacement introduce by an OLPF . . . . . . . . . . . . . . . . . . . MTF of the pipeline with and without the PSF . . . . . . . . . . . . . . . . . . . . MTF of the different configuration. Top left: Without OLPF, Top right: OLPF 0.5pix, Bottom left: 1pix, Bottom right: 1.5pix of displacement on the horizontal and vertical Left: Average PSNR of the Kodak images other the displacement introduce by the OLPF; Right: Average ∆E of the Kodak images other the displacement introduce by the OLPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image demosaicking with the algorithm of Hamilton and Adams according to different OLPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From top to bottom and left to right: Original image, Bilinear interpolation, Smooth hue transition, Pattern recognition Cok, Chang, Proposed Chang + Smooth hue transition, Hamilton and Adams, Wang, Pekkucuksen, Lian, Dubois, Lu . . . . . . Alternating hue: Left: Bilinear interpolation, Right: Pekkucuksen algorithm . . . . CPSNR: Kodak dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CPSNR: IMAX dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∆E: Kodak dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∆E: IMAX dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

33 37 38 39 39 45 45 46 49 50 51 53 53 54 55 56

56 57

58 59 66 67 68 69


List of Tables

vii


1

Introduction

Thales Angenieux is a world renowned manufacturer of precision optic with the best zoom lenses for film production, as well as specific products for surveillance applications and security. This reputation is due to the heart of competence of the company, which allowed it to become a major player in these markets. One of the most significant year was in 1993, where it was bought by the Thomson-CSF group, now known as THALES. In 2013, the THALES groups has generated more than 14 billion euros of turnover. THALES group is now trying to provide complete and reliable solutions that meet the needs of demanding customers as air, naval or land forces. The main objective is to give away to their customers to fulfill their mission through high-tech tools. With a knowledge of the markets and a presence in 56 countries, Thales is close to their customers and can better understand their needs. Thales Angenieux now has nearly 400 employees who master all phases of design, industrialization, manufacture and marketing of subsets and finite products. The company was able to capitalize and maintain more than 75 years of industry expertise and strives to remain at the forefront of innovation. It offers its customers the best professional zooms based on mechanisms combining precision and reliability, as well as designs and outstanding optical achievements. Many development projects allow Thales Angenieux to keep the highest level of expertise in the field of optical technology, optoelectronics and mechanics. The company has sales offices in the United States and Singapore, and relies on a network of 50 distributors. The company is positioned on three different strategic areas: • Cinema: The professional optics for the cinema industry. Thales Angenieux is particularly recognized for developing numerous ranges of zooms. Today, the company provides high end zooms for large productions but also has midranges of products that allow it to expand its scope of activity. These mid-range products, came out on the market, at the same time of digital zooms, offering users more accessible price. • Defense: The night vision equipment, security systems and monitoring equipment are used worldwide for the control of borders and coastlines, the search and rescue operations or observation of sensitive sites. • New Domain: Television with a system of live capture 3D for Broadcast industry. Thales Angenieux in partnership with a French expert in 3D shooting - Binocle 3D - wants to po1


sition itself today as a systems manufacturer and propose a 3D offer adapted to customer needs and specificities of the broadcast market. AB One system, commercialized under the new brand AB Live, brings new functions through a unique concept of Plug & Shoot easy of installation. This system brings innovative features which gives access to a comprehensive approach more compatible with current 2D equipment. The teams have primarily endeavored to develop an efficient system to optimize the overall operating costs. It was in the context of the Nouveau Domaine services which work on the new system of 3D camera broadcast that my internship takes place. The goal of my internship was to study what is done in terms of image quality metric, define a process of simulation of the camera pipeline and investigate on what is done on the field of demosaicking of color filter array image.

2


2 2.1

Video camera presentation

Video camera architecture

A video camera is made of three main elements ([1], [2]). An optical system, which projects the image of the scene in the sensor; Then, the sensor (made of a color filter array in the case of mono-sensor camera and a photosensitive surface) will sample a scene observe by the camera to realize the acquisition of an image. And finally, the image is processed, in regard to a chosen processing pipeline, to reconstruct the image.

Figure 1: Main steps of acquisition of a single sensor camera (dotted steps are optional)

The figure 1 gives an idea of a basic architecture of a digital camera (which don’t take into account of the optical system). The quality of each of the elements has an influence on the overall quality of the image captured. A deeper explanation of each steps of the acquisition system is made in [1] and [2].

2.1.1 Optical system Related problem : As shown before, the intensity of the luminous signal of the scene is passing through the optical system of lens to the photosensitive surface of the digital sensor. This is the first step of the acquisition of the image. This process of acquisition is directly linked to the fundamental property of geometrical optics. Due to the curvy shape of the lens, the optical property of the material, and the projection property of the scene on a plane surface, some inherent problem may happen [3]. As the optical systems are not ideal, they are the main sources of errors. They introduce many artifacts, like geometric distortions, chromatic aberrations, the vignetting effect and/or the blurring effect. 3


Geometric distortions : is a deviation of a straight line of a scene which appears curved after projection onto the sensor. Usually, geometric distortions are introduced by the position of the diaphragm which doesn’t satisfy the Gauss approximation due to low incident angle [1-3]. Two well-known distortion are illustrated in figure 2. The first image illustrate a distortion as a barrel (curved lines go out the image) and pincushion distortion (curved lines go inside the image).

Figure 2: Exemple of geometrical distortion: Left, Barrel distortion; Right, Pincushion distortion

Figure 3: Influence of a diaphragm to the geometrical distortion: Top, Barrel distortion; Bottom, Pincushion distortion

Vignetting effect : the brightness is reduced in the periphery of the image in regard to the center (figure 4). This problem happens when the optical system has some difficulty to concentrate the light in the periphery of the image (when the opening of the diaphragm is high the vignetting effect is strong). Chromatic aberration : are due to the varying refractive index of the lens in regard to the different wavelenght of the light (figure ). The result is that the focal lenght is different for each wavelenght. And the result is when we take the focal to a chosen wavelenght, we could see an irization effect in the edge of an object (figure 5-6). Aliasing and antialiasing : A video camera is an example of imaging system which samples a scene according to an array of pixels organized at a regular pitch. We start to observed aliasing, according to the Whittaker4


Figure 4: Exemple of vignetting effect

Figure 5: Chromatic aberration due to the lens

Figure 6: Image with chromatic aberration

Shannon theorem, when the spacial frequencies which are involved in an image are higher than half the sampling rate. Thus, as an example, an imaging system which samples at a sampling rate of 40 cyc/mm could carry out spatial frequencies inferior or equal to 20 cyc/mm (denoted as the Nyquist frequency). In figure 7a we could see the effect of aliasing in an image. The high-frequencies contained in the image are interpreted as low-frequencies by the sampling matrix of the sensor. Fortunately, an antialiased (optical low pass filter) technic exist to limit the bandwidth of the spatial frequency which are involved in the imaging system. Figure 7b, illustrate an example of image resulting from an antialiased filter. We could see that the figure 7b, where we loss some resolution due to the optical low pass filter (OLPF), is more pleasing to see because it respect the spatial contents rather than the figure 7a where some low frequency pattern appears. Note: the lens which involve in a zoom can reduce also the spatial frequencies in an image but according to [1] the reduced frequencies are dependent on the zoom (focal-number) and the 5


Figure 7: (a) Image suffering of aliasing. (b) image with antialiased filter

Figure 8: Illustration of how an OLPF remove the aliasing

focus, thus a lens is not a good way to control the spatial frequencies. Usually, it was observed that the spatial frequencies which involves after the zoom are higher than the Nyquist frequency [1] which may cause aliasing problem if we don’t use an OLPF.

2.1.2

Digital sensors

As explain in [1], [2], [3], [6], CMOS and CCD sensors can only register the information about the intensity of the light (and not the color information) in regard to the spectral sensitivity of the silicon. In other words, each photosites of the sensor measure the quantity of photons hitting a photosite. There are severals methods to infer the color information; we will describe those methods in the next section.

2.1.3

Technologies

- Three-sensor device : to infer the color information, the three-sensor device uses a beam splitter to split a beam of light in three components (usually the Red, Green and Blue) and by some (difficult) mechanical adjustment [7] of the different path, we could register a color component in each of the three sensors. Usually, this system gives the best results; it is an expensive system due to the three sensors

6


Figure 9: Acquisition of a 3-sensor device

and the beam splitter. Generally, that kind of system is reserved to professional (Broadcast TV,...). - Foveon X3 technology-based device : is a single color sensor which uses the property of the light to be absorbed by different layer of silicon of different thickness in function of the wavelenght.

Figure 10: Acquisition of the foveon sensor

If we compare to a simple single-sensor (next paragraph), we avoid the problem of artifact due to the demosaicking step, the color seems purer, the image is sharper, the image resolution is better [3] and we limit the loss of light inside the color filters. The drawbacks [2][3], are due to non-ideal tansmission property of the silicon layers, that result of greyish images. We could also notify that those sensors are really sensitive to noise. As, there are expensive, they find most of their applications, in medical, scientific or industrial fields. - Single-sensor device : to infer the color information in a single sensor, a basic solution is to superimpose a Color Filter Array (CFA) on the CCD/CMOS sensor [2][3]. Each photosites of the sensor registers only one color in regard to the organisiton of the mosaic of the CFA. Then, we have to process the resulting image to obtain the two other colors and reconstruct the color image; this step is called demosaicking. 7


Figure 11: Acquisition of a single sensor

Those systems are the cheapest solution in the market. The main drawback of this method is due to the CFA [4] where we get some loss by absorption and saturation caused by the filters. We also need some efficient algorithm to perform a full reconstruction of a color image. Photoelectric effect The photoelectric effect, which is dependent of the material (usually silicon for sensor device), is an effect which generates electrons when it is hitting by light. Considering a semi-conductor material as the silicium, we could see that the energetic states are cutted in two bands. One band is the conduction band (Ec ) and the other one is the valence band (Ev ). Those bands are separate by a forbidden gap of energy (Eg ) where the energy levels are unreachable by the electrons. The gap Eg , could be estimate as Eg = Ec − Ev and is more or less big in regard to the considerd material.

Figure 12: Energy band formation of an atom of silicium

Thus, to create some conductivity inside the material (moving electrons), an electron needs to be on the conduction band. To be on the conduction band the electron need to have an energy (carried by a photon) superior or equal to Eg . Thus, the energy of a photon required is express as:

8


h.c ≤ Eg (2.1) λ Where h is the Planck constant, c the celerity of the light and λ is the wavelength of the light. When Ephoton ≤ Eg a pair of electron is created and stored in a capacitor where the charge is proportional to the numbers of photons hitting the sensor. Then, a converter converts the analog signal in digital signal. Ephoton = hν =

Figure 13: Simplified representation of the photoelectric effect in a sensor

CCD/CMOS sensors - CCD sensor :is made of an array of photosensitive cells. As related in [1][3][6], the photoreceptor should be adress as a sequence. At the output we convert and amplify the registered tension. To access to the value of a pixel, we shift and empty the charge between the output register and the pixel we want to read. The main drawbacks of this process are that we remove all pixels between the process pixel and the output of the sensor, this step is also computationaly expensive. As stated in [6], there are severals architecture of CCD sensors: - Full-frame architecture: generally used in photography, all the area of the sensor used for the image is active. - Frame transfer architecture: is more expensive than the full-frame architecture. There are two matrix in one CCD sensor. One of the matrix is masked and the other one is exposed to the light. With this architecture, we could perform fast transfert of the exposed matrix to the masked matrix which play the role of memory. 9


- Interline transfer architecture: is the most complex architecture. It introduces a photodiode to each cell of the CCD to have a better spectral response in the visible spectrum. As related in [3], there are ameliorations regurlarly made in the field of CCD sensors. One example is the super-CCD sensor, where the aim is to ameliorate the sensitivity of the sensor to the light and thus giving a better signal/noise ratio. Those results are obtained by replacing the classical square pixels array by octogonal pixels array, which augment the activ surface of the sensor. - CMOS : are composed of pixels where each photodiodes (compared to CCD sensor) has its own charge/voltage converter and amplifier [3][9]. If we compare to CCD sensor, the CMOS sensor has a lower electric consumption, the cost production is lower and the reading step is quicker. This is the main reason why the CMOS are more and more popular. Both of the technologies allow to get great image quality. Each of the sensors have their own advantages and drawbacks [3][6][9]. For CMOS, the researches are mainly oriented to improve the image quality, while the researches are more focused to reduce the energy consumption of the CCD. But, the differences are so small that we could find CMOS sensor in some acquisition system that require high image quality and some CCD sensor in mobile device like smartphone. Color Filter Array As explain before, CCD and CMOS sensor are only sensitive in all the spectrum of visible light. The main problem is that the photosites are unable to make the difference between the different wavelengths of the light spectrum. They just measure the amount of light hitting each photosites. One method to tackle this problem is to superimpose a color filter array (CFA) on top of the sensor to infer the color information [1-5]. There are various CFA with different arrangement of the mosaic of color filter. Those different CFA have an influence, has related in [2-4], in the overall quality of the reconstructed image and on the processing time to perform the reconstruction. As stated in [2][10], there are other CFA which use more than the classical three tristimulus value (RGB, YMC). Those CFA could mix primary and complementary colors or severals colors CFA [11]. Those CFA usually have better accuracy in hue gamut. The problem with those CFA is that it increases the complexity. That’s why tristimulus CFA are prefered for demosaicking. The design of the CFA is also important. A random tristimulus CFA or aperiodic CFA lead to increasing the complexity of the algorithm, but they could reduce the false color artifact in regard to their arrangement [10]. Whereas with periodic CFA (Bayer and Yamanaka CFA), the way the pattern is built, has an effect to reduce the sensitivity to color artifacts, a lower complexity, but sensitive to false colors [10].

10


Figure 14: (a)Bayer pattern, (b) Yamanaka pattern, (c) vertical stripe pattern, (d) diagonal stripe pattern, (e) diagonal Bayer pattern, (f,g) pseudo-random patterns, (h) HVS-based pattern

Micro-lens system Due to the increasing resolution of the sensors without augmentation of the format of the sensor will cause a reduction of the photosensitive surface of the photosites. Thus, the fill factor (quantity of light hitting the sensor) of the photosite will be lower and lower [58]. The consequence will be a reduction of the sensitivity and an augmentation of the signal/noise ratio of the sensor. Thus, it was proposed to use a system of micro-lens which will concentrate the beams of light on the photosites of the sensor.

Figure 15: Modelisation of the effect of micro-lens (right) compared to a system without micro-lens (left)

2.1.4

Signal processing in a video camera

In video camera system the signal cannot be used as it is. Some pre-processing or post-processing steps are required to give the best possible quality to the image. In a color images, three primary colors are required to perform the reconstruction of a color image. Due to the spectral properties of the imaging system, define by the optical system, the lighting conditions, the spectral response of the color filter array may not produce faithful color rendition. Thus, some processing steps are 11


required to reproduce the most faithful colors according to the human visual perception of the scene. Before, making any colorimetric correction in the imaging system, a black balance step is required to adjust the voltage measured for the three colors of the single sensor array. The black balance is done automatically or manually by closing the diaphragm with a slight offset (not too much otherwise the image will appear grayish) to avoid some relative noise in the sensor. In a second step, we need to set up the white balance. The white balance is a pre-processing step in the camera pipeline where the level of the components R and B are adjusted to the level of the G to reproduce a white object in a scene as a white object in the camera output. The white balance is required as white will not appear white in a video camera if we don’t take into account of color temperature (Figure 16). In most of the professional cameras, a preset function exist which allow a reporter to be ready to take a video (without particular conditions of illumination) or when we want to respect particular ambience like candlelight or sunset. In other special conditions, like multiple camera set up, when a video is shot in two different days, mixed light sources, colorimetric effect where we need to keep the reference at a neutral white point.

Figure 16: Left : Image without white balance; Right : Image with white balance

Considering the spectral response of the Bayer color filter array, the native color image resulting from the demosaicking step, will depend of the colorimetric space of analysis define by the spectral response of the color filter array (define for some lighting condition). Thus, to reproduce the color as we perceive in a scene, we need a color correction step to pass from a colorimetric space of analysis to a colorimetric space of synthesis. The color correction which involve in the synthesis space will try to perform a better reconstruction of the colors out the gamut of the color space and will add some contrast. Generally, the color correction is computed by using a 3 × 3 matrix where the parameters are computed, according to a reference color checker (as MacBeth color checker). Then, knowing the original color of each patch of the color checker, 12


we can solve the equation 2.2, using a least square error for the most simple method [59-61]. Where (RS ; GS ; BS ) are the components of the synthesis color space and (RA ; GA ; BA ) are the components of the analysis color space. 

  RS a11 GS  = a21 BS a31

a12 a22 a32

  a13 RA a23  GA  a33 BA

(2.2)

The image should respect the neutrality of the white point from the analysis image to the synthesis one. This constraint is used to avoid a modification of the grayscale of the image if the computed correction matrix is not well computed. Thus: RA = GA = BA = 1 and RS = GS = BS = 1 Using, the constraint the computation of the matrix require only six parameters and the equation 2.3-2.5 could be re-writen as:

RS = RA + a1 (RA − VA ) + a2 (RA − BA )

(2.3)

GS = GA + a3 (GA − RA ) + a4 (GA − BA )

(2.4)

BS = BA + a5 (BA − RA ) + a6 (BA − GA )

(2.5)

Figure 17: Left : Spectral response of a mono-sensor; Right : MacBeth color checker

The colorimetric correction of the RGB space leads to some amelioration in terms of contrast and saturation of colors in the video image but we should keep in mind that the correction will be done to the detriment of the numbers of the colors displayed. More processing are done in a video camera like the distortion correction, the noise reduction or gamma correction (characterize the contrast rendering of a photosensitive sensor). More insight on the subject could be found in [1][2].

13


3 3.1

Image demosaicking algorithms

Introduction

As related in the past chapter, a CCD or a CMOS converts a luminous signal to a digital signal. We have seen several technics to reconstruct color images. The three sensors method is one of them but those system are complex to set-up and even more costly if we consider a 3D stereovision system. This is why most of the cameras use only one sensor. To infer the color information, a Color Filter Array (CFA) is used to record one color component at each photo site of the sensor. One of the most popular CFA in the industry is the Bayer pattern which is composed of two times more green patches than red and blue. Those patcheswill subsample the luminous input signal according to the Bayer pattern to gather only one color per pixels. The goal of the demosaicking algorithm will reconstruct the two missing colors at each pixel. In the first paragraph, we will try to identify the different state-of-the-art algorithms as done in many papers [1-3] [35][34][39] and try to add our contribution by reviewing some new technics applied in the demosaicking field which could be considered as some new state-of-the-art algorithm.

3.2

Formalization and properties

As related before the Bayer CFA select a color over each photosites of the sensor. In the configuration of the two first pixels in the Bayer matrix are green and red, we could write the resulting image of the CFA (ICFA ) as:

ICFA x,y

 R if x odd and y even   x,y = Bx,y if x even and y odd   Gx,y else

An image is originally composed of three channels Ix,y , thus to reconstruct a full color image we will need to estimate the two missing colors at each pixels to obtain an estimated color image Îx,y :

Îx,y

 ^ ,B ^ ) if x odd and y even  (R , G   x,y x,y x,y ^ x,y , G ^ x,y , Bx,y ) if x even and y odd = (R    (R ^ x,y , Gx,y , B ^ x,y ) else

Thus, the goal of a demosaicking algorithm is to estimate accurately the missing color of a CFA image ICFA to reconstruct a full color image Î. To help the colors reconstruction some properties are used to perform the interpolation step in an efficient manner:

14


Property 1: Spectral Correlation Most of the algorithms assume and were designed according to the fact that the R, G, and B channels are highly correlated. In [16], Gunturk measure the inter-channel correlation between the Red/Green plane and the Blue/Green plane. He observed that the correlation in the Kodak dataset is high, especially in the high frequency bands [16]. Note: It was observed in [42-43], that the high spectral correlation observed by Gunturk in the Kodak dataset does not always hold in some natural scene. Some problem may appear when some noise is present in the sensor or for highly saturated colors. In this case, Zhang and al. [41] point out that overusing the spectral correlation in color demosaicking could lead to some unexpected result in the reality. The main cause expressed was the fact that researchers try to optimize their algorithm on the Kodak dataset which have smooth hue transition and high spectral correlation which is not always representative of what we could get in the reality. Property 2: Spatial Correlation Spatial correlation is an important property used in many demosaicking algorithms. It relates to the fact that the levels of the colors are similar in homogeneous regions of the image. Thus, we could use the level of the pixel neighbor to perform the estimation of the missing colors level. It is important to notice that this property doesn’t hold in transition region where the levels are abruptly different and could lead to miss interpolation. Note: Many algorithms use this property by making first an interpolation of green level (which is less affected by aliasing as they are they have two times more sample compared to the red/blue pixels) and then by using the property of spectral correlation we could estimate the red and blue components. The figure 18 show a characteristic problem of miss interpolation between two homogeneous regions. We could see between the black and white regions some intermediate levels, which illustrate the importance of respecting the spatial correlation and the limit of the spectral correlation on the edges.

3.3

Non-adaptive algorithm

In this section, we will relate the algorithms which use a fixed method to perform the interpolation step. This class of algorithms, doesn’t take into account of spatial information and this is why they are more affected by different kind of artifact, related in the next chapter. On the other hand those algorithms require less computation to perform the demosaicking step.

15


Figure 18: Example of miss interpolation of an edge

3.3.1

Nearest neighbor replication

In order to perform the demosaicking step, a simple way to perform the reconstruction of the different color channel is to replicate the nearest neighbor pixel [3]. The interpolation step could be performed in any directions (North, South, West, East) of the pixels. For example, we give an interpolation by nearest neighbor in the East direction.

Figure 19: Nearest neighbor interpolation

This method is generally considered as the worst method in the demosaicking field. The nearest neighbor replication tends to miss some interpolation especially in the zone of high frequency. The result is the apparition of zipper or false color artifact explains in the next chapter.

16


3.3.2

Bilinear interpolation

One of the first technic of reconstruction of the missing colors in a CFA was the bilinear interpolation [35]. As the pattern of the Bayer CFA is known and periodic, it was proposed to use a low pass filter to average the pixel neighbor (usually in a 3 × 3 window) and perform the reconstruction of the missing pixels. The interpolation step is performed by convolution (kernel define below) independently for each R, G, and B channels.

Figure 20: Bayer pattern

To illustrate the process of the bilinear interpolation we give an example of the interpolation of the blue pixel B12 and the green pixel G12 :

B12 =

B6 + B8 + B16 + B18 4

G7 + G11 + G13 + G17 4 With this example we could clearly see that the bilinear interpolation is an average of the neighboring pixels of the same channels. As the method computes a convolution in a 3 × 3 window for each pixels, the bilinear interpolation could be considered as a low complexity method. But, we observe some blurring effect, false colors and zipper artifact on the produced image, due to the low pass property of the filter and some miss interpolation due to the behavior of the algorithm which cause abrupt changes of hues (as the algorithm average the neighbor pixels the result will be an intermediate level between two adjacent zones) in an unnatural manner. G12 =

3.3.3

Smooth hue transition

In [26][45], Cok proposes to use the spectral correlation property to perform the interpolation step. In his algorithm, he assimilates the Bayer CFA as a pattern composed of a luminance component (green pixels) and two chromatic components (Red/Blue pixels). After a preliminary estimation of the green levels (luminance) a ratio of the chromatic over the luminance is computed to perform a bilinear interpolation of the ratios which are then multiply by the luminance to obtain the two chromatic components. If we use the algorithm of Cok to estimate the blue

17


pixel at B12 , we can compute it as:

B12 = G12 ×

1 B6 B8 B16 B18 ×( + + + ) 4 G6 G8 G16 G18

Here, the green levels are estimated by bilinear interpolation, but we will see in the next paragraphs that other algorithms use the smooth hue transition to compute the value of their chromatic channels. In the survey paper of Losson [35], we could find a simplified version of the demonstration of the color ratio constancy over homogeneous region made by Kimmel in [19]. This demonstration, by making the assumption that an image could be seen as a Lambertian surface, shows that two color ratio or difference are constant. From this observation, Pei in [53] re-writes the smooth hue transition formula as the difference of chromatic and luminance levels rather than the ratio. Thus, the new smooth hue transition could be written as (for the blue pixel B12 ):

1 × (+B6 − G6 + B8 − G8 + B16 − G16 + B18 − G18 ) 4 In [35] Losson makes a comparative study of the different algorithms of the smooth hue transition, denoted by the ratio [26], the difference [53], and a ratio affected of an offset [35]. It is reported that the first method is more affected by errors than the method by difference. This is due to the fact that when the red and/or blue are saturated the ratio will be sensitive to small variations of the red and/or blue value. In figure 21, Losson shows that the ratio method tends to be more artefact-prone as we tends to have more high frequency (see figure 21 sobel filter) in the plumage of the red parrot which could lead to miss interpolation if we use a bilinear method of interpolation. B12 = G12 +

In a second step, Losson [35] compares the three smooth hue transition algorithms other twelve images of the Kodak dataset, it seems that the difference method gives the best result but we could take some precaution with that, the result are really close from each other. In our experimentation, we observed for different phase of the Bayer pattern that the same algorithm may give some bias in objective measurements result. And, assuming that most of the papers focus on one phase of the Bayer pattern, we prefer to take those results with a small factor of uncertainty and assume the difference technic equivalent to ratio affected of offsets. Using the spectral correlation of the green plane versus the red and blue planes gives a pretty good improvement to the bilinear interpolation method (around +3dB in the PSNR (see chapter ˘Z ´ t hold on the tran4) applied on the red/blue channel), but as the spectral correlation doesnâA sitions between two regions, we could observe similar artifact as the bilinear interpolation like false color, zipper or some blur. For few more computation we have seen the importance of using

18


Figure 21: Different smooth hue transition algorithms: (c,e) based on ratio and (d,f) based on differences

the spectral correlation.

3.4

Adaptive algorithm

Adaptive algorithms at the opposite of the non-adaptive algorithm, relate to the class of algorithms which use some content analysis to perform the demosaicking step. Generally, those algorithms use some local spacial features (like gradients) in the neighboring pixels to perform the interpolationwhich creates less artifacts.

3.4.1

Pattern recognition interpolation

As related in the smooth hue transition algorithm, the hypothesis of constancy against the hue doesn’t hold on transition regions (edges). Cok in [45], proposes a scheme which allows to perform some pattern recognition of the neighboring structure of a pixel and adapt the interpolation on transition regions. Cok proposed an algorithm which classifies the different pixels of the green channel as edge, stripe or corner. At this time it was supposed that the improvement of the algorithms reling on the quality of interpolation of the green pixels. The interpolation of red/blue pixels by smooth 19


hue transition was supposed to be enough. 1. First we take the mean (M) of the four neighbors pixels (North, South, East, West). 2. Then, for each of the four neighbors, we classify them as Equal (E), higher (H), Lower (L) than the mean value. 3. According to the pattern of figure 22, we apply the following scheme: a Edge pattern: the pixel G = median(N, S, E, W), assuming N > S > E > W, the median(N, S, E, W) = (S+E) 2 b Stripe pattern: we add another level of decision by taking a 5 × 5 window according to the bottom left pattern of figure 22. We made the average of the eight pixels A (S = AVERAGE(A)), then we enforce the value of x = 2M − S to be between S and E if x is above S, the green pixel is equal to S, if x is below E then the green pixel is equal to E, otherwise the green pixel is equal to x. c Corner pattern: here also we need another level of decision. We use the 5 × 5 kernel of the bottom right figure 22. We compute the average of the pixels C (S = AVERAGE(C)). Then, x is computed as x = M − (S−M) and same constraints of 4 the stripe pattern is applied to reconstruct the green pixel.

Figure 22: Cok’s kernel classifier

The cok’s method was really a new way to perform demosaicking at this period. It was the first one to take into account of the structure of the image and adapt the interpolation in consequence. This method was the starting point of many other methods using the spatial content of the image to perform efficient interpolation.

20


Unfortunately, the result were not as good as expected by Cok, the algorithm still misses some interpolation in high frequencies zone (see figure 48). We could also notice that the improvement of the smooth hue transition is at the price of many more computations as for each pixel we need to perform some costly test to classify the pixel. Nevertheless, Cok’s algorithm leads the way to some new kind of algorithm.

3.4.2

Edge sensing interpolation

As related in many papers [35][62][17][46], the human visual system is especially sensitive to edges. As related before, the spectral correlation property fails on the edges which are the cause of failure of non-adaptive methods. One intuitive idea was to perform an edge detection to make an efficient interpolation which respects the high frequencies of the image. Gradient method As the green pixels have twice many samples as the red or blue pixels, it was proposed by Hibbard [62] to perform the interpolation of the green at red and blue pixels by taking two gradients, one horizontal and one vertical to classify in a second step the optimal direction of interpolation. According to figure 20 and the algorithm of Hibbard, to perform the interpolation step at G12 , we should do: 1. First, we have to compute the two estimators horizontaly and verticaly as: ∆H = |G11 − G13 | and ∆V = |G7 − G17 | 2. Then, choose the direction of interpolation and reconstruct G12 : if ∆H < ∆V Then G12 = (G11 + G13 )/2 Else if ∆H > ∆V Then G12 = (G7 + G17 )/2 Else G12 = (G7 + G11 + G13 + G17 )/4 End 3. Use the smooth hue transition to reconstruct the Red and Blue channels. Gradient with Laplacian correction In [63] Laroche and Prescott adapt the algorithm from Hibbard in a 5 × 5 pixels neighbor where they assumed that a gradients is the same in the different channels due to the spectral correlation property. Using this assumption, they modify the gradient horizontal and vertical estimator from Hibbard by taking the Laplacian horizontal and vertical of the red or blue pixels to perform the interpolation of the green pixels. Later, Hamilton and Adams [17] combine the method from Hibbard use the method from Laroche and Prescott has a correction term for the estimator.

21


The algorithm from Hamilton and Adams leads to a huge improvement of the other proposed algorithms especially because it combines the information of the different color planes (spectral correlation property) and an estimator to perform interpolation in regard to the frequency content of the neighboring pixels (spatial correlation property). According to the figure 23 we estimate the green pixel G5 as:

Figure 23: Bayer pattern

1. First, we have to compute the two estimators horizontal and vertical as: ∆H = |G4 − G6 | + |2B5 − B3 − B7 | and ∆V = |G2 − G8 | + |2B5 − B1 − B9 | 2. Then, choose the direction of interpolation and reconstruct G5 : if ∆H < ∆V Then G5 = (G4 + G6 )/2 + (2B5 − B3 − B7 )/4 Else if ∆H > ∆V Then G5 = (G2 + G8 )/2 + (2B5 − B1 − B9 )/4 Else G5 = (G4 + G6 + G2 + G8 )/4 + (4B5 − B3 − B7 − B1 − B9 )/8 End 3. Use the smooth hue transition to reconstruct the Red and Blue channels. The correction of the gradients by the Laplacian is an efficient method and for a lot of edge sensing algorithm remains a starting point [54]. As the algorithm work on 5 × 5 window and takes few more computation to compute a Laplacian and a gradient Horizontal/Vertical and a decision step it makes the algorithm slightly more complex than Hibbard or the classical bilinear interpolation but we could notice the huge improvement of the classical method in terms of image quality and the few more computation will not be out of the range of most of the embedded systems (like FPGA).

22


Other edge sensing method Many other algorithms use edge sensing method to perform efficient interpolation. One intuitive technics proposed by Chang [46] was to use variable number of gradients to perform the interpolation step. In a 5 × 5 neighborhood, we compute several types of gradients according to the North, South, East, West, North-West, North-East, South- East, and South-West, and for those gradients a threshold value is define (according to the minimum/maximum value of the gradients and some weight). The values under this threshold are considered as the most similar color in the neighborhood and the values above represent the pixels of which contains details (high frequencies). Finally, the most similar pixels directions are averaged to reconstruct the missing color. By experimentation, we observe that using variable number of gradients doesn’t improve the interpolation compared to the interpolation proposed by Hamilton and Adams [17]. We also propose to replace the interpolation of the red and blue channels made by variable number of gradients by the popular smooth hue transition which improve the overall image quality and reduce the number of computations. In [18], Hirakawa propose to perform in a first step the interpolation of the three channels (RGB) horizontally and vertically by using filter bank technic. Then, he propose a homogeneity criterion between the neighboring pixels and the considered pixel to estimate if the reconstructed pixel is misguided or not. Those measures were done according to human perception in the CIE Lab color space. Nevertheless, as good as the algorithm is some interpolation artifact may remain, thus he proposed to perform an iterative (non-linear) post-processing step by median filtering the hue similar to the algorithm of [64]. At the end of the papers, he considered also the real-time implementation of the algorithm where he proposed, to reduce the overall cost of the algorithm, some alternatives like using the YCbCr color space rather than the CIE Lab computationally more expensive or considering the use of an L1-Norm in regard to L2 -Norm and obviously reducing the size of the window used to compute the distance metric.

Figure 24: Hirakawa’s algorithm

We denote many other technics using edge sensing methods. Nevertheless those methods use similar scheme as the Hamilton and Adams algorithm. Usually, they try to use superior criterion 23


to perform the interpolation or using more efficient estimator, as the criterion used by Hirakawa [18] or the algorithm of Chang [46] using more estimators to perform the interpolation step. Thus, those algorithms make them more or less complex in regard to the adopted scheme of the algorithm.

3.4.3

Linear weighted interpolation

One problem with the edge sensing algorithms is the computation cost which is variable due to the decision phase and the overall behavior which require two steps, first for the classification and second for the interpolation. One solution to overcome this problem was proposed by Kimmel [19], to integrate the classification and the interpolation in the same step. To make that he proposed to compute in a 3 × 3 neighborhood some weight according to their gradients and weight the green pixels neighbors. Those weights are adjusted for each green pixel. The red and blue pixels are interpolated by weighting smooth hue transition and a refinement step to update the convergence of the result is used. For example, the interpolation of the green is done as:

Gi,j =

wi,j−1 Gi,j−1 + wi,j+1 Gi,j+1 + wi−1,j Gi−1,j + wi+1,j Gi+1,j wi,j−1 + wi,j+1 + wi−1,j + wi+1,j

Where w is the weighting function computed as: 1 1+grad(Pi,j )2 +grad(Pi+k,j+l )2

wi+k,j+l = √

with k ∈ [−1, 0, 1] and l ∈ [−1, 0, 1]

And grad is the gradient in horizontal, vertical, diagonal and anti-diagonal direction, of the image from the CFA: gradh (Pi,j ) =

Pi,j−1 −Pi,j+1 2

gradd (Pi,j ) =

Pi−1,j+1 −Pi+1,j−1 √ 2 2

and gradv (Pi,j ) =

Pi−1,j −Pi+1,j 2

and grada d(Pi,j ) =

Pi−1,j−1 −Pi+1,j+1 √ 2 2

In his algorithm, Kimmel underlines that the segmentation in the image is a crucial point to perform accurate estimation of the weighting factor. Thus, Lu and Tan in [65] proposed a cost effective way to compute the weighting factor. They proposed to compute the gradient of the weighting factor, using the Sobel filter and replacing the square root and the squared gradient by taking only the absolute value of the gradients. He proposed also to use the same iterative step like the one proposed by Freeman [64] and used by Hirakawa [18] which removes most of the visible artifact. Other methods were denoted in papers [35], like Muresan which modifies the way of computing the weight by using only the green pixels (an average was taken for the missing green), then an interpolation by weighting smooth hue transition is made.

24


3.4.4 Frequency domain algorithm Frequency selection Using spatial frequency representation to perform the demosaicking of images is an interesting way to see the problem as it allows us to localize (in the case of periodic signals as the Bayer CFA) the components of chrominance in the Fourier spectrum. Thus, we could use this representation to select the appropriate signal to perform the reconstruction of a signal. Alleysson [13][66][67], in his algorithm, proposed to write a sampled image as a sum of the three signals modulated by a function m according to the Bayer CFA matrix:

fCFA (x, y) =

X

fk (x, y)mk (x, y)

(3.1)

k∈R,G,B

mk (x, y): is a function which is equal to 1 or 0 according to the position of color k in the Bayer matrix. fk (x, y): is the input signal (no subsampling of Bayer).  1  mR (x, y) = (1 − cos(πx))(1 + cos(πy))    4   1 (3.2) mG (x, y) = (1 + cos(πx)cos(πy))  2     1  mB (x, y) = (1 + cos(πx))(1 − cos(πy)) 4 Alleysson making the observation that the Luminance is present on each pixel (so the luminance is present at full-resolution), he proposed to re-write the equation (3.2) as the sum of luminance and chrominance: X

fCFA (x, y) =

fk (x, y)pk (x, y) +

k∈R,G,B

X

fk (x, y)m ˜ k (x, y)

(3.3)

k∈R,G,B

Where: k∈R,G,B fk (x, y)pk (x, y): denote the probability (pk (x, y)) that the chrominance of the input signal is present on the sensor. Assimilated as the luminance signal. P

P

˜ k (x, y): k∈R,G,B fk (x, y)m

denote the chrominance signal modulated by m ˜ k (x, y) according

to the Bayer matrix. Thus, Alleysson has proposed to design some filters for luminance and chrominance to extract those informations. One problem was to estimate the limit of the bandwidth of the luminance and the chrominance. Those limits are image dependent, thus Alleysson has decided to use a dataset of images and optimize his filter to get the best overall filter according to the image set. 25


It was proposed by Alleysson to use the following 11 × 11 kernels (Sums of Gaussian) optimized on 12 images of the Kodak set. For further, implementation, some test should be done, as the optimization should be not relevant for other particular scene [27].

Figure 25: Filter used to estimate the luminance

For the algorithm, Alleysson proposed to estimate in a first step the luminance at each pixels of the CFA image. Then, the proposed scheme is quite similar as a smooth hue transition, rather than we subtract the estimated luminance information to the chrominance components and we perform a bilinear interpolation. Finally, we add the luminance to the interpolated value to recall the chrominance components as it is done on the next figure:

Figure 26: Alleysson’s algorithm

Later, on the algorithm of frequency selection, Dubois has proposed to simplify the frequency ˘ âAS ¸domain representation of Alleysson and proposed new algorithms which use asymmetric filter to extract the components of luminance and chrominance. To simplify the equation (3.2), he proposed to replace the cosine function in the modulation function by −1 with the power the pixel coordinates:  1  mR (x, y) = (1 − (−1)x )(1 + (−1)y )    4   1 mG (x, y) = (1 + (−1)x+y )  2      m (x, y) = 1 (1 + (−1)x )(1 − (−1)y ) B 4 26

(3.4)


Then, re-writing equation (3.3), we have:

1 1 1 1 1 1 fCFA (x, y) = ( fR (x, y)+ fG (x, y)+ fB (x, y))+(− fR (x, y)+ fG (x, y)− fB (x, y))×(−1)x+y 4 2 4 4 2 4 1 1 + (− fR (x, y) + fB (x, y))((−1)x − (−1)y ) (3.5) 4 4 Which is equivalent in a luminance/chrominance space to:

fCFA (x, y) = fL (x, y) + fC1 (x, y)(−1)x+y + fC2 ((−1)x − (−1)y )

(3.6)

Where −1 = exp(jπ)

fCFA (x, y) = fL (x, y) + fC1 (x, y)exp(j2π(x + y)/2) + fC2 (exp(j2πx/2) − exp(j2πy/2))

(3.7)

From equation (3.6), Dubois made the observation that three components could be distinguished. One of luminance (L) present on each pixels, one of chrominance (C1) modulated at the spatial frequency (0.5, 0.5) (in cycle/pixel) and a second chrominance components (C2) modulated at (0.5, 0) and (0, 0.5). Thus, extracting the L,C1 and C2 will allows us to estimate the RGB values.

    1 1 1 fL fR 4 2 4 fC1  = − 1 1 − 1  fG  4 2 4 1 fC2 fB − 41 0 4      fL fR 1 −1 −2 fG  = 1 1 0  fC1  1 −1 2 fC2 fB 

(3.8)

(3.9)

Then, Dubois has proposed to use optimized asymmetric filters to efficiently extract the components (avoiding the spectrum overlap of the three signals) fL , fC1 and fC2 . Dubois observed that the spectrum overlap between the luminance and the chrominance mainly occurs on the horizontal or the vertical axis of the Fourier spectrum. Thus, in his algorithm, he made the choice to apply higher weight to the component which is less affected by overlap on the luminance.

FCFA (u, v) = FL (u, v) + FC1 (u − 0.5, v − 0.5) + FC2 (u − 0.5, v) − FC2 (u, v − 0.5)

(3.10)

Frequency and spatial analysis In [27][49-50] Lian exposed some limitation of the algorithm which are using frequencies selection technics. As the frequencies which involve are image dependent, the optimization of an efficient filter is a little bit "tricky". Lian made also an other observation, due to some limitations 27


Figure 27: Fourier spectrum of an image with the luminance and the 8 chrominance band.

of the frequency selection algorithms, the high frequencies over the horizontal and vertical axis of the Fourier spectrum (where the human visual system is more sensitive)are cut du to the lowpass filter. Thus, he proposed to use only the green components to estimate the luminance which are not overlapping over C2. Due to the particular arrangement of the green pixels, Lian writes that x + y = even and cos(π(x + y)) = 1, sin(π(x + y)) = 0 cos(πx) = cos(π(x + y − y))

cos(πx) = cos(π(x + y))cos(πy) + sin(π(x + y))sin(πy)

cos(πx) = cos(πy)

(3.11)

Thus, the equation (3.5), could be re-written as:

1 1 (fR (x, y)+2fG (x, y)+fB (x, y))+ (−fR (x, y)+2fG (x, y)−fB (x, y))cos(πx)cos(πy) 4 4 (3.12) As we could see, the components other the horizontal and vertical axis components cancelled for green pixels and only the components other the corner remains. Thus, Lian has proposed a technics which preserved more accurately the luminance estimation at green pixels for horizontal and vertical frequencies. And, comparing to the frequency selection algorithm, the low-pass filter used to estimate the luminance doesn’t require a large support (5 × 5 for Lian versus 11 × 11 for Alleysson) which is obviously more efficient for the complexity of the algorithm. According to the result [48], the proposed method gives a better reconstruction of the luminance components.

FCFA (x, y) =

To have a better insight of what the Lian algorithm does we summarize some of the key points:

28


Figure 28: Filter proposed by (11 × 11) Alleysson (a) and (5 × 5) Lian (b)

^ other the green pixels using the 5 × 5 kernel proposed in [27]. 1. Estimate the luminance (L) 2. At green pixels, estimate bilinearly the red and blue components. 3. Using a linear weighted interpolation technics, interpolate the difference of L − R or L − B. ^ according to: Extract the luminance component L P ^ y) = R(x, y) + L(x,

p∈Ω (w(p)

P

p∈Ω

^R (p)) ×L

w(p)

(3.13)

Where Ω = [(x − 1, y), (x + 1, y), (x, y − 1), (x, y + 1)] And, for p = (x − 1, y) at R pixels we have,

w(x − 1, y) =

1 (1 + |R(x, y) − R(x − 1, y)| + |L(x + 1, y) − L(x − 1, y)|)

4. Then, an iteration step is required to refine the luminance component by making an update of the value of R and B as:

^ − 1, y) = L(x − 1, y) + 1 (L ^R (x, y) + L ^R (x − 2, y)) R(x 2 ^ and B. ^ Then, we use equation (3.13), to back project the luminance with the refine value R 5. Finally, perform a bilinear interpolation of the difference against the chrominance components (L − R,L − G,L − B) and add the luminance to each interpolate plane to recall R,G

29


and B. Wavelet method As related before, Gunturk [16], made the observation that the color channels are highly correlated (especialy in the high frequencies). Thus, it implies that the textured zone and the edge location are highly correlated (see property 1 and 2). Another, key observation of Gunturk was the fact that we have two times more green sample, thus the green is less affected by aliaising and is more likely to preserve the high frequencies of the image than the red and blue channels. More generally, the appearance of color artifacts are mostly caused to aliasing in the red and blue channels and the inability of the algorithm to use in an efficient manner the correlation of the channel, and, thus reconstructing the missing color according to its original frequency. To remove the aliasing, Gunturk has proposed an algorithm which alternately projects the high frequencies on two constraint sets. Those sets exploit the frequency correlation inside the image channels by forcing the high frequencies to correlate between each channels and constraint the projection according to the observed data of the CFA. Using a filter bank technics, Gunturk decomposed the different channels of a CFA image into several sub-bands (LL,HL,LH,HH). Thus, we can copy the high frequencies of the different channels to constraint the correlation of the frequencies in an image. And, from the sub-bands, we can reconstruct the image and enforce the channels to keep the value of the CFA. More detail of the algorithm was given in [16] and [34]. Here we will summarize the main step of the algorithm: 1. Estimate the R,G and B components using the Hamilton and Adams algorithm [17]. 2. Sub-sampled by two (in row and column) the G, R and B plane to keep only the pixels present on the CFA image (the G is sub-sampled according to the considered line R or B, thus two sub-sampled G components are extract). 3. Analysis: using the low-pass filter (h0 = [1 2 1]/4) and the high-pass filter (h1 = [1 −2 1]/4) decomposed the signal into sub-band, k ∈ (R, G, B):

IkLL (x, y) = h0 (x) ∗ [h0 (y) ∗ Ik (x, y)]

(3.14)

IkLH (x, y) = h0 (x) ∗ [h1 (y) ∗ Ik (x, y)]

(3.15)

IkHL (x, y) = h1 (x) ∗ [h0 (y) ∗ Ik (x, y)]

(3.16)

IkHH (x, y) = h1 (x) ∗ [h1 (y) ∗ Ik (x, y)]

(3.17)

30


4. Synthesis: update the high frequencies of the G channel by using the high frequencies of the R and B channels. And use the synthesis filter to reconstruct the G channel (g0 = [−1 2 6 2 − 1]/8 and g1 = [1 2 − 6 2 1]/8)

G(R)

G(R)

G(R)

IG(R) (x, y) = g0 (x)∗[g0 (y)∗ILL (x, y)]+g0 (x)∗[g1 (y)∗ILH (x, y)]+g1 (x)∗[g0 (y)∗IHL (x, y)] G(R)

+ g1 (x) ∗ [g1 (y) ∗ IHH (x, y)] (3.18) We obtain two green plane (IG(R) (x, y)) and (IG(B) (x, y)) according to the R and B line. 5. Put the value of IG(R) (x, y) and IG(B) (x, y) at the right position to reconstruct the G plane

Figure 29: Pipeline of the Gunturk’s algorithm

A similar process is applied to the red and blue pixels where we use the high frequencies of the green pixels to enforce the correlation of the details in the image. This step is iterative and converges in approximately 8 iterations. Thus, the different step of analysis, synthesis and iterations make this algorithm impossible to implement in real-time. But, the Gunturk algorithm was widely used as benchmark because of the very good result obtain from it. Later on the alternating projection of the Gunturk’s algorithm, Yue M. Lu [20] proposed a one-step version of the Gunturk’s algorithm which reduces substantially the number of computation of the famous Gunturk’s algorithm. In the paper proposed by Lu, it was proposed to decompose a 2D image as four polyphase components (see figure ( )) as X00 , X01 , X10 and X11 where X is equivalent to the components present on the Bayer matrix (G00 , R01 , B10 and G11 ). Thus, we could write the input signal in the z-domain [68] as:

X(z) = X00 (z2 ) + z2 −1X01 (z2 ) + z1 −1X10 (z2 ) + z1 −1z2 −1X11 (z2 )

(3.19)

And, expanded the decomposition of the low-pass filter L(z) (used to decomposed the LL sub-band in equation (3.15)) in terms of polyphase components:

L(z) = L00 (z2 ) + z2 −1L01 (z2 ) + z1 −1L10 (z2 ) + z1 −1z2 −1L11 (z2 ) 31

(3.20)


Figure 30: Locations of the four polyphase components

Thus, to obtain the (LL) sub-bands Y(z) = X(z)L(z), we replace the value X(z) and L(z) and we obtain a sum with 16 terms:

Y(z) = φL (z)ξX (z)

(3.21)

Where φL (z):

And ξX (z) is the Fourier transform of the polyphase vector X(z). Lu, also demonstrate that the Gunturk’s algorithm written as:

˜ 0 (ω)X(ω) + Y(ω) = A0 (ω)A

3 X

^ ˜ i (ω)G(ω) Ai (ω)A

(3.22)

i=1

Could be simplified as:

^ Y(ω) = L(ω)X(ω) + (1 − L(ω))G(ω)

(3.23)

P3

˜ i (ω) (as the set of filters satisfy the perfect reconstruction where (1 − L(ω)) = i=1 Ai (ω)A ˜ 0. of the image (see the low-pass filters proposed by Gunturk)) and L(ω)) = A0 (ω)A From the equation (3.22), Lu observed that only the low-pass required to obtain X(ω) and ^ G(ω), thus the high-pass filter are not required and the problem could be solved as the equation (3.23). A full-demonstration was given of the algorithm was done in [20]. To have a better 32


insight of the algorithm we give the overall pipeline of the algorithm and focus on convergence part of the algorithm which is less complex in term of computation:

Figure 31: Lu’s algorithm. Where s01 and g01 are respectively the signal and the green at 01. F00 , F10 , F11 are the polyphase filter computed for the kth iteration.

3.4.5 Other algorithms Coherence of the direction of interpolation In [25], Zhang proposed to perform the demosaicking step by directional linear minimum mean square-error estimation (DLMMSE). He proposed a more efficient way, than Hamilton and Adams, by using two levels (on the horizontal and vertical green pixels) of estimation. Then, from the computed level, Zhang use the difference inter-channel and combined the computed levels using a linear minimum mean-square error to perform the interpolation step. Later on the DLMMSE, Pekkucuksen [54] address some limitation of the DLMMSE algorithm. He observed that for the DLMMSE only some pixels (which are contains on the same horizontal or vertical line) are considered to compute the estimator which maybe insufficient to perform an efficient decision. Another, limitation was on the selected direction of a pixels falling near edges or in texture regions. Thus, Pekkucuksen proposed to overcome those two problems by selecting locals window on the north, south, east and west directions and taking into account of all the pixels inside the local window. 1. Estimate the interpolation other the horizontal and vertical according to Hamilton and Adams algorithm [17]. 2. For red pixels (blue pixels are estimates in a similar manner), compute the color difference ˜V ˜H ∆ g,r and ∆g,r other the horizontal and vertical. 3. Then, a weighted sum interpolation is computing as:

33


V H 0 H 0 ˜ g,r (i, j) = [wN ∗f∗∆V ∆ g,r (i, j)+wS ∗f∗∆g,r (i, j)+wE ∗∆g,r (i, j)∗f +ww ∗∆g,r (i, j)∗f ]/wT (3.24) Where wT = wN + wS + wE + wW and f = [1 1 1 1 1]/5

And weight were computed according to :

i X

wN = 1/(

j+2 X

2 DV a,b )

(3.25)

a=i−4 b=j−2 j+2 i+4 X X

wS = 1/(

2 DV a,b )

(3.26)

a=i b=j−2 i+2 X

wW = 1/(

j X

2 DH a,b )

(3.27)

a=i−2 b=j−4

wE = 1/(

j+4 i−2 X X

2 DH a,b )

(3.28)

a=i+2 b=j V V H H H Where the gradients are defined as DV i,j = |Di−1,j − Di+1,j | and Di,j = |Di,j−1 − Di,j+1 |

4. Finally, we could estimate the green pixels as:

˜ j) = R(i, j) + ∆ ˜ g,r (i, j) G(i,

(3.29)

5. Then, we estimate the blue pixels at red pixels and the red pixels at blue pixels according to:

˜ i,j = G ˜ i,j − ∆ ˜ g,b (i − 3 : i + 3, j − 3 : j + 3) ⊗ prb B

(3.30)

˜ i,j = G ˜ i,j − ∆ ˜ g,r (i − 3 : i + 3, j − 3 : j + 3) ⊗ prb R

(3.31)

With:



prb

0 0  −1  = 0 −1  0 0

0 −1 0 −1 0 0 0 0 0 10 0 10 0 0 0 0 0 10 0 10 0 0 0 0 0 −1 0 −1

34

 0 0 0 0  0 −1  1 0 0  × 32 0 −1  0 0 0 0


6. Finally, we interpolate the missing red and blue pixels using bilinear interpolation technics by taking a V4 neighborhood. The algorithm of Pekkucuksen gives really good result in the fact of it high respect of the high frequencies and it low amount of false colors generated. It is also free of iterative step which maybe a relevant choice for real-time application.

35


4 4.1

Image Quality Assessment

Introduction

In order to give a rise to the demosaicking performances of the algorithm, we have to define and review the different process of image quality assessment (IQA). Over the last decades, a wide range of IQA algorithms [69] were proposed to measure some problems according to the human perception or not and with or without reference images. To assess the quality of an image, several technics could be applied: we could use objective or subjective measurement. The first one relies on mathematical models and the measure of an error to perform the evaluation of the quality of the images. The second one is based on numerous experimentations with humans to assess images qualities. In this thesis we will focus more on the review of the different objective measurement algorithms done in the field of demosaicking than the subjective measurement methods as it was done in [68] and [34]. Subjective measurements could be considered as the most accurate way to quantify some impairment on our visual experience, but they are harder to set up, expensive and time-consuming. For our application, subjective measurement will be probably not relevant for the IQA as most of the benchmarks in the industry use objective measurement technics to qualify a camera. But, before the presentation of the different metrics, in a first section we will have to define some of the inherent problems of the demosaicking process and in a next section we will have to find some relevant dataset to define an IQA process.

4.2

Demosaicking Artefact

In our experimentation we have seen that none of the algorithms were able to reconstruct perfectly the images as it is in the reference image. We make also the observation that different kind of pattern or artifact appears and are redundant for some algorithms. Generally, most of the survey papers [3][34][68] try to explain those artefact in the spatial domain according to the chosen algorithm, but a more elegant way was proposed by Alleysson [66] by studying the resulting artifact in the frequency domain. 1. Zipper effect: is a well-known pattern in the demosaicking field, it looks like to a high-lowhigh or low-high-low pattern. Usually, this effect appears on the edge of two surfaces with strong transitions color levels. The non-adaptive algorithms are the most affected by the zipper artifact as they don’t take into account the content of the image resulting of some miss interpolation of two different color levels. By experimentation, we also observed that an image could be affected or not by the zippering in regard to the phase used in the Bayer pattern. We also notice that, by shifting the Bayer pattern, that the hue of the zipper pattern is affected which could be even more problematic in video camera. In regard to the move of the camera, we could observe in some 36


cases the hue shifting, it results to a glitter effect which is unpleasant and eye-catching.

Figure 32: An example of zipper artifact

2. False Color: is another well-known artefact in demosaicking. It is mainly characterize by the emergence of aberrant colors without special pattern. This phenomenon is highly visible in the area of high spatial frequencies where the color components of the neighboring pixels are not that much correlated which lead to miss interpolation and the loss of correlation against the reconstructed colors of a pixel. Alleysson explain in [66] that the false color artefact is the result of a selection of a too wide band in the high frequencies filters (overestimation of the chrominance) for the luminance estimation which result of a spectrum overlap between the chrominance and luminance. Like the zipper effect, the hue of the false color artefact is affected by the phase shifting of the Bayer matrix. This artefact is even more problematic as it appears on textured zone. 3. Water color: is mainly related to the false color artifact. If the false color artifact is an overestimation of the chromatic component, the water color effect is an underestimation of the chromatic band. It results of dull colors. 4. Grid effect: is generally visible when the bandwidth of the low pass filter (which estimates the luminance) is overestimated, thus it results of cross-talking phenomenon between the achromatic and chromatic channel. In the spatial case, we could see this artefact on some edge directed algorithms which use (only) a horizontal/ vertical estimator and tend to struggle when the content of an object of the scene is diagonal. 5. Blurring: is mainly due to the low pass filter used in the demosaicking algorithms (like bilinear interpolation) which tend to smooth or cut the high spatial frequencies. Consequently, the image loses in sharpness. Alleysson explain this phenomenon as an under-estimation of the chosen band for the lowpass filter which estimate the luminance.

37


Figure 33: Example of possible artefact

4.3

Test Images

To illustrate and assess the performance of each of the implemented algorithm, a relevant dataset is required. In most of the objective measurement technics we need to have a reference image. According to the pipeline of the different camera system, the image will be affected by the zoom response and the optical low-pass filtering (and the performance of the demosaicking algorithm if the camera is mono-sensor). Considering that, not every image will be suitable to assess the performance of the algorithms as there will be some deviation introduce by the acquisition process of the camera [29]. One very popular dataset used in most of the publication [29], is the image dataset of Kodak, composed of 24 images of resolution 768 × 512 which are scanned from film at a high resolution. Those images are really popular as it allows pushing the different algorithms to their limit and revealing if they are prone to different kind of artifact.

38


Figure 34: left: Kodak images, right: McMaster images

More recently, in [29] and [42] exposed the problem of demosaicking images with weak spectral correlation. Gunturk addressed the problem in [29], that most of the model in demosaicking assumes slow variation of the colors which is not always the case in natural scene. This is why in [42], it was proposed to use a new dataset from the McMaster University of Canada. Those images were digitized as the Kodak dataset from film at the resolution of 2310 × 1814 and because of their size some cropped region were define (generally sharp region with high spectral transition) to get 18 images. As shown in [42] the table of figure 35, the spectral content, of the Kodak images, is highly correlated. The mean saturation and gradients of the Kodak dataset appear to be smooth and less saturated compared to the McMaster dataset which may also indicates some processing in the image of the Kodak dataset [42]. As both (high and low)spectral correlation, it was proposed to use both of them to assess the performance of the algorithms [29].

Figure 35: Statistics of the Kodak and the McMaster datasets from [42]

To perform the IQA step we also proposed to use some sinusoid images generated by informatics and varying in spatial frequency and phase, which will gives us a good indication at which frequency the algorithm will start to fail (measures of modulation other increasing frequencies). And severals other informatics generated images like a slanted edge, Fresnel function, some special color pattern to see the limit of demosaicking algorithm or in some case IQA algorithms.

4.4

Objective evaluation

This section will refer to objective measurements technics, which are mainly based on different mathematical models. We will distinguish in this thesis three class of IQA algorithm. In a first class, we will have what we call the fidelity measurement technics, which refers to pure comparison of the reference image and the resulting image (pixel wise for most of them) without

39


taking into account the Human Visual System (HVS). In the next class, we will introduce some metrics refered as perceptual measure. Those metrics take into account of the HVS, by making comparison of a reference image to the resulting. In the final class, we will define some frequency measurement technics which generally use some pattern (sinusoid, step,...) to highlight some loss of modulation in some spatial frequencies. For the most popular method, in the demosaicking field [34], we could identify measurement technics which use full-reference images (Kodak dataset), and measure Signal Noise Ratio (see PSNR subsection) or Mean Squared Error (see MSE subsection). Those methods has a low computation cost and oftenly they are used as regularization parameters for synthesis of filters. But, the correlation against our visual perception is not enough [34-35], thus, in the demosaicking field, we usually combine some fidelity measure with fidelity measure according to the human perception. Maybe more sparsely seen in IQA in the demoaicking field, we proposed to use some frequency measurement technics which will give us an indication on the behavior of our algorithms in term of spatial frequency.

4.4.1 Fidelity measurement Mean Absolute Error The Mean Absolut Error (MAE), measures the absolute variation of the reference image I and the estimated image Î. The lower is the value of the MAE, the closer will be the two images. To compute the MAE, we have:

MAE(I, Î) =

1 3XY

X

X−1 X Y−1 X

|Ikx,y − Îkx,y |

k=R,G,B x=0 y=0

Where k is the considered channel of the image, (x,y) are the pixel coordinate of the image and X, Y are the horizontal and vertical resolution of the image. For 8bits images, the value of the MAE is 0 ≤ MAE(I, Î) ≤ 255. Mean Square Error The Mean Square Error (MSE), measures the square variation of the reference image I and the estimated image Î. As the MAE, the lower is the value of the MSE, the closer will be the two images. To compute the MSE, we have:

MSE(I, Î) =

1 3XY

X

X−1 X Y−1 X

(Ikx,y − Îkx,y )2

k=R,G,B x=0 y=0

Where k is the considered channel of the image, (x,y) are the pixel coordinate of the image and X, Y are the horizontal and vertical resolution of the image.

40


For 8bits images, the value of the MSE is 0 ≤ MSE(I, Î) ≤ 2552 . Peak signal-noise ratio The Peak signal-noise ratio is a measure especially used to measure the performance of an encoder in coding and compression of data. In the literature the PSNR is generally prefered to assess the quality of the demosaicking algorithms. Here, the higher is the value of the PSNR, the better is the quality of the estimated image. The PSNR is compute as:

PSNR(I, Î) = 10.log10 (

d2 ) MSE(I, Î)

Where d depends on the number of bits to encode a pixel (e.g.: for 8 bits image d = 28 − 1 = 255. Generally, the different algorithms average value of PSNR is between 30 dB and 40 dB in the Kodak dataset. Cross-Correlation In [34] it was proposed to use cross-correlation technics for IQA of their demosaicking algorithm. The cross-correlation technic measure the covariance/similitude of a reference signal against the reconstructed signal. PX−1 PY−1

^ µ x=0 y=0 Ix,y Ix,y ) − XYµ^ C(I, Î) = | PX−1 PY−1 | P P 1 1 X−1 Y−1 ^2] 2 [( x=0 y=0 I2x,y ) − XYµ2 ] 2 [( x=0 y=0 Î2x,y ) − XY µ (

Where µ and µ ^ is the average of the gray levels of a color component. X and Y are the numbers of columns and rows. Cross-correlation is computed pixel wise as the absolute value of the sum of the multiplication of the intensities of the reference and demosaicking images minus the number of pixel multiply by the average value of each images (reference and demosaicking and normalized by the multiplication of the square root of the sum of intensities minus the number of pixel multiply by the average value the intensities in respect to the reference or demosaicking image. The correlation is computed for each color plane and then average to obtain the overall correlation of the two images. The result falls between 0 and 1, the better is the reconstruction the closer to 1 is the result. This metric is used more sparsely in the publication; we thought that the main reason for this is the overall complexity of the cross correlation. Another reason could be that the error based method like PSNR or MSE may be more appreciable for their users because of their simplicity. Structural similarity index (SSIM) Similar to the cross correlation metric, Zhou Wang [70] proposed to measure structural distortion. This measurement was proposed as the HVS is more prone to see structural deformation of

41


a scene rather than pixels errors. Denoting x = (xi |i = 1, 2, ..., N) the reference image vector and y = (yi |i = 1, 2, ..., N) the demosaicking image vector:

SSIM =

(2^ xy ^ + C1 )(2σx y + C2 ) (^ x2 + y ^ 2 + C1 )(σ2x + σ2y + C2 )

Where x ^ and y ^ are the averages of x and y vector. σx and σy are the variances of x and y vector. The error measured is between 0 and 1 and as the cross correlation the best quality value is 1 (meaning that the 2 signals are the same).

4.4.2

Perceptual Measure

To give a better insight of how the human visual system perceive color differences, some metrics and different color spaces were designed to assess the overall image quality. ∆E CIELAB CIELAB color space: CIE L*, a*, b* was born from the willingness to design a colorimetric space that models the process to encode a signal according to our visual perception. CIELAB is also considered as a uniform color space in terms of differential perception of colors. This means that for two equal distances, we will see the same variation of a color in terms of visual perception. The color conversion of RGB tristimulus value to the L*a*b* color space was given, first by performing the linear conversion of the RGB value to the CIE XYZ space and in a second time the XYZ value to the L*a*b* space with the following relationship :  r  Y Y   116 × 3 − 16 if W > 0.008856 W ∗ Y Y L =  Y Y   903.3 × W if W ≤ 0.008856 Y Y and, X Y ) − f( W )) XW Y Y Z b∗ = 500 × (f( W ) − f( W )) Y Z

a∗ = 500 × (f(

where :   f(x) =

√ 3 x

if x > 0.008856

 7.787x + 16 if x ≤ 0.008856 116 42


(Xn , Y n , Zn ) are the coordinates of the reference white. If we look closer to the formula, we could see that the CIELAB color space organises our visual perception system as three opponents components: - Black/white - Red/Green - Yellow/Blue In the CIE 1931, the distance of two colors is define as a simple Euclidian distance. The lower is the ∆E value, the lower is the visual perception.

L∗a∗b∗

∆E

X−1 Y−1 s 1 XX ^ (I, I) = XY

X

(Ikx,y − Îkx,y )2

k=L∗,a∗,b∗

x=0 y=0

Generally, it is said that if ∆E < 2 then the color difference is hard to perceive and if ∆E > 4 then the color difference is great and easily perceived. As the CIELAB space has some perceptual non-uniformities, other measure of ∆E (CIE 1994, CIE 2000, CMC) were introduced. The result was the introduction of specific weighted parameters for each components of the tristimulus values. S-CIELAB As related before the ∆E CIELAB measures the difference between two colors according to our visual perception. S-CIELAB simply introduces a spatial property (extension) of our perception in the space. As proposed in [36][37], the RGB tristimulus value are converted into device independent format, CIE XYZ. Then, the XYZ value are represented into an opponent color space representation (AC1 C2 ) according to :



    A 0.297 0.72 −0.107 X C1  = −0.449 0.29 −0.077 Y  0.086 −0.59 0.501 Z C2 After the conversion, we define the filters which approximate the contrast sensitivity functions of the human eye for each components AC1 C2 according to :

filter = k

X i

And, 43

wi E i


2

Ei = ki e−(x

+y2 )/σ2 i

k and ki are the normalize the sum of the kernel to be equal to one. wi are weights parameters. And, σi are the spread parameters of the Gaussian functions define according to our visual sensitivity of each channel of the opponent colors space. Then, by convolution, we filter each channel with the corresponding kernel and we go back to the CIE XYZ color space according to :

   −1   X 0.297 0.72 −0.107 A Y  = −0.449 0.29 −0.077 C1  Z 0.086 −0.59 0.501 C2 Finally, we compute a ∆E measure (after converting XYZ in L*a*b*) as explain in the precedent paragraph. Normalised Criterion Difference (NCD) NCD is another perceptual measure used more sparsely in the demosaicking field. The NCD is expressed as the perceptual color error between the reference and output color vector, normalized by the magnitude of the reference image. PX−1 PY−1 qP NCD(I, Î) =

x=0

y=0

k k=L,a,b (Ix,y

PX−1 PY−1 qP x=0

y=0

− Îkx,y )2

k 2 k=L,a,b (Ix,y )

The value of the NCD is between 0 and 1. A value close to 0 indicates a good quality of demosaicking, 0 is the worst.

4.5

Frequency domain measure

Modulation transfer function The Modulation Transfer Function (MTF) is a method of assessment of a signal processing set-up and also a way to measure the response of an optical system. Given an input sine signal, we could measure in the frequency domain (frequency measured as cycles or line pairs per distance (millimeters, inches, pixels, or image height), the variation/degradation of the contrast of the pattern against the spacial frequencies. If we consider a zero value to a black bars and a maximum value for a white value we could express the MTF as:

MTF =

Imax − Imin Imax + Imin 44


Figure 36: left: Input sine image, right: Output sine image with affected contrast

Figure 37: Example of computation of the MTF

The MTF is an important tool because it gives an important indication of the ability of a system to transfer contrast at a particular resolution from the object to the image. So, we have the information of the resolution and the contrast in a single measure as the frequency are increasing in the pattern the pattern will have more and more problem to transfer the decreasing contrast.

4.6 4.6.1

Artefact analysis Blurring Measure

In [71] Marziliano explains a way to measure blur and ringing artefact applied to the JPEG2000. The proposed method could be with or without reference image. If the reference image is known we will compare the blurring measure of the output image against the blurring measure of the reference image. Otherwise, we will compare the same output image with different processing. In his algorithm, Marziliano estimates the blur in an image by making an estimation of the horizontal gradient (Sobel filter). After applying a threshold to the gradient, we pick a line as a profile and we try to find a left and right extrema or minima as illustrated in figure 38. Then, we measure the width between the extrema and minima. And, we pick an average of the sum of estimated width to get a global measure of how much is affected the image by blur. Finally, we repeat the same process to the reference (if we have it) image and we compare to the output image to have an estimation of the introduced blur. (Note: the estimated pixels from the gradient should be the same in the reference and output images to perform the blurring measure at the same position). As explained in [38], the JPEG2000 artifacts only appear on the horizontal and vertical without taking of other directions which will be probably not representative for the possible blur directions in a demosaicked image. To overcome this problem, in [38] proposed a no-reference

45


Figure 38: Profile of an Edge use to measure the blur in an image

algorithm to estimate the overall blur in an image. A profile is drawn according to the gradient magnitude and the phase (direction). A measure of the slope is the computed and summed with the other profile measure. At the end an average value is given. Here is the framework proposed in [38]:

46


Edge Slope Measure : 1. Reduce the noise by using a Gaussian filter 2. Compute and vertical derivatives  the horizontal    : −1 0 1 −1 −1 −1 ∂ −1 0 1 and ∂ =  0 0 0 ∂x = ∂y −1 0 1 1 1 1 3. Compute at each pixel, the gradient magnitude and phase : q ∂I ∂I 2 ∂I 2 g = ( ∂x ) + ( ∂y ) and θ = tan−1 ( ∂y ∂I ) ∂x

4. In regard to the gradient directions, take the local gradient maximum as an edge point. 5. Draw an edge profile by finding the local extrema of an edge point. 6. Compute the slope of that edge profile: S(i, j) =

∆y (i, j) ∆x (i, j)

where ∆x and ∆y is the edge width an height. 7. Average the sum of the absolute value of the slope by the number of edge point : P (i,j)∈N |S(i, j)| SM = N Where N is the number of edge points. From our observation, we found that the blurring measure could be affected by the false color (as the picket of the lighthouse in the figure 49) which appears on high frequency area of the image. Thus, there is a possible miss-interpretation of the proposed sharpness measure.

4.6.2

False color measure

In [38], proposed also to use the edge point detected in the Edge slope Measure (see paragraph 4.6.1) to perform an estimation of the ratio of false color by using a constant color difference model:

47


False Color Measure : 1. Estimate the edge points in the same way of the Edge Slope measure. 2. At each edge points (i, j), compute the difference between Gi,j and Ri,j channels Hi,j = (Gi,j − Ri,j ) 3. For an edge point (i, j), compute the median value Mi,j of (G − R)5×5 in a 5 × 5 neighbor. 4. Compute the MSE between Hi,j and Mi,j : P FCMR =

(i,j)∈N (Hi,j

− Mi,j )2

N

Where N is the number of edge points. We could apply the same measure for the Blue channel by replacing R by B. Then, by making the difference between the green and red pixels value and the median value (of green minus red) in a 5 × 5 neighborhood we compute a mean squared value. In the zone of high frequency like the picket of the lighthouse, we could find that some false colors could appear on some textured zone and not only on the edge. Applying this scheme only on the edge will result of an under-estimation of the false color generated on textured zone.

4.6.3

Zipper measure

In [72], Lu and Tan proposed an algorithm to detect the perceptual percentage of zipper in an image. This measure is done with reference, and for them the zipper is characterize in regard to a middle pixel and his closest neighboring pixel (in the reference image) as an increasing value of the color difference against the two same pixels in the demosaicking image. 1. Identify the closest neighboring pixel (i) of the middle pixel (P) in a 3 × 3 window (in the reference image) as : I = mini∈N ∆Eab (P, i) 2. Compute the color difference (∆E) of the middle pixel (P) and and it closest one (I) (in the reference image). ^ at the same (P) and (I) in de demosaicking image. 3. Compute the color difference (∆E) 4. Finally, compute the variation of the color difference between the reference and the demosaick images:

^ ab (P, I) − ∆Eab (P, i)| ψ = |∆E

48


Then, a threshold value of 2.3 is picked which is the human perception threshold to perceive color differences. In the thesis [34], Yang explains that the algorithm tends to over-detect the pixels as shown in figure 39. If we use the algorithm described above, we will see that the middle pixel P will be assimilated as a zipper artifact as the color differences between P and P 0 of the reference image and the demosaick image is increasing and we shouldn’t get that as the color of P isn’t affected at all.

Figure 39: Example of over-detection of zipper

Making the observation that the zipper artifact is mainly present on the horizontal/vertical edges (due to the fact that most of the interpolation technics perform the interpolation step according to the horizontal/vertical) and characterized by an alternating pattern due to the miss interpolation of the green pixels. Yang has proposed some modification of the algorithm of Lu and Tan. He proposed to estimate the zipper pattern in the horizontal and vertical direction by using the green channel. 1. First he estimates the homogeneity of the green level in the horizontal and vertical direction in a 3 × 3 window, by estimating a directional variance according to: P1 P1 1 G x 2 y y 2 σx (P) = 31 i=−1 (IG x+i,y − µ (P)) and σ (P) = 3 i=−1 (Ix,y+i − µ (P)) Then, he chooses the direction where the green levels are the closest to the middle one (lowest variance among the vertical or horizontal direction):

δ = argmind∈(x,y) (σd (P)) 2. Then, according to the chosen direction, he tries to measure an alternating pattern ("highlow-high" or "low-high-low") in the reference image:

αx (I, P) = |Ix−1,y − Ix,y | + |Ix,y − Ix+1,y | + |Ix−1,y − Ix+1,y | αy (I, P) = |Ix,y−1 − Ix,y | + |Ix,y − Ix,y+1 | + |Ix,y−1 − Ix,y+1 | If he has an alternating pattern the result will be strictly superior to zero and zero otherwise. 49


3. Then, he computes and compare the amplitude value of the alternating pattern in the demosaick image to the reference image amplitude pattern. If the amplitude pattern of the demosaick image is higher than the pattern of the reference, then the "high-low-high" or "low-high-low" pattern was amplified, so, the considered pixel suffer of zippering. 4. Finally, he applies a modify version of the Lu and Tan algorithm to estimate the closest pixel P 0 from the middle pixel P in the chosen direction, then from the second step of Lu and Tan, we follow the same method. After some test on the algorithm, we found that the zippering wasn’t detected on the diagonal. The assumption that the zipper is more likely to appear on the edge of object is true as many algorithm are based on the horizontal or vertical interpolation direction but for some of them, this not only the case. Thus, if we have some amount of diagonal content in the image it will be almost impossible to distinguish an algorithm based on the interpolation of horizontal, vertical and diagonal direction (which may be less affected of zipper) from an algorithm taking into account only the horizontal and vertical (and more affected of zipper in the diagonal).

Figure 40: Example of pattern considered as zipper

When we get a high frequency pattern like the above image on the left; we could identify this artifact as zipper if we follow the algorithm but the pattern looks more to a grid artifact than a zipper artifact but in this case the interpretation is more dependent of how people identified the zipper.

4.6.4

Alternating hue in a video

From our observation, we saw that shifting an image of the order of a pixel in the horizontal, vertical or diagonal may change the hue of some artifact especially visible on some false color or zipper artifacts which result of glitter effect. This glitter effect is highly visible when we look at a video. When the phenomenon happen, the HVS tend to focus on those zone and see only the artifact. In order to measure this effect, we propose a simple algorithm to locate the region and give an insight of how much the image is affected by the changing hue. We propose to measure the average ∆E of several appropriate translations (diagonal, horizontal, vertical) of the image to 50


Figure 41: Example of alternating hue by shifting the phase of the bayer matrix

gives an insight this glitter effect. 1. First, we simulate all the possible phase of the Bayer pattern according to a reference image. 2. Then, we apply the chosen demosaicking algorithm on the four images to obtain: IGR , IGB , IRG , IBG where GR, GB, RG, BG denote the first two pixels of the Bayer pattern. 3. After, we perform the measures of ∆E according to different possible transitions: ∆E(IGR , IRG ), ∆E(IGR , IBG ), ∆E(IGR , IGB ), ∆E(IRG , IBG ), ∆E(IRG , IGB ), ∆E(IGB , IBG ). 4. Finally, we perform an average of all the ∆E value to estimate the glitter. For later works, we aim at realizing an evolution of the algorithm in a no reference way. We observe, during different experimentation that the optical flow (Farneback optical flow) is completely distorted by the alternation of hue in the image. We also found an interesting idea in [40] which try to detect temporal aliasing by using a wavelet decomposition of n- level to perform an accurate motion estimation in video signals. By using this temporal aliasing and appropriate wavelet decomposition, we may identify the sub bands affected by the temporal aliasing (thus the alternating hue pattern) in the video without prior knowledge of the Bayer pattern and the reference image.

4.7

Conclusion

In this chapter, we have seen several types of metrics to assess the quality of the different algorithms. We first identify the different problem related to the demosaicking, like the artefact of false colors, the blur or zipper, for the most important. Those artifacts were explained in numerous papers, spatially [34] [35] and in terms of frequency by Alleysson [66]. In a second paragraph, we proposed to use two different dataset, to assess the performance of the different algorithm. The first one is the well-known Kodak dataset used in almost all papers 51


of the demosaicking field characterize by a high spectral correlation. A second dataset, used to address the problem of demosaicking in low spectral correlation was proposed. Finally, we proposed to use some digital images to perform some measures in the frequency domain. Then we switch to the objective evaluation measure, where we introduce the basic fidelity measure. MAE, MSE, and PSNR are really popular measure (not only in the field of demosaicking) because of their low complexity; it makes them easy to implement it as regularization parameters for optimization problem. But, those measures are only a global estimation of the fidelity of reconstruction of the reference image. An image with "high" errors in some located zone could be equivalent (in terms of measurement) to an image with some "small" errors split everywhere. And according to our perception a high error of the order of a pixel could be less problematic than errors on the overall image. That’s why those metrics were highly criticizes in the past as they didn’t take into account of the HVS. To answer this problem, the perceptual measures were introduced. By using a perceptually uniform space to perform the measure of errors according to our visual systems but according to [34] the result were far from the result produce by the subjective assessment method. In order to give an extent to those measures and have a clear assessment from where (in term of spatial frequency) the algorithm starts to struggle we have proposed to use frequency measurement with the well-known Modulation Transfer Function (MTF). Finally, we present different ways to measure how much an image is affected by different artifacts. We present some methods to assess the overall blurring in the image, a framework to assess the amount of false colors and the zipper effect. All those measures give a good idea of the amount of artifact which affect the image but by making some assumptions; those measures may be unreliable in some particular case. We finish by making a proposition of an algorithm and a possible evolution of it, as a new no reference measure to assess some temporal aliasing and alternating hue in an image. At this point, the results of the algorithm doesn’t completely reflect as much as we wish the perception of we have in a video, for example the picket and the alternating hue catch more our eyes than some alternating hue located at some pixels on the grass for example.

52


5 5.1

Result

Methodology

In order to assess the performance of the demosaicking algorithms we proposed to model each elements of the acquisition chain to have an interpretation of the quality of the algorithms which correlate with the frequency content of a real scene. In many publications, the assessment of the algorithms is limited to a classical PSNR measure or sometimes a ∆E. Thus, the assessment of the algorithm is only limited to a comparison of the reconstructed image and a reference image. In this thesis, we propose to perform an evaluation of the full chain by assessing the impact of each block of the chain in terms of demosaicking performance for different configuration. Figure 42, illustrate the processing pipeline applied:

Figure 42: Pipeline of image quality assessment

As a scene shot by a camera isn’t sampled as an image. We proposed to upsample the input image to model more finely the influence of the PSF and OLPF beyond the sampling frequency of the sensor. Without it we couldn’t highlight properly the phenomenon of spectrum overlap. Then, we want to simulate the effect of the optical system. The first thing to model was the optical response of the zoom also called point spread function (PSF). The PSF is represented as an Airy function (as first approximation) and has a low-pass filter effect to the image. The PSF was simulated according to some weighted spectral band in the visible domain and the properties of a zoom.

Figure 43: Airy function

The next part to simulate is the Optical Low Pass Filter (OLPF). For an incident beam, the OLPF will split it into two or four beams (depending of the displacement introduce by the plate). 53


The displacement (horizontal, vertical, diagonal or with combine directions) are simulated in our program and supposed to be perfect; we suppose that the beam is perfectly splitted into four beams of equal intensity for an horizontal and vertical displacement. We also assume that for a displacement of 1 pixel (for example), the deviated beam fall exactly on the considered pixels. In our camera simulation, we study the influence of an OLPF over images reconstruction and the MTF (see figure 46).

Figure 44: Exemple of displacement introduce by an OLPF

In order to simulate the acquisition of the sensor and according to our assumption on the input image (ten times more resolved than the output image), first integrate the sampling of the sensor by averaging a 10 × 10 window of the input image to reconstruct one pixel. When all the pixels are sampled, we sample another time the image according to the Bayer pattern. The Bayer sampling was done in the same manner express in the literature [35], by selecting the color component present in the CFA. Then, we apply a demosaicking algorithm to reconstruct the different RGB components. We proposed different state-of-the-art algorithms in our implementation and other algorithms which are as good as the best state-of-the-art algorithms. Finally, we perform measurements according to technics proposed above. We will provide some results as it was done in the papers in terms of fidelity measure as the PSNR (measure are computed for all the image and for each RGB components) and in terms of perceptual measure as the ∆E. And, for the case of slanted edge measure, we will use frequency domain measure. Due to some drawbacks (see chapter 4), we prefer to avoid the use of artefact analysis method judge as dependent of the content in the image. Note: For all the algorithms, we propose to pick 10 pixels on both side of the image to avoid edge effects.

54


5.2

Experimentation and observation

Our first approach of the camera was to study the impact of the different elements on the demosaicking step. Thus, we proposed to study the influence of the zoom on the frequency content transferred in the pipeline. In this experimentation, we used the PSF of a zoom intern from Thales (focusing at the infinite), an OLPF which introduce a displacement of 1 pixel in the horizontal and vertical (low pass filtering horizontal and vertical) , the sampling method as describe above, and a demosaicking step which gives high performances. To perform the measure we will use a slanted edge as an input image with a resolution of 19200 × 10800 (sampled at 0.5µm ) to get a processed image of 1920 × 1080 (sampled at 5µm ). Then, we compute their MTF for each color channels:

Figure 45: MTF of the pipeline with and without the PSF

From those measurements, we observe the effects of the PSF on the slanted edge. We could see that the three MTF measure over the three color channels are really similar with or without PSF. We also identify a slight chromaticism of the lens; the effect is especially visible on the MTF of the blue, where the modulation is a little bit more important compared to the MTF of the red and green. Thus, this experimentation could be an interesting way to perform some simulations of zoom to identify some problem (as the chromaticism) and/or making some comparison of MTF for several optics. For instance, our observation leads us to see that the main source of loss in modulation come from the OLPF; the PSF changes almost nothing to the output image. This observation leads us to perform some simplification of the pipeline. As, we observed that the PSF doesn’t affect so much the frequency content in comparison of the OLPF, we preferred to remove temporary the PSF from our measurement pipeline. We also point out some weakness of our pipeline of measurement; as we supposed the input image as ten times the resolution of the sensor, the resulting image is impossible to compare to the input image in terms of objective measures (measure done pixel wise) due to the difference of resolution. To study the next component (OLPF), and the impact on the chain, we will assume that the image is already discretized in the input image. From the input image, we introduce the OLPF by making horizontal and vertical separation of a pixel, as an OLPF does in reality. Then, we perform the sampling step according to the Bayer matrix. Finally, we chose a demosaicking algo55


rithm to reconstruct a full color image. The goal of this experiment was to see how the algorithm is performant in function of the OLPF used in the camera pipeline. Introducing different OLPF with different displacement will give us information of how an algorithm performs in function of the frequency content in an image. We also measure the modulation of a slanted edge by demosaicking algorithms and the OLPF. We will perform some measures without the OLPF and with an OLPF with three different displacement (0.5 pixel, 1 pixel and 1.5 pixel of displacement in the horizontal and vertical).

Figure 46: MTF of the different configuration. Top left: Without OLPF, Top right: OLPF 0.5pix, Bottom left: 1pix, Bottom right: 1.5pix of displacement on the horizontal and vertical

We also perform objective measure (average of the four phases of the Bayer CFA) on the Kodak dataset filtered by the OLPF in four different configurations define above and present the average result in terms of PSNR and ∆E. For more insight the results of all the images are given in annex. From those measures, we could see that the performance of the different algorithm are increasing (more or less, depending of the algorithm) over the displacement introduce by the OLPF. But, the objective measures have to be correlated with the MTF measures; The higher is the displacement the more we will lose in contrast and thus in the high frequencies which evolves in the image. On the one hand, if we lose too much high frequencies, the image will appear blurry, on the other hand, too much high frequencies will lead to aliasing problem according to the Whittaker-Shannon theorem.

56


Figure 47: Left: Average PSNR of the Kodak images other the displacement introduce by the OLPF; Right: Average ∆E of the Kodak images other the displacement introduce by the OLPF

From the result of figure (46-47), we could also say that the algorithms which gives the best performances are the algorithms from Lian [27] and Dubois[47] which use frequency selection technics and algorithm from Pekkucuksen(GBTI)[54] which use the coherence of the direction of interpolation. The three algorithms leads to similar result in terms of objective measure and MTF. But, for the final choice, we should also consider the frequency content allowed by the OLPF as for example in the configuration of an OLPF introducing 1.5 pixel of displacement in the horizontal and vertical the method of Hamilton and Adams [17] became a better choice than the other three proposed algorithm at a lower complexity cost. We also highlight the importance of considering the trade off between the frequency content and the apparition of color aliasing in the image: Without using OLPF we could see a strong color aliasing highly visible by the human eye. Using an OLPF with more displacement will reduce substantially the apparition of color aliasing and replace the area where high frequency involve by a uniform pattern, less visible by the human eye. The proposed method, will help us on the choice of an antialiasing filter (to avoid problem of color aliasing) and an appropriate algorithm (to respect as much as possible the frequency content of the image). We also provide a classical (as done in most of the papers [34-35]) comparative study of some of the state of the art algorithm. As we could see, from the figure below, the algorithms are more or less sensitive to color aliasing and leads to different quality: We observe from the results that algorithms with frequency selection (Lian [27] an Dubois [47]) and algorithms (Wang [57] and Pekkucuksen [54]) which adapt their interpolation scheme in regard to the inter- and intra-channel correlations may lead to really good results. We extend the comparative study in the context of lower spectral correlation and higher saturation in the McMaster dataset. We could see that most of the algorithms which assume high spectral redundancy fails in the McMaster dataset and may become as good as (sometimes worst) the simple algorithm proposed 57


Figure 48: Image demosaicking with the algorithm of Hamilton and Adams according to different OLPF

by Hamilton and Adams [17]. As adressed in [41], the different algorithms should extend their interpolation scheme by exploiting the local spectral correlation and the non-local similarity in the CFA image. Weak spectral correlation in image demosaicking need to be further more investigated as only few papers take into account of this problems [29][41-42]. We also use our proposed algorithm to compare alternating hue by moving the phase of the Bayer CFA. We observe that the proposed method highlight the zone affected by the alternating hue. The result observed on the map are similar to the map of a simple ∆E measure (as the errors are mostly located on edges) with a better highlight of the strenght of the alternating pattern. As example, we could make the contrast of a bilinear interpolation and the efficient algorithm of Pekkucuksen: We could observe that the alternating hue is more important and spread other the edges in the map of the bilinear interpolation than on the Pekkucuksen algorithm. We also make the observation that the algorithm of Pekkucuksen or Lian are probably the best cost effective solution to perform FPGA implementation. Their performances over the different

58


Figure 49: From top to bottom and left to right: Original image, Bilinear interpolation, Smooth hue transition, Pattern recognition Cok, Chang, Proposed Chang + Smooth hue transition, Hamilton and Adams, Wang, Pekkucuksen, Lian, Dubois, Lu

dataset and the proposed camera simulation make them enough reliable to obtain good quality images.

5.3

Conclusion

From this thesis, we have seen the importance to take into account all the chain of acquisition of a video camera. Most of the study, only consider the performance of their algorithms only on the well known Kodak dataset which is probably not has representative of all the situation we could find. We find that competing for higher PSNR or ∆E is probably less relevant than studying the impact of the different block which composed a video camera acquisition chain. We proposed a simplified pipeline to perform basic simulation. We also highlight some results which need to be correlated with experimentation on real camera. We have seen that the optical response of a zoom (when it is at a focal plane) don’t play a major role on the frequency content transmit other the acquisition chain. We also see the interest of the use of OLPF and the trade off between

59


Figure 50: Alternating hue: Left: Bilinear interpolation, Right: Pekkucuksen algorithm

color aliasing and blurry image. Correlated with the result of the measure of different OLPF we hightlight that in regard to the frequency content which is allowed to pass by the OLPF there is no special need (in case of low frequency content) of algorithm with high performance on PSNR or ∆E. Thus, those parameters need special attention to optimise the quality of the image in terms of low color aliasing, frequency content and demosaicking algorithm choice. We also show the case of "favorable" phase of the Bayer pattern which may lead to color aliasing pattern and alternating hue in the context of a video. We address some limitation of our work which doesn’t take into account of the noise in a camera sensor and may also influence the demosaicking step. More insight and solution could be found in [3][73] to understand the problem of noisy image. More study could be also done on the context of demosaicking in low spectral correlation. To extend the camera simulation, we could also introduce the spectral response of the different patch of the CFA to be as close as possible of what does a camera.

60


Bibliography [1] R.Lukac. Single Sensor Imaging : Methods and Applications for Digital Cameras,CRC Press, 2008 [2] R.Lukac. Color Image Processing : Methods and Applications,CRC Press, p.363-392, 2006 [3] Harold Phelippeau. Methodes et algorithmes de dematricage et de filtrage du bruit pour la photographie numerique,thesis, 2010 [4] Lukac R., Plataniotis K.N. Color Filter Arrays for Single-Sensor Imaging, 23rd Biennial Symposium on Communications, 2006 [5] B.E.Bayer. Color Imaging Array, United State Patent, 1976 [6] F. Ubiria. FPGA implementation of camera colour model conversion, Master thesis, 2008 [7] G. Sharma and H.J. Trussell. Digital color imaging, IEEE Trans. on Image Process, July 1997 [8] D. Menon, G Calvagno. Color image demosaicking: An overview, Signal Processing: Image Communication 26, 2011 [9] A. Theuwissen. CMOS Image Sensors : State-Of-The-Art and Future Perspectives, IEEE CONFERENCE PUBLICATIONS, 2007 [10] Lukac R. Color Filter Arrays: Design and Performance Analysis, IEEE Transactions on Consumer Electronics, Vol. 51, No. 4, November 2005 [11] Condat L. A New Color Filter Array With Optimal Properties for Noiseless and Noisy Color Image Acquisition, IEEE Transactions on image processing, Vol. 20, No. 8, August 2011 [12] Menon D., Calvagno G. Color image demosaicking:An overview, Signal Processing: Image Communication 26, April 2011 [13] D. Alleysson , S. Susstrunk , J. Herault Color demosaicing by estimating luminance and opponent chromatic signals in the Fourier domain, 2002 [14] Lanlan Chang, Yap-Peng Tan Effective use of Spatial and Spectral Correlations for Color Filter Array Demosaicking, IEEE journals and magazines, 2004 [15] Chung K.L., Demosaicing of Color Filter Array Captured Images Using Gradient Edge Detection Masks and Adaptive Heterogeneity-Projection, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. 12, DECEMBER 2008

61


[16] B. K. Gunturk, Color Plane Interpolation Using Alternating Projections, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 11, NO. 9, SEPTEMBER 2002 [17] Hamilton J.F., Adams J.E., Adaptative color plane interpolation in single sensor color electronic camera, United States Patent, July 1997 [18] K.Hirakawa, T.Parks, Adaptive homogeneity-directed demosaicing algorithm. Image Processing, IEEE Transactions on, 14 :360-369, 2005. [19] Kimmel R., Demosaicing: Image Reconstruction from Color CCD Samples, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO. 9, SEPTEMBER 1999 [20] Yue M. Lu, Demosaicking by Alternating Projections: Theory and Fast One-Step Implementation, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 8, AUGUST 2010 [21] H. S. Malvar, L.W. He, R. Cutler, High-quality linear interpolation for demosaicing of Bayerpatterned color images, Microsoft Research, May 2004 [22] Menon D., Demosaicing With Directional Filtering and a posteriori Decision, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 1, JANUARY 2007 [23] Menon D., Regularization Approaches to Demosaicking, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 10, OCTOBER 2009 [24] Paliy D., Spatially Adaptive Color Filter Array Interpolation for Noiseless and Noisy Data, International Journal of Imaging Systems and Technology - Special Issue on Applied Color Image Processing, 2007 [25] Zhang L., Color Demosaicking Via Directional Linear Minimum Mean Square-Error Estimation, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 12, DECEMBER 2005 [26] D.R.Cok, Signal processing method and apparatus for producing interpolated chrominance values in a sampled color image signal, U.S. Patent, No. 4 642 678, February 1987 [27] Lian, N.-X., Chang, L., Tan, Y.-P., Zagorodnov, V., Adaptive filtering for color filter array demosaicking, IEEE Transactions Image Processing, 2007 [28] Jing Gu, Wolfe P.J., Hirakawa K., Filterbank-based universal demosaicking, Proceedings of 2010 IEEE 17th International Conference on Image Processing, September 2010 [29] X.Li, B.K.Gunturk, L.Zhang, Image demosaicing : a systematic survey, Proceedings of the SPIE-IS and T Electronic Imaging, Visual Communications and Image Processing, vol.6822, 2008 [30] Maschal R.A., Young S.S., Reynolds J.P., Krapels K., Fanning J., Corbin T., New Image Quality Assessment Algorithms for CFA Demosaicing, IEEE SENSORS JOURNAL, VOL. 13, NO. 1, January 2013

62


[31] L. Condat, A New Color Filter Array With Optimal Properties for Noiseless and Noisy Color Image Acquisition, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 8, August 2011 [32] K.Sudha Rani , W.Jino Hans, FPGA implementation of Bilinear Interpolation Algorithm For CFA Demosaicing, International conference on Communication and Signal Processing, April 2013 [33] Clive "Max" Maxfield, FPGAs World class designs, Newnes, Mars 2009 [34] YANG Yanqin, Contribution a l’evaluation objective de la qualite d’images couleur estimees par dematricage, these universite des sciences et technologies de Lille, 2009 [35] LOSSON Olivier, DINET Eric, From the Sensor to Color Images Digital Color - Acquisition, Perception, Coding and Rendering, Chapter 6, June 2012 [36] ZHANG Xuemei, WANDELL Brian A., A spatial extension of CIELAB for digital color image reproduction, Journal of the Society for Information Display, March 1997 [37] JOHNSON Garrett M., FAIRCHILD Mark D., A top down description of S-CIELAB and CIEDE2000, Color Research and Application, March 2003 [38] MASCHAL Robert A., YOUNG Susan, REYNOLDS Joseph P., KRAPELS Keith, FANNING Jonathan, CORBIN Ted, New Image Quality Assessment Algorithms for CFA Demosaicing, IEEE SENSORS JOURNAL,VOL. 13, January 2013 [39] COULANGE B., MOISAN L., An aliasing detection algorithm based on suspicious colocalizations of Fourier coefficients, IEEE CONFERENCE PUBLICATIONS, 2010 [40] Teahyung LEE, David V. ANDERSON, The wavelet-based multi-resolution motion estimation using temporal aliasing detection, Visual Communications and Image Processing, 2007 [41] L. Zhang, X. Wu, A. Buades, and X. Li, Color Demosaicking by Local Directional Interpolation and Non-local Adaptive Thresholding, Journal of Electronic Imaging 20(2), Apr-Jun 2011 [42] Wang Guo-gang and Zhu Xiu-chang and Gan Zong-liang, Image demosaicing by non-local similarity and local correlation, Signal Processing (ICSP), 2012 IEEE 11th International Conference on, Oct. 2012 [43] Zhang, Fan and Wu, Xiaolin and Yang, Xiaokang and Zhang, Wenjun, Improved color demosaicking in weak spectral correlation, Proc. SPIE, 2008 [44] Anna Gabiger-Rose, Matthias Kube, Robert Weigel, An FPGA-Based Fully Synchronized Design of a Bilateral Filter for Real-Time Image Denoising, IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 61, NO. 8, AUGUST 2014 [45] D.R Cok, Signal processing method and apparatus for sampled image signals, U.S. patent 4,630,307, Dec. 1986 63


[46] Ed Chang, Shiufun Cheung and Davis Pan, Color Filter Array Recovery Using a Thresholdbased Variable Number of Gradients, IS&T/SPIE Conference on Sensors, Cameras, and Applications for Digital Photography, January 1999 [47] Eric Dubois, Frequency-Domain Methods for Demosaicking of Bayer-Sampled Color Images, IEEE SIGNAL PROCESSING LETTERS, VOL. 12, NO. 12, DECEMBER 2005 [48] Eric Dubois, Gwanggil Jeon, Demosaicking of Noisy Bayer-Sampled Color Images With Least-Squares Luma-Chroma Demultiplexing and Noise Level Estimation, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 1, JANUARY 2013 [49] Nai-Xiang Lian, Lanlan Chang, Vitali Zagorodnov, Yap-Peng Tan, Reversing Demosaicking and Compression Color Filter Array Image Processing Performance Analysis and Modeling, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 11, NOVEMBER 2006 [50] Naixiang Lian, Lanlan Chang, Yap-Peng Tan, Improved Color Filter Array Demosaicking By Accurate Luminance Estimation, Image Processing, 2005. ICIP 2005. IEEE International Conference on, Sept. 2005 [51] Hye-Rin Choi, Rae-Hong Park and Ji Won Lee, Gradient Estimation for Demosaicking in a Color Filter Array Image, Journal of Communication and Computer 10 (2013) 59-71, Jan. 2013 [52] Hye-Rin Choi, Rae-Hong Park and Ji Won Lee, Gradient Estimation for Demosaicking in a Color Filter Array Image, Journal of Communication and Computer 10 (2013) 59-71, Jan. 2013 [53] Soo-Chang Pei, Io-Kuong Tam, Effective Color Interpolation in CCD Color Filter Arrays Using Signal Correlation, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 6 , JUNE 2003 [54] Ibrahim Pekkucuksen, Yucel Altunbasak, Gradient Based Threshold free Color Filter Array Interpolation, Proceedings of 2010 IEEE 17th International Conference on Image Processing, September 2010 [55] Ibrahim Pekkucuksen, Yucel Altunbasak, Directional Color Filter Array Interpolation Based on Multiscale Color Gradients, IEEE International Conference on , vol., no., pp.997,1000, May 2011 [56] Ibrahim Pekkucuksen, Yucel Altunbasak, Multiscale Gradients-Based Color Filter Array Interpolation, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 1, JANUARY 2013 [57] Xiaomeng Wang, Weisi Lin, Ping Xue, Demosaicing with Improved Edge Direction Detection, IEEE International Symposium on , vol., no., pp.2048,2051 Vol. 3, May 2005 [58] http : //www.optique−ingenieur.org/fr/cours/OPI− fr− M05− C06/co/Contenu− 07.html [59] Danny Pascale, Color correction without color patterns for stereoscopic camera systems, www.BabelColor.com, June 2006 64


[60] Youngbae Hwang, Je Woo Kim, Byeong Ho Choi, Wangheon Lee, RGB coordinates of the MacBeth ColorChecker, 11th International Conference on Control, Automation and Systems, June 2011 [61] Senfar Wen, Color Management for Future Video Systems, Proceedings of the IEEE, January 2013 [62] Robert H. Hibbard, Apparatus and method for adaptively interpolating a full color image utilizing luminance gradients, United States Patent, Jan. 17, 1995 [63] C.A. LAROCHE and M. A. PRESCOTT, Apparatus and method for adaptively interpolating a full color image utilizing chrominance gradients, United States Patent, 1994 [64] W.T. FREEMAN, Median filter for reconstructing missing color samples, United States Patent, 1988 ˘Z ´ images couleurs sous echantillonnees [65] D. ALLEYSSON and J. HERAULT, Interpolation dâA par un modele de perception, Proc. GRETSI, 2001 [66] D. ALLEYSSON, S. SUSSTRUNK, J. HERAULT, Linear Demosaicing inspired by the Human Visual System, IEEE Trans. on Image Processing, 2005 [67] Li Tan, Jean Jiang, Digital Signal Processing Fundamentals and Applications second edition, Academic Press of Elsevier, 2013 [68] O. Lossona, L. Macairea, Y. Yanga, Comparison of color demosaicing methods, Advances in Imaging and Electron Physics 162 (2010) 173-265, 2012 [69] Damon M. Chandler, Seven Challenges in Image Quality Assessment: Past, Present, and Future Research, ISRN Signal Processing, 2012 [70] Wang Z., Lu L., Bovic A.C., Video quality assessment using structural distortion measure˘S ment, Signal Processing: Image Communication, 19 (2), 121âA ¸132, 2004 [71] Pina Marziliano, Frederic Dufaux, Stefan Winkler, Touradj Ebrahimi, Perceptual blur and ringing metrics : application to JPEG2000, Signal Processing: Image Communication, 2004 [72] Wenmiao Lu, Yap-Peng Tan, Color filter array demosaicking: New method and performance measures, IEEE Transactions on Image Processing, 2003 [73] Anna Gabiger-Rose, Matthias Kube, Robert Weigel, Richard Rose, An FPGA-Based Fully Synchronized Design of a Bilateral Filter for Real-Time Image Denoising, IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 61, NO. 8, 2014

65


Table: CPSNR

Figure 51: CPSNR: Kodak dataset

66


Figure 52: CPSNR: IMAX dataset

67


Table: ∆E

Figure 53: ∆E: Kodak dataset

68


Figure 54: ∆E: IMAX dataset

69