algorithms Article
A No Reference Image Quality Assessment Metric Based on Visual Perception † Yan Fu and Shengchun Wang * College of Computer Science and Technology, Xi’an University of Science and Technology, Xi’an 710054, China;
[email protected] * Correspondence:
[email protected]; Tel.: +86-29-8558-3173 † This paper is an extended version of our paper published in the International Symposium on Computer, Consumer and Control, Xi’an, China, 4–6 July 2016. Academic Editor: Hsiung-Cheng Lin Received: 30 August 2016; Accepted: 12 December 2016; Published: 16 December 2016
Abstract: Nowadays, how to evaluate image quality reasonably is a basic and challenging problem. In view of the present no reference evaluation methods, they cannot reflect the human visual perception of image quality accurately. In this paper, we propose an efficient general-purpose no reference image quality assessment (NRIQA) method based on visual perception, and effectively integrates human visual characteristics into the NRIQA fields. First, a novel algorithm for salient region extraction is presented. Two characteristics graphs of texture and edging of the original image are added to the Itti model. Due to the normalized luminance coefficients of natural images obey the generalized Gauss probability distribution, we utilize this characteristic to extract statistical features in the regions of interest (ROI) and regions of non-interest respectively. Then, the extracted features are fused to be an input to establish the support vector regression (SVR) model. Finally, the IQA model obtained by training is used to predict the quality of the image. Experimental results show that this method has good predictive ability, and the evaluation effect is better than existing classical algorithms. Moreover, the predicted results are more consistent with human subjective perception, which can accurately reflect the human visual perception to image quality. Keywords: image quality assessment; no reference; visual perception; support vector regression; region of interest; generalized Gauss distribution (GGD)
1. Introduction With the rapid development of information technologies and widespread usage of smart phones, digital imaging has become a more and more important medium to acquire information and communicate with other people. These digital images generally suffer impairments during the process of acquisition, compression, transmission, processing and storage. In addition, these phenomena have brought us great difficulties in studying images and understanding the objective world. Because of this, the image quality assessment (IQA) metrics have become a fundamentally important and challenging work. Effective IQA metrics could play important roles in applications such as dynamic monitoring and adjustment of image quality, optimizing the parameter settings of image processing systems, and searching high quality images of medical imaging and so on [1]. For one thing, high quality images can help doctors to judge the severity of the disease in medical fields. For another, such as high-definition televisions, personal digital assistants, internet video streaming, and video on demand, necessitate the means to evaluate the images quality of this information. Therefore, in order to obtain high quality and high fidelity images, the study of image quality evaluation has important practical significance [2,3].
Algorithms 2016, 9, 87; doi:10.3390/a9040087
www.mdpi.com/journal/algorithms
Algorithms 2016, 9, 87
2 of 21
Because human beings represent terminals for the majority of processed digital images, subjective assessment is the ultimate criterion and reliable image quality assessment method. However, the subjective assessment methods hinder its application in practice because of its disadvantages, such as time-consuming, expensive, complex and laborious. Thus, the goal of objective IQA is to automatically evaluate the quality of images as near to human perception as possible [2]. Based on the availability of a reference image, objective IQA metrics can be classified into full reference (FR), reduced reference (RR), and no reference (NR) approaches [4]. Only recently did FR-IQA methods reach a satisfactory level of performance, as demonstrated by high correlations with human subjective judgments of visual perception. SSIM [5], MS-SSIM [6] and VSNR [7] are examples of successful FR-IQA algorithms. These metrics are based on measuring the similarity between the distorted image and its corresponding original image. However, in real-world applications, where the original is not available, FR metrics are not used. This strictly limits the application domain of FR-IQA algorithms. In addition, the NR-IQA is the only possible algorithm that can be used in the practical application. However, a number of NR-IQA metrics have been testified that they do not always correlate with the perceived image quality [8,9]. Presently, NR-IQA algorithms generally follow one of two trends: The distortion-specific and general-purpose methods. The former evaluate the distorted image of a specific type while the latter directly measure the image quality evaluation without knowing the type of image distortion. Since the result of evaluation ultimately depends on the feeling of the observers, image evaluation with more perfect and more suitable for the actual quality must be based on human visual, psychological characteristics and organic combine the subjective and objective evaluation methods. A large number of studies have shown that: considering the human visual system (HVS) evaluation methods are better than that without considering the characteristics of HVS evaluation methods [10–13]. However, existing no-reference image quality assessment algorithms are failed to give full consideration to the human visual features [10]. At present, there are some problems in the field of image quality assessment: 1. 2. 3. 4.
Unable to obtain the original image. Difficult to determine whether the type of distortion of the distorted image exists. Existing methods lack of considering human visual characteristics. How to ensure the characteristics to be not significantly difference caused by the distortion images with same degree yet different types.
In this paper, based on the existing image quality evaluation methods, we draw the human visual attention region mechanism and the HVS characteristics into the no reference image quality assessment method. Then, we propose a universal no-reference image quality assessment method based on visual perception. The rest of this paper is organized as follows. In Section 2, we describe the current research about no reference image quality assessment methods and visual saliency. In Section 3, we introduce the extraction model of visual region of interest. In Section 4, we provide an overview of the method. In Section 5, we describe blind/reference image spatial quality evaluator (BRISQUE) algorithm and the prediction model is provided in Section 6. We present the results in Section 7, and we conclude in Section 8. 2. Previous Work Before proceeding, we state some salient aspects of NR-IQA methods. Present day depending on whether we know the type of distortion image or not, NR-IQA methods can be divided into two categories: distortion-specific approaches and general-purpose approaches [14]. Distortion-specific algorithm is capable of assessing the quality of images distorted by a particular distortion type, such as blockiness, ringing, blur after compress, and noise generated during image transmission. In literatures [15–18], the authors proposed some approaches for JPEG compression
Algorithms 2016, 9, 87
3 of 21
images. In [17], Zhou et al. aimed to develop NR-IQA algorithms for JPEG images. At first, they not only established a JPEG image database, but also subjective experiments were conducted on the database. Furthermore, they proposed a computational and memory efficient NR IQA model for JPEG images, and estimated the blockiness by the average differences across block boundaries. They used the average absolute difference between in-block image samples and the zero-crossing rate to estimate the activity of the image signal. At last, subjective experiment results were used to train the model, which achieved good quality prediction performance. For JPEG2000 compression, distortion in an image is generally modeled by measuring edge-spread using an edge-detection based approach and this edge spread is related to quality [5,19–22]. The literatures [23–25] are evaluation algorithms for blur. QingBin Sang et al. calculated image structural similarity by constructing fuzzy counterpart [20]. Firstly, they used a low pass filter to blur the original blurred image and produce re-blurred image. Then they divided the edge dilation image into 8 × 8 blocks. In addition, the sub-blocks were classified into edge dilation block and smooth block. In addition, the gradient structural similarity index was defined by different weighs according to the different types of blocks. Finally, the blur average value of the whole image was produced. The experimental results were shown that the method was more reasonable and stable than others. However, these distortion-specific methods are obviously limited by the fact that it is necessary to know the distortion types in advance, which makes their application field limited [3]. Universal methods are applicable to all distortion types and can be used in any occasion. In recent years, researchers proposed many universal no reference image quality assessment methods. However, most of these methods relied on the prior knowledge of subjective value and distortion types, and used machine learning methods to get the image quality evaluation score. Some meaningful universal methods are reported in the literature [8,26–31]. In [26], the author proposed a new two-step framework for no-reference image quality assessment based on natural scene statistics (NSS), namely Blind Image Quality Indices, (BIQI). Firstly, the algorithm estimated the presence of a set of distortions in the image. Secondly, they evaluated the quality of the image according to those distortions. In literature [27], Moorthy et al. proposed a blind IQA method based on the hypothesis that natural scenes have some certain statistical properties. In addition, those properties can make the quality of image degeneration. This algorithm was called Distortion Identification- based Image Verity and Integrity Evaluation (DIIVINE) and also was a two-step framework. The DIIVINE mainly consists of three steps. Firstly, the wavelet coefficients were normalized. Then, they calculated the statistical characteristics of the wavelet coefficients, such as scale and orientation selective statistics, orientation selective statistics, correlations across scales and spatial correlation. Finally, the quality of the image was calculated using those statistical features. The method has achieved a good performance on evaluation. In literature [8], the authors supposed in Discrete Cosine Transform (DCT) domain the change of image statistical characteristics can predict image quality, and proposed Blind Image Integrity Notator using DCT Statistics algorithm (BLIINDS). Subsequently, Saad et al. extended the BLIINDS algorithm, and proposed an alternate approach that relies on a statistical model of local DCT coefficients which they dubbed blind image integrity notator using DCT statistic (BLIINDS-II) [28]. To begin with, they partitioned the image into n × n blocks and computed the local DCT coefficient. Then, they applied a generalized Gaussian density model to each block of DCT coefficients. Furthermore, they computed functions of the derived model parameters. Finally, they used a simple Bayesian model to predict the quality score of the image. In literature [29], Mittal et al. proposed a natural scene statistic-based NR IQA model that operated in the spatial domain, namely blind/reference image spatial quality evaluator (BRISQUE). This method did not compute distortion-specific features, but used scene statistics of locally normalized luminance coefficients to evaluate the quality of the image instead. Since Ruderman found that these normalized luminance values strongly tend towards a unit normal Gaussian characteristic [30] for natural images, this method used this theory to preprocess the image. Then, they extracted features in spatial domain. Those features mainly contain generalized Gauss distribution characteristics and correlation of adjacent coefficients. Finally, those features were used to map the quality to human rating via SVR model. Over the course of last few years, a number
Algorithms 2016, 9, 87
4 of 21
of researchers have devoted time to improving the assessment accuracy of NR IQA methods by taking advantage of known natural scene statistics characteristics. These methods are failed to give full consideration to the human visual characteristics [2]. There is also a deviation between the predicted qualities and the real qualities, since the human visual system is the ultimate assessor of image quality. In addition, almost all of the no reference image quality assessment methods are based on gray scale images, and they do not make full use of the color characteristics of the image. Through above analysis of NR IQA methods, at present, both in distortion-specific and general-purpose methods, almost all of the methods use statistical characteristics of natural images to establish quality evaluation model, and have achieved better effect of evaluation. However, considering the statistical characteristics of natural images can hardly reflect the whole regularities of image quality. There is still a certain gap with the subjective consistency of human vision. Since the HVS is the ultimate assessor of image quality, a number of image quality assessment methods based on an important feature of the HVS, namely, visual perception, are emerging in the present day [10]. Among them, the research of region of interest (or visual saliency) is a major branch of the HVS. In the field of image processing, ROI of an image are the areas that differ significantly from their adjacent regions. They can immediately attract our eyes and capture our attention. Therefore, the ROI is a very important region in the image quality assessment. On the other hand, how to build a computational visual saliency model has been attracting tremendous attention in recent years [32]. Nowadays, a number of computational models to simulate human visual attention have been studied by scholars and some powerful models have been proposed. For further details of ROI refer to [9–12,32–36]. In [33], Itti et al. proposed a visual saliency model which was the first influential and best known visual saliency model. In this paper, we call it Itti model. Itti model mainly contains two aspects. Firstly, they introduced image pyramids for feature extraction, which makes the visual saliency computation efficient. Secondly, Itti et al. proposed the biologically inspired “center-surround difference” operation to compute feature dependent saliency maps across scales [32]. This model effectively breaks down the complex problem of scene understanding by rapidly selecting. Following the Itti model, Harel et al. proposed the graph-based visual saliency (GBVS) model [37]. The GBVS was introduced a novel graph-based normalization and combination strategy, and was a new bottom-up visual saliency model. It mainly consists of two steps. The first step is forming activation maps on certain feature channels. In addition, the second step is normalizing them in a way which highlights salient and admits combination with other maps. Experimental results show that the GBVS predicts ROI more reliably than other algorithms. In this paper, we mainly study the human visual characteristics, extract the region of interest, and combine the statistical characteristics of the natural image as the measurement index of the distortion image. Finally, do experiments on the database LIVE [38], the experimental results show that the features extraction and the learning method are reasonable, and correlate quite well with human perception. 3. Extraction Model of Visual Region of Interest With the increasingly development of information technology, images have become the main media of information transmission. How to analyze and deal with a large number of image information effectively and accurately is an important subject. Then, researchers have found that the most important information in the image is often focusing on a few key areas. They call these key areas salient regions or regions of interest. We will greatly improve the efficiency and accuracy of image processing and analyzing with extracting these salient regions accurately. So far, there are many regions of interest extraction algorithms. The detection technology of the ROI, which has been based on interaction and detection technology, has gradually developed into based on visual feature technology. The salient map model proposed by Itti is the most typical ROI detection, based on visual features [33]. This method only considers the bottom-up signal, simple and easy to implement. It is the most influential visual attention model.
Algorithms 2016, 9, 87
5 of 21
Algorithms 2016, 9, 87 Algorithms 2016, 9, 87
5 of 5 5 of 5
3.1. The Principle of Itti Model
3.1. The Principle of Itti Model 3.1. The Principle of Itti Model In 1998, Itti et ananalgorithm onsalient salientmap map model. algorithm defines In 1998, Ittial. et proposed al. proposed algorithm based based on model. TheThe algorithm defines the the In 1998, Itti et al. proposed an algorithm based on salient map model. The algorithm defines the color,color, brightness and direction of three kinds of visual characteristics, and uses Gaussian pyramid brightness and direction of three kinds of visual characteristics, and uses Gaussian pyramid color, brightness and direction of three kinds of visual characteristics, and uses Gaussian pyramid building scalescale space. The model thecompute compute strategy to form building space. The modeluses usesthe the center-surround center-surround ofofthe strategy to form a set aofset of building scale space. The model uses the center-surround of the compute strategy to form a set of feature maps, and then these collections are normalized and merged to create salient map. It feature maps, and then these collections are normalized and merged to create salient map. It includes feature maps, and then these collections are normalized and merged to create salient map. It three processesacquisition, of features acquisition, salient map computation, selection and of transfer of threeincludes processes of features salient map computation, selection and transfer the region of includes three processes of features acquisition, salient map computation, selection and transfer of the region of interest [34–36]. Figure 1 is the implementation process of Itti Koch model. interest is the implementation process of Ittiprocess Koch of model. the [34–36]. region ofFigure interest1[34–36]. Figure 1 is the implementation Itti Koch model.
Figure themodel. model. Figure1.1.General General architecture architecture ofofthe Figure 1. General architecture of the model.
3.2. Algorithm Improvement 3.2. Algorithm Improvement 3.2. Algorithm Improvement In this paper, propose improvedmodel model based of of Itti model. We add texture In this paper, we we propose anan improved basedon onthe theidea idea model. We add texture In this paper, we propose an improved model based on the idea of IttiItti model. We add texture feature and edge structure characteristics to the original model. The model block diagram is shown feature and edge structure characteristics theoriginal originalmodel. model. The model is is shown feature and edge structure characteristics toto the modelblock blockdiagram diagram shown in in Figure 2. in 2. Figure 2. Figure
Figure 2. Improved architecture of the mode.
Figure2.2.Improved Improved architecture architecture ofofthe Figure themode. mode.
Algorithms 2016, 9, 87
6 of 21
3.2.1. Algorithms Texture2016, Feature Extraction 9, 87
6 of 6
Texture is an important visual cue, which is a common and difficult feature in the image. 3.2.1. Texture Feature Extraction The texture feature is consistent with human visual perception process and plays an important role in is an important which is common and difficult feature in the currently, image. The such the regionTexture of interest. There arevisual manycue, methods toaextract texture features of images texture feature is consistent with human visual perception process and plays an important as wavelet extraction methods, gray level co-occurrence matrix (GLCM) extraction methods and so role in the region of interest. There are many methods to extract texture features of images currently, on. In recent years, the wavelet methods have been widely used in many fields. In literatures [39,40], such as wavelet extraction methods, gray level co-occurrence matrix (GLCM) extraction methods the authors introduced the wavelet method into the medical field and had achieved considerable and so on. In recent years, the wavelet methods have been widely used in many fields. In success. In the[39] former article, the authors used the thewavelet discretemethod wavelet transform (DWPT) literatures and [40], the authors introduced intopacket the medical field and had to extract wavelet packet coefficients from MR brain images. In the latter article, to help improve achieved considerable success. In the former article, the authors used the discrete wavelet packet the directional selectivity by discrete transform, dual-tree complex transform (DWPT) impaired to extract wavelet packetwavelet coefficients from MRthey brainproposed images. Inathe latter article, wavelet transform which was implemented by two separate two-channel filter banks. to help improve(DTCWT), the directional selectivity impaired by discrete wavelet transform, they proposed a dual-tree complex wavelet transform (DTCWT), which was implemented by two separate DTCWT obtained directional selectivity by using approximately analytic wavelets. At each scale of a two-channelDTCWT, filter banks. DTCWT obtained directional selectivity by using approximately analytic two dimensions it produces in total six directionally selective sub-bands (±15◦ , ± 45◦ , ±75◦ ) wavelets. At each scale of a two dimensions DTCWT, it produces in total six directionally selective for both real and imaginary parts. The method can obtain more information about the direction of sub-bands (±15°, ±45°, ±75°) for both real and imaginary parts. The method can obtain more the images. However, Gabor wavelet is an important method of texture feature extraction. It not information about the direction of the images. However, Gabor wavelet is an important method of only can extract texture features effectively, but also eliminate redundant information. In addition, texture feature extraction. It not only can extract texture features effectively, but also eliminate the frequency directionInofaddition, Gabor filter that closeand to the humanofvisual forclose frequency redundantand information. the frequency direction Gabor system filter that to the and direction, and theysystem were used for texture and were description. seriesrepresentation of filtered images human visual for frequency andrepresentation direction, and they used for A texture and can be obtained by convolution of the image with the Gabor filter, and the information of each image description. A series of filtered images can be obtained by convolution of the image with the Gabor on the scale is described eachonimage [41]. Indirection this paper, we use Gabor to extract filter,and anddirection the information of eachin image the scale and is described in eachwavelet image [41]. In thisfeatures paper, we use Gabor to extract texture features each filter dimensional texture of each filter wavelet image. Two dimensional Gabor of function g(x,image. y) canTwo be expressed as: Gabor function g(x, y) can be expressed as:
" ! # 2 2 1 x 1 y 2 2 1 exp − 1 x +y + 2πiWx g( x, y) = g(x, y)2πσ = x σy exp −2 σ2x2 + 2σy2+ 2πiWx
2πσx σ y
2 σx
σ y
(1)
(1)
wherewhere W isW Gaussian function of polyphonic system frequency. According to the accordance with is Gaussian function of polyphonic system frequency. According to the accordance with human visual system, W =W0.5. The ofg(x, g(x,y)y)can can expressed human visual system, = 0.5. TheFourier Fouriertransform transform of bebe expressed as: as: 1 1 ((u u −−W) W2)2 v 2 v2 = exp− + +2 G (u, vG(u,v) ) = exp − 22 2 σ σ y σy2 2 σxx
(
"
#) (2)
(2)
," # r √ (a + 1) 2 ln 2 π U 2 U h2 1 22 ( a+ 1) 2 ln 2 π 1 where = − . The frequency spectrum is σ 1 2 π tan( ) ( ) σ = , h x where σx = 2πa ) . The frequency spectrum is shown y = 1 y 2π tan m (2 2πσ πa1)mU(aM−, 1σ)U a− ( 2N )2 N 2 ln22ln−2 ( 2πσ x x M in Figure 3. in Figure 3. shown
Figure Frequency component component ofofGabor Figure 3.3.Frequency Gaborfilter. filter.
Algorithms 2016, 9, 87
7 of 21
In order to get a set of self-similar filters, we use g(x, y) as the generating function, and do moderate scale expansion and rotation transformation on it. That is, Gabor wavelet: Algorithms 2016, 9, 87
gmn ( x, y) = a−m g( x 0 , y0 )
7 of 7
(3)
In order to get a set of self-similar filters, we use g(x, y) as the generating function, and do x 0 = a−m ( x cos θ + y sin θ ), y0 = a−m (− x sin θ + y cos θ ), moderate scale expansion and rotation transformation on it. That is, Gabor wavelet: 1
where a > 1, m, n are integer.
a = (Uk /U M ) M−1 , where θ = nπ/k, and k is the number of direction. m, n are scale and direction ' ' g mn ( x, ydirections ) = a − m g ( xof , ythe ) Gabor wavelet. Then we use(3) respectively, n ∈ [0, k]. Figure 4 are four different the Gabor wavelet transform to extract the texture features of the image. For an image I(x, y), its Gabor 1 where a > 1, m, be n are integer.as:x ' = a− m (x cos θ + y sin θ) , y ' = a− m ( − x sin θ + y cos θ) , a = (U k U M ) M −1 wavelet transform can defined . where θ = nπ/k, and k is the number of direction. m, n are scale and direction respectively, n ∈ [0, k]. Tm,n ( x, y)directions = ∑ ∑ Iof( x,the y) gGabor − x1 , y − y1 ) we use the Gabor wavelet (4) mn ∗ ( xwavelet. Figure 4 are four different Then x1 y1 transform to extract the texture features of the image. For an image I(x, y), its Gabor wavelet transform can be defined as:
where, ∗ represents the conjugate complex, and the texture features are calculated as shown in Tm ,n ( x, y ) = ( x, y ) g mn ∗ ( x −asx1the , y −difference y1 ) Formula (5). In the model, the center-surround is Iimplemented between fine (4) and x1 y1 coarse scales. The center represent a pixel at scale c ∈ {2, 3, 4}, and the surround represent the corresponding at scale the s = conjugate c + δ, δ ∈complex, {3, 4}. The difference (denoted as “Θ” below) * represents and across-scale the texture features are calculated shown in where, pixel between two maps is In acquired by interpolation to theisfiner scale andaspoint-by-point subtraction. Formula (5). the model, the center-surround implemented the difference between fine and
coarse scales. The center represent a pixel at scale c ∈ {2, 3, 4}, and the surround represent the T (c4}. )ΘT (sacross-scale )| (5) ∈ |{3, The difference (denoted “Θ” below) corresponding pixel at scale s = cT+(c, δ,sδ) = between two maps is acquired by interpolation to the finer scale and point-by-point subtraction.
The above feature maps are combined to get the texture salient map. They are obtained through T(c,s) = T(c)ΘT(s) (5) across-scale addition, “⊕”, which consists of reduction of each map to scale four and point-by-point addition. TheThe combined formula is as above feature maps arefollow. combined to get the texture salient map. They are obtained through across-scale addition, “⊕”, which consists of reduction of each map to scale four and point-by-point 4 c +4 addition. The combined formulaTis= as ⊕ follow. ⊕ N ( T (c, s)) (6) c =2 s = c +3 4
c+4
T = ⊕ ⊕ N (T (c , s))
(6)
c = 2 s=c + 3
(a)
(b)
(c)
(d)
Figure 4. Gabor wavelet based on different directions. (a) θ = 0, (b) θ = π/4, (c) θ = π/2, (d) θ = 3 × π/4.
Figure 4. Gabor wavelet based on different directions. (a) θ = 0, (b) θ = π/4, (c) θ = π/2, (d) θ = 3 × π/4.
3.2.2. Edge Structure Feature Extraction
3.2.2. Edge Structure Feature Extraction
Edge is the most important feature of the image. Currently, there are many kinds of edge
Edgedetection is the most important the image. Currently, thereoperator are many kinds method, of edge methods, such as feature Roberts of operator detection method, Sobel detection detectionPrewitt methods, suchdetection as Roberts operator method, Sobel operator detection method, operator method and sodetection on. However, because of the noise, the Canny method is not easy to be disturbed, or to detect the real weak edges. The advantage is that the two different Prewitt operator detection method and so on. However, because of the noise, the Canny method is thresholds are usedortotodetect strong edgesweak and weak edges. edges are included in the output not easy to be disturbed, detect the real edges. TheWeak advantage is that the two different image when the weak edges and strong edges are connected. In this paper, we use Canny edge thresholds are used to detect strong edges and weak edges. Weak edges are included in the output detection algorithm to extract the edge structure features of the image. Steps are as follows: image when the weak edges and strong edges are connected. In this paper, we use Canny edge We usetoGauss filter smooth image.features Canny of method firstly Steps uses two-dimensional detection(1) algorithm extract theto edge structure the image. are as follows: Gauss’s (1)
function to smooth the image:
We use Gauss filter to smooth image. Canny method firstly uses two-dimensional Gauss’s x2 + y 2 1 function to smooth the image: G( x , y ) = exp − (7) 2 2πσ 2σ 2 x 2 + y2 1 − exp (7) G ( x, y) = 2πσ2 2σ2
Algorithms Algorithms 2016, 2016, 9, 9, 87 87
8 8ofof218
"∂G / ∂x # . Where σ is distributed parameter of Gauss filter. ∂∂G/∂x G / ∂y . Where σ is distributed parameter of Gauss filter. ∂G/∂y gradient, and suppress thethe amplitude of the (2) We (2) We calculate calculate the the magnitude magnitudeand anddirection directionofofthe the gradient, and suppress amplitude of gradient. the gradient. perform aa dual dual threshold threshold method method to to detect detect and and connect connect edges. edges. E(x, the final final edge edge (3) We (3) We perform E(x, y) y) is is the structure map. structure map.
Its ∇GG== Its gradient gradient vector vector is is ∇
The edge structure characteristic is calculated as: The edge structure characteristic is calculated as: E( c , s) = E( c )ΘE( s) (8) E(c, s) = | E(c)ΘE(s)| (8) The above feature maps are combined to get the edge structure salient map. The combined The isabove formula as: feature maps are combined to get the edge structure salient map. The combined formula is as: 4 c+4 4 c +4 (9) E =⊕⊕ ⊕ E= ⊕ NN( (EE(c(,c,s))s)) (9) c = 2 s=c + 3
c =2 s = c +3
In the the end, end, the the final final image image salient salient region region is is obtained obtained by by linear linear addition addition of of 55 salient salient maps, maps, and and the the In formula is is shown shown below: below: formula 1 1 ( I ) + N (C ) + N (O) + N (T ) + N ( E)) S =S =( N((N I ) + N (C ) + N (O) + N ( T ) + N ( E)) 5 5
(10) (10)
where, I,I, C C and and O O represent represent intensity, intensity,color colorand andorientation orientationfeature featurerespectively. respectively. where, 4. Overview of the Method Currently no reference image quality evaluation algorithm can be divided into two categories: distortion-specific methods and universal universal methods. methods. However, the distortion-specific methods are limited application because the the prior needneed to determine the type the distorted images. limited in inthe thepractical practical application because prior to determine theoftype of the distorted The general-purpose methodsmethods do not depend the type images, and haveand prospect images. The general-purpose do not on depend on of thedistorted type of distorted images, have of practical application. However, the existing algorithms are almost to extract natural statistical prospect of practical application. However, the existing algorithms are almost to extract natural characteristics of image, and account visual characteristics of characteristics the human eye.of Thus, statistical characteristics of rarely image,take andinto rarely takethe into account the visual the in practical there isapplication, a deviation there between evaluation resultthe and the resultresult of the and human human eye.application, Thus, in practical is a the deviation between evaluation the eye perception. As shown in FiguresAs 5 and 6, due to the visual characteristics of the human eye, in result of the human eye perception. shown in Figures 5 and 6, due to the visual characteristics of the same picture, (such as interest and(such non as interest areas) theinterest same image human eye, indifferent the sameregions picture, different regions interest andin non areas)quality in the damage degree will make the degree human will eye produce experience. Under the same Gauss same image quality damage make thedifferent human visual eye produce different visual experience. fuzzy, observation of Figure 5 is better thanofFigure Underthe thevisual same Gauss fuzzy,effect the visual observation effect Figure6.5 is better than Figure 6.
Figure 5. 5. Damaged of non-interest non-interest region. region. Figure Damaged figure figure of
Algorithms 2016, 9, 87
9 of 21
Algorithms Algorithms 2016, 2016, 9, 9, 87 87
99 of of 99
Figure ofinterest interestregion. region. Figure Damaged figure Figure6.6. 6.Damaged Damaged figure figure of of interest region.
In the observation of the image, human eye will be the first to pay attention to the visual
In observation the observation image,human human eye eye will pay attention to the In the of of thetheimage, will be bethe thefirst firsttoto pay attention to visual the visual characteristics of the more prominent areas. Although the image suffered the same damage, because characteristics of the more prominent areas. Although the image suffered the same damage, because characteristics of the more prominent areas. Although the image suffered the same damage, because of of the the different different regions, regions, the the human human eye eye subjective subjective feelings feelings are are different. different. Based Based on on the the background, background, of thethis different regions, the human eye subjective feelings are different. Based on the background, this paper paper proposes proposes aa method method of of image image quality quality evaluation evaluation based based on on visual visual perception. perception. this paper proposes athe method of image quality evaluation based on from visual perception. Firstly, we Firstly, we extract region of interest. Secondly, extract the features the Firstly, we extract the region of interest. Secondly, extract the features from the regions regions of of interest interest extract the region of interest. Secondly, extract the features from the regions of interest and regions and regions of non-interest. Finally, the extracted features are fused effectively, and the and regions of non-interest. Finally, the extracted features are fused effectively, and the image image quality evaluation model is established. The proposed approach we called region of interest of non-interest. Finally, the extracted features are fused effectively, and the image quality evaluation quality evaluation model is established. The proposed approach we called region of interest blind/reference spatial evaluator (ROI-BRISQUE). Figure 77 is of model is established.image The proposed approach we called region of interest image spatial blind/reference image spatial quality quality evaluator (ROI-BRISQUE). Figure blind/reference is the the framework framework of the the proposed method in this paper. quality evaluator (ROI-BRISQUE). proposed method in this paper. Figure 7 is the framework of the proposed method in this paper.
Train Train image image ROI ROI Extraction Extraction Test Test image image
ROI ROI
BRISQUE BRISQUE
Train Train features features fusion fusion
Non-ROI Non-ROI
BRISQUE BRISQUE
Test Test features features fusion fusion
DMOS DMOS SVR SVR ROI-BRISQUE ROI-BRISQUE Model Model
Image Image quality quality score score
Figure 7. Model of Region of Interest—Blind/Reference Image Spatial Quality Evaluator.
Figure 7. Model of Region Interest—Blind/Reference Image Spatial Quality Evaluator. Figure 7. Model of Region ofofInterest—Blind/Reference Image Spatial Quality Evaluator.
The The method method of of this this paper paper is is described described as as follows: follows:
The method of this described as follows: Firstly, LIVE image is into Firstly, the the LIVEpaper imageisdatabase database is divided divided into two two categories: categories: training training images images and and testing testing images. In addition, we extract the region of interest and features on the training images. Firstly, LIVE image database is divided intoand two categories: trainingimages. imagesThen, andwe testing images. the In addition, we extract the region of interest features on the training Then, we use the feature as input of ε-SVR, DMOS of image the images. we extract region of interest and features oncorresponding the training images. use In theaddition, feature vectors vectors as the thethe input of the the ε-SVR, the the DMOS of the the corresponding image as asThen, the we output target to the image evaluation model. the output target to train train the input imageofquality quality evaluation model.ofFinally, Finally, the image image quality quality prediction use the feature vectors as the the ε-SVR, the DMOS the corresponding imageprediction as the output model is used to predict the quality of the distorted image. model is used to predict the quality of the distorted image. target to train the image quality evaluation model. Finally, the image quality prediction model is used This paper on region paper focuses focuses on the the image image region of of interest interest extraction. extraction. The The traditional traditional Itti Itti model model to predictThis the quality of the distorted image. calculates the salient region of the image by extracting the underlying features of the image color, calculates the salient region of the image by extracting the underlying features of the image This paper focuses on the image region of interest extraction. The traditional Ittiare modelcolor, calculates brightness brightness and and direction. direction. It It can can search search out out the the attention attention region. region. However, However, there there are still still some some the salient region of the image by extracting the underlying features of the image color, brightness shortcomings, shortcomings, for for instance, instance, the the contour contour of of the the salient salient region region is is not not clear. clear. In In order order to to extract extract the the region region and direction. It canaccurately, search outwe theadd attention region. However, there aretostill some shortcomings, for of interest more texture and edge structure features the Itti model. of interest more accurately, we add texture and edge structure features to the Itti model. Then, Then, we we instance, the improved contour ofmodel the salient region not clear. In order to region extractof region of interest use to the of and non-interest, and use the the improved model to extract extract theisregion region of interest interest and region ofthe non-interest, and use usemore BRISQUE algorithm to extract the natural statistical features of the image respectively. Moreover, accurately, we add texture and edge structure features to the Itti model. Then, we use the improved BRISQUE algorithm to extract the natural statistical features of the image respectively. Moreover, the of of region of are to measure model extract the region of interest and and region of non-interest, use BRISQUE thetocharacteristics characteristics of the the region region of interest interest and region of non non interest interestand are fused fused to get get the the algorithm measure to factor of the image. We train the image quality evaluation model, which the measure factor factor the image. We train the image quality evaluation model,Moreover, which the the measure factor as as the theof the extract the of natural statistical features of the image respectively. characteristics input and the DMOS value of corresponding image as the output target of SVR. Finally, the quality input and theand DMOS valueofofnon corresponding target offactor SVR. Finally, the quality region of interest region interest areimage fusedastothe getoutput the measure of the image. We train of distorted image by the evaluation model. of the the quality distortedevaluation image is is predicted predicted the trained trained evaluation model. the image model,by which the measure factor as the input and the DMOS value of corresponding image as the output target of SVR. Finally, the quality of the distorted image is predicted by the trained evaluation model.
Algorithms 2016, 9, 87
10 of 21
5. Feature Extraction of BRISQUE Algorithm Ruderman [30] found that the normalized luminance coefficients of natural images obey the unit generalized Gauss probability distribution. He believes that the image distortion will change the statistical characteristics of the normalized coefficient. By measuring the change of the statistical features, the distortion types can be predicted and the visual quality of the image can be evaluated. Based on this theory, Mittal et al. proposed a BRISQUE algorithm based on spatial statistical features [29]. This spatial method to NR IQA that they have developed can be summarized as follows. 5.1. Image Pixel Normalization Given an image which possibly distorted, we compute locally normalized luminances via local mean subtraction and divisive normalization. Formula (11) may be applied to a given intensity image I(i, j) to produce: I (i, j) − µ(i, j) Iˆ(i, j) = (11) σ (i, j) + c K
µ(i, j) =
L
∑ ∑
ωk,l I (i + k, j + l )
(12)
k =−K l =− L
v u u σ(i, j) = t
K
L
∑ ∑
ωk,l [ I (i + k, j + l ) − µ(i, j)]2
(13)
k =−K l =− L
where I(i, j) represents the gray value of the original image, I ∈ {1,2, . . . , M}, j ∈ {1,2, ..., N}. M and N are the image height and width respectively; c is a small constant, in order to prevent the stability of calculated results when the denominator closes to 0. µ(i, j) and σ(i, j) are weighted mean and variance. ω = {ωk,l |k = −K, . . . , K, l = −L, . . . , L} is a two-dimensional circularly symmetric Gaussian weighting function. They called the normalized brightness value of Iˆ(i, j) is MSCN (Mean Subtracted Contrast Normalized) coefficients. 5.2. Spatial Feature Extraction The existence of distortion will destroy the regularity between the adjacent MSCN coefficients. The characteristics of the normalized coefficients include the generalized Gauss distribution features and the correlation of the adjacent coefficients. (1) Generalized Gauss distribution characteristics: The model formula can be expressed as Formula (14). a a |x| f ( x; a, σ ) = exp − 2βΓ(1/a) β 2
where β = σ
q
Γ(1/a) , Γ(3/a)
(14)
Γ(·) is gamma function. The shape parameter α controls the shape of the
generalized Gauss model, and σ2 is the variance. This article uses the literature [42] fast matching method to estimate (a, σ2 ) as the features of generalized Gauss distribution. (2) Correlation of adjacent coefficients: We obtain correlation images from the horizontal, vertical, diagonal and diagonal four directions [43], and use Asymmetric Generalized Gaussian Distribution (AGGD) to fit the images. We use fast matching method [44] to estimate parameters (η, ν, σl , σr ) for each direction. Thus, we utilize the 16 estimation parameters of four directions as the correlation feature of the adjacent coefficients. However, due to multiscale statistical characteristics of natural images, the author found that it is much more reasonable to extract two generalized distribution characteristics and 16 neighboring
Algorithms 2016, 9, 87
11 of 21
Algorithms 2016, 9, 87
11 of 11
coefficients of correlation characteristics at two scales. Therefore, the total extracted features are utilize the 16 estimation parameters of four directions as the correlation feature of the adjacent (2 + 16) × 2 = 36.
coefficients. However, due to multiscale statistical characteristics of natural images, the author found that it 6. Prediction Model is much more reasonable to extract two generalized distribution characteristics and 16 neighboring Through established the relationship image features subjective evaluation coefficientsbeing of correlation characteristics at two between scales. Therefore, the totaland extracted features are (2of + 16) × 2 =in36. values image this paper, we propose a model of objective image quality evaluation model based on
ROI-BRISQUE. We utilize measure factor of distortion image (V all ) which obtained from feature vectors 6. Prediction Model effective fusion between region of interest feature vector (V roi ) and region of non-interest feature vector being the relationship between features and subjective evaluation (V non-roi ) Through as the input ofestablished the SVR model. Moreover, we useimage the subjective quality evaluation value of values of image in this paper, we propose a model of objective image quality evaluation the corresponding image as the output target to establish image quality evaluation modelmodel based on on ROI-BRISQUE. We utilize measure factorbyofFormula distortion(15). image (Vall) which obtained from visualbased perception. The calculation formula is given feature vectors effective fusion between region of interest feature vector (Vroi) and region of non-interest feature vector (Vnon-roiV ) as = theλV input+of(1the model. Moreover, we use the subjective (15) − SVR λ)Vnon roi −roi all quality evaluation value of the corresponding image as the output target to establish image quality evaluation modelλbased onweight visual perception. The region calculation formula isThrough given by Formula The parameter is the of the image of interest. training(15). on the LIVE
image database, the experimental results show the model has a good learning accuracy Vall = λ Vroi +that (1 − λ )Vnon (15) and - roi predictive ability. The parameter λ is the weight of the image region of interest. Through training on the LIVE image database, the experimental results show that the model has a good learning accuracy and 7. Results and Discussion predictive ability.
7.1. Experimental Results of the Region of Interest 7. Results and Discussion
In this paper, we select some pictures from image database to test the improved Itti model. 7.1. Experimental of the Region of Interest Meanwhile, in orderResults to compare the experimental results, saliency maps of Itti and GBVS are also listed, theInresults as shown in Figures 8–10: from image database to test the improved Itti model. this paper, we select some pictures In Figures 8–10, (a) is original (b), (c)results, and (d)saliency are salient maps of Graph-Based Meanwhile, in order tothe compare theimage, experimental mapsoverlap of Itti and GBVS are also listed, the results as shown in Figures 8–10:and this paper’s model respectively. (e), (f) and (g) are Visual Saliency (GBVS) model, Itti model In Figures (a) Itti is model the original image, (b), model (c) andrespectively. (d) are salient maps of salient maps of GBVS8–10, model, and this paper’s The overlap white areas represent Graph-Based Visual Saliency (GBVS) model, Itti model and this paper’s model respectively. (e), (f) region of interest, and the black areas represent the region of non-interest. The higher the brightness (g) are salient maps of GBVS model, Itti model and this paper’s model respectively. The white of theand white part is, the higher the degree of interest of the human eye becomes. From figures of areas represent region of interest, and the black areas represent the region of non-interest. The (b), (c) and (d), we can see the regions of interest which obtained from the method of this paper are higher the brightness of the white part is, the higher the degree of interest of the human eye becomes. moreFrom accurate than others. The regions quite wellwhich with obtained human perception and human figures of (b), (c) and (d), we can are see consistent the regions of interest from the method of visualthis process. paper are more accurate than others. The regions are consistent quite well with human To verify the accuracy of theprocess. region of interest which is better obtained by the improved Itti model, perception and human visual we use eye-tracking data [45]. It was order which to better understand how look toItti images To verify the accuracy of thecollected region of in interest is better obtained by people the improved we use eye-tracking It was collected to better understand people look was whenmodel, assessing image quality. data This[45]. database consists in of order 40 original images. Eachhow original image to images when quality. This database consists of 40 original images. Each original further processed toassessing produceimage four different versions, which resulted in a total of 160 images used in image was further processed to produce four different versions, which resulted in a total of 160 the experiment. Participants performed two eye-tracking experiments: one with a free-looking task images used in the experiment. Participants performed two eye-tracking experiments: one with a and one with a quality assessment task. With the eye tracker it was possible to track the eye movement free-looking task and one with a quality assessment task. With the eye tracker it was possible to track of the viewers and map the salient regions for images given different tasks and with different levels the eye movement of the viewers and map the salient regions for images given different tasks and of image Someofmeaningful progress in the design of eye-tracking data is reported with quality. different levels image quality. Some meaningful progress in the design of eye-tracking datain the literature [46]. is reported in the literature [46].
(a) Original
(b) GBVS Figure 8. Cont.
(c) Itti
(d) Improved Itti
Algorithms 2016, 9, 87
12 of 21
Algorithms 2016, 9, 87
12 of 12
Algorithms 2016, 9, 87
12 of 12
Algorithms 2016, 9, 87
12 of 12
(e) GBVS
(f) Itti
(g) Improved Itti
(e) visual GBVSsaliency models. (a) (f) Original Itti Improved Itti Figure 8. Comparison of different image; (g) (b–d) salient overlap Figure 8. Comparison of different(e) visual saliency models.(f) (a)Itti Original image; salient GBVS (g) (b–d) Improved Ittioverlap maps of GBVS model, Itti model and improved Itti model respectively; (e–g) salient maps of GBVS 8. Comparison of different visual saliency models. (a) Original image; (b–d) salient overlap mapsFigure of GBVS model, Itti model and improved Itti model respectively; (e–g) salient maps of GBVS model, Itti model and improved model respectively. maps model, Itti modelItti and improved Ittimodels. model respectively; (e–g) salient maps ofoverlap GBVS Figureof8.GBVS Comparison of different visual saliency (a) Original image; (b–d) salient model, Itti model and improved Itti model respectively. model, model and improved model respectively. maps ofItti GBVS model, Itti modelItti and improved Itti model respectively; (e–g) salient maps of GBVS model, Itti model and improved Itti model respectively.
(a) Original
(b) GBVS
(c) Itti
(d) Improved Itti
(a) Original (a) Original
(b) GBVS (b) GBVS
(c) Itti (c) Itti
(d) Improved Itti (d) Improved Itti
(e) GBVS
(f) Itti
(g) Improved Itti
(e) visual GBVSsaliency models. (a) (f) Original Itti Improved Itti Figure 9. Comparison of different image; (g) (b–d) salient overlap (e) GBVS (f) Itti (g) Improved Itti maps of GBVS model, Itti model and improved Itti model respectively; (e–g) salient maps of GBVS Figure 9. Comparison of different visual saliency models. (a) Original image; (b–d) salient overlap Figure 9. Comparison ofimproved differentItti visual saliency models. (a) Original image; (b–d) salient overlap model, Itti model and model respectively. maps model, Itti model and improved Ittimodels. model respectively; (e–g) salient maps ofoverlap GBVS Figureof9.GBVS Comparison of different visual saliency (a) Original image; (b–d) salient maps of GBVS model, Itti model and improved Itti model respectively; (e–g) salient maps of GBVS model, model and improved model respectively. maps ofItti GBVS model, Itti modelItti and improved Itti model respectively; (e–g) salient maps of GBVS model, Itti model and improved Itti model respectively. model, Itti model and improved Itti model respectively.
(a) Original
(b) GBVS
(c) Itti
(d) Improved Itti
(a) Original (a) Original
(b) GBVS (b) GBVS
(c) Itti (c) Itti
(d) Improved Itti (d) Improved Itti
(e) GBVS
(f) Itti
(g) Improved Itti
(e) GBVS Improved Itti Figure 10. Comparison of different visual saliency models. (f) (a) Itti Original image; (g) (b–d) salient overlap (e) GBVS (f) Itti (g) Improved Itti maps of GBVS model, Itti model and improved Itti model respectively; (e–g) salient maps of GBVS Figure 10. Comparison of different visual saliency models. (a) Original image; (b–d) salient overlap model, Itti model and improved Itti model respectively. maps GBVS model, Itti and improved Ittimodels. model respectively; (e–g) salient maps ofoverlap GBVS Figureof10. Comparison of model different visual saliency (a) Original image; (b–d) salient Figure 10. Comparison of different visual saliency models. (a) Original image; (b–d) salient overlap model, model and improved model respectively. maps ofItti GBVS model, Itti modelItti and improved Itti model respectively; (e–g) salient maps of GBVS recording is so far the ways for studying human mapsSince of GBVS model,eye Itti movements model and improved Itti most modelreliable respectively; (e–g) salient maps of visual GBVS model, Itti model and improved Itti model respectively.
attention it isand highly desirable use these visual attention data forhuman the model of model, Itti[10], model improved Ittito model respectively. Since recording eye movements is so far “ground the mosttruth” reliable ways for studying visual extract ROI. We use the images of the eye-tracking database to extract the region of interest, and attention it is highly to use these visual attention data forhuman the model of Since[10], recording eye desirable movements is so far “ground the mosttruth” reliable ways for studying visual compare with resultsimages of eye-tracking database. The resulting ROI are illustrated 11 extract ROI. We oftothe database to extract the of in interest, and attention [10], itthe isuse highly desirable useeye-tracking truth” visual attention data for theFigures model of Since recording eyethe movements is sothese far “ground the most reliable ways forregion studying human visual and 12.ROI. (e) asWe an example a saliency map derived from eye-tracking obtained experiment compare with theuse results of eye-tracking database. The resulting ROIdata are illustrated Figuresand 11I extract the images of the eye-tracking database to extract the region in of in interest,
attention [10], it is highly desirable to use these “ground truth” visual attention data for the and 12. (e) as an a saliency map database. derived from in in experiment compare with theexample results of eye-tracking The eye-tracking resulting ROIdata are obtained illustrated Figures 11I model of extract ROI. We use the images of the eye-tracking database to extract the region of and 12. (e) as an example a saliency map derived from eye-tracking data obtained in experiment I interest, and compare with the results of eye-tracking database. The resulting ROI are illustrated in Figures 11 and 12. (e) as an example a saliency map derived from eye-tracking data obtained in
Algorithms 2016, 9, 87
13 of 21
Algorithms 2016, 9, 87
13 of 13
experiment I (free looking task) for one of the original images, and the (i) obtained in experiment II Algorithms 2016, 9, 87 13 of 13 looking task) for one of the original images, and the (i) obtained in experiment II (scoring task) (scoring(free task) for the same image. It can be seen that the improved Itti model achieves comparable for the same image. It can be seen that the improved Itti model achieves comparable testing results looking for one of the images, and obtained in experiment II (scoring testing (free results andtask) approaches theoriginal performance of the the(i)eye-tracking data. This modeltask) is able to and approaches the performance of the eye-tracking data. This model is able to successfully predict for thepredict same image. It can be agreement seen that the with improved Itti model achieves comparable testing results successfully ROI in close human judgments. ROI in close agreement with human judgments. and approaches the performance of the eye-tracking data. This model is able to successfully predict ROI in close agreement with human judgments.
(a) Original
(b) GBVS
(c) Itti
(d) Improved Itti
(e) SaliencyFree
(a) Original
(b) GBVS
(c) Itti
(d) Improved Itti
(e) SaliencyFree
(f) GBVS
(g) Itti
(h) Improved Itti
(i) SaliencyScore
Figure 11. Comparison different visual saliency models. (a) (h) Original image;Itti (b–d) salient overlap (f)ofGBVS (g) Itti Improved (i) SaliencyScore
Figure 11. Comparison of different visual saliency models. (a) Original image; (b–d) salient overlap maps of GBVS model, Itti model and improved Itti model respectively; (f–h) salient maps of GBVS 11.model, Comparison of different visual saliency Original image; (b–d) salientmaps overlap maps ofFigure GBVS and Itti improved Ittimodels. model(a) respectively; salient of GBVS model, Itti modelItti andmodel improved model respectively; (e) Salient map(f–h) of free looking task; maps of GBVS model, Itti model and improved Itti model respectively; (f–h) salient maps of GBVS model, Itti model and improved Itti model respectively; (e) Salient map of free looking task; (i) salient (i) salient map of scoring task. model, Itti model and improved Itti model respectively; (e) Salient map of free looking task; map of scoring task. (i) salient map of scoring task.
(a) Original
(b) GBVS
(c) Itti
(d) Improved Itti
(e) SaliencyFree
(a) Original
(b) GBVS
(c) Itti
(d) Improved Itti
(e) SaliencyFree
(f) GBVS
(g) Itti
(h) Improved Itti
(i) SaliencyScore
Figure 12. Comparison different visual saliency models. (a) (h) Original image;Itti (b–d) salient overlap (f)ofGBVS (g) Itti Improved (i) SaliencyScore maps of GBVS model, Itti model and improved Itti model respectively; (f–h) salient maps of GBVS Figure 12. Comparison of different visual saliency models. (a) Original image; (b–d) salient overlap model, Itti model of and improved Itti model respectively; map of free(b–d) looking task;overlap Figure 12. Comparison different visual saliency models. (e) (a) Salient Original image; salient maps of GBVS model, Itti model and improved Itti model respectively; (f–h) salient maps of GBVS (i)GBVS salientmodel, map of scoring task and improved Itti model respectively; (f–h) salient maps of GBVS maps ofmodel, Itti model Itti model and improved Itti model respectively; (e) Salient map of free looking task; model, Itti modelmap and (i) salient ofimproved scoring taskItti model respectively; (e) Salient map of free looking task; (i) salient
7.2. Experiments and Relate Analysis of Image Quality Evaluation
map of scoring task
7.2. Experiments and Relate Analysis of Image Quality Evaluation 7.2.1. LIVE IQA Database
7.2. Experiments and Relate of Image Quality use the LIVEAnalysis IQA database2 [38] to test Evaluation the performance of ROI-BRISQUE method. The 7.2.1.We LIVE IQA Database database consists of 29 reference images and 779 distorted images spanning various distortion
We use the LIVE IQA database2 [38] to test the performance of ROI-BRISQUE method. The 7.2.1. LIVE IQA Database categories, JPEG compresses images (169 images), JPEG2000 compressed images (175 images), database consists of 29 reference images and 779 distorted images spanning various distortion
Gaussian blur (145 images) and Rayleigh fading channel (fast fading 145 images). Further, each Wecategories, use the JPEG LIVEcompresses IQA database2 [38] images), to test JPEG2000 the performance ROI-BRISQUE images (169 compressedofimages (175 images),method. Gaussian blur (145 images) and Rayleigh fading channel (fast fading 145 images). The database consists of 29 reference images and 779 distorted images spanningFurther, variouseach distortion categories, JPEG compresses images (169 images), JPEG2000 compressed images (175 images), Gaussian blur (145 images) and Rayleigh fading channel (fast fading 145 images). Further, each distorted image has an associated difference mean opinion scores (DMOS), which represents the perceived quality of the image.
Algorithms 2016, 9, 87
14 of 21
In order to calibrate the relationship between the extracted features and the ROI, as well as DMOS, the ROI-BRISQUE method requires two subsets of images. We randomly divide the LIVE IQA database2 into two non-overlapping sets −80% training and 20% testing. We do this to make sure that the experiment results do not depend on features extracted from known distortion images, which can improve performance of the ROI-BRISQUE method. Further, in order to eliminate performance bias, we repeat this random 80% train and 20% test procedure 1000 times and compute the mean value of the performance. In the following text, the figures reported here are the median values of performance across these 1000 iterations. In this paper, we use four commonly used performance indexes to evaluate the IQA algorithms. They are Spearman’s rank ordered correlation coefficient (SROCC), Kendall rank order correlation coefficient (KROCC), Pearson’s linear correlation coefficient (PLCC) and Root Mean Squared Error (RMSE). The first two indexes can measure the prediction monotonicity of an IQA algorithm. The other two indexes are computed after passing the algorithmic value through a logistic nonlinearity as in [47]. 7.2.2. Performance Analysis In this paper, all the procedures are performed using the Matlab (R2011a) toolbox. We use the LIBSVM package [48] to implement the SVR and the radial bias function (RBF) kernel for regression. The operating environment is Intel® CoreTM i5 and 2.67 GHz processor with 6 GB of RAM. First of all, in order to make the method proposed in this paper can get better results in performance, we need to choose a reasonable weight value of λ in formula (15). We use different values of λ to test the proposed model. The results are shown in Table 1. As observed from Table 1, we can see the experimental results obtained by the Formula (15) are relatively good when the λ = 0.7. In addition, we also utilize ITTI and GBVS model to extract the ROI of the image and test the proposed model. The performance indices are tabulated in Tables 2 and 3 respectively. In these two tables, since Itti and GBVS model cannot accurately describe the scope of the region of interest, these performances are inferior to the improved Itti model. In order to test the proposed algorithm performance better, we report the performance of two FR IQA algorithms peak-signal-to-noise-ratio (PSNR), and the structural similarity index (SSIM). Since the two algorithms have a good performance in FR IQA, they have been used as a benchmark for many years. We also select three NR IQA algorithms BLIINDS-II, DIIVINE and BRISQUE to compare the performance, because these three methods are general-purpose methods and have a good performance in DCT, wavelet and spatial domain respectively. These performance indices are tabulated in Table 4. As seen from Table 4, ROI-BRISQUE performs quite well in terms of correlation with human perception, beating present day full-reference and no-reference image quality assessment indices. Table 1. Performance of ROI-BRISQUE which ROI extracted by the improved Itti model. The numbers in bold are the best values. λ
KROCC
RMSE
SROCC
PLCC
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.7674 0.7692 0.7725 0.7802 0.7816 0.7899 0.7907 0.8021 0.7981 0.7910 0.7841
7.9544 7.7801 6.9801 6.2725 5.9544 5.5036 5.0113 4.7825 4.9130 5.1800 5.5767
0.8805 0.8936 0.9169 0.9234 0.9303 0.9439 0.9443 0.9495 0.9493 0.9454 0.9413
0.8818 0.8958 0.9220 0.9267 0.9352 0.9459 0.9450 0.9494 0.9482 0.9444 0.9396
Algorithms 2016, 9, 87
15 of 21
Table 2. Performance of ROI-BRISQUE which ROI extracted by Itti model. The numbers in bold are the best values in the corresponding column. λ
KROCC
RMSE
SROCC
PLCC
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.7452 0.7585 0.7714 0.7829 0.7926 0.7916 0.7849 0.7848 0.7736 0.7572 0.7509
7.0713 6.6507 6.3002 5.9391 5.7110 5.7155 5.7062 5.6635 5.9682 6.3446 6.4869
0.9059 0.9175 0.9269 0.9352 0.9407 0.9412 0.9378 0.9376 0.9299 0.9196 0.9145
0.8998 0.9119 0.9214 0.9304 0.9354 0.9353 0.9356 0.9365 0.9294 0.9200 0.9162
Table 3. Performance of ROI-BRISQUE which ROI extracted by GBVS model. The numbers in bold are the best values. λ
KROCC
RMSE
SROCC
PLCC
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.7490 0.7592 0.7709 0.7938 0.7991 0.7938 0.7918 0.7890 0.7880 0.7804 0.7745
6.7257 6.3346 6.0183 5.5164 5.3783 5.5856 5.5501 5.5012 5.5172 5.6695 5.8075
0.9175 0.9258 0.9321 0.9432 0.9453 0.9439 0.9429 0.9400 0.9404 0.9371 0.9341
0.9102 0.9207 0.9288 0.9401 0.9433 0.9392 0.9399 0.9409 0.9405 0.9372 0.9345
Table 4. Performance comparison of some IQA methods and the method of this paper. Italicized algorithms are NR IQA algorithms, others are FR IQA algorithms.
PSNR SSIM BLIINDS-II DIIVINE BRISQUE ROI-BRISQUE
KROCC
RMSE
SROCC
PLCC
0.6674 0.7410 0.7619 0.7643 0.7768 0.8021
11.2795 10.5723 9.8894 8.1542 6.5468 4.7613
0.8873 0.9201 0.9123 0.9265 0.9314 0.9522
0.8866 0.9110 0.9145 0.9283 0.9377 0.9465
Usually we hope the scatter plot should define a cluster, that means the subjective score and objective evaluating score are tightly correlative since the ideal IQA algorithm should accurately reflect the subjective score, such as the DMOS [49]. The scatter plots of different IQA models are shown in Figure 13, where each point represents one test image, with its vertical and horizontal axes representing its predict score and the given objective quality score, respectively. We can see the scatter plots of full reference methods PSNR and SSIM are scattered, while the scatter plots of no reference methods DIIVINE, BLIINDS-II and BRISQUE are evenly distributed along the curve in the low distortion. The prediction is relatively accurate. When the degree of distortion of the image is too large, there is a deviation between the prediction results and the true value, scatter points more dispersed. In this paper, the method can accurately predict the image with different degree of distortion. We can see the distributions cluster along the curve.
Received: date; Accepted: date; Published: date
We would like to make the following change to our article [1]. Algorithms 2016, 9, 87 The sub-graph (c) DIIVINE was not edited correctly in the original Figure 13. The figure should16 of 21
be replaced with:
(a) PSNR
(b) SSIM
(c) DIIVINE
(d) BLIINDS-II
(e) BRISQUE
(f) Method of this paper
Figure 13. Scatter plots of DMOS versus different model predictions. Each sample point represents
Figure Scatter DMOS versus different model predictions. Each sample pointmodel; represents one13. test imageplots in theofentire LIVE database. (a) PSNR model; (b) SSIM model; (c) DIIVINE one test image in the entire LIVE database. (a)Model PSNR (b)(ROI-BRISQUE). SSIM model; (c) DIIVINE model; (d) BLIINDS-II model; (e) BRISQUE model; (f) of model; this paper (d) BLIINDS-II model; (e) BRISQUE model; (f) Model of this paper (ROI-BRISQUE). We apologize for any inconvenience this may cause. The change does not affect the scientific results. The manuscript will be updated and the original In order to verify the accuracy of the method proposed in this paper further, we test five categories will remain online on the article web-page.
image distortion of LIVE image respectively. In addition, 100 images of various types are randomly selected to experiment. We list the median performance parameters for each distortion category and Reference the overall parameters as well. The results are shown in Table 5. 1.
Y Fu, S Wang; A No Reference Image Quality Assessment Metric Based on Visual Perception. Algorithms 2016, 9(4), 87.
Table 5. Performance comparison of different distortion types.
Algorithms 2016, 10, x; doi:
JPEG
JP2K
BLUR
WN
FF
ALL
www.mdpi.com/journal/algorithms
RMSE
ROI-BRISQUE BRISQUE
2.5975 4.0339
3.9836 5.4350
2.2188 4.0374
2.6453 3.3340
3.9494 5.6078
4.7613 6.5468
KROCC
ROI-BRISQUE BRISQUE
0.8071 0.7691
0.7927 0.7619
0.8384 0.8286
0.8933 0.8792
0.7398 0.7237
0.8021 0.7768
PLCC
ROI-BRISQUE BRISQUE
0.9867 0.9551
0.9693 0.9090
0.9801 0.9498
0.9927 0.9903
0.9719 0.9148
0.9465 0.9377
SROCC
ROI-BRISQUE BRISQUE
0.9858 0.9647
0.9690 0.9139
0.9865 0.9479
0.9908 0.9843
0.9561 0.9053
0.9522 0.9314
As a whole it is obvious that ROI-BRISQUE performs well in terms of correlation with human perception, and is competitive with the BRISQUE across distortion types. From Table 5, it can be seen that the ROI-BRISQUE algorithm outperforms the BRISQUE indices in the five different types of distortion. This is a remarkable result, as it shows that algorithms that operate without any reference information can offer performance competitive with the predominant IQA algorithm over the last decades. Figure 14 is scatter plots of five categories. It can be shown that the proposed method has higher accuracy and stability than conventional image quality evaluation methods. We also listed the median correlation values of ROI-BRISQUE as well as other PSNR, SSIM DIIVINE, BLIINDS-II and BRISQUE algorithms. Although the experiment results show some differences according to the median correlation, in this part we evaluate if these differences in correlation is statistically significant. To visualize the statistical significance of the comparison, we show error bars of the distribution of the SROCC values for each of the 1000 experimental trials. The error bars indicate the 95% confidence interval for each algorithm [50]. Plots are shown in
Algorithms 2016, 9, 87
17 of 21
Algorithms 2016, 9, 87 Algorithms 2016, 9, 87
17 of 17 17 of 17
show error bars of the distribution of the SROCC values for each of the 1000 experimental trials. The
Figure 15. error Obviously, the lower theofconfidence interval higher median SROCC, the better show bars of the SROCC forwith each a of[50]. the 1000 error bars indicate the distribution 95% confidencethe interval forvalues each algorithm Plots experimental are shown in trials. FigureThe 15. the performance is. The plots show that the ROI-BRISQUE is not statistically significantly error bars indicate the 95% for with each algorithm PlotsSROCC, are shown Figuredifferent 15. Obviously, the lower theconfidence confidenceinterval interval a higher [50]. median thein better the in performance. In inconfidence order the statistical significance of performance of Obviously, lower theshow interval with aishigher median SROCC, the different better the performancethe is. addition, The plots thatto theevaluate ROI-BRISQUE not statistically significantly in each performance The plots show that ROI-BRISQUE is not values statistically different in of algorithm, we useis. t-test between thosethe correlation [50].significantly We tabulate the performance. Inone-sided addition, in order tothe evaluate statistical significance of performance of results each addition, in 6. order to evaluate statistical significance of tabulate performance of each such performance. statistical in Table Thebetween null hypothesis is that the mean correlation forthe theresults row isofequal algorithm, analysis weIn use one-sided t-test thosethe correlation values [50]. We algorithm, we use one-sided t-test between those correlation values [50]. We tabulate the results statistical analysis 6. The hypothesis is thatlevel the mean correlation for the row of is to thesuch mean correlation for in theTable column at null the 95% confidence [51]. The other hypothesis is that such statistical analysis in Table 6. The null hypothesis is that the mean correlation for the row is equalcorrelation to the mean correlation column the 95% confidence level [51]. The other is the mean of the rowfor is the greater or at lesser than the mean correlation of thehypothesis column. In the equal to the mean correlation for row the column at the 95% confidence levelcorrelation [51]. The other hypothesis is that the mean correlation of the is greater or lesser than the mean of the column. In Table 6, the value of “1” indicates that the row method is statically superior to the column method, that the mean correlation theindicates row is greater or lesser than theismean correlation of the column. In the Table 6, the value of of“1” that the row method statically superior to the column “−1”the indicates that the rowofmethod is inferior to the column method, andsuperior “0” indicates that the row Table 6, the value “1” indicates that the row method is statically to the column method, “−1” indicates that the row method is inferior to the column method, and “0” indicates that and the column method, “−1”methods indicatesare thatequivalent. the row method is inferior to the column method, and “0” indicates that the row and the column methods are equivalent. the row and the column methods are equivalent.
(a) JPEG (a) JPEG
(b) JP2k (b) JP2k
(c) WN (c) WN
(d) BLUR (e) FF (d) BLUR (e) FF Figure 14. Scatter plots of JPEG, JP2k, WN, BLUR and FF. Scatter plots of predicted score versus Figure 14. Scatter plots of JPEG, JP2k, WN, BLUR and FF. Scatter plots of predicted score versus Figure 14. DMOS Scatter on plots of JPEG, JP2k, WN, FF. Scatter of predicted score subset; versus subjective LIVE IQA database. (a)BLUR JPEG and database subset;plots (b) JPEG2000 database subjective DMOS ononLIVE IQA database. database.(a) JPEG (a) JPEG database subset; (b) JPEG2000subset; database subjective DMOS LIVE database subset; (b) JPEG2000 (c) Gaussian white noise IQA database subset; (d) Gaussian blur database subset;database (e) fast-fading subset; (c) Gaussian white noise database subset; (d) Gaussian blur database subset; (e) fast-fading (c) Gaussian database subset.white noise database subset; (d) Gaussian blur database subset; (e) fast-fading database subset. database subset.
Figure 15. Mean SROCC value of the algorithms evaluated in Table 4, across 1000 train-test on the Figure 15. database2. Mean SROCC value ofthe the algorithms evaluated ininTable 4, 4, across 1000 train-test on the Figure 15. IQA Mean SROCC value algorithms evaluated Table across 1000 train-test on the LIVE The errorofbars indicate the 95% confidence interval. database2. error barsindicate indicatethe the 95% 95% confidence LIVELIVE IQA IQA database2. TheThe error bars confidenceinterval. interval.
Algorithms 2016, 9, 87
18 of 21
Table 6. Results of one-sided t-test performed between SROCC values of various IQA algorithms. The value of “1” indicates that the row algorithm is statically superior to the column algorithm; “−1” indicates that the row is inferior to the column; “0” indicates that the row and the column methods are equivalent. Italics indicate No-reference algorithms.
PSNR SSIM BLIINDS-II DIIVINE BRISQUE ROI-BRISQUE
PSNR
SSIM
BLIINDS-II
DIIVINE
BRISQUE
ROI-BRISQUE
0 1 1 1 1 1
−1 0 −1 1 1 1
−1 1 0 1 1 1
−1 −1 −1 0 1 1
−1 −1 −1 −1 0 1
−1 −1 −1 −1 −1 0
As observed from the Table 6, we can see that the ROI-BRISQUE is statistically better than other IQA methods listed in the Table 6. 7.2.3. Database Independence Since ROI-BRISQUE algorithm is evaluated only on the LIVE IQA database2, we cannot make sure that the performance of ROI-BRISQUE maintains its superiority in other IQA database. To show this, we apply the ROI-BRISQUE to the alternate TID2013 database [52]. The TID2013 database contains 25 reference images and 3000 distorted images (25 reference images × 24 types of distortions × 5 levels of distortions). Further, although it has 24 distortion categories, we test the ROI-BRISQUE only on these categories which has been trained for JPEG, JP2k (JPEG2000 compression), BLUR (Gaussian Blur), and WN (Additive white noise). TID2013 database does not contain FF (Fast Fading) distortion. The results of testing on TID2013 database are shown in Table 7. Compared with PSNR, SSIM, BLIINDS-II, DIIVINE and BRISQUE, the correlation with subjective perception of ROI-BRISQUE remained consistently competitive. Table 7. SROCC results obtained by training on the LIVE IQA database and testing on the TID2013 database. Italicized algorithms are NR IQA algorithms, others are FR IQA algorithms.
PSNR SSIM BLIINDS-II DIIVINE BRISQUE ROI-BRISQUE
JP2K
JPEG
WN
BLUR
All
0.8250 0.9630 0.9157 0.9240 0.8320 0.8534
0.8760 0.9354 0.8901 0.8660 0.9240 0.9276
0.9180 0.8168 0.6600 0.8510 0.8290 0.8488
0.9342 0.9600 0.8500 0.8620 0.8810 0.8965
0.8700 0.9016 0.8442 0.8890 0.8960 0.8993
In addition, to study whether the algorithm is database dependent, we also test ROI-BRISQUE on a portion of the CSIQ image database [53]. The database consists of 30 original images, each distorted using one of six types of distortions, each at four to five different levels of distortion. The database contains 5000 subjective ratings from 35 different observers, and the ratings are reported in the form of DMOS (the value is in the range [0, 1], where 1 denotes the lowest quality). The distortions used in CSIQ are: JPEG compression, JPEG2000 compression, global contrast decrements, additive pink Gaussian noise, and Gaussian blurring. In total, there are 866 distorted images. We test ROI-BRISQUE on the CSIQ database over 4 distortion categories in common with the LIVE database: JPEG, JPEG2000, Gaussian noise, and Gaussian blurring. The experimental results are reported in Table 8. It is obvious that the performance of ROI-BRISQUE does not depend on the database. The correlations are still consistently high.
Algorithms 2016, 9, 87
19 of 21
Table 8. SROCC results obtained by training on the LIVE IQA database and testing on the CSIQ database. Italicized algorithms are NR IQA algorithms, others are FR IQA algorithms.
PSNR SSIM BLIINDS-II DIIVINE BRISQUE ROI-BRISQUE
JP2K
JPEG
WN
BLUR
All
0.8905 0.9606 0.9057 0.8084 0.8662 0.8869
0.8993 0.9546 0.8809 0.8673 0.8394 0.8404
0.9173 0.8922 0.6956 0.8594 0.8253 0.8277
0.8620 0.9609 0.8653 0.8312 0.8625 0.8689
0.8810 0.9420 0.8452 0.8472 0.8509 0.8553
8. Conclusions With the development of information technology, digital images have been widely used in every corner of our life and work, the research of image quality assessment algorithm also has very important practical significance. In this letter, we describe a framework for constructing an objective no-reference image quality assessment measure. The framework is unique, since it absorbs the features of human visual system. On the basis of human visual attention mechanism and BRISQUE method, this paper proposes a new NR IQA method based on visual perception, which we named as ROI-BRISQUE. First, we detailed the algorithm and the extraction model of ROI, and demonstrated the ROI consistent quite well with human perception. Then, we used BRISQUE method to extract image features and evaluated the ROI-BRISQUE index in terms of correlation with subjective quality score, and demonstrated that algorithm is statistically better than some state of the art IQA indices. At last, we demonstrated that ROI-BRISQUE performs consistently across different databases with similar distortions. Further, the experimental results show that the new ROI-BRISQUE algorithm can be easily trained for a LIVE IQA database. In addition, it may be easily extended beyond the distortions considered here making it suitable for general-purpose blind IQA problems. The method correlates highly with visual perception of quality, and outperforms the full-reference PSNR and SSIM measure and the recent no-reference BLIINDS-II, DIIVINE indexes, and excels the performance of the no-reference BRISQUE index. In the future, we will make further efforts to improve ROI-BRISQUE, such as further exploring human visual system and extracting the region of interest accurately. Author Contributions: Both authors contributed equally to this work. Yan Fu and Shengchun Wang conceived this novel algorithm. Shengchun Wang performed the experiments and analyzed the data. Yan Fu tested the novel algorithm. Shengchun Wang wrote the paper, and Yan Fu revised the paper. Both authors have read and approved the final manuscript. Conflicts of Interest: The authors declare no conflict of interest.
References 1. 2. 3. 4. 5. 6.
Liu, L.; Dong, H.; Huang, H.; Bovik, A.C. No-reference image quality assessment in curvelet domain. Signal Process. Image Commun. 2014, 29, 494–505. [CrossRef] Hu, A.; Zhang, R.; Yin, D.; Zhan, Y. Image quality assessment using a SVD-based structural projection. Signal Process. Image Commun. 2014, 29, 293–302. [CrossRef] Liu, L.; Liu, B.; Huang, H.; Bovik, A.C. No-reference image quality assessment based on spatial and spectral entropies. Signal Process. Image Commun. 2014, 29, 856–863. [CrossRef] Requirements for an Objective Perceptual Multimedia Quality Model. Available online: https://www.itu. int/rec/T-REC-J.148-200305-I/en (accessed on 13 December 2016). Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [CrossRef] [PubMed] Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural similarity for image quality assessment. Signals Syst. Comput. 2003, 2, 1398–1402.
Algorithms 2016, 9, 87
7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.
29. 30. 31. 32. 33.
20 of 21
Chandler, D.M.; Hemami, S.S. VSNR: A wavelet-based visual signal-to-noise ratio for natural images. IEEE Trans. Image Process. 2007, 16, 2284–2298. [CrossRef] [PubMed] Saad, M.A.; Bovik, A.C.; Charrier, C. A DCT statistics-based blind image quality index. Signal Process. Lett. 2010, 17, 583–586. [CrossRef] Pedersen, M.; Hardeberg, J.Y. Using gaze information to improve image difference metrics. Proc. SPIE Int. Soc. Opt. Eng. 2008, 6806, 97–104. Liu, H.; Heynderickx, I. Visual Attention in Objective Image Quality Assessment: Based on Eye-Tracking Data. IEEE Trans. Circuits Syst. Video Technol. 2011, 21, 971–982. Wang, B.; Wang, Z.; Liao, Y.; Lin, X. HVS-based structural similarity for image quality assessment. In Proceedings of the International Conference on Signal Processing, Beijing, China, 26–29 October 2008. Babu, R.V.; Perkis, A. An HVS-based no-reference perceptual quality assessment of JPEG coded images using neural networks. IEEE Int. Conf. Image Process. 2005, 1, I-433-6. Zhang, Y.D.; Wu, L.N.; Wang, S.H.; Wei, G. Color image enhancement based on HVS and PCNN. Sci. Chin. Inf. Sci. 2010, 53, 1963–1976. [CrossRef] Zhang, S.; Zhang, C.; Zhang, T.; Lei, Z. Review on universal no-reference image quality assessment algorithm. Comput. Eng. Appl. 2015, 51, 13–23. Feng, X.J.; Allebach, J.P. Measurement of ringing artifacts in JPEG images. Proc. Spie 2006, 6076, 74–83. Meesters, L.; Martens, J.B. A single-ended blockiness measure for JPEG-coded images. Signal Proc. 2002, 82, 369–387. [CrossRef] Wang, Z.; Sheikh, H.R.; Bovik, A.C. No-reference perceptual quality assessment of JPEG compressed images. Proc. IEEE Int. Conf. Image Process. 2003, 1, I-477–I-480. Shan, S. No-reference visually significant blocking artifact metric for natural scene images. Signal Proc. 2009, 89, 1647–1652. Tong, H.; Li, M.; Zhang, H.J.; Zhang, C. No-reference quality assessment for JPEG2000 compressed images. Int. Conf. Image Process. 2004, 5, 3539–3542. Marziliano, P.; Dufaux, F.; Winkler, S.; Ebrahimi, T. Perceptual blur and ringing metrics: Application to JPEG2000. Signal Process. Image Commun. 2004, 19, 163–172. [CrossRef] Sazzad, Z.M.P.; Kawayoke, Y.; Horita, Y. No reference image quality assessment for JPEG2000 based on spatial features. Signal Process. Image Commun. 2008, 23, 257–268. [CrossRef] Sheikh, H.R.; Bovik, A.C.; Cormack, L. No-reference quality assessment using natural scene statistics: JPEG2000. IEEE Trans. Image Process. 2005, 14, 1918–1927. [CrossRef] [PubMed] Caviedes, J.; Oberti, F. A new sharpness metric based on local kurtosis, edge and energy information. Signal Process. Image Commun. 2004, 19, 147–161. [CrossRef] Ferzli, R.; Karam, L.J. A no-reference objective image sharpness metric based on the notion of Just Noticeable Blur (JNB). IEEE Trans. Image Process. 2009, 18, 717–728. [CrossRef] [PubMed] Sang, Q.B.; Su, Y.Y.; Li, C.F.; Wu, X.J. No-reference blur image quality assessment based on gradient similarity. J. Optoelectron. Laser 2013, 24, 573–577. Moorthy, A.K.; Bovik, A.C. A two-step framework for constructing blind image quality indices. Signal Process. Lett. 2010, 17, 513–516. [CrossRef] Moorthy, A.K.; Bovik, A.C. Blind image quality assessment: From natural scene statistics to perceptual quality. IEEE Trans. Image Process. 2011, 20, 3350–3364. [CrossRef] [PubMed] Saad, M.A.; Bovik, A.C.; Charrier, C. DCT statistics model based blind image quality assessment. In Proceedings of the 18th IEEE International Conference on Image Processing (ICIP), Brussels, Belgium, 11–14 September 2011. Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [CrossRef] [PubMed] Ruderman, D.L. The statistics of natural images. Netw. Comput. Neural Syst. 1994, 5, 517–548. [CrossRef] Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. Signal Process. Lett. 2013, 20, 209–212. [CrossRef] Zhang, L.; Shen, Y.; Li, H. VSI: A visual saliency-induced index for perceptual image quality assessment. IEEE Trans. Image Process. Publ. IEEE Signal Process. Soc. 2014, 23, 4270–4281. [CrossRef] [PubMed] Itti, L.; Koch, C.; Niebur, E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 1254–1259. [CrossRef]
Algorithms 2016, 9, 87
34. 35. 36.
37. 38. 39.
40. 41. 42. 43. 44.
45. 46.
47. 48. 49. 50. 51. 52.
53.
21 of 21
Itti, L.; Koch, C. Computational modeling of visual attention. Nat. Rev. Neurosci. 2001, 2, 194–230. [CrossRef] [PubMed] Itti, L.; Koch, C. Feature combination strategies for saliency based visual attention systems. J. Electr. Imaging 2001, 10, 161–169. [CrossRef] Navalpakkam, V.; Itti, L. An integrated model of top-down and bottom-up attention for optimizing detection speed. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, NY, USA, 17–22 June 2006. Schölkopf, B.; Platt, J.; Hofmann, T. Graph-Based Visual Saliency. Adv. Neural Inf. Process. Syst. 2006, 19, 545–552. Sheikh, H.R.; Wang, Z.; Cormack, L.; Bovik, A.C. Live Image Quality Assessment Database Release 2. Available online: http://live.ece.utexas.edu/research/quality/subjective.htm (accessed on 13 December 2016). Zhang, Y.; Dong, Z.; Wang, S.; Ji, G.; Yang, J. Preclinical Diagnosis of Magnetic Resonance (MR) Brain Images via Discrete Wavelet Packet Transform with Tsallis Entropy and Generalized Eigenvalue Proximate Support Vector Machine (GEPSVM). Entropy 2015, 17, 1795–1813. [CrossRef] Wang, S.; Lu, S.; Dong, Z.; Yang, J.; Yang, M.; Zhang, Y. Dual-Tree Complex Wavelet Transform and Twin Support Vector Machine for Pathological Brain Detection. Appl. Sci. 2016, 6, 169. [CrossRef] Barina, D. Gabor Wavelets in Image Processing. 2016; arXiv:1602.03308. Sharifi, K.; Leon-Garcia, A. Estimation of shape parameter for generalized Gaussian distributions in sub-band decompositions of video. IEEE Trans. Circuits Syst. Video Technol. 1995, 5, 52–56. [CrossRef] Wainwright, M.J.; Simoncelli, E.P. Scale mixtures of Gaussians and the statistics of natural images. Neural Inf. Process. Syst. 2000, 12, 855–861. Lasmar, N.E.; Stitou, Y.; Berthoumieu, Y. Multiscale skewed heavy tailed model for texture analysis. In Proceedings of the 16th IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 7–10 November 2009. Alers, H.; Liu, H.; Redi, J.; Heynderickx, I. TUD Image Quality Database: Eye-Tracking Release 2. Available online: http://mmi.tudelft.nl/iqlab/eye_tracking_2.html (accessed on 13 December 2016). Alers, H.; Liu, H.; Redi, J.; Heynderickx, I. Studying the risks of optimizing the image quality in saliency regions at the expense of background content. In Proceedings of the IS & T/SPIE Electronic Imaging 2010, Image Quality and System Performance VII, San Jose, CA, USA, 17–21 January 2010. Sheikh, H.R.; Sabir, M.F.; Bovik, A.C. A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms. IEEE Trans. Image Process. 2006, 15, 3440–3451. [CrossRef] [PubMed] Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 389–396. [CrossRef] Tong, Y.; Konik, H.; Cheikh, F.A.; Tremeau, A. Full Reference Image Quality Assessment Based on Saliency Map Analysis. J. Imaging Sci. Technol. 2010, 54, 3. [CrossRef] Sheskin, D.J. Handbook of Parametric and Nonparametric Statistical Procedures, 3rd ed.; Statistical Procedures: Boca, FL, USA, 2012. Steiger, J.H. Beyond the F Test: Effect Size Confidence Intervals and Tests of Close Fit in the Analysis of Variance and Contrast Analysis. Psychol. Methods 2004, 9, 164–182. [CrossRef] [PubMed] Ponomarenko, N.; Ieremeiev, O.; Lukin, V.; Egiazarian, K.; Jin, L.; Astola, J.; Vozel, B.; Chehdi, K.; Carli, M.; Battisti, F.; et al. Color image database TID2013: Peculiarities and preliminary results. In Proceedings of the European Workshop on Visual Information Processing, Paris, France, 10–12 June 2013. Larson, E.C.; Chandler, D.M. Most apparent distortion: Full-reference image quality assessment and the role of strategy. J. Electr. Imaging 2010, 19, 143–153. © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).