color models in YCbCr color space. In proposed approach, the pixel is represented by the color and average image brightness L . Then, the posterior probability ...
Objectionable Image Recognition System in Compression Domain Qixiang Ye, Wen Gao, Wei Zeng, Tao Zhang, Weiqiang Wang, Yang Liu Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China {Qxye, Wgao, Wqw}@ict.ac.cn
Abstract. In this paper, we propose an intelligent recognition system for objectionable image in JPEG compression domain. First, the system applies robust skin color model and skin texture analysis to detect the skin regions in an input image based on its DC and ACs. Then, the color, texture, shape and statistical features are extracted from the skin regions and input into a decisiontree classifier for classification. A large image library including about 120,000 images is employed to evaluate the system’s effectiveness.
1 Introduction Internet has been brought more and more objectionable information especially objectionable images to people. There has been a strong demand for screening out these objectionable images from underage children, and this demand is also growing. For that there are lots of limitations when using text analysis to recognize objectionable images, there has been existed some solutions for recognizing objectionable images by image content analysis. In [1], Forsyth designed and implemented an automatic system for identifying whether there are naked people presenting in an image. After skin detection, the geometric processing is used to grouping skin into human figure for human body detection. Wang et al. [2] presented a system of screening objectionable images for practical applications. Wang’s method uses a combination of an icon filter, a graph-photo detector, a color histogram filter, a texture filter and a wavelet-based shape matching algorithm. Jones and Rehg developed objectionable images system based on a statistic skin color detection models and a neural network classifier [3]. By observing that most of objectionable images on the web are JPEG format images, and compressed domain processing may yield great advantages compared with spatial domain processing in the following aspects, a) smaller data volume, b) lower computation complexity since full decompression can be avoided, we develop this objectionable image recognition system for compressed images. The entire objectionable image recognition system consists of three steps: skin detection, features extraction and objectionable image recognition. We will introduce them in the section 2 and 3. Experimental results are summarized in section 4. Section 5 gives the summary.
2
Qixiang Ye, Wen Gao, Wei Zeng, Tao Zhang, Weiqiang Wang, Yang Liu
2
Skin detection in compressed domain
Before introducing skin detection algorithm, we introduce the block-DCT briefly. 2.1 Block-DCT transformation Given a JPEG image, after the Huffman decoding, DCT information for each 8x8 block can be directly gotten by the image IDCT operation [4]. The first coefficient of a given block (DC) represents the average pixel intensity for that block (Fig.1) [4]. Then we can judge if an 8 x 8 block is or is not a skin color block just according to the DC value. We can tell if an 8 × 8 block is or is not a skin texture block according to the ACs and by the proposed method in Section 2.3.
Fig.1. DCT feature of an 8 x 8 block. ① is DC coefficient, ② ③ ④ ⑤ are the coefficients groups representing each frequency band characteristic.
2.2 Skin color detection To weak the influence of illuminations, we propose a simple and efficient solution to detect the skin color under different illuminations. We pre-group the sample images into different clusters according the average image brightness and learned different color models in YCbCr color space. In proposed approach, the pixel is represented by the color and average image brightness L . Then, the posterior probability of skin pixels is P (YCbCr , L | Skin ) P ( Skin ) (2) P ( Skin | YCbCr , L ) = P (YCbCr , L | Skin ) P ( Skin ) + P (YCbCr , L | ¬ Skin ) P ( ¬ Skin )
where P (YCbCr , L Skin ) and P (YCbCr , L ¬ Skin ) are the prior probability of skin and non-skin pixels under the average brightness L. They are formulated as P ( YCbCr | Skin , L ) P ( Skin , L ) (3) P ( YCbCr , L | Skin ) = P ( Skin ) P (YCbCr | ¬ Skin , L ) P ( ¬ Skin , L ) P (YCbCr , L | ¬ Skin ) = P ( ¬ Skin )
(4)
Then, the equation (1) can be re-written as P ( Skin | YCbCr , L ) =
P (YCbCr | Skin , L ) P ( Skin , L ) P (YCbCr | Skin , L ) P ( Skin , L ) + P (YCbCr | ¬ Skin , L ) P ( ¬ Skin , L )
(5)
Under the equal prior probability assumption of P ( Skin , L ) and P(¬Skin, L) , we get
Objectionable Image Recognition System in Compression Domain P ( Skin | YCbCr , L ) =
P ( YCbCr | Skin , L ) P ( YCbCr | Skin , L ) + P ( RGB | ¬ Skin , L )
3
(6)
A pixel is classified as skin if P ( Skin | YCbCr , L ) ≥ θ where θ ∈[0,1] is the threshold. For deciding the threshold, the receiver operating characteristic (ROC) curve is drawn from the relationship between the correct and false detections as a function of the detection threshold θ . The selection is based on the fact that optimal value for the threshold should lie near the bend of the ROC curve [5]. Fig.3 gives skin color results of an images, which show that our method is better than single model.
Fig.2. The ROC curves of for skin color detection as a function of θ .
Fig.3. Results of skin color detection. The first column is original image, the second column is the result of the single color model and the third is the result of presented method.
2.3 Skin texture detection We cannot distinguish skin with other things only by color. There are too many things in nature that share same color with human skin, like desert, lion skin, yellow grass etc. Texture property should be employed to assist skin detection. In this paper, we use ACs to verify the smoothness of an 8 × 8 skin color block. As shown in Fig. 1, we use ② ③ ④ ⑤ to capture the skin texture information, which represent texture distribution from low frequency to high frequency [6]. Experiments show that the texture energy of the nature materials, which share color with skin like desert, yellow grass etc., mainly focuses in the middle frequencies. Therefore, in Fig.1 ③ ④ frequency bands will be assigned higher importance, and ② ⑤ will be assigned lower importance. A block will be skin texture block if the following equation holds,
∑ (e N
i =2
−( i − µ ) 2 / 2σ 2
)
⋅ si < Ts
(7)
where si is the sum of the square of the ACs in the ith frequency band. N represents fives frequency bands as shown in Fig.1. Ts is the threshold. µ is set as 3.0 in the experiment , which represents the center frequency. σ is set as 2.0 according to Fig.1.
4
Qixiang Ye, Wen Gao, Wei Zeng, Tao Zhang, Weiqiang Wang, Yang Liu
The threshold Ts is also determined as the method used to determine the skin color threshold in Sec.2.2. Fig. 4 shows two examples of skin texture process results.
Fig.4. Results of skin texture detection. From left to right: original image, mask by only skin color detection and mask by skin color detection and skin texture analysis.
3 Objectionable Image Recognition System Based on the detected skin regions, we extract features to describe an image and the decision tree classifier is employed to classify it into objectionable or benign image. 3.1 Image Representation In our work, the color feature and textures are extracted from skin mask, shape feature and statistical features are extracted from the dwindled (1/8) skin mask. Color features: percentage of skin color area Texture features: percentage of skin texture area, smoothness of the image Shape features: color moments, Zenike moments Statistical features: percentage of max connected skin area, the number of connected skin regions 3.2 Image Classification After we obtain the trained model, the recognition system is built as Objectionable image
JPEG Image Objectionable image recognition Skin detection Skin Color Detection Skin Texture Analysis
Yes
Object feature extraction Color features Texture features Shape features Statistical features
Classifer C4.5
Lack of skin
No Benign image
Fig. 5. Flow chart of the proposed objectionable image recognition system.
Objectionable Image Recognition System in Compression Domain
5
4 Experimental Result In the field of information retrieval, the terms of precision and recall are widely used [7]. In this paper, we use accuracy instead of precision as
Recall =
a+b a , Accuracy = c c+d
(9)
where a is the number of truly positive case classified as positive case, b is the number of truly negative case predicted as positive case, c is the total number of truly positive case and d is the total number of truly negative case. We build an image library that includes 117,695 images from Internet and Coredraw CDs. Images are manually divided into two categories: one is objectionable and the other is benign. Each time, 1/10 of the images are selected for training, and 9/10 of the images for test. The proportion of the objectionable images and benign images is set as 1:5. The average recall and accuracy is 79.2% and 92.2% respectively. The average processing speed is about 27 images/s on PC (Pentium IV 1.6GHZ CPU), which is much faster than that of the system [3] whose speed is about 1 image/s. For that experiments are completed on a much larger image library, the experimental result will be more robust compared with that of [3].
5 Summary and acknowledge A simple but robust objectionable Image recognition system in compressed domain is proposed in this paper. The recognition speed is much faster than that in spatial domain and can be easily extended to the MPEG video. This work is supported by National Hi-Tech Development Programs of China under grant No. 2001AA42140.
References 1. 2. 3. 4. 5. 6. 7.
Margaret Fleck, David A.Forsyth, Chris Bregler.: Find Naked People. Pro. 4th European conference on Computer Vision: ECCV’96. Vol. 2. UK. (1996) 593-602 James Z. Wang, Jia Li, Gio Wiederhold, Oscar Firschein.: System for screening objectionable images. Computer Communications, Vol. 21, Elsevier. (1998) 1355-1360 Michael J. Jones and James M. Rehg.: Statistical Color Models with Application to Skin Detection. In Proceedings of CVPR. June (1999) 274-280 G.K. Wallace.: The JPEG Still Picture Compression Standard. Communications of the ACM, Vol. 34. April (1991) 31-44 Leonid Sigal, Stan Sclaroff, and Vassilis Athitsos.: Estimation and Prediction of Evolving Color Distributions for Skin Segmentation Under Varying Illumination. In Proceedings of CVPR. March (2000) 152-159 Hee-Jung Bae, Sung-Hwan Jung.: Image Retrieval Using texture based on DCT. ICICS’97. (1997) 1065-1068. Raghavan, V., Bollmann, P., and Jung, G.: A critical investigation of recall and prcision as measures of retrieval system performance. ACM Trans. on Information Systems, vol. 7, issue 3. (1989) 205-229.