Fast Video Encoding Algorithm for the Internet of ... - SAGE Journals

1 downloads 0 Views 1MB Size Report
HEVC RExt provides high resolution video with a high bit-depth and an abundance of color formats ... ISO/IEC MPEG and ITU-T VCEG [5]. ...... 1555–1564, 2013.
Hindawi Publishing Corporation International Journal of Distributed Sensor Networks Volume 2015, Article ID 146067, 10 pages http://dx.doi.org/10.1155/2015/146067

Research Article Fast Video Encoding Algorithm for the Internet of Things Environment Based on High Efficiency Video Coding Jong-Hyeok Lee,1 Kyung-Soon Jang,1 Byung-Gyu Kim,1 Seyoon Jeong,2 and Jin Soo Choi2 1

Department of Computer Engineering, Sun Moon University, 100 Kalsan-ri ,Tangjeong-myeon, Asan, Chungnam 305-350, Republic of Korea 2 Realistic Media Research Team, ETRI, 218 Gajeong-ro, Yuseong-gu, Daejeon 34129, Republic of Korea Correspondence should be addressed to Byung-Gyu Kim; [email protected] Received 22 May 2015; Revised 7 August 2015; Accepted 9 August 2015 Academic Editor: Qun Jin Copyright © 2015 Jong-Hyeok Lee et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Video data for the Internet traffic is increasing, and video data transmission is important for consideration of real-time process in the Internet of Things (IoT). Thus, in the IoT environment, video applications will be valuable approach in networks of smart sensor devices. High Efficiency Video Coding (HEVC) has been developed by the Joint Collaborative Team on Video Coding (JCTVC) as a new generation video coding standard. Recently, HEVC includes range extensions (RExt), scalable coding extensions, and multiview extensions. HEVC RExt provides high resolution video with a high bit-depth and an abundance of color formats. In this paper, a fast intraprediction unit decision method is proposed to reduce the computational complexity of the HEVC RExt encoder. To design intramode decision algorithm, Local Binary Pattern (LBP) of the current prediction unit is used as texture feature. Experimental results show that the encoding complexity can be reduced by up to 12.35% on average in the AI-Main profile configuration with only a small bit-rate increment and a PSNR decrement, compared with HEVC test model (HM) 12.0-RExt4.0 reference software.

1. Introduction The Internet of Things (IoT) is a sensing network that connects any object with the Internet using many kinds of sensor equipment. Along with the rapid development of IoT applications, new generations of mobile broadband networks, cloud computing, and video coding technology for video streaming in real-time all represent an interactive and realistic development direction for next generation multimedia application networks. It will play a valuable role in industrial, medical, and television fields [1–3]. MPEG has already started to investigate standardization activities to define network protocols for the Internet of Things (e.g., how to connect things). The variety and heterogeneity of “Things” make it difficult to standardize descriptions, data formats, and APIs in a global manner; however, when the environment is well established, this can be done. Therefore, MPEG is exploring representations of multimedia

things as part of complex distributed systems implying interaction between things and between humans and things. The multimedia data type elements are corresponding to descriptions of devices and messages for “talking to” and “adapting to” either devices or services in the Internet of Things. Recently, there has been a change in the video content service in video communication technologies from lower resolution video to an ultrahigh definition (UHD) video format. Mobile device, storage, and network technologies are striving to keep pace with rapid changes in the market. Modern data compression techniques can store or transmit based on allocation of significant amounts of data while UHD video content has a large data transfer rate. Many applications of existing video compression technology are used to broadcast high definition (HD) TV signals over satellite, cable, and terrestrial transmission systems, video content acquisition and editing systems, camcorders, security applications, Internet and mobile network video, Blu-ray

2 discs, and real-time conversational applications, such as video chat, video conferencing, and telepresence systems for lower dimensional video sequences. However, the growing popularity of HD video and an increasing diversity of services and emergence of beyond-HD formats (4k × 2k or 8k × 4k resolution, called UHD) requires stronger video coding with efficiency that is superior to previous video compression standards. Moreover, the traffic caused by video applications targeting mobile devices and tablet-PCs and the transmission requirement for video on demand services are imposing severe pressures on existing networks. An increased desire for higher quality and better resolution is also driving mobile applications. The H.264/MPEG-4 AVC [4] is still widely used for most of many applications, both in real-time and non-real-time. However, this standardization suffers a bit-rate increment and significant computational complexity for beyond-HD resolution applications. A next-generation video coding scheme, called High Efficiency Video Coding (HEVC), was developed by the Joint Collaborative Team on Video Coding (JCT-VC) group of ISO/IEC MPEG and ITU-T VCEG [5]. The HEVC version 1 has the primary goal of achieving a 50% high compression rate than the H.264/MPEG-4 AVC, especially with a primary focus on 8-bit/10-bit YUV 4:2:0 video. Although this standardization supports high compression rate using improved and modified coding tools, the HEVC standard still requires a large amount of time for compression. HEVC is developing extensions to support several additional application scenarios, including professional uses with enhanced precision and color format support, scalable video coding, and 3D/stereo/multiview coding. Among these extensions, the HEVC range extension (RExt) provides a high bit-depth (larger than 10 bits) and different color formats in high resolution sequences. HEVC RExt has the same structure as HEVC, but additional coding tool options have been added to support 10 bits per sample and different color formats. The 4:2:2 and 4:4 :4 enhanced chroma sampling structures and sample bit-depths beyond 10 bits per sample are supported [6]. UHD resolution is expected to emerge in the near future and will be supported by next generation displays. This kind of data rate increase will put additional pressure on all types of networks, and data rates for video content are increasing faster than network infrastructure capacities for economical delivery. HEVC and its extensions provide good performance based on a large computational complexity because of heavy and complicated coding tools in order to improve the coding efficiency, support for in-deep color formats, and a high bitdepth. To reduce the computational complexity in the HEVC RExt encoder, a fast intramode decision algorithm is proposed based on block texture information. This paper is organized as follows: in Section 2, the HEVC structure and related works are introduced. Local Binary Patterns (LBPs) and fast intramode decision method based on LBP are described for the proposed algorithm in Section 3. Section 4 presents the coding performance of the algorithm, and Section 5 presents concluding remarks.

International Journal of Distributed Sensor Networks

2. HEVC Encoding Structure and Related Works The HEVC standard has adopted a highly flexible and efficient block partitioning based on introduction of the coding tree unit (CTU). There are three structures of three block units as a coding unit (CU), prediction unit (PU), and transform unit (TU). The CU represents a basic block type like macroblock of the H.264/AVC. The PU is used for the coding mode decision, including motion estimation and rate-distortion (RD) optimization (RDO). Transform and entropy coding are performed based on the TU. Initially, a frame is divided into largest CU size which is called a coding tree unit (CTU). The CTU consist of a coding tree block (CTB), on luma block, and two chroma blocks. Each CTB is an assemblage of square shaped coding blocks (CB) that are divided based on a quad tree structure. The structure of each CB is square and the size can be 8, 16, 32, or 64. This kind of change is more effective and beneficial, unlike the conventional previous H.264 method that used a 16 × 16 macroblock (MB). A larger and more flexible block structure is effective for encoding high resolution video. The CTU size is 2𝑁 × 2𝑁, where 𝑁 is 32, 16, or 8. A CTU can contain a single CU with a 2𝑁 × 2𝑁 dimension, or it can be split into four smaller CUs of equal size (𝑁 × 𝑁). Each CTU is selected and recursively split into four CUs based on a quadtree structure. Each CB is predicted based on an intra- or interprediction process that is performed in the PU. The intraprediction process uses two modes (PART 2𝑁 × 2𝑁 and PART 𝑁 × 𝑁) based on the encoded size of the PU. Figures 1(a) and 1(b) show different intraprediction direction and mode between HEVC and H.264/AVC. In order to improve the coding efficiency of video coding, HEVC uses 35 intraprediction modes in the PU from 4 × 4 to 64 × 64. The H.264/AVC standard only used 4 and 9 intraprediction modes for block sizes of 4×4 and 16×16 macroblocks. The increased numbers of intraprediction modes increase the computational complexity of HEVC, compared with H.264. In order to reduce the time required for the intraprediction coding, Yoo and Suh [8] proposed an early termination algorithm for inter- and intra-PUs that checked a coded block flag (CBF) value and the RD cost of the inter-PU. If conditions were satisfied based on these two values, each PU was skipped for an inter-PU and intra-PU. A two-stage prediction unit size decision method has also been presented [9] in which texture complexities are analyzed according to the video content using variance in order to filter out unnecessary PUs. Next, for intraprediction coding, skipped small PU sizes are selected based on PU sizes of encoded upper-left, upper, and left blocks. Some fast algorithms have been reported that use dominant edge information [10] with a subset of tree level PUs [11]. Cho and Kim [12] proposed fast CU splitting and pruning methods based on Bayes decision rules in order to reduce the computational complexity in the HEVC intraprediction process. Fast intraprediction approaches based on gradients have been used [7, 13, 14]. Wang and Siu [15] reported an adaptive intramode skipping algorithm and signaling

International Journal of Distributed Sensor Networks

3

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

17 16 15 14 13

8

12 11 1

10 9 0: Intra Planar 1: Intra DC 35: Intra FromLuma

8 7 6 5

6

3

4 3

4 7

2 (a) Intraprediction angle for 35 modes in HEVC

0

5

(b) Intraprediction angle for 9 modes in H.264/AVC

Figure 1: Different intraprediction direction and mode for (a) HEVC and (b) H.264/AVC.

processes using statistical properties of reference samples. An intramode decision strategy arranging candidate modes into different groups has been presented using a notation of a circle [16]. In HEVC RExt, an advanced color Table and Index Map (cTIM) [17], intrablock copy (IntraBC) [18], and angular prediction with a weight function and a modification filter based on a blending filter for DC mode [19] have been proposed.

3. Proposed Work 3.1. Local Binary Patterns (LBPs). Intraprediction process has usually been analyzed based on use of image texture information. Local Binary Pattern (LBP) features were originally designed for texture description [20]. This LBP operator transforms an image into an array or image of integer labels describing the small-scale appearance of the image. These labels or their statistics, most commonly in the form of a histogram, are then used for further image analysis. This approach has advantages, such as gray-scale invariance and normalization. The LBP represents texture information without any time consumption because the LBP operator is simply calculated. The LBP operator is based on the assumption that texture has two locally complementary aspects for a pattern and a pattern strength. In H.264/AVC, the LBP is used to extract moving objects with motion vectors and to use edge information in motion estimation process [21–23]. The pixels in a particular block area are thresholded based on a center pixel value, multiplied by powers of two and then summed to obtain a label for the center pixel. If the neighborhoods consist of 8 pixels, a total of 28 = 256 different labels can be obtained depending on the relative gray values of the center and the pixels in the neighborhood.

Circular symmetrical neighbor sets for different (𝑃, 𝑅) are illustrated in Figure 2. 𝑃 is the node for the number of neighboring pixels and 𝑅 is the radius of circle. By combining different values of 𝑃 and 𝑅, the LBP is composed of variety of sets. Let 𝑔𝑝 denote the gray value of the sampled pixel in an evenly spaced circular neighborhood of 𝑃 sampling points of radius 𝑅 around point (𝑥, 𝑦). 𝐼(𝑥, 𝑦) and 𝑔𝑐 denote the image of a frame and the gray level of center position. 𝑥𝑝 , 𝑦𝑝 is given by (−𝑅 sin(2𝜋/𝑃), 𝑅 cos(2𝜋/𝑃)): 𝑔𝑝 = 𝐼 (𝑥𝑝 , 𝑦𝑝 ) ,

𝑝 = 0, . . . , 𝑃 − 1.

(1)

For analysis of local texture patterns, the joint distribution of differences with spatial characteristics can be modeled as 𝑇 ≈ 𝑡 (𝑔0 − 𝑔𝑐 , 𝑔1 − 𝑔𝑐 , . . . , 𝑔𝑝−1 − 𝑔𝑐 ) .

(2)

Analyzed LBP is a discriminative pattern for different patterns between neighborhood pixels and center pixel. The LBP code can represent a bright/dark spot, flat areas, edges, edge ends, and curves if differences are zero in a constant region. Equation (3) represents the binary bit value which is calculated at the 𝑖th neighbor. Let 𝐵(𝑖) denote the binary bit value of the neighboring pixel intensity 𝐼. 𝐼(𝑐) presents the pixel location of the center position at (0, 0). The 𝐼(𝑥,𝑦) coordinates of circular neighbor sets 𝐼(𝑖) are given by (−𝑅 sin(2𝜋/𝑃), 𝑅 cos(2𝜋/𝑃)). The LBP operator when 𝑃 is 8 and 𝑅 is 1 is shown in Figure 2(b). Binary bits can be transformed into integer values as pattern number using (4) when the binary bit stream consists of a combination of each bit calculated using (3) as

4

International Journal of Distributed Sensor Networks

(a) 𝑃 = 4, 𝑅 = 1

(b) 𝑃 = 8, 𝑅 = 1

(c) 𝑃 = 8, 𝑅 = 4

Figure 2: Different set for Local Binary Patterns. 𝑃 is the number of neighboring pixels and 𝑅 is circle of radius.

0

0

1

137 82 129

1

0

1

212 68 165

1

0

1

56

74 250

LBP

Binary 00111011

Figure 3: LBP(8,1) operator.

the thresholding function. For example, Figure 3 illustrates the LBP(8,1) operator: {1, 𝐵 (𝑖) = { 0, {

if 𝐼 (𝑖) ≥ 𝐼 (𝑐) , otherwise,

(3)

𝑃−1

LBP(𝑃,𝑅) = ∑ 𝐵 (𝑖) ∗ 2𝑖 . 𝑖=0

(4)

Many texture analysis applications are required for invariant or robust rotations of the input image. LBP𝑃,𝑅 patterns are obtained by circularly sampling around the center pixel. Most of the Local Binary Patterns in natural images are uniform. Use of uniform patterns is the statistical robustness. Local primitives detected by the LBP include spots, flat areas, edges, edge ends, and curves. Figure 4 illustrates examples with the LBP8,𝑅 operator which are represented as gray circles and zeros are white. The LBP distribution can be successfully used in recognizing a wide variety of different textures, to which statistical and structural methods have normally been applied separately. The texture of the current PU can be identified as discriminative textures using the LBP. In the HEVC encoder, interpolation is used for application of the LBP model to coordinate location of neighboring node sets appropriately using (5). The designated center position and the neighboring location from the center in the LBP based on the PU size are shown in Figure 5. The number of neighboring pixels as 𝑃 is 8 and 𝑅 is (𝑠/2) − 1. The value of 𝑠 is the PU size of 8 × 8, 16 × 16, 32 × 32, and 64 × 64, except 4 × 4. Consider

International Journal of Distributed Sensor Networks

5

Distribution (%)

10

Spot

Spot/flat

Line end

8 6 4 2 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250

0

Pattern number Kimono Seeking

Edge

Graphics KidsSoccer

Figure 6: Probability distribution about pattern number according to various sequences.

Corner

Figure 4: Different textures detected using LBP(8,R) .

7 6

Distribution (%)

5 4 3 2 1 0 0

Figure 5: LBP(8,(𝑠/2)−1) in the HEVC with interpolated center and its neighboring positions. 𝑠 is the PU size.

𝐼̂𝑥,𝑦 =

1 𝑁=2 𝑀=2 . ∑ ∑𝐼 𝑁𝑀 𝑖=1 𝑗=1 (𝑥−𝑖−1,𝑦−𝑗−1)

(5)

Therefore, LBP(𝑃,𝑅) is calculated using 𝐼(𝑖) = 𝐼̂𝑥,𝑦 in (5) as neighborhood pixels for application and analysis of local texture information in the HEVC block structure. For analysis of binary textural information and different intraprediction modes, a probability distribution is first analyzed for binary patterns that occurred in text sequences of natural video content. Next, relationship with bit patterns that exhibit higher frequencies of occurrence than other patterns and encoded intramodes is analyzed. Probability distributions of patterns and modes in four sequences with 20 frames using texture information for LBP are shown in Figures 6 and 7. Sequences have a 4:2:2 color format, 10 bits, and 1920 × 1080 resolutions in HM version 12.0-RExt4.0. Similar distribution graphs for all test sequences are shown in Figure 6, indicating that the texture feature appears in different sequences. Furthermore, before the encoding stage, sets of

5

Pattern 0 Pattern 2 Pattern 4 Pattern 16 Pattern 64 Pattern 126 Pattern 128 Pattern 190

10

15 20 Intramode

25

30

Pattern 224 Pattern 240 Pattern 250 Pattern 252 Pattern 254 Pattern 8 Pattern 14 Pattern 32

Figure 7: Probability distribution of intramode which is best for most probable LBP.

most probable patterns with high distribution rates in the LBP are already prepared using a look-up table. The mode distribution for different most probable patterns that are similar to other most probable patterns for Intra Planar, Intra DC, and vertical mode (0, 1, and 26) is shown in Figure 7. Therefore, in order to use only DC, Planar, and 26 modes, the most probable patterns are used to quickly make a mode decision using texture information. 3.2. The Overall Procedure of the Proposed Fast Scheme. The complexity of HEVC is significantly increased than H.264/AVC by improvement of encoding efficiency. Consequently, HEVC requires improved rapid coding process as well as guarantee of efficient compression. In HEVC, there

6

International Journal of Distributed Sensor Networks Available candidate modes according to PU size

Number of modes in subset: 8 for 4 × 4 8 for 8 × 8 3 for 16 × 16 3 for 32 × 32 3 for 64 × 64

LBP calculation

LBP is most probable pattern? Candidate modes: 0, 1, 26

Yes

No Number of available candidate modes = number of modes in subset

RD cost calculation

Best mode decision

Figure 8: The overall procedure of the proposed algorithm.

are fast encoding tools in each prediction, transform, and filtering processes. To support high speed encoding, the intramode prediction process in the original HM-12.0-RExt4.0 is performed using a rough mode decision (RMD) and most probable mode (MPM). RMD and MPM are contributed to speedup intraprediction process. Intraprediction selects the 𝑁 best candidate modes based on RMD where all modes are tested based on the minimum absolute sum of Hadamard transformed coefficients of the residual signal (HSAD) and the number of mode bits in the RMD. The number of N best RMD candidates is 8 for 4×4 and 8×8 and 3 for 16×16, 32×32, and 64 × 64. The RD optimization is only used for N + MPM candidates. However, the computation load on the encoder is still high. The overall procedure of the proposed fast mode decision scheme is illustrated in Figure 8 and follows the following process. Step 1. Initially, the LBP is calculated for the current encoded PU. Step 2. If the LBP is included in the most probable patterns, which are already defined in the look-up table, go to Step 3. Otherwise, go to Step 4. Step 3. Prediction is only performed three times for a set of 0, 1, and 26 candidate modes. Next, go to Step 5. Step 4. Prediction is performed for the number of modes based on the PU size. Go to Step 5. Step 5. The best mode is selected with the minimum RD cost. In the proposed scheme, the MPM condition is used based on the most probable LBP in the look-up table. If the local texture pattern of the LBP encoded block satisfies the condition, the intraprediction process is only performed three times for the three modes Intra Planar 0, Intra DC 1, and vertical mode 26.

4. Experimental Results The proposed fast scheme was implemented on HM-12.0RExt4.0 (HEVC RExt reference software). Test environments were all intra using AI-Main. For wireless video communication, in the past, IPPP structure which one 𝐼 frame followed by all 𝑃 frames is usually employed. Recently, wireless video communication is required to support high resolution video service due to rapid advance in network technology. To provide better quality than IPPP structure, all intrastructure should be used necessarily. Standard sequences with 50 frames were used for three to four sequences with different quantization parameter (QP) range (12, 17, 22, and 27) defined by superhigh tier (SHT) [24]. Test sequences were classified for color formats of RGB4:4:4, YCbCr4:4:4, and YCbCr4:2:2. Each class had a 1920 × 1080 resolution. Details of the encoding environment can be seen in JCTVC-N1006 [24]. To evaluate performance, measurements of ΔBit, ΔPSNRY , and ΔTime were used as ΔBit = Bitproposed − Bitoriginal ,

(6)

ΔPSNRY = PSNRYproposed − PSNRYoriginal ,

(7)

ΔTime =

TimeProposed − TimeAnchor TimeAnchor

× 100.

(8)

ΔTime is a complexity comparison factor used to indicate the amount of total encoding time saving (8). From (8), Time(𝑥) indicates the total consumed time of the method 𝑥 for encoding. The performance results between the proposed algorithm and Jiang et al. [7] algorithm on the HM-12.0-RExt4.0 software are shown in Tables 1, 2, and 3. Each performance in the tables is based on different color formats of RGB4:4:4, YCbCr4:4:4, and YCbCr4:2:2. Bjøntegaard delta bitrates (BDBR) [25] are shown in Tables 1, 2, and 3 as a performance measurement. The time reduction performance of the proposed method in RGB4:4:4 was almost 11.29%,

International Journal of Distributed Sensor Networks

7

Table 1: The performance of the proposed algorithm and Jiang’s algorithm [7] on the HM-12.0-RExt4.0 reference software in RGB4:4:4 with superhigh tier (SHT) of QP range. Sequence

QP

12 17 Kimono 22 27 avg 12 17 DucksAndLegs 22 27 avg RGB4:4:4 12 17 OldTownCross 22 27 avg 12 17 Park Scene 22 27 avg Average 0.28

Proposed ΔBit (%) ΔPSNRY (dB) ΔTime (%) BD rate (%) −0.49 −0.12 −11.90 −0.56 −0.09 −11.16 0.6 0.25 −0.01 −10.44 0.90 −0.01 −10.61 0.02 −0.06 −11.03 0.49 −0.49 −11.92 0.01 −0.38 −11.47 1.3 0.08 −0.13 −10.73 0.39 −0.04 −10.92 −0.26 −11.26 0.24 0.19 −0.15 −11.71 0.21 −0.13 −11.73 1.1 0.47 −0.10 −11.27 0.62 −0.06 −11.23 0.37 −0.11 −11.48 0.02 −0.18 −11.88 0.07 −0.15 −11.69 1.6 0.63 −0.08 −11.12 1.25 −0.07 −10.88 0.49 −0.12 −11.39 −0.14 −6.29 −11.29 1.13

Jiang [7] ΔBit (%) ΔPSNRY (dB) ΔTime (%) BD rate (%) −0.43% −0.10 −8.95 −0.43% −0.07 −8.31 0.5 0.16% −0.01 −7.40 0.46% −0.01 −7.07 −0.06% −0.05 −7.93 0.42% −0.45 −9.40 0.00% −0.34 −8.83 1.2 0.04% −0.13 −7.98 0.24% −0.06 −7.86 0.18% −0.25 −8.52 0.73% −0.11 −8.25 1.41% −0.10 −5.51 2.1 1.86% −0.07 −4.95 2.50% −0.05 −4.52 1.62% −0.08 −5.81 −0.07% −0.15 −7.72 −0.09% −0.14 −5.66 1.2 0.24% −0.07 −5.16 0.61% −0.06 −4.43 0.17% −0.10 −5.74 0.48% −0.12 −7.00 1.25

Table 2: The performance of the proposed algorithm and Jiang’s algorithm [7] on the HM-12.0-RExt4.0 reference software in YCbCr4:4:4 with superhigh tier (SHT) of QP range. Sequence

QP

12 17 Kimono 22 27 avg 12 17 BirdsInCage 22 YCbCr4:4:4 27 avg 12 17 Crowd Run 22 27 avg Average 0.28

Proposed Jiang [7] ΔBit (%) ΔPSNRY (dB) ΔTime (%) BD rate (%) ΔBit (%) ΔPSNRY (dB) ΔTime (%) BD rate (%) −0.70 −0.11 −11.89 −0.57 −0.09 −9.18 −0.36 −0.05 −11.17 −0.16 −0.03 −8.56 0.8 0.7 0.61 0.00 −10.84 0.35 −0.01 −8.15 2.20 −0.01 −11.46 1.52 −0.02 −8.38 0.43 −0.04 −11.34 0.28 −0.04 −8.57 0.12 −0.08 −12.08 0.01 −0.08 −8.73 −0.02 −0.06 −11.48 −0.09 −0.05 −8.47 1.8 1.5 0.46 −0.06 −11.49 0.32 −0.04 −8.17 2.18 −0.02 −12.22 1.60 −0.02 −8.50 −11.82 0.46 −0.05 −8.47 0.69 −0.05 0.49 −0.18 −12.07 0.22 −0.16 −8.38 0.82 −0.15 −12.23 0.38 −0.14 −8.41 3.0 2.1 1.87 −0.10 −12.25 0.97 −0.09 −8.43 3.12 −0.10 −12.24 1.75 −0.09 −7.87 1.58 −0.14 −12.20 0.83 −0.12 −8.27 0.90 −0.08 −11.78 1.87 0.52 −0.07 −8.43 1.44

on average with 0.28%, 0.14 (dB), and 1.13% losses in bitrate, Y-PSNR, and BDBR, respectively. In RGB4:4:4, Jiang’s algorithm achieved 7% of complexity reduction, on average with 0.48% in bitrate increment, 0.12 (dB) loss of Y-PSNR, and 1.25% BDBR.

For sequences with YCbCr4:4:4 (Table 2), the proposed algorithm achieved a BDBR loss rate of 1.87% with a bit increment of 0.9% and a 0.08 (dB) PSNR decrement, on average. An 11.78% speed-up gain was achieved in sequences with YCbCr4:4:4. Performance result of [7] in YCbCr4:4:4

8

International Journal of Distributed Sensor Networks

Table 3: The performance of the proposed algorithm and Jiang’s algorithm [7] on the HM-12.0-RExt4.0 reference software in YCbCr4:2:2 with superhigh tier (SHT) of QP range. Sequence

QP

12 17 Kimono 22 27 avg 12 17 DucksAndLegs 22 27 avg YCbCr4:2:2 12 17 OldTownCross 22 27 avg 12 17 Park Scene 22 27 avg Average 0.28

Proposed Jiang [7] ΔBit (%) ΔPSNRY (dB) ΔTime (%) BD rate (%) ΔBit (%) ΔPSNRY (dB) ΔTime (%) BD rate (%) −1.23 −0.17 −14.00 −1.15 −0.16 −12.07 −0.77 −0.06 −13.16 −0.64 −0.05 −11.09 1.1 0.9 1.30 0.00 −12.50 0.76 −0.01 −10.53 2.51 −0.01 −13.12 1.67 −0.02 −10.67 0.45 −0.06 −13.19 0.16 −0.06 −11.09 0.16 −0.25 −13.58 −0.02 −0.21 −12.04 0.26 −0.24 −13.71 −0.04 −0.20 −12.17 2.7 2.0 0.89 −0.16 −13.56 0.37 −0.13 −12.18 2.11 −0.11 −13.15 1.23 −0.10 −11.73 −13.50 0.38 −0.16 −12.03 0.85 −0.19 2.17 −0.25 −13.50 1.42 −0.19 −12.19 2.98 −0.26 −13.30 1.73 −0.21 −11.82 5.8 4.2 3.22 −0.24 −13.40 2.37 −0.18 −11.69 5.69 −0.16 −13.10 4.69 −0.11 −11.26 3.51 −0.22 −13.32 2.55 −0.17 −11.74 −0.25 −0.23 −16.14 −0.26 −0.21 −11.26 −0.23 −0.21 −16.07 −0.32 −0.20 −11.13 2.0 1.7 0.02 −0.13 −15.82 −0.16 −0.12 −11.04 −10.41 0.95 −0.08 −15.37 0.52 −0.07 0.12 −0.16 −15.85 −0.06 −0.15 −10.96 1.24 −0.16 −13.97 2.90 0.76 −0.13 −11.45 2.17

achieved 0.52%, 0.07 (dB), and 1.44% losses in bit-rate, YPSNR, and BDBR, respectively. The time reduction of Jiang’s method gained 8.43% on average value. Losses and gains of the proposed HEVC encoding system with YCbCr4:2:2 are shown in Table 3. Simulated sequences with YCbCr4:2:2 achieved a 13.97% time-saving factor and a 2.9% BDBR with a 1.24% bit-rate loss and a 0.16 (dB) PSNR loss, on average value. Jiang’s algorithm achieved 11.45% on average of complexity improvement and a 2.17% BDBR with a 0.76% in bit-rate and 0.13 (dB) in Y-PSNR. The proposed algorithm achieved a speed-up gain up to 16.14% with a smaller bit increment in the Seeking sequence with QP = 12 and YCbCr4:2:2. In DucksAndLegs sequence with QP = 22 and YCbCr4:2:2, Jiang’s algorithm achieved a speed-up factor up to 12.18% with a smaller bit-rate loss. Bit-rate performance was increased in nonnatural sequences and videos with many moving objects in the EBUGraphics sequence. Using the proposed approach, an average speed-up gain of over 12.35% was obtained, compared to the original intraprediction process with a negligible bit-rate increment. Gradients based fast intramode decision algorithm of Jiang et al. [7] achieved encoding speed-up gain of 8.98% on average. Although Jiang’s algorithm can give better performance of BDBR than the proposed method, the proposed one can reduce more encoding time without the performance degradation from gradients based fast intramode decision scheme [7]. RD performance for test sequences classified by color and line-style with a QP range of SHT in AI-Main is shown in

Figure 9. The original standard and proposed method performance values exhibited similar peaks in a graph (Figures 9(a) and 9(b)). A Seeking sequence with YCbCr4:2:2 (Figure 9(c)) showed a negligible loss of bit-rate and quality. A large loss of image quality was observed using the original HM encoder (up to 0.22 dB, on average, for the EBUGraphics sequence). However, the proposed method provided high quality maintenance of 40 (dB) when the QP value was less than or equal to 22, and the video quality was greater than or equal to 50 (dB) when the QP value was set to 12, except for the OldTownCross sequence. Furthermore, the proposed fast intramode decision scheme supported rapid compression, even with a small Y-PSNR loss. The proposed algorithm was efficient for use in a real-time encoding system without significant degradation of encoding performance for large video resolution, using a deep color format, and high bit-rate sequences.

5. Conclusions A fast intramode decision scheme is proposed for high resolution video with a high bit-depth and a rich color format. The proposed algorithm achieved a 12.35% time saving and a BDBR value of 1.96%, on average, over the original HM-12.0RExt4.0 software with a Y-PSNR loss of 0.12 (dB) and a 0.81% bit increment. The proposed algorithm should be considered for the Internet of Things (IoT) environment in real-time and can be useful for real-time HEVC video encoding systems for maintenance of video quality.

International Journal of Distributed Sensor Networks

9

Bit-rate (kbps)

700000

600000

500000

400000

300000

200000

52.5 51.5 50.5 49.5 48.5 47.5 46.5 45.5 44.5 43.5 42.5 41.5

100000

Y-PSNR versus bit-rate Sequence: BirdsInCage_1920 × 1080_60_10 bit_444.yuv Coding conditions: AI-SHT

0

450000

400000

350000

300000

250000

200000

Bit-rate (kbps)

HM12.0-RExt4.0 Proposed

HM12.0-RExt4.0 Proposed (b) BirdsInCage with YCbCr4:4:4

600000

500000

400000

300000

200000

Y-PSNR versus bit-rate Sequence: Seeking_1920 × 1080_50_10 bit_422.yuv Coding conditions: AI-SHT

100000

51.5 50.5 49.5 48.5 47.5 46.5 45.5 44.5 43.5 42.5 41.5 40.5 39.5 38.5 37.5 36.5 0

Y-PSNR (dB)

(a) Kimono with RGB4:4:4

700000

150000

100000

50000

Y-PSNR (dB)

51 50 49 48 47 46 45 44 43 42 41 40 0

Y-PSNR (dB)

Y-PSNR versus bit-rate Sequence: Kimono1_1920 × 1080_24_10 bit_444.rgb Coding conditions: AI-SHT

Bit-rate (kbps) HM12.0-RExt4.0 Proposed (c) Seeking with YCbCr4:2:2

Figure 9: Rate-distortion (RD) curves for (a) Kimono (RGB4:4:4), (b) BirdsInCage (YCbCr4:4:4), and (c) Seeking (YCbCr4:2:2) sequences in AI-SHT.

Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment This work was supported by ICT R&D program of MSIP/IITP. [B0101-15-1280, Development of Cloud Computing Based Realistic Media Production Technology].

References [1] J. Xu, Y. Andrepoulos, Y. Xiao, and M. van der Schaar, “Nonstationary resource allocation policies for delay-constrained

video streaming: application to video over internet-of-thingsenabled networks,” IEEE Journal on Selected Areas in Communications, vol. 32, no. 4, pp. 782–794, 2014. [2] R. Pereira and E. G. Pereira, “Video streaming considerations for internet of things,” in Proceedings of the 2nd International Conference on Future Internet of Things and Cloud (FiCloud ’14), pp. 48–52, August 2014. [3] Z. Liu and T. Yan, “Study on multi-view video based on IOT and its application in intelligent security system,” in Proceedings of the International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC ’13), pp. 1437–1440, December 2013. [4] T. Wiegand, G. J. Sullivan, G. Bjøntegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560–576, 2003.

10 [5] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the high efficiency video coding (HEVC) standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649–1668, 2012. [6] G. J. Sullivan, J. M. Boyce, Y. Chen, J. R. Ohm, C. A. Segall, and A. Vetro, “Standardized extensions of high efficiency video coding (HEVC),” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 6, 2013. [7] W. Jiang, H. Ma, and Y. Chen, “Gradient based fast mode decision algorithm for intra prediction in HEVC,” in Proceedings of the International Conference on Consumer Electronics, Communications and Networks (CECNet ’12), April 2012. [8] H.-M. Yoo and J.-W. Suh, “Fast coding unit decision algorithm based on inter and intra prediction unit termination for HEVC,” in Proceedings of the IEEE International Conference on Consumer Electronics (ICCE ’13), pp. 300–301, IEEE, Las Vegas, Nev, USA, January 2013. [9] G. Tian and S. Goto, “Content adaptive prediction unit size decision algorithm for HEVC intra coding,” in Proceedings of the Picture Coding Symposium (PCS ’12), May 2012. [10] T. L. da Silva, L. V. Agostini, and L. A. D. S. Cruz, “Fast HEVC intra prediction mode decision based on EDGE direction information,” in Proceedings of the 20th European Signal Processing Conference (EUSIPCO ’12), pp. 1214–1218, IEEE, Bucharest, Romania, August 2012. [11] T. L. da Silva, L. A. da Silva Cruz, and L. V. Agostini, “Fast HEVC intra mode decision based on dominant edge evaluation and tree structure dependencies,” in Proceedings of the 19th IEEE International Conference on Electronics, Circuits, and Systems (ICECS ’12), pp. 568–571, IEEE, Seville, Spain, December 2012. [12] S. Cho and M. Kim, “Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 9, pp. 1555–1564, 2013. [13] G. Chen, Z. Pei, L. Sun, Z. Liu, and T. Ikenaga, “Fast intra prediction for HEVC based on pixel gradient statistics and mode refinement,” in Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP ’13), pp. 514–517, Beijing, China, July 2013. [14] G. Chen, L. Sun, Z. Liu, and T. Ikenaga, “Fast mode and depth decision HEVC intra prediction based on edge detection and partitioning reconfiguration,” in Proceedings of the International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS ’13), pp. 38–41, IEEE, Naha, Japan, November 2013. [15] L.-L. Wang and W.-C. Siu, “Novel adaptive algorithm for intra prediction with compromised modes skipping and signaling processes in hevc,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 10, pp. 1686–1694, 2013. [16] Y. Quanhe, R. Yaocheng, and H. Yun, “Fast intra mode decision strategy for HEVC,” in Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP ’13), pp. 500–504, Beijing, China, July 2013. [17] Z. Ma, W. Wang, M. Xu, and H. Yu, “Advanced screen content coding using color table and index map,” IEEE Transactions on Image Processing, vol. 23, no. 10, pp. 4399–4412, 2014. [18] D.-K. Kwon and M. Budagavi, “Fast intra block copy (IntraBC) search for HEVC screen content coding,” in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS ’14), pp. 9–12, June 2014. [19] S. Matsuo, S. Takamura, and A. Shimizu, “Intra angular prediction with weight function and modification filter,” in

International Journal of Distributed Sensor Networks

[20]

[21]

[22]

[23]

[24]

[25]

Proceedings of the Picture Coding Symposium (PCS ’13), pp. 77– 80, IEEE, San Jose, Calif, USA, December 2013. T. Ojala, M. Pietik¨ainen, and T. M¨aenp¨aa¨, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971–987, 2002. T. Wang, J. Liang, X. Wang, and S. Wang, “Background modeling using local binary patterns of motion vector,” in Proceedings of the IEEE Visual Communications and Image Processing (VCIP ’12), pp. 1–5, IEEE, San Diego, Calif, USA, November 2012. R. Verma and M. Y. Dabbagh, “Binary pattern based edge detection for motion estimation in H.264/AVC,” in Proceedings of the IEEE Canadian Conference of Electrical and Computer Engineering (CCECE ’13), May 2013. J. Yang, S. Wang, Z. Lei, Y. Zhao, and S. Z. Li, “Spatiotemporal LBP based moving object segmentation in compressed domain,” in Proceedings of the 9th International Conference on Advanced Video and Signal-Based Surveillance (AVSS ’12), pp. 252–257, IEEE, Beijing, China, September 2012. D. Flynn and C. Rosewarne, “Common test conditions and software reference configurations for HEVC range extensions,” in Proceedings of the 14th Meeting of Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Vienna, Austria, August 2013. G. Bjøntegaard, “Calculation of average PSNR differences between RD-curves (VCEG-M33),” in Proceedings of the VCEG Meeting, ITU-T SG 16 Q.6 Document, Austin, Tex, USA, April 2013.

Suggest Documents