Usage of Video Codec Based on Multichannel ...

2 downloads 0 Views 2MB Size Report
Usage of Video Codec Based on Multichannel. Wavelet Decomposition in Video Streaming. Telecommunication Systems. Kirill Bystrov(B), Alexander ...
Usage of Video Codec Based on Multichannel Wavelet Decomposition in Video Streaming Telecommunication Systems Kirill Bystrov(B) , Alexander Dvorkovich, Viktor Dvorkovich, and Gennady Gryzov Moscow Institute of Physics and Technology, Institutsky lane 9, 141700 Dolgoprudny, Moscow region, Russia [email protected], [email protected], [email protected], [email protected] https://mipt.ru/en/

Abstract. The amount of transmitted video content is rising sharply in modern telecommunication systems. This is especially true for streaming video. In this regard, the requirements are increased not only in terms of the compression ratio and the quality of the transmitted image, but also in terms of video codec performance. The usage of multichannel wavelet transform may be a way to increase the compression ratio while maintaining the same quality of reconstructed images. Therefore, the aim of the study is to implement multichannel wavelet decomposition based video codec and to make use of it in real-time mode. Multichannel wavelet video codec based on new filter banks specially designed for wavelet decomposition of images into a certain number of channels has been implemented within the research. Video compression results for various number of decomposition channels are presented in the efficiency evaluation of multichannel wavelet video codec. Keywords: Discrete Wavelet Transform · Video codecs · Streaming video · Telecommunication systems · Video coding performance

1

Introduction

Block orthogonal transforms are widely used in intra-frame video compression algorithms and thereby enabling to reduce the size of transmitted video content. Historically, discrete cosine transform (DCT) and integer approximation of DCT are generally used in video codecs. Discrete wavelet transform (DWT) may be considered as an alternative to DCT. DWT enables to avoid “blocking” artifacts which occur while using DCT at low bitrates and consequently provides a theoretical opportunity for achieving better quality of reconstructed images. Two-channel DWT scheme based on low-pass and high-pass FIR filters is usually used (for example, in JPEG2000 standard [1]). The increase in the number c Springer International Publishing AG 2017  V.M. Vishnevskiy et al. (Eds.): DCCN 2017, CCIS 700, pp. 108–119, 2017. DOI: 10.1007/978-3-319-66836-9 10

Usage of Multichannel Wavelet Codec for Video Streaming

109

of channels should enable to improve the compression ratio while maintaining the same quality of the reconstructed images due to a more compact representation of the signal energy along the frequency subbands [2]. These considerations are confirmed by experience of implementing three-channel DWT in Dirac video codec [3]. Therefore, the aim of the study is to implement multichannel DWT video codec and to evaluate its efficiency. Estimation of compression ratio at a certain level of distortion is used as the criteria of effectiveness. Three- and five-channel banks of FIR filters are selected for testing the video codec based on multichannel DWT. The novelty of the study is the implementation of the video codec which supports video coding with various number of channels, including wavelet decomposition of images into five channels, what is practically implemented in video codecs for the first time. The practical value of these innovations lies both in the ability to verify the practical applicability of various filter banks in video coding and in the potential of using this codec for streaming video after implementation of necessary modifications and optimization.

2

Image Processing Using DWT

If filter bank is correctly selected, then an increase in the number of channels (i.e. in the number of used filters) should lead to greater compactness of the image energy in the low-frequency domain. Consider multichannel discrete wavelet transform in the case of the three-channel scheme. Transform coefficients are obtained at the analysis stage: the convolution of the input signal and filters’ impulse characteristics (low-pass filter H, middle-pass filter B, high-pass filter G) (see Fig. 1) is performed and then its result is decimated according to selected decimation scheme (see Fig. 2). The initial signal is reconstructed at the synthesis stage: the inverse wavelet transform is performed by the convolution of subbands transform coefficients, which are supplemented by zero taps according to decimation scheme, and those of corresponding reconstruction filters (low-pass filter Kh , middle-pass filter Kb , high-pass filter Kg ). Filters frequency responses are determined from (1), considering that H and G filters are symmetric, i.e. hn = h−n , gm = g−m , and B filter is antisymmetric, i.e. b0 = 0, bl = b−l . Reconstruction filters characteristics Kh , Kb and Kg are determined from (2). Initial signal reconstruction condition is determined from (3), where H , H , H , B , B , B , G , G , G - are filters characteristics after corresponding decimation (see Fig. 2). ⎧ N  ⎪ ⎪ ⎪ H(x) = h + 2 hn cos(πnx), ⎪ 0 ⎪ ⎪ ⎪ n = 1 ⎪ ⎪ L ⎨  (1) bl sin(πlx), B(x) = 2j ⎪ ⎪ l = 1 ⎪ ⎪ ⎪ M ⎪  ⎪ ⎪ ⎪ G(x) = g0 + 2 gm cos(πmx). ⎩ m=1

110

K. Bystrov et al.

Fig. 1. The structural scheme of three-channel subband transform system

Fig. 2. LF(a), MF(b), HF(c) reactions in response to the unit impulse

Usage of Multichannel Wavelet Codec for Video Streaming

⎧ N1  ⎪ ⎪ ⎪ K (x) = kh + 2 khn cos(πnx), ⎪ h 0 ⎪ ⎪ ⎪ n = 1 ⎪ ⎪ L1 ⎨  Kb (x) = 2j kbl sin(πlx), ⎪ ⎪ l = 1 ⎪ ⎪ ⎪ M1 ⎪  ⎪ ⎪ ⎪ kgm cos(πmx). ⎩ Kg (x) = kg0 + 2 

111

(2)

m=1

H (x)Kh (x) + B (x)Kb (x) + G (x)Kg (x) = 1, H (x)Kh (x) + B (x)Kb (x) + G (x)Kg (x) = 1, H (x)Kh (x) + B (x)Kb (x) + G (x)Kg (x) = 1.

(3)

N + L + M + 2 equations are required for computing all unknown values hn , bl , gm . The necessary conditions: the determinant of the system must be a constant, H(0) = const, H(1) = 0 (i.e. H is low-pass filter), B(0) = 0, B(1) = 0 (i.e. B is middle-pass filter), G(0) = 0, G(1) = const (i.e. G is high-pass filter). Equations system must be supplemented by additional conditions, for example by the equality to zero of filter characteristics derivatives at the boundaries of the range and etc. [4], in order that it would be determined. All expressions for five-channel scheme to (4), (5) and (6) are similar to (1), (2) and (3), but with more variables and more complex equations system [5]. It should also be noted that H and G filters are symmetric, i.e. hn = h−n , gk = g−k ; D and C filters are antisymmetric, i.e. d0 = 0, dl = -d−l , c0 = 0, cr = -c−r (see Figs. 3 and 4). ⎧ N  ⎪ ⎪ ⎪ ⎪ H(x) = h0 + 2 hn cos(πnx), ⎪ ⎪ ⎪ n=1 ⎪ ⎪ ⎪ L ⎪  ⎪ ⎪ ⎪ dl sin(πlx), D(x) = 2j ⎪ ⎪ ⎪ ⎪ l = 1 ⎪ ⎪ S ⎪  ⎪ ⎪ ⎪ B(x) = 2j bs sin(πsx), b0 ≡ 0, or ⎪ ⎨ s=1

S  ⎪ ⎪ ⎪ ⎪ + 2 bs cos(πsx), B(x) = b 0 ⎪ ⎪ ⎪ s=1 ⎪ ⎪ ⎪ R ⎪  ⎪ ⎪ ⎪ dr sin(πrx), C(x) = 2j ⎪ ⎪ ⎪ ⎪ r=1 ⎪ ⎪ K ⎪  ⎪ ⎪ ⎪ + 2 gk cos(πkx). G(x) = g ⎪ 0 ⎩ k=1

(4)

112

K. Bystrov et al.

⎧ N1  ⎪ ⎪ ⎪ K khn cos(πnx), ⎪ h (x) = kh0 + 2 ⎪ ⎪ ⎪ n=1 ⎪ ⎪ ⎪ L1 ⎪  ⎪ ⎪ ⎪ K (x) = 2j kdl sin(πlx), ⎪ ⎪ d ⎪ ⎪ l = 1 ⎪ ⎪ S1 ⎪  ⎪ ⎪ ⎪ K (x) = 2j kbs sin(πsx), kb0 ≡ 0, ⎪ b ⎨

or

s=1

S1  ⎪ ⎪ ⎪ ⎪ Kb (x) = kb0 + 2 kbs cos(πsx), ⎪ ⎪ ⎪ ⎪ s=1 ⎪ ⎪ R1 ⎪  ⎪ ⎪ ⎪ Kc (x) = 2j kdr sin(πrx), ⎪ ⎪ ⎪ ⎪ r=1 ⎪ ⎪ K1 ⎪  ⎪ ⎪ ⎪ K (x) = kg + 2 kgk cos(πkx). ⎪ 0 ⎩ g k=1

Fig. 3. The structural scheme of five-channel subband transform system

(5)

Usage of Multichannel Wavelet Codec for Video Streaming

113

Fig. 4. LF(b), LMF(c), MF(d), HMF(e), HF(f) reactions in response to the unit impulse(a)

⎧ H (x)Kh (x) + D (x)Kd (x) + B (x)Kb (x) + C (x)Kc (x) + G (x)Kg (x) = 1, ⎪ ⎪ ⎨ H (x)Kh (x) + D (x)Kd (x) + B (x)Kb (x) + C (x)Kc (x) + G (x)Kg (x) = 1,

H (x)Kh (x) + D (x)Kd (x) + B (x)Kb (x) + C (x)Kc (x) + G (x)Kg (x) = 1, (6)

⎪ ⎪ ⎩ HΘ (x)Kh (x) + DΘ (x)Kd (x) + BΘ (x)Kb (x) + CΘ (x)Kc (x) + GΘ (x)Kg (x) = 1,

H⊕ (x)Kh (x) + D⊕ (x)Kd (x) + B⊕ (x)Kb (x) + C⊕ (x)Kc (x) + G⊕ (x)Kg (x) = 1.

3

Practical Implementation

It is required either to develop new video codec or to implement multichannel wavelet transform in an existing one for the effectiveness evaluation of the practical usage of multichannel wavelet transform in video coding. The second way was chosen and Schrodinger video codec [6] was selected for the given task. The choice of this codec is determined by the following factors: open and free source code; the usage of wavelet transform for video compression. The process of multichannel wavelet video codec development includes: 1. analysis of Schrodinger’s structure; 2. identification of “implicit two-channeling” (such implementation of functions and data structures that enables only two-channel transform) and required changes;

114

K. Bystrov et al.

3. “implicit two-channeling” removal and source code modification for working with various number of wavelet transform channels; 4. computation of filter banks for multichannel wavelet transform; 5. multichannel wavelet transform implementation; 6. testing the codec with specially designed filter banks and evaluation of its effectiveness. Three- and five-channels banks of filters were used for testing efficiency of developed video codec. Analysis filter bank 23/23/23 (see Table 1) and synthesis filter bank 13/13/13 (see Table 2) were used for three-channel wavelet transform; 5/5/5/5/5 analysis and synthesis filter banks (see Tables 3 and 4) were used for five-channel transform. Mallat’s pyramid [7] with 2 levels was used for efficiency evaluation of these filter banks. A direct comparison of the given wavelet video codec with x264 and x265 codecs is not correct enough because the last ones are significantly optimized in terms of quality (motion estimation and compensation, entropy coder, etc.). Meanwhile, their test results are also presented to demonstrate the potential of wavelet codec under development, keeping in mind appropriate further improvements. The following video sequences of standard (704 × 576 - 4cif), enhanced (1280 × 720 - 720p) and high (1920 × 1080 - 1080p) definition were selected for testing: 4cif – city, crew, harbour, ice, soccer; [8] 720p – mobcal, parkrun, shields, stockholm; [9] 1080p – blue sky, pedestrian area, rush hour, station2, sunflower, tractor. [9]

Table 1. Analysis filter bank 23/23/23 for three-channel wavelet transform n

h(n)

b(n)

g(n)

0 0.717215

0

−0.685303

1 0.482119

−0.687905

−0.499947

2 0.100785

0.052845

0.146139

3 −0.0572617

0.192827

0.0322635

4 −0.0434293

−0.0239495

−0.0249847

5 −0.00257065

0.0337432

0.00924183

6 0.00762182

−0.00319313

−0.00460337

7 0.0016123

−0.00829397

−0.000219346

8 0.00091449

0.000815896

−0.000671062

9 0.000107057

−0.0000712195 −0.0000620485

10 −0.000411828 0.000273966

0.000238688

11 0.0000812236 −0.0000540336 −0.0000470757

Usage of Multichannel Wavelet Codec for Video Streaming

115

Table 2. Synthesis filter bank 13/13/13 for three-channel wavelet transform n kh(n)

kb(n)

kg(n)

0 0.663679

0

0.662462

1 0.499683

0.688063

−0.509985

2 0.155318

0.0473184

0.0954987

3 −0.0494516

−0.153701

0.0783513

4 −0.0573942

−0.0260899

−0.069081

5 −0.00810589 −0.00475089 −0.00853267 6 0.0123629

0.00151849

0.0195878

Table 3. Analysis filter bank 5/5/5/5/5 for five-channel wavelet transform n h(n)

d(n)

b(n)

c(n)

g(n)

0 0.4472135955 0

0.59628479442586

0

0.666666666

1 0.4472135955 0.36

0.22360679743054

0.608604962188 −0.5

2 0.4472135955 0.608604962188 −0.52174919464347 −0.36

0.166666667

Table 4. Synthesis filter bank 5/5/5/5/5 for five-channel wavelet transform n kh(n)

kd(n)

kb(n)

kc(n)

kg(n)

0 0.4472135955 0

0.59628479442586

0

0.666666666

1 0.4472135955 −0.36

0.22360679743054

−0.608604962188 −0.5

2 0.4472135955 −0.608604962188 −0.52174919464347 0.36

4

0.166666667

The Results

The dependence of distortion level on bitrate is one of the video codec effectiveness criteria. The PSNR metric of reconstructed images was used as the distortion level estimate in test video sequences. Video coding was performed in Intra mode because Schrodinger’s motion estimation and compensation algorithms are significantly inferior to x264/x265 analogs and they have not been modified yet. Graphs for some of the tested video sequences are presented in the paper. Three-channel filter bank results are comparable to those of x264 for most video sequences (see Figs. 5 and 6) and, moreover, they are better than x264 results and close to those obtained by x265 for some 1080p test videos (see Figs. 7 and 8).

116

K. Bystrov et al.

Fig. 5. Parkrun

Fig. 6. City

Nevertheless, three-channel wavelet transform results are significantly inferior to those of x264/x265 for some other test videos (see. Fig. 9). It is attributed not to the transform efficiency but to the difference in entropy coding between x264/x265 and Schrodinger - the last uses simpler realization of entropy coder. Results comparable to x264/x265 are expected for such video sequences (and perhaps even better for the rest of tested videos) due to entropy coding improvement in wavelet codec.

Usage of Multichannel Wavelet Codec for Video Streaming

117

Fig. 7. Sunflower

Fig. 8. Riverbed

It should be noted that results close to those of x264 are obtained by wavelet codec also in motion prediction mode for individual video sequences (see Fig. 10) for which motion estimation is inefficient (for example, video “riverbed”). This shows that motion prediction algorithm improvement in wavelet video codec should increase its effectiveness to x264/x265 level also in inter-frame coding mode.

118

K. Bystrov et al.

Fig. 9. Ice

Fig. 10. “Riverbed” with motion estimation

5

Conclusions

Video codec based on multichannel wavelet transform was developed within the study. Its compression effectiveness in Intra coding mode was close to that of x264 in the same mode for most test video sequences; furthermore, it was close to x265 compression effectiveness for some other tested videos. This result is important to note especially against the background of the fact that x264 and x265 use intra-frame prediction algorithms, which considerably increase compression ratio, whereas implemented wavelet codec does not contain any comparable analog of

Usage of Multichannel Wavelet Codec for Video Streaming

119

those. Wavelet video codec is inferior to x264/x265 in motion prediction mode what may be attributed to the lack of comparable in efficiency motion estimation and compensation algorithms and entropy coder. Results comparable to those of x264/x265 should be provided by specified modules improvement in wavelet codec. Therefore, it is concluded that practical applicability of wavelet decomposition based video codec is proved. It is easy to note that computationally simple five-channel filter banks used in the research (see. Tables 3 and 4) are much inferior to three-channel wavelet transform realization that makes use of well-designed filter banks (see Tables 1 and 2). It may be accounted for the poor ability of this filter bank to split the signal energy by frequencies. More complex and long filters usage for five-channel wavelet transform should give better results than for three-channel filter bank. It should also be noted that the parallelization of computations in multichannel wavelet transform module was not used in this video codec version the simplest implementation was used. Computations transfer to OpenCL and wavelet video codec optimization should significantly improve its performance what will enable to use optimized wavelet codec for streaming video. Acknowledgments. This work was supported by Russian Ministry of Education and Science under Grant ID RFMEFI58115X0015.

References 1. Taubman, D.S., Marcellin, M.W.: JPEG2000: standard for interactive imaging. Proc. IEEE 90, 1336–1357 (2002) 2. Dvorkovich, V.P., Dvorkovich, A.V.: Digital Video Information Systems (Theory and Practise). Technosphere, Moscow (2012) 3. Prokhorov, I.B., Gryzov, G.Y.: Implementation of 3-band wavelet decomposition in Dirac video codec. In: Digital Signal Processing and its Applications: 17th International Conference Proceedings, Moscow, pp. 507–509 (2015) 4. Dvorkovich, A.V., Dvorkovich, V.P.: The methodology of multichannel wavelet filter banks computation for image decomposition within video compression. In: 3rd International Conference “Engineering & Telecommunication - En&T 2016”, pp. 19–21. Books of Abstracts, Moscow/Dolgoprudny (2016). (In Russian) 5. Dvorkovich, V.P., Dvorkovich, A.V.: Window Functions for Harmonic Analysis of Signals. Technosphere, Moscow (2016) 6. Schrodinger video codec. http://schrodinger.sourceforge.net/schrodinger faq.php 7. Mallat, S.G.: A theory for multiresolution signal decomposition : the wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11, 674–693 (1989) 8. 4cif test video sequences. ftp://ftp.tnt.uni-hannover.de/pub/svc/testsequences/ 9. 720p/1080p test video sequences. http://media.xiph.org/video/derf/

Suggest Documents