Improving Perceptual Quality of Multiple

0 downloads 0 Views 214KB Size Report
The ”multimedia era” is a big leap in the data communication scenario. Internet .... PDF of the differential position between the topological and sorted coef cients ...
1

Improving Perceptual Quality of Multiple Description Coding by Data Hiding 1 2 1 1 2 G. Boato , M. Carli , N. Conci , F. G. B. De Natale , and A. Neri 1

Dept. of Information and Communication Technology, University of Trento, Trento, Italy (boato, conci)@dit.unitn.it, [email protected]

2

Dept. of Applied Electronics, Universit a degli Studi ”Roma TRE”, Roma, Italy (carli, neri)@uniroma3.it

hiding-based

Description Coding (MDC) systems [2]-[3], rstly ap-

method for increasing the quality of a video is presented.

plied to generic information sources and then extended to

The proposed scheme well ts the multiple description

audio and video delivery. In MDC, the original video is

Abstract— In

this

contribution

a

data

video coding framework, providing quality scalability with reduced overhead. Furthermore, the perceived quality of the reconstructed video is improved.

split in pre-dened number sub-streams, individually decodable at lower quality, and providing maximum quality when decoded at once [4]-[5]. The more descriptions the end user receives, the higher the quality of the decoded video. When all the sub-streams are available the perfect

I. I NTRODUCTION

reconstruction is obtained. Unfortunately, this operation

The ”multimedia era” is a big leap in the data

is costly and the target is to introduce in each descriptor

communication scenario. Internet, IEEE 802.11, IEEE

the minimum amount of redundancy needed to obtain

802.16, and the third generation of mobile communi-

the best possible visual content.

cation (UMTS/CDMA2000) are spreading the use and

The coupling of watermarking and MDC appears in

the request for multimedia data. Video streaming appli-

the state of the art just in very few cases: in [6] the

cations have been showing a larger widespread, whose

image is coded into two description and only one of

motivations can be found in different concurrent reasons

them contains the mark, which can be obviously detected

such as the increasing diffusion of broadband networks

by comparison; in [7] and [8] the mark is embedded

(both wired and wireless), the reduction of connection

exploiting MDC and therefore improving robustness.

costs, the growing awareness of users on technology,

Our strategy combines MDC and data hiding in a

as well as the industrial push into the market of new

novel way, with the aim of achieving a higher quality,

multimedia services and devices [1].

while keeping the redundancy of partial reconstructions

Enabling video and audio services to clients every-

constant. The idea is based on two previous works of

where and anyway is a challenging problem, since

the authors. In [9]-[10], the use of watermarking for

the mobile devices typically vary on their connection

concealing video transmission over IP-based networks

characteristics, processing capabilities, and display sizes.

has been investigated. Briey, the key-frame of each

Furthermore,

transmitted

video shot is embedded in the relevant video code in

and processed is quite huge when video streaming or

order to allow a better interpolation of missing informa-

downloading is considered. These constraints, together

tion due to possible transmission errors. In [11]-[12],

with the intrinsic time-varying and location-dependent

an MDC scheme is proposed based on a JPEG-like

characteristics of wireless links, are in contrast with the

syntax to encode the side information, which requires an

increasing demand of a higher perceived quality from the

extremely low computational burden making it suitable

nal user. Therefore, in video streaming technologies is

for streaming and broadcasting applications, and ensures

a fundamental task the evaluation of the performance

a quasi-standard code, which makes it attractive for

in terms of robustness, signal quality, bandwidth re-

integration with current systems.

the

volume

of

data

to

be

quirements, reliability, accessibility, and, in a single expression, quality of experience.

The main drawback of this approach lies in the non negligible overhead associated to the side information.

A possible solution to low bandwidth constraints and

For this reason, in this work, we propose to avoid the

error prone environments can be represented by Multiple

usual transmission of the DC coefcient of each block

2

of the videoframe and to embed such information in

are sent to the decoder over different paths. In [4] and

the descriptor itself, by using an 'ad-hoc' watermarking

[5] this approach is exploited for the rst time for video

scheme. The DC coefcient has usually a strong impact

transmission by data partitioning, even though the MDC

on the video size, especially in intra-coded frames. Its

concept has been adopted making use of many different

removal makes the system more exible when one or

techniques such as the scalar quantization theory [14]-

more descriptors are lost, and allows either incrementing

[15] or the correlating transforms [16]. In all approaches

the overall quality of the stream (therefore xing the

the main objective still remains the same: subdivide the

amount of overhead), or reducing the bitrate of the whole

information into equal sub-streams, in order to make

sequence (at a given quantization parameter).

the reconstruction possible even when only part of the

In this work, the authors have focused on the perceived

original data is available. The cost to pay consists on

quality evaluation maintaining the amount overhead con-

a certain amount of redundancy introduced in each

stant. Two different approaches have been used: frames-

descriptor, needed at the decoder side to recover the

based quality evaluation, using the classical Peak Signal

best possible visual content. Several valuable methods

to Noise Ratio (PSNR) and the Weighted-PSNR (WP-

have been proposed for MDC (see [12] for a detailed

SNR) function, based on the contrast sensitivity function.

overview), but the main common drawbacks are related

Experimental results show an improved video quality at

to the computational complexity that makes them in

the client side while allowing exible support for video

many cases useless for real-time applications.

services to heterogeneous clients.

The authors propose in [11] and [12] a new approach

The structure of the paper is as follow: in Section

to MDC, suitable for DCT block transform coders,

II we describe the proposed approach, explaining how

which are the basis of current video coding standards.

the multiple description video coding framework and

The method is based on the alternate selection of the

the data hiding technique are combined in order to

transform coefcients in the different descriptors, and a

improve the perceived quality of the video. In Section

sorting operation. The former allows halving the number

III experimental results are shown and in Section IV we

of coefcients to be embedded in each descriptor, while

present some concluding remarks and future directions

the latter orders the coefcients by decreasing absolute

of this work.

value, and is intended to ease the interpolation of the

II. T HE

missing terms at the decoder, when a descriptor is lost. PROPOSED APPROACH

The underlying idea is to provide the decoder with

In the last decade different strategies have been pro-

the information needed to reconstruct a sorted array of

posed in order to face the problem of achieving the de-

coefcients, where only half of the values are available,

sired quality level in video streaming applications. This

but the positions and sign of the unavailable ones is

is not a trivial task, due to the limited reliability of data

known. Being the sequence sorted in descending order,

networks, especially in wireless channels characterized

the missing coefcients can be successfully recovered

by high bit-error rates, strong delay jitters, and packet

at the decoder by a simple monotonic decreasing inter-

losses.

polator, exploiting the typical exponential behavior of quantized DCT coefcients after zig-zag scan. In order

A. Multiple Description Coding

to keep the amount of redundancy low, the authors have

MDC can be included in the receiver-based rate con-

noticed that from a perceptual point of view it could be

trol methods [13]. As far as the video coding strategies is

sufcient to transmit only the information of few missing

concerned, two appropriate solutions are layered coding

DCT coefcients in the rst part of the zig-zag scan.

and MDC. Layered coding is a very efcient technique

Those terms usually have a strong impact on the resulting

suitable for many applications but with a drawback: if

quality of the video because they carry the low frequency

the base layer is not received or contains errors, the

information, and must be therefore considered carefully.

additional information coming from the enhancement

Some information about the high frequencies can instead

layers is almost useless. MDC aims at solving these

be discarded. At the receiver the overhead is comparable

problems by structuring the stream into two or more

with other MDC techniques, but ensures a very efcient

equivalent descriptors, which are individually decodable

computation suitable for real-time implementation. For

at lower quality, while providing the highest quality if

the sake of simplicity, the algorithm has been developed

suitably merged.

considering only two descriptors. Each descriptor is built

The problem of multiple description coding (MDC)

up on with one half of the DCT coefcients. In order

was rst faced as an information theoretic problem [2]-

to allow an efcient interpolation ad the decoder, the

[3], where complementary representations of the input

DC component is always replicated, together with the

3

information concerning the most important coefcients in the zig-zag scan. It is worth to be noticed that the transmission of the complete coefcient set cause a higher overhead; therefore, exploiting the natural decay of the DCT coefcients after the zig-zag scan, only the relative position and the sign is encoded/transmitted. Furthermore, as shown in Fig. 1, the information has been encoded in a differential form, considering that the topological position in the zig-zag scan often coincides with the sorted one.

B. Data Hiding Scheme Our goal is to exploit data hiding techniques in order to improve the perceived video quality. In particular, we propose to avoid the transmission of the DC coefcient of each block of the videoframe (which is normally inserted into all descriptions) embedding it into the other coefcients of the single descriptor with a proper robust

Fig. 1.

PDF of the differential position between the topological and

sorted coefcients positions

watermarking scheme. In this paper we use the binary Scalar Costa Scheme (SCS) [17], which belongs to the family of quantization-

to increase the quality of the video maintaining constant

based methods and achieves rates always larger than

the overhead, which is the focus of the paper.

those of the classical spread spectrum [18]. Once a

∆,

quantization step size

kn ∈ [0, 1) are dn ∈ {0, 1} can be key

the parameter

α

and the secret

In the following, the details of the modied MDC scheme are presented.

chosen, the sequence of elements

x

embedded into the cover

in the

following way

C. Modied Multiple Description Coding The procedure implemented to obtain the two descrip-

½ µ ¶¾ dn qn = Q∆ xn − ∆ + kn − 2 µ µ ¶¶ dn − xn − ∆ + kn 2 s = x + w = x + αq where

s

tors can be seen in Fig. 2. The chosen encoder is a typical JPEG-like scheme. After the quantization the two descriptors are created according to the alternate coefcient selection as already mentioned in the previous sections. The DC coefcient is then extracted and replicated on both descriptors, in order to facilitate the reconstruction.

is now watermarked. Therefore, we consider

the DC coefcient as a sequence of

dn ∈ {0, 1}

to be

embedded and the remaining coefcients of the single description as cover and

kn

x.

Once they are dened,

∆, α

are kept xed for all the MCD coding and SCS

embedding procedure. At the receiver side, we can extract the DC coefcient (the watermark) in the following way:

The proposed scheme aims at removing the introduced redundancy, by embedding the DC into the available AC coefcients. Furthermore, this implies that the video is not correctly decodable if the user does not have the proper key used for encrypting the DC. At the decoder, only the owner of the key is able to extract the DC component and place it in its DCT position. After this stage, the decoding process is standard compliant, and does not require further ad hoc processing.

yn = Q∆ {sn − kn ∆} − {sn − kn ∆} ∆ 2 and

In the following some results of the performed ex-

Notice, that as consequence of the DC coefcient

periments are reported. The data hiding subsystem has

hiding the scheme provides some encryption of the

been tuned considering that the codewords should not

video, which cannot be decoded without the knowledge

affect the entropy encoder too strongly. For this reason

|yn | ≤

yn

III. E XPERIMENTAL R ESULTS

dn = 0 dn = 1.

Therefore,

is closed to zero if

∆ is inserted and transmitted and close to 2 for

of the key

kn .

Moreover, we can exploit the data hiding

technique to reduce the bitrate of the whole sequence, or

we have applied a weak watermark, by setting:

kn = 0.01,

and

∆ = 10.

α = 0.1,

4

Fig. 2.

MDC scheme with DC embedding

N

Q

DC emb

Red %

PSNR dB

WPSNR dB

6

60

no

36.81

27.08

26.92

8

70

yes

35.51

27.90

27.32

TABLE I F OREMAN FRAME - BASED QUALITY EVALUATION

N

Q

DC emb

Red %

PSNR dB

WPSNR dB

6

60

no

32.75

24.12

25.02

8

70

yes

34.49

24.91

25.63

Fig. 3.

Sample from the sequence Foreman. SDC (left), MDC (right).

TABLE II

generated sequences. The general model VQM has been

S TEFAN FRAME - BASED QUALITY EVALUATION

designed to track subjective quality judgments of video scenes that can span a very wide range of quality levels and presents objective parameters which measure the perceptual effects of a wide range of impairments

In fact, in this specic application framework, the prin-

(blurring, block distortion, jerky/unnatural motion, noise

cipal objective is to reduce the redundancy, and not

and error blocks). Due to the nature of the VQM

to recover the information in the single description in

model that considers motion as a key factor for quality

case of packet losses. The simulations have been carried

evaluation purposes, this would not be meaningful in our

out considering a set of standard test sequences and

simulations, because of the adopted video coding scheme

both the PSNR and the WPSNR function, based on

(JPEG like).

the contrast sensitivity function as quality evaluation IV. C ONCLUSIONS

measures. Experimental results are reported in Table I and II for Foreman and Stefan video sequences (2

Future expansion of multimedia based applications

seconds each). The sequences have been compared with

strongly depends on the nal users' experience. The

the single description coding (SDC) mode that allowed

perceived quality can be affected by several factors as

achieving a PSNR of 33.4dB and 34.03dB respectively.

the particular application, the environment, the back-

As it can be seen from the Tables I and II, the quality

ground and the expectations of the users themselves.

improvement can be assessed on average around 1dB

Performances of multimedia services will be therefore a

(PSNR) or 0.5dB (WPSNR). In fact the DC embedding

compromise among several factors: bandwidth, quality,

procedure allowed improving the MDC coding stage,

security, latency time, etc., which are often in contrast

by incrementing both the number of coefcients to

one to each other.

be embedded in each description and the quantization parameter. To consider also the temporal features of the video, video

quality

In this work, we have proposed a data hiding-based method for increasing the quality of a video without re-

measurement

(VQM)

techniques

quiring additional information. The scheme well ts the

[19]

multiple description video coding framework, providing

should be also used to evaluate the quality of the

an increase in perceived quality. In this approach the

5

authors concentrated on the DC coefcient embedding

[12] N. Conci, F. De Natale, ”Multiple description video coding

in order to reduce the overall redundancy of the MDC

using coefcients ordering and interpolation,” Signal Process-

scheme. Several variation of the proposed scheme can

ing Image Communication Special Issue on Mobile Video (to appear).

be performed according to the particular application, but

[13] J. Chakareski, S. Han, B. Girod, ”Layered coding vs. multiple

this simple example should serve as a proof-of-concept

descriptions for video streaming over multiple paths,” Springer

for underlying the advantages of data hiding techniques for improving MDC schemes. If high quality of the received video is required, than our system can be used to improve it keeping xed the amount of overhead. Instead, if the available bandwidth is the main constraint, the proposed scheme allows reducing the amount of data

Multimedia Systems, vol. 10, no. 4, pp. 275-285, Apr. 2005. [14] V. Vaishampayan, ”Desing of multiple description scalar quantizers,” IEEE Transactions on Information Theory, vol. 39, no. 5, pp. 821-834, May 1993. [15] F. Verdicchio, A. Munteanu, A. I. Gavrilescu, J. Cornelis, and P. Schelkens, ”Embedded multiple description coding of video,” IEEE Transactions on Image Processing, vol. 15, no. 10, pp. 3114-3130, Oct. 2006.

to be transmitted while preserving an acceptable sub-

[16] M.T. Orchard, Y. Wang, V. Vaishampayan, and A. R. Reibman,

jective quality. The experimental results show a limited

”Redundancy rate-distortion analysis of multiple description

quality increase. The next step of this work is targeted at

coding using pairwise correlating transforms,” Proc. ICIP 1997, Santa Barbara, CA, Oct. 1997.

making the embedding strategy more efcient, in order to

[17] J.J. Eggers, R. Bauml, R. Tzschoppe, and B. Girod, ”Scalar

provide stronger quality enhancements for application on

Costa Scheme for Information Embedding,” IEEE Transactions

the most recent video coding standard such as MPEG-4 and H.264.

on Signal Processing, vol. 51, no. 4, pp. 1003-1019, Apr. 2003. [18] L. Perez-Freire, F. Perez-Gonzalez, and S. Voloshynovskiy, ”An accurate analysis of scalar quantization-based data hiding,” IEEE Transactions on Information Forensics and Security, vol. 1, no. 1, pp. 80-86, Mar. 2006.

R EFERENCES [1] V. K. Goyal, ”Multiple description coding: compression meets the network”, IEEE Signal Processing Magazine, vol. 18, no. 5, pp. 74-93, Sep. 2001. [2] J.K. Wolf, A.D. Wyner, and J. Ziv, ”Source coding for multiple descriptions,” Bell Syst. Tech. Journal, vol. 59, no. 8, pp. 14171426, Oct. 1980. [3] A. A. El Gamal and T. M. Cover, ”Achievable rates for multiple descriptions,” IEEE Transactions on Information Theory, vol. 28, no. 11, pp. 851-857, Nov. 1982. [4] J. G. Apostopoulos, ”Error-resilient video compression through the use of multiple states,” Proc. ICIP 2000, Vancouver, SC, CA, Sep. 2000. [5] J. G. Apostolopoulos, ”Reliable video communication over lossy packet networks using multiple state encoding and path diversity,” Proc. VCIP 2001, San Jose, USA, Jan. 2001. [6] R. Chandramouli, B.M. Graubard, and C.R. Richmond, ”A Multiple Description framework for Oblivious Watermarking,” Proc. SPIE Security and Watermarking for Multimedia Contents 2001, San Jose, USA, Jan. 2001. [7] Y.-F. Hsia, C.-Y. Chang, and J.-R. Liao, ”Multiple-Description Coding for Robust Image Watermarking,” Proc. ICIP 2004, Singapore, 2004. [8] S.-C. Chu, Y.-C. Hsin, H.-C. Huang, K.-C. Huang, and J.-S. Pan, ”Multiple Description Watermarking for Lossy Network,” Proc. ISCAS 2005, Kobe, Japan, May 2005. [9] C.B. Adsumilli, M. Carli, M.C.Q. de Farias, S.K. Mitra, ”A hybrid constrained unequal error protection and data hiding scheme for packet video transmission,” Proc. ICASSP 2003, Hong Kong, Apr. 2003. [10] C. B. Adsumilli, M. C. Q. Farias, S. K. Mitra, M. Carli, ”A robust error concealment technique using data hiding for image and video transmission over lossy channels,” IEEE Transaction on Circuit and Systems for Video Technology , vol. 15, no. 11, pp. 1394-1406, Nov. 2005. [11] N. Conci, F. De Natale, ”Multiple description video coding by coefcients ordering and interpolation,” Proc. Second International Mobile Multimedia Communications Conference 2006, Alghero, Italy, Sep. 2006.

[19] S.Wolf and M.Pinson, ”Video quality measurement techniques,” NTIA Report 02-392, June 2002.