1
Improving Perceptual Quality of Multiple Description Coding by Data Hiding 1 2 1 1 2 G. Boato , M. Carli , N. Conci , F. G. B. De Natale , and A. Neri 1
Dept. of Information and Communication Technology, University of Trento, Trento, Italy (boato, conci)@dit.unitn.it,
[email protected]
2
Dept. of Applied Electronics, Universit a degli Studi Roma TRE, Roma, Italy (carli, neri)@uniroma3.it
hiding-based
Description Coding (MDC) systems [2]-[3], rstly ap-
method for increasing the quality of a video is presented.
plied to generic information sources and then extended to
The proposed scheme well ts the multiple description
audio and video delivery. In MDC, the original video is
Abstract In
this
contribution
a
data
video coding framework, providing quality scalability with reduced overhead. Furthermore, the perceived quality of the reconstructed video is improved.
split in pre-dened number sub-streams, individually decodable at lower quality, and providing maximum quality when decoded at once [4]-[5]. The more descriptions the end user receives, the higher the quality of the decoded video. When all the sub-streams are available the perfect
I. I NTRODUCTION
reconstruction is obtained. Unfortunately, this operation
The multimedia era is a big leap in the data
is costly and the target is to introduce in each descriptor
communication scenario. Internet, IEEE 802.11, IEEE
the minimum amount of redundancy needed to obtain
802.16, and the third generation of mobile communi-
the best possible visual content.
cation (UMTS/CDMA2000) are spreading the use and
The coupling of watermarking and MDC appears in
the request for multimedia data. Video streaming appli-
the state of the art just in very few cases: in [6] the
cations have been showing a larger widespread, whose
image is coded into two description and only one of
motivations can be found in different concurrent reasons
them contains the mark, which can be obviously detected
such as the increasing diffusion of broadband networks
by comparison; in [7] and [8] the mark is embedded
(both wired and wireless), the reduction of connection
exploiting MDC and therefore improving robustness.
costs, the growing awareness of users on technology,
Our strategy combines MDC and data hiding in a
as well as the industrial push into the market of new
novel way, with the aim of achieving a higher quality,
multimedia services and devices [1].
while keeping the redundancy of partial reconstructions
Enabling video and audio services to clients every-
constant. The idea is based on two previous works of
where and anyway is a challenging problem, since
the authors. In [9]-[10], the use of watermarking for
the mobile devices typically vary on their connection
concealing video transmission over IP-based networks
characteristics, processing capabilities, and display sizes.
has been investigated. Briey, the key-frame of each
Furthermore,
transmitted
video shot is embedded in the relevant video code in
and processed is quite huge when video streaming or
order to allow a better interpolation of missing informa-
downloading is considered. These constraints, together
tion due to possible transmission errors. In [11]-[12],
with the intrinsic time-varying and location-dependent
an MDC scheme is proposed based on a JPEG-like
characteristics of wireless links, are in contrast with the
syntax to encode the side information, which requires an
increasing demand of a higher perceived quality from the
extremely low computational burden making it suitable
nal user. Therefore, in video streaming technologies is
for streaming and broadcasting applications, and ensures
a fundamental task the evaluation of the performance
a quasi-standard code, which makes it attractive for
in terms of robustness, signal quality, bandwidth re-
integration with current systems.
the
volume
of
data
to
be
quirements, reliability, accessibility, and, in a single expression, quality of experience.
The main drawback of this approach lies in the non negligible overhead associated to the side information.
A possible solution to low bandwidth constraints and
For this reason, in this work, we propose to avoid the
error prone environments can be represented by Multiple
usual transmission of the DC coefcient of each block
2
of the videoframe and to embed such information in
are sent to the decoder over different paths. In [4] and
the descriptor itself, by using an 'ad-hoc' watermarking
[5] this approach is exploited for the rst time for video
scheme. The DC coefcient has usually a strong impact
transmission by data partitioning, even though the MDC
on the video size, especially in intra-coded frames. Its
concept has been adopted making use of many different
removal makes the system more exible when one or
techniques such as the scalar quantization theory [14]-
more descriptors are lost, and allows either incrementing
[15] or the correlating transforms [16]. In all approaches
the overall quality of the stream (therefore xing the
the main objective still remains the same: subdivide the
amount of overhead), or reducing the bitrate of the whole
information into equal sub-streams, in order to make
sequence (at a given quantization parameter).
the reconstruction possible even when only part of the
In this work, the authors have focused on the perceived
original data is available. The cost to pay consists on
quality evaluation maintaining the amount overhead con-
a certain amount of redundancy introduced in each
stant. Two different approaches have been used: frames-
descriptor, needed at the decoder side to recover the
based quality evaluation, using the classical Peak Signal
best possible visual content. Several valuable methods
to Noise Ratio (PSNR) and the Weighted-PSNR (WP-
have been proposed for MDC (see [12] for a detailed
SNR) function, based on the contrast sensitivity function.
overview), but the main common drawbacks are related
Experimental results show an improved video quality at
to the computational complexity that makes them in
the client side while allowing exible support for video
many cases useless for real-time applications.
services to heterogeneous clients.
The authors propose in [11] and [12] a new approach
The structure of the paper is as follow: in Section
to MDC, suitable for DCT block transform coders,
II we describe the proposed approach, explaining how
which are the basis of current video coding standards.
the multiple description video coding framework and
The method is based on the alternate selection of the
the data hiding technique are combined in order to
transform coefcients in the different descriptors, and a
improve the perceived quality of the video. In Section
sorting operation. The former allows halving the number
III experimental results are shown and in Section IV we
of coefcients to be embedded in each descriptor, while
present some concluding remarks and future directions
the latter orders the coefcients by decreasing absolute
of this work.
value, and is intended to ease the interpolation of the
II. T HE
missing terms at the decoder, when a descriptor is lost. PROPOSED APPROACH
The underlying idea is to provide the decoder with
In the last decade different strategies have been pro-
the information needed to reconstruct a sorted array of
posed in order to face the problem of achieving the de-
coefcients, where only half of the values are available,
sired quality level in video streaming applications. This
but the positions and sign of the unavailable ones is
is not a trivial task, due to the limited reliability of data
known. Being the sequence sorted in descending order,
networks, especially in wireless channels characterized
the missing coefcients can be successfully recovered
by high bit-error rates, strong delay jitters, and packet
at the decoder by a simple monotonic decreasing inter-
losses.
polator, exploiting the typical exponential behavior of quantized DCT coefcients after zig-zag scan. In order
A. Multiple Description Coding
to keep the amount of redundancy low, the authors have
MDC can be included in the receiver-based rate con-
noticed that from a perceptual point of view it could be
trol methods [13]. As far as the video coding strategies is
sufcient to transmit only the information of few missing
concerned, two appropriate solutions are layered coding
DCT coefcients in the rst part of the zig-zag scan.
and MDC. Layered coding is a very efcient technique
Those terms usually have a strong impact on the resulting
suitable for many applications but with a drawback: if
quality of the video because they carry the low frequency
the base layer is not received or contains errors, the
information, and must be therefore considered carefully.
additional information coming from the enhancement
Some information about the high frequencies can instead
layers is almost useless. MDC aims at solving these
be discarded. At the receiver the overhead is comparable
problems by structuring the stream into two or more
with other MDC techniques, but ensures a very efcient
equivalent descriptors, which are individually decodable
computation suitable for real-time implementation. For
at lower quality, while providing the highest quality if
the sake of simplicity, the algorithm has been developed
suitably merged.
considering only two descriptors. Each descriptor is built
The problem of multiple description coding (MDC)
up on with one half of the DCT coefcients. In order
was rst faced as an information theoretic problem [2]-
to allow an efcient interpolation ad the decoder, the
[3], where complementary representations of the input
DC component is always replicated, together with the
3
information concerning the most important coefcients in the zig-zag scan. It is worth to be noticed that the transmission of the complete coefcient set cause a higher overhead; therefore, exploiting the natural decay of the DCT coefcients after the zig-zag scan, only the relative position and the sign is encoded/transmitted. Furthermore, as shown in Fig. 1, the information has been encoded in a differential form, considering that the topological position in the zig-zag scan often coincides with the sorted one.
B. Data Hiding Scheme Our goal is to exploit data hiding techniques in order to improve the perceived video quality. In particular, we propose to avoid the transmission of the DC coefcient of each block of the videoframe (which is normally inserted into all descriptions) embedding it into the other coefcients of the single descriptor with a proper robust
Fig. 1.
PDF of the differential position between the topological and
sorted coefcients positions
watermarking scheme. In this paper we use the binary Scalar Costa Scheme (SCS) [17], which belongs to the family of quantization-
to increase the quality of the video maintaining constant
based methods and achieves rates always larger than
the overhead, which is the focus of the paper.
those of the classical spread spectrum [18]. Once a
∆,
quantization step size
kn ∈ [0, 1) are dn ∈ {0, 1} can be key
the parameter
α
and the secret
In the following, the details of the modied MDC scheme are presented.
chosen, the sequence of elements
x
embedded into the cover
in the
following way
C. Modied Multiple Description Coding The procedure implemented to obtain the two descrip-
½ µ ¶¾ dn qn = Q∆ xn − ∆ + kn − 2 µ µ ¶¶ dn − xn − ∆ + kn 2 s = x + w = x + αq where
s
tors can be seen in Fig. 2. The chosen encoder is a typical JPEG-like scheme. After the quantization the two descriptors are created according to the alternate coefcient selection as already mentioned in the previous sections. The DC coefcient is then extracted and replicated on both descriptors, in order to facilitate the reconstruction.
is now watermarked. Therefore, we consider
the DC coefcient as a sequence of
dn ∈ {0, 1}
to be
embedded and the remaining coefcients of the single description as cover and
kn
x.
Once they are dened,
∆, α
are kept xed for all the MCD coding and SCS
embedding procedure. At the receiver side, we can extract the DC coefcient (the watermark) in the following way:
The proposed scheme aims at removing the introduced redundancy, by embedding the DC into the available AC coefcients. Furthermore, this implies that the video is not correctly decodable if the user does not have the proper key used for encrypting the DC. At the decoder, only the owner of the key is able to extract the DC component and place it in its DCT position. After this stage, the decoding process is standard compliant, and does not require further ad hoc processing.
yn = Q∆ {sn − kn ∆} − {sn − kn ∆} ∆ 2 and
In the following some results of the performed ex-
Notice, that as consequence of the DC coefcient
periments are reported. The data hiding subsystem has
hiding the scheme provides some encryption of the
been tuned considering that the codewords should not
video, which cannot be decoded without the knowledge
affect the entropy encoder too strongly. For this reason
|yn | ≤
yn
III. E XPERIMENTAL R ESULTS
dn = 0 dn = 1.
Therefore,
is closed to zero if
∆ is inserted and transmitted and close to 2 for
of the key
kn .
Moreover, we can exploit the data hiding
technique to reduce the bitrate of the whole sequence, or
we have applied a weak watermark, by setting:
kn = 0.01,
and
∆ = 10.
α = 0.1,
4
Fig. 2.
MDC scheme with DC embedding
N
Q
DC emb
Red %
PSNR dB
WPSNR dB
6
60
no
36.81
27.08
26.92
8
70
yes
35.51
27.90
27.32
TABLE I F OREMAN FRAME - BASED QUALITY EVALUATION
N
Q
DC emb
Red %
PSNR dB
WPSNR dB
6
60
no
32.75
24.12
25.02
8
70
yes
34.49
24.91
25.63
Fig. 3.
Sample from the sequence Foreman. SDC (left), MDC (right).
TABLE II
generated sequences. The general model VQM has been
S TEFAN FRAME - BASED QUALITY EVALUATION
designed to track subjective quality judgments of video scenes that can span a very wide range of quality levels and presents objective parameters which measure the perceptual effects of a wide range of impairments
In fact, in this specic application framework, the prin-
(blurring, block distortion, jerky/unnatural motion, noise
cipal objective is to reduce the redundancy, and not
and error blocks). Due to the nature of the VQM
to recover the information in the single description in
model that considers motion as a key factor for quality
case of packet losses. The simulations have been carried
evaluation purposes, this would not be meaningful in our
out considering a set of standard test sequences and
simulations, because of the adopted video coding scheme
both the PSNR and the WPSNR function, based on
(JPEG like).
the contrast sensitivity function as quality evaluation IV. C ONCLUSIONS
measures. Experimental results are reported in Table I and II for Foreman and Stefan video sequences (2
Future expansion of multimedia based applications
seconds each). The sequences have been compared with
strongly depends on the nal users' experience. The
the single description coding (SDC) mode that allowed
perceived quality can be affected by several factors as
achieving a PSNR of 33.4dB and 34.03dB respectively.
the particular application, the environment, the back-
As it can be seen from the Tables I and II, the quality
ground and the expectations of the users themselves.
improvement can be assessed on average around 1dB
Performances of multimedia services will be therefore a
(PSNR) or 0.5dB (WPSNR). In fact the DC embedding
compromise among several factors: bandwidth, quality,
procedure allowed improving the MDC coding stage,
security, latency time, etc., which are often in contrast
by incrementing both the number of coefcients to
one to each other.
be embedded in each description and the quantization parameter. To consider also the temporal features of the video, video
quality
In this work, we have proposed a data hiding-based method for increasing the quality of a video without re-
measurement
(VQM)
techniques
quiring additional information. The scheme well ts the
[19]
multiple description video coding framework, providing
should be also used to evaluate the quality of the
an increase in perceived quality. In this approach the
5
authors concentrated on the DC coefcient embedding
[12] N. Conci, F. De Natale, Multiple description video coding
in order to reduce the overall redundancy of the MDC
using coefcients ordering and interpolation, Signal Process-
scheme. Several variation of the proposed scheme can
ing Image Communication Special Issue on Mobile Video (to appear).
be performed according to the particular application, but
[13] J. Chakareski, S. Han, B. Girod, Layered coding vs. multiple
this simple example should serve as a proof-of-concept
descriptions for video streaming over multiple paths, Springer
for underlying the advantages of data hiding techniques for improving MDC schemes. If high quality of the received video is required, than our system can be used to improve it keeping xed the amount of overhead. Instead, if the available bandwidth is the main constraint, the proposed scheme allows reducing the amount of data
Multimedia Systems, vol. 10, no. 4, pp. 275-285, Apr. 2005. [14] V. Vaishampayan, Desing of multiple description scalar quantizers, IEEE Transactions on Information Theory, vol. 39, no. 5, pp. 821-834, May 1993. [15] F. Verdicchio, A. Munteanu, A. I. Gavrilescu, J. Cornelis, and P. Schelkens, Embedded multiple description coding of video, IEEE Transactions on Image Processing, vol. 15, no. 10, pp. 3114-3130, Oct. 2006.
to be transmitted while preserving an acceptable sub-
[16] M.T. Orchard, Y. Wang, V. Vaishampayan, and A. R. Reibman,
jective quality. The experimental results show a limited
Redundancy rate-distortion analysis of multiple description
quality increase. The next step of this work is targeted at
coding using pairwise correlating transforms, Proc. ICIP 1997, Santa Barbara, CA, Oct. 1997.
making the embedding strategy more efcient, in order to
[17] J.J. Eggers, R. Bauml, R. Tzschoppe, and B. Girod, Scalar
provide stronger quality enhancements for application on
Costa Scheme for Information Embedding, IEEE Transactions
the most recent video coding standard such as MPEG-4 and H.264.
on Signal Processing, vol. 51, no. 4, pp. 1003-1019, Apr. 2003. [18] L. Perez-Freire, F. Perez-Gonzalez, and S. Voloshynovskiy, An accurate analysis of scalar quantization-based data hiding, IEEE Transactions on Information Forensics and Security, vol. 1, no. 1, pp. 80-86, Mar. 2006.
R EFERENCES [1] V. K. Goyal, Multiple description coding: compression meets the network, IEEE Signal Processing Magazine, vol. 18, no. 5, pp. 74-93, Sep. 2001. [2] J.K. Wolf, A.D. Wyner, and J. Ziv, Source coding for multiple descriptions, Bell Syst. Tech. Journal, vol. 59, no. 8, pp. 14171426, Oct. 1980. [3] A. A. El Gamal and T. M. Cover, Achievable rates for multiple descriptions, IEEE Transactions on Information Theory, vol. 28, no. 11, pp. 851-857, Nov. 1982. [4] J. G. Apostopoulos, Error-resilient video compression through the use of multiple states, Proc. ICIP 2000, Vancouver, SC, CA, Sep. 2000. [5] J. G. Apostolopoulos, Reliable video communication over lossy packet networks using multiple state encoding and path diversity, Proc. VCIP 2001, San Jose, USA, Jan. 2001. [6] R. Chandramouli, B.M. Graubard, and C.R. Richmond, A Multiple Description framework for Oblivious Watermarking, Proc. SPIE Security and Watermarking for Multimedia Contents 2001, San Jose, USA, Jan. 2001. [7] Y.-F. Hsia, C.-Y. Chang, and J.-R. Liao, Multiple-Description Coding for Robust Image Watermarking, Proc. ICIP 2004, Singapore, 2004. [8] S.-C. Chu, Y.-C. Hsin, H.-C. Huang, K.-C. Huang, and J.-S. Pan, Multiple Description Watermarking for Lossy Network, Proc. ISCAS 2005, Kobe, Japan, May 2005. [9] C.B. Adsumilli, M. Carli, M.C.Q. de Farias, S.K. Mitra, A hybrid constrained unequal error protection and data hiding scheme for packet video transmission, Proc. ICASSP 2003, Hong Kong, Apr. 2003. [10] C. B. Adsumilli, M. C. Q. Farias, S. K. Mitra, M. Carli, A robust error concealment technique using data hiding for image and video transmission over lossy channels, IEEE Transaction on Circuit and Systems for Video Technology , vol. 15, no. 11, pp. 1394-1406, Nov. 2005. [11] N. Conci, F. De Natale, Multiple description video coding by coefcients ordering and interpolation, Proc. Second International Mobile Multimedia Communications Conference 2006, Alghero, Italy, Sep. 2006.
[19] S.Wolf and M.Pinson, Video quality measurement techniques, NTIA Report 02-392, June 2002.