PROCEEDINGS
DMS 2005 The 11th International Conference on Distributed Multimedia Systems
Sponsored by Knowledge Systems Institute, USA
Technical Program September 5-7, 2005 Fairmont Banff Springs Hotel, Banff, Alberta, Canada
Organized by Knowledge Systems Institute
Copyright © 2005 by Knowledge Systems Institute Graduate School
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of the publisher.
ISBN 1-891706-17-9 (paper)
Additional Copies can be ordered from: Knowledge Systems Institute Graduate School 3420 Main Street Skokie, IL 60076 USA Tel:+1-847-679-3135 Fax:+1-847-679-3166 Email:
[email protected] http://www.ksi.edu
Printed in the United States of America
ii
The 11th International Conference on Distributed Multimedia Systems (DMS 2005) September 5-7, 2005 Fairmont Banff Springs Hotel, Banff, Alberta, Canada
Organizers & Committee Conference Chair Shi-Kuo Chang, University of Pittsburgh, USA
Program Committee Co-Chairs T. Arndt, Cleveland State University, USA A. Guercio, Kent State University, USA
Program Committee Arvind Bansal, Kent State University, USA Athena Vakali, Aristotle University, Greece Athula Ginige, University of Western Sydney, Australia Augusto Celentano, University Ca' Foscari of Venice, Italy Carsten Griwodz, University of Oslo, Norway Chien-Tsai Liu, Taipei Medical College, Taiwan David H. C. Du, Univ. of Minnesota, USA Fadi P. Deek, New Jersey Institute of Technology, USA Filomena Ferrucci, Univ. of Salerno, Italy Fuhua Lin, Athabasca University, Canada Genny Tortora, University of Salerno, Italy Han-Chieh Chao, National Dong Hwa University, Taiwan Ing-Ray Chen, Virginia Tech (VPI&SU), USA James Kwok, Hong Kong University of Science and Technology, Hong Kong
iv
Jean-Luc Dugelay, Institute EURECOM, France Jonathan Liu, University of Florida, USA Joseph E. Urban, Arizona State Univ., USA Kai H. Chang, Auburn University, USA Kang Zhang, The University of Texas at Dallas, USA Larbi Esmahi, National Research Council of Canada, Canada Makoto Takizawa, Tokyo Denki University, Japan Mark Liao, Academia Sinica, Taiwan Masahito Hirakawa, Shimane University, Japan Maurizio Tucci, University of Salerno, Italy Meng Chang Chen, Academia Sinica, Taiwan Ming Ouhyoung, National Taiwan University, Taiwan Mohamed Ally, Athabasca University, Canada Nadia Berthouze, University of Aizu, Japan Nikolay Mirenkov, The University of Aizu, Japan Paolo Maresca, Univ. of Napoli, Italy Peter Douglas Holt, Athabasca University, Canada Rentaro Yoshioka, University of Aizu Shih-Fu Chang, Columbia University, USA Shi-Nine Yang, National Tsing Hua University, Taiwan Shu-Ching Chen, Florida International University, USA Son T. Vuong, Univ. of British Columbia, Canada Steven L. Tanimoto, Univ. of Washington, USA Stuart Goose, Siemens Corporate Research, USA Syed M. Rahman, Minnesota State University, USA Timothy K. Shih, Tamkang University, Taiwan Vincent Oria, New Jersey Institute of Technology, USA Wen-Syan Li, CCRL, NEC USA Inc. William Grosky, University of Michigan - Dearborn, USA Yonghee Choi, Seoul National Univ., Korea Yoshitaka Shibata, Iwate Prefectural University, Japan Zied Choukair, ENST Bretagne, France
Proceedings Cover Design Gabriel Smith, Knowledge Systems Institute Graduate School, USA
Conference Secretariat Judy Pan, Chair, Knowledge Systems Institute Graduate School, USA Tony Gong, Knowledge Systems Institute Graduate School, USA C. C. Huang, Knowledge Systems Institute Graduate School, USA Rex Lee, Knowledge Systems Institute Graduate School, USA Daniel Li, Knowledge Systems Institute Graduate School, USA
v
Table of Contents Foreword -------------------------------------------------------------------------------------------------------------------- iii Conference Organization -----------------------------------------------------------------------------------------------
iv
Keynote Intelligent Decision Support for the Design of Distributed Multi-Media Systems -----------------------------
1
Guenther Ruhe Digital Inpainting ---------------------------------------------------------------------------------------------------------- 2 Timothy Shih
DMS Papers Multimedia Content Analysis, Retrieval and Watermarking Adaptive Background Evaluation for Foreground Detection with Gaussian Distribution, a Fast Approach ------------------------------------------------------------------------------------------------------------------------ 5 Gianluca Bailo, Paivi Ijas, Marco Raggio, Fabio Sguanci Composite Algorithms in Image Content Searches ----------------------------------------------------------------- 10 J. R. Parker and Brad Behm Rotation and Scale Invariant Color Image Retrieval Using Fuzzy Clustering -------------------------------- 16 Shan Li, M.C.Lee An Image Watermarking Procedure Based on XML Documents ------------------------------------------------ 22 Franco Frattolillo, Salvatore D'Onofrio Low Complexity Scrambling Scheme for Compressed Audio Based on Human Auditory Perception Characteristics (S) ------------------------------------------------------------------------------------------------------- 28 Koichi Takagi, Shigeyuki Sakazawa, Yasuhiro Takishima
Visual and Multidimensional Languages for Multimedia Applications Synchronization in Multimedia Languages for Distributed Systems --------------------------------------------- 34 A. Guercio, A. Bansal, T. Arndt Uncertain topological relations for mobile point objects in terrain (S) --------------------------------------- 40 Karin Silvervarg, Erland Jungert Visual Languages for Non Expert Instructional Designers: A Usability Study ------------------------------- 46 Gennaro Costagliola, Andrea De Lucia, Filomena Ferrucci, and Giuseppe Scanniello Towards a Multimedia Ontology System: an Approach Using TAO_XML ------------------------------------- 52 Massimiliano Albanese, Paolo Maresca, Antonio Picariello and Antonio Maria Rinaldi xi
The Image Stack Stream Model, Querying and Architecture ----------------------------------------------------- 58 Alfonso F. Cardenas, Raymond K. Pon, Bassam S. Islam
Invited Session on E-learning SSRI online First experiences in a three-years course degree offered in e-learning at the University of Milan (Italy) ----------------------------------------------------------------------------------------------------------------- 65 Ernesto Damiani, Antonella Esposito, Maurizio Mariotti, Pierangela Samarati, Daniela Scaccia, Nello Scarabottolo A Web-Based Architecture for Tracking Multimedia using SCORM ---------------------------------------------- 71 P. Casillo, C. Cesalano, A. Chianese, V. Moscato A GQM Based E-Learning Platform Evaluation -------------------------------------------------------------------- 77 B. Fabini, P. Maresca, P. Prinetto, C. Sanghez, G. Santiano Models of pragmatics of man-machine interaction (Perspectives and problems in the Elementary Pragmatic Model experimentation) ------------------------------------------------------------------------------------------ 83 Sabina Bordiga, Luigi Colazzo, Francesco Magagnino, Daniela Malinverni, Andrea Molinari Ontologies for E-Learning ----------------------------------------------------------------------------------------------- 91 F. Colace, M. De Santo
E-commerce, E-education and E-entertainment Time Management for Senior Citizen Care (S) ----------------------------------------------------------------------- 97 Shi-Kuo Chang, Suresh Rangan and Yue Zhang Integrating e-Business and e-Learning Processes -------------------------------------------------------------------103 Andrea De Lucia, Rita Francese, Giuseppe Scanniello, Genoveffa Tortora A Personalized E-Learning System Based on User Profile Constructed Using Information Fusion ------109 Xin Li and Shi-Kuo Chang An Application of XML on Network Data Model for Data Conversion (S) ----------------------------------- 115 J. K. Chen, C. J. Liu Protection of Virtual Property in Online Gaming -------------------------------------------------------------------- 119 Ronggong Song, Larry Korba, George Yee, and Ying-Chieh Chen
Web Servers and Services The Research and Implementation of Semantic Based RDF Tagging and Webpage Searching Web Service (S) -------------------------------------------------------------------------------------------------------------------125 Jason C. Hung, Schummi Yang, Mao-Shuen Chiu and Timothy K. Shih A Planning Approach to Media Adaptation within the Semantic Web ------------------------------------------ 129 Matthias De Geyter, Peter Soetens A Component based Multimedia Middleware for Content Production Factory (S) --------------------------135 T. Martini, P. Nesi, D. Rogai, A. Vallotti Cost Estimation Modeling Techniques for Web Applications: An Empirical Study --------------------------139 G. Costagliola, S. Di Martino, F. Ferrucci, C. Gravino, G. Tortora, G. Vititello xii
ImagePickup: A Web Based Hybrid Image Search Engine --------------------------------------------------------145 Abolfazl Lakdashti, Mohammad Shahram Moin
Mobile Networks, Mobile Computing and Mobile Agents Using Agent Technology to Improve the Quality of Artificial Intelligence Instruction (S) ----------------- 151 Victor R. L. Shen Mobility in File Sharing --------------------------------------------------------------------------------------------------155 Hojjat Sheikhattar, Abolfazl Haghighat, Neda Noroozi A binary SOAP-based messaging infrastructure for mobile multimedia sessions -----------------------------161 Bjorn Muylaert, Stijn Decneut, Matthias De Geyter HyperGuide: a context-aware semantically interoperable multimedia application for the fruition of cultural heritage --------------------------------------------------------------------------------------------------------------- 166 Giorgio Ventre, Francesco Gragnani, Vincenzo Masucci, Claudio Nardi, Vincenzo Orabona
Multimedia HCI A 3D Interaction Metaphor for Remote Control of Smart Home Systems --------------------------------------172 Gennaro Costagliola, Sergio Di Martino, Filomena Ferrucci, Genoveffa Tortora Rapidly Prototyping Multimedia Groupware ------------------------------------------------------------------------- 178 Michael Boyle and Saul Greenberg Modelling complex user experiences in distributed interaction environments --------------------------------184 Fabio Pittarello and Daniela Fogli A Robotic Interface for Retrieval of Distributed Multimedia Content ------------------------------------------ 190 Ellen Lau, Ehud Sharlin, Ariel Shamir
Feature-Based Human Recognition and Tracking Handling the Occlutions in Fractal Face Recognition and Retrieval --------------------------------------------196 Michele Nappi, Daniel Riccio, Genny Tortora Beard Tolerant Face Recognition Based on 3D Geometry and Color Texture ---------------------------------202 Andrea F. Abate, Stefano Ricciardi, Gabriele Sabatino, Maurizio Tucci Play With Video: An Integration of Human Motion Tracking and Interactive Video -------------------------208 Timothy K. Shih, Hui-Huang Hsu, Chia-Ton Tan and Louis H. Lin A Design for Speaker Determination in Video Conferencing: An Application of Speaker Recognition --213 L. Bateman, J. R. Parker
OS Support for Distributed Multimedia Systems pcVOD: Internet Peer-to-Peer Video-On-Demand with Storage Caching on Peers --------------------------218 Lihang Ying, Anup Basu
xiii
Disk Performance and VBR Admission Control for Media Servers (S) ------------------------------------------224 Dwight Makaroff, Jason Coutu, and Fujian Liu Assessment of Data Path Implementations for Download and Streaming --------------------------------------228 Pal Halvorsen, Tom Anders Dalseng, Carsten Griwodz Application of Heuristic MMKP in Admission Control and QoS Adaptation for Distributed Video on Demand Service ----------------------------------------------------------------------------------------------------------- 234 Md. Shamsul Alam, S.M.Kamrul Hasan, A.S.M. Sohail, Mahmudul Hasan, Boshir Ahmed Extensions of Ethernet for Multimedia Transmission -------------------------------------------------------------- 241 Donald Molaro, J.R. Parker
Multimedia Communications and Network Architectures Adaptive Joint Source-Channel Coding for Real-Time Video Applications over Wireless IP Networks --247 Qi Qu, Yong Pei, and James W. Modestino An Asynchronous Multi-source Streaming Protocol for Scalable and Reliable Multimedia Communication ----------------------------------------------------------------------------------------------------------------------------253 Satoshi Itaya, Naohiro Hayashibara, Tomoya Enokido, and Makoto Takizawa SMIP: Striping Multimedia Communication Protocol for Large Scale Hierarchical Group ---------------259 Yasutaka Nishimura, Naohiro Hayashibara,Tomoya Enokido, and Makoto Takizawa Modeling and Analysis of Multipath Video Transport over Lossy Networks Using Packet-Level FEC---265 Xunqi Yu, James W. Modestino and, Ivan V. Bajic Dynamic Media Routing in Multi-User Home Entertainment Systems ----------------------------------------- 271 Marco Lohse, Michael Repplinger, and Philipp Slusallek
VLC Papers Tableaux for Diagrammatic Reasoning --------------------------------------------------------------------------------279 Octavian Patrascoiu, Simon Thompson, and Peter Rodgers A New Language for the Visualization of Logic and Reasoning --------------------------------------------------287 Gem Stapleton, Simon Thompson, Andrew Fish, John Howse, John Taylor Transfer of Problem-Solving Strategy Using the Cognitive Visual Language ---------------------------------293 Jim Davies, Ashok K. Goel, Nancy J. Nersessian Global and Vector Operations in a Rule-Based Visual Language (S) ------------------------------------------- 299 Joseph J. Pfeiffer, Jr. Syntax-directed Program Visualization --------------------------------------------------------------------------------303 Yoshihiro Adachi Staying Oriented with Software Terrain Maps -----------------------------------------------------------------------309 Robert DeLine From the Concrete to the Abstract: Visual Representations of Program Execution--------------------------315 Steven P. Reiss and Guy Eddon xiv
An Image Watermarking Procedure Based on XML Documents Franco Frattolillo, Salvatore D’Onofrio Research Centre on Software Technology, Department of Engineering, University of Sannio, Benevento, Italy Abstract
to insert a distinct code identifying the buyer within each copy of the distributed images. Furthermore, to increase the procedure security and robustness level, the watermark is repeatedly embedded into an image in the DCT domain at different frequencies and by exploiting both block classification techniques and perceptual analysis. The embedded watermark is then extracted from an image according to the information contained in a protected XML document that is associated to the image. Thus, the usual security and robustness levels characterizing the nonblind watermarking schemes can be achieved without requiring unprotected, high dimension images to be exchanged in the Internet whenever the watermark extraction has to be performed. In fact, keeping XML documents protected or securely exchanging them in a web context results nowadays in being much easier than securely transferring high dimension images or having to manage their secure storaging at distinct web entities. Moreover, using the XML technology makes it also easier to automate the document access in a web context, since XML is a standard technology well supported by the Java world, and standard document parsers, such as SAX and DOM parsers, are freely available.
This paper presents a watermarking procedure for JPEG images based on the use of protected XML documents. The procedure enables the copyright owner to insert a distinct watermark code identifying the buyer within the distributed images. Furthermore, to increase the security level of the procedure, the watermark is repeatedly embedded into an image in the DCT domain at different frequencies and by exploiting both block classification techniques and perceptual analysis. The embedded watermark is then extracted from an image according to the information contained in a protected XML document that is associated to the image.
1. Introduction and Motivations Digital watermarking can be considered as one of the main technologies to implement the copyright protection of digital contents distributed on the Internet [1]. To this end, many watermarking procedures adopt “blind” insertion schemes and are based on fingerprinting techniques that enable the copyright owner to insert specific “anticollusion codes” able to identify the buyer within any copy of content that is distributed [12]. The main aim is to make it possible to establish if a user is illegally in possession of a content as well as who has initially bought and then illegally shared it via, for example, peer-to-peer network applications [1]. However, “nonblind” watermarking schemes are typically considered more robust then blind ones [8, 12]. Unfortunately, differently from blind ones, nonblind schemes need the original digital contents in order to be able to run the watermark extraction algorithms on the corresponding pirated copies. This is considered a drawback particularly for watermarking procedures that aim at being adopted in a web context, because such procedures force the distinct web entities involved in “identification and arbitration” protocols [8] to exchange unprotected, high dimension digital contents through the insecure communication channels characterizing the Internet [6]. This paper presents a web oriented watermarking procedure for JPEG images based on the use of protected XML documents [5]. The procedure enables the copyright owner
ISBN 1-891706-17-9
The paper is organized as the follows. Section 2 describes the proposed watermarking procedure. Section 3 shows how the information contained in the XML documents associated to the protected images can be exploited in the watermark extraction process described in Section 4. Section 5 reports on some experimental results. Section 6 reports conclusion remarks.
2. The Watermarking Procedure The proposed procedure makes it possible to insert into a JPEG image a binary code represented by a sequence µ ∈ {0, 1} and able to unambiguously identify a user. The sequence µ , whose length is denoted as nµ , is repeatedly embedded into the image in the DCT domain at different frequencies, denoted as γ1 , γ2 . . . γ f . In particular, since the coefficients in each 8×8 DCT block of an image have a frequency value associated with them, a γ value identifies an entry in such blocks, and so, it can range from 1 to 82 = 64. Furthermore, to increase the security and robustness level
22
F. Frattolillo and S. D'Onofrio
DMS 2005, Banff, Canada
of the procedure, the watermark insertion is assumed to be carried out at low, middle and high frequencies chosen on the basis of the image to watermark. In principle, all the DCT coefficients at a given frequency could be modified by a value representing a watermark information. However, in the proposed procedure, the “perceptual capacity” of the coefficients belonging to the luminance DCT blocks is preliminarily estimated by exploiting both block classification techniques and perceptual analysis. In fact, the block classification techniques [3, 9] are applied to indicate the bests DCT coefficients that can be altered without reducing the visual quality. They classify each luminance DCT block with respect to its energy distribution by using four classification masks. The possible types of classified blocks are “flat”, “diagonal edge”, “horizontal edge”, “vertical edge” and “textured block”. The result of this procedure is a first selection of DCT coefficients whose modification has a minimal or no impact to the perceptual quality of the image. The perceptual analysis is then applied to calculate the “just noticeable difference” ( jnd) values for the DCT coefficients [7, 10, 11]. Such values are the thresholds beyond which any changes to the respective coefficient will most likely be visible. Therefore, let Xbm (γ ) denote the DCT coefficient at the frequency γ in the block bm , and let JNDbm (γ ) denote the jnd value calculated for the Xbm (γ ) coefficient. JNDbm (γ ) can be calculated as: n o JNDbm (γ ) ≈ max Cbm (γ ), |Cbm (γ )| Ebm (γ )g (1)
perceptual analysis. In particular, the “choice rule” states that two DCT coefficients are allowed to belong to a same pair only if they have similar values. Consequently, if µ is nµ bit long, the process that selects the DCT coefficients at the frequency γ has to choose at least 2 · nµ coefficients. Moreover, if f insertion frequencies are chosen, the total number of DCT coefficients to select are 2·nµ · f . To insert the bits of a user sequence µ into an image, the “encoding function” K has to be defined within the watermarking procedure. K defines an encoding rule by which the bits 0 and 1 are translated to the symbols belonging to the alphabet composed by {%, &}, respectively called the up symbol and the down symbol. Thus, a user sequence µ ∈{0, 1} is translated to a corresponding sequence of symbols σ ∈ {%, &} depending on the function K. For example, the user sequence {01101 . . .} is translated to the sequence of symbols {%&&%& . . .}, if the function K associates the up symbol to 0 and the down symbol to 1. Let µ be a user sequence, and let σ be the corresponding sequence of symbols obtained by applying a K function. Let γ1 , γ2 . . . γ f be the insertion frequencies. Let Wbm (γi ) denote the watermarked DCT coefficient at the frequency γi in the block bm . A symbol of σ is inserted into a pair of DCT coefficients belonging to the blocks bm and bn , at the frequency γi , by the following expressions: ½ Wbm (γi ) = Xbm (γi ) − JNDbm (γi ) to insert % Wbn (γi ) = Xbn (γi ) + JNDbn (γi ) ½ Wbm (γi ) = Xbm (γi ) + JNDbm (γi ) to insert & Wbn (γi ) = Xbn (γi ) − JNDbn (γi ) In fact, since the “choice rule” imposes that Xbm (γi ) ≈ Xbn (γi ) for each selected pair of DCT coefficients, the insertion process attempts to maximize the difference existing between the coefficients of the pair according to the direction specified by the insertion symbol and by an amount that should not compromise the final visual quality of an image. Therefore, the insertion process should be carried out according to the following rules:
where Cbm (γ ) represents the perceptual threshold of the contrast masking and is expressed as: n o Cbm (γ ) = max tbm (γ ), |Xbm (γ )|h tbm (γ )1−h (2) Ebm (γ ) is the entropy value calculated over the eight neighbors of the Xbm (γ ) coefficient [7, 10] and can be approximated by the following expression: Ebm (γ ) ≈ Xbm (γ ) − ubm (γ )q(γ )
1. The insertion frequencies should be evenly distributed among the low, middle and high frequencies, and should be chosen so that attacks characterized by a filtering behavior on an image would end up reducing its final visual quality. This can be achieved by selecting the frequencies characterized by high spectrum values, which, if filtered, can impair the image.
(3)
In (1) g is assumed equal to 0.5, while in (2) h is assumed equal to 0.7 and tbm (γ ) is equal to t(γ )(Xbm (1)/X(1)), where X(1) is a DC coefficient corresponding to the mean luminance of the display, while Xbm (1) is the DC coefficient of the block bm . In fact, t(γ ) can be approximated by the value q(γ )/2, where q(γ ) represents the coefficient of the quantization matrix corresponding to the frequency γ [10]. Finally, in (3) ubm (γ ) is equal to round(Xbm (γ )/q(γ )). The insertion procedure at a given frequency γ assumes that each bit of the user sequence µ is inserted into a given image by altering a pair of DCT coefficients associated to the frequency γ and chosen among the ones previously selected by applying the block classification techniques and
2. At each insertion frequency, the pairs of the selected DCT coefficients should belong to spatial regions that cannot be cropped without impairing the image. Once the symbols of the sequence σ have been inserted into the image at the chosen frequencies, in order to increase the security and robustness level of the watermarking procedure against collusion and averaging attacks, it is
23
DMS 2005, Banff, Canada
F. Frattolillo and S. D'Onofrio
γ1 , γ2 . . . γ f ; (2) the sets Σ(γi ), ∀i = 1 . . . f ; (3) the encoding function K; (4) some information about the original image, which can well and synthetically characterize the image and can be exploited to individuate possible geometric modifications performed on the image. Thus, when a pirated image is found in the market, a trusted third party (TTP) delegated to run the identification and arbitration protocols can retrieve the XML document associated to the image from the image’s copyright owner in a protected and ciphered form. Then, the TTP can extract the code identifying the original buyer of the image from the pirated copy. This way, the image’s copyright owner and the TTP are the sole web entities that are allowed to access the XML document associated to the image. This means that the security level achieved by the proposed procedure closely depends on the capabilities of both the copyright owners and TTPs of keeping the XML documents protected [4].
necessary to hide the modifications made to the DCT coefficients of the image. In fact, let γ1 , γ2 . . . γ f be the insertion frequencies chosen for the image, and let Σ(γi ) denote the sequences of the pairs of DCT entries (bm (γi ),bn (γi )) that have been involved in the watermarking process for a given frequency γi , ∀i = 1 . . . f . It is worth noting that both the set of the frequencies γi and the sets Σ(γi ) are always the same for all the copies of a given image to protect. Consequently, the DCT coefficients modified at the different insertion frequencies remain the same for all the copies of the image. Therefore, in order to prevent malicious users from individuating the DCT coefficients modified by the insertion process, the jnd values modulated by a binary pseudo-noise sequence ρ ∈ {−1, 1} have to be added to all the unmodified DCT coefficients of a watermarked image. This addition is carried out by the following expression: Xbk (γi ) = Xbk (γi ) + αk ρk JNDbk (γi ), (i 6= 1 . . . f ) or (i = 1 . . . f and bk ∈ / Σ(γi ))
4. The Watermark Extraction
where 0 < αk < 0.5 is a randomly varied amplitude factor.
The first operation to perform before carrying out the watermark extraction from a given protected image is its geometric re-synchronization. Therefore, let Figures 1(a) and 1(b) be the watermarked and the attacked versions of the “Lena” image. To carry out the geometric resynchronization, it is necessary to exploit the information stored in the XML document associated to Lena in order to build a reference picture (Figure 1(e)) whose dimensions coincide with the ones of the original image (Figure 1(a)). Then, the feature points connected by segments and specified by the XML document are to be reported on the picture (Figure 1(e)). These points have been originally determined on the watermarked Lena (Figure 1(c)) and are: the eyes of Lena, whose coordinates are (x2 , y2 ) and (x3 , y3 ); the tip of her hat, specified by (x4 , y4 ); the right end of her mouth, specified by (x1 , y1 ). The coordinates are referred to the X and Y axes, and the dimensions of the watermarked Lena are respectively dx and dy (Figures 1(c) and 1(e)). The successive operation consists in reporting the feature points connected by segments on the attacked version of Lena (Figure 1(d)), which is a scaled and 45o rotated version of Lena (Figure 1b). To this end, it is worth noting that the feature points can be reported on the image solely starting from the textual description provided by the XML document associated to the image (Figures 1(d) and 1(f)). This entails a natural approximation in individuating the feature points on the attacked image, which can determine errors in the geometric re-synchronization process involving Figures 1(e) and 1(f). However, the preliminary tests, conducted also on other images available on the web, have shown that the proposed procedure is robust with respect to such approximations. In fact, the procedure has been able to ensure a correct watermark extraction provided that
3. The XML Documents The capability of both repeatedly embedding a user code at different frequencies and hiding the watermarked DCT coefficients can make the proposed procedure almost secure against the most common filtering, corrupting, removal, averaging and collusion attacks. However, the characteristics of the insertion process could make the procedure vulnerable to geometric attacks. Therefore, to increase the robustness level of the procedure against such attacks, the attacked images should be geometrically resynchronized before carrying out the watermark extraction. To detect the most common geometric distortions applied to a watermarked image without having to use complex re-synchronization techniques, the proposed procedure makes use of some information about the image, which is assumed to be stored in a protected XML document associated to the image. This allows the information to be stored in both textual and quantitative form. The textual information can individuate and describe some evident and significant “feature points” and boundary segments of the image. The quantitative information can provide the original dimensions of the image, the coordinates of the feature points and selected boundaries, some Fourier descriptors and statistical moments of K-point digital boundaries, as well as the eigenvectors and eigenvalues of some well-defined regions of the image. Thus, inverse geometric transformations can be performed on the image in order to restore it before the watermark extraction [9]. Therefore, the XML document associated to each image to protect has to include: (1) the insertion frequencies
24
F. Frattolillo and S. D'Onofrio
DMS 2005, Banff, Canada
(a)
(b)
(c) Y
(d)
(x4,y4)
Y (x3,y3)
(x4,y4)
(x2,y2)
(x3,y3)
(x1,y1)
dy (x1,y1) 0
dx
(e) Figure 1. “Lena”.
Figure 2. The watermarked “Lena” and its worst re-synchronized version.
dy
(x2,y2) X
0
dx
X
(f)
The geometric re-synchronization of
the rotation degrees and scale factors are determined with approximations in the range of ±6%. However, these limits have never been exceeded in the conducted, practical tests. To this end, Figure 2 shows the result of the worst resynchronization performed on the scaled and 45o rotated version of the watermarked Lena, which has not anyway prevented a correct watermark extraction, as reported in Section 5. In fact, the imperfect re-building of Lena essentially affects the outer regions of the image, i.e. the regions that do not, and should not, host watermark information. After the geometric re-synchronization of the attacked image, the watermark extraction can be carried out. In particular, for each insertion frequency γi , the pairs of coefficients specified by the DCT entries (bm (γi ),bn (γi )) belonging to the set Σ(γi ), with i = 1 . . . f , have to be examined. Therefore, let Wˆ bm (γi ) and Wˆ bn (γi ) be the two coefficients of a pair belonging to Σ(γi ). To extract the watermark symbol they host, the following expression has to be calculated: ½ Wˆ bm (γi ) − Wˆ bn (γi ) > 0 =⇒ & is extracted (4) Wˆ bm (γi ) − Wˆ bn (γi ) < 0 =⇒ % is extracted
Attack
f
ber (%)
PSNR
JPEG 60 JPEG 60 JPEG 60 JPEG 40 JPEG 40 JPEG 40 Add Noise 5 Add Noise 5 Add Noise 5 Sharpening Sharpening Sharpening Median 3 Median 3 Median 3 Median 9 Median 9 Median 9 Rotating 45◦ Rotating 45◦ Rotating 45◦
3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9
3.78 3.61 3.13 5.77 4.79 4.53 1.95 1.67 1.2 1.86 1.47 1.35 1.67 1.34 1.13 1.99 1.85 1.63 3.85 3.12 2.61
31.15 29.96 27.71 26.51 25.37 24.92 25.79 23.44 22.13 33.33 31.02 30.92 30.79 30.12 29.89 24.33 24.01 23.88 23.45 22.91 22.7
Table 1. The results of some Stirmark attacks. Therefore, let µ ( j) denote the j-th bit of µ . µ ( j) can be derived from the sequences µˆi by the expression:
µ ( j) ≡ 1 ⇐⇒
f ∑i=1 µˆi ( j) > 0.5, f
∀ j = 1 . . . nµ
5. Experimental Results To evaluate the security and robustness of the proposed watermarking procedure, some relevant attacks have been performed on many different images by using the widely accepted tool “Stirmark”. However, for the sake of brevity, only the results obtained on Lena are reported. Furthermore, Stirmark does not perform “collusion attacks” [12]. So, such attacks have been purposely implemented. Figure 3 shows the original Lena, whose dimension is 512×512 pixels, and its watermarked version. The PSNR calculated on these two images is 47.25 db, where the PSNR is defined as 10 log(2552 /MSE), being MSE the “mean squared error” computed on Lena and its watermarked version. The user sequence µ is 128 bit long.
Then, the extracted symbol is translated to a bit depending on the encoding function K. After the watermark extraction, f user sequences µˆi result in being re-built, one for each insertion frequency γi .
25
DMS 2005, Banff, Canada
F. Frattolillo and S. D'Onofrio the procedure allows for choosing the insertion frequencies as well as the regions of the image where to embed the watermark. Moreover, the tests in Table 1 show that the value of f does not result in strongly influencing the final PSNR values. As a consequence, it can be also increased to improve the security level of the procedure without causing a further reduction of the final visual quality of the image.
Figure 3. “Lena” and its watermarked version.
(a)
(b)
(c)
(d)
5.2. Collusion Attacks For user codes or fingerprints to allow for identifying “colluders” exploiting differently watermarked versions of the same image to produce a new version of the image with no detectable watermark, the watermark embedding method has to be capable of withstanding the collusion attacks [12]. Therefore, some tests based on “linear” and “nonlinear” collusion have been purposely implemented. In the former case, k differently watermarked copies of the same image are linearly combined by averaging the copies with an equal weight to produce a colluded version of the image. In the latter case, an attacked image is created in which each DCT coefficient is the minimum, maximum, and median, respectively, of the corresponding coefficients of k watermarked copies of the same image [12]. Table 2 summarizes the results obtained in the collusion tests. In particular, the codes used in the colluding copies have been generated according to what reported in [2]. Furthermore, in order to correctly assess the procedure behavior independently of the adoption of anticollusion codes, the following conditions have been considered as errors in the watermark extraction:
Figure 4. Some attacked images of “Lena”.
5.1. Stirmark Attacks Table 1 reports three main parameters for each attack: the number f of the insertion frequencies used to waterf mark the image; the ber, defined as (∑i=1 beri )/ f , where beri is the number of bit errors reported in the watermark extraction carried out at the frequency γi ; the PSNR calculated on the watermarked and attacked image. Figure 4 shows some attacked images of Lena: a sharpened image (4(a)), a “median” filtered (factor 9) image (4(b)), an image corrupted by an additive noise (factor 5) (4(c)), a JPEG reencoded (quality factor 60) image (4(d)). The attacks are named according to the Stirmark definitions. The results reported in Table 1 show that the proposed procedure can achieve a good performance against attacks that are considered able to prove a high level of robustness without imposing any strong constraint to the length of the codes identifying users. In particular, the ber values are always very low and have never prevented the user sequence µ from being correctly re-built from the single sequences µˆi extracted for each test at the different insertion frequencies. In fact, the redundancy assured by the insertion process enables the procedure to behave as other well known watermarking procedures. This also because
1. if all the colluding copies present a bit 0 in the i-th position of the embedded codes and a bit 1 is extracted from the colluded copy; 2. if all the colluding copies present a bit 1 in the i-th position of the embedded codes and a bit 0 is extracted from the colluded copy; 3. if the colluding copies present both a bit 0 and a bit 1 in the i-th position of the embedded codes and a bit 0 or a bit 1 is extracted from the colluded copy. In fact, if the conditions reported above occur in the watermark extraction, the watermarking procedure prevents the anticollusion codes from correctly catching colluders according to the capabilities documented in [2]. Therefore, to reduce the number of errors determined by the above conditions, particularly the condition 3, the expressions reported in (4) have been thus modified: ½
26
Wˆ bm (γi ) − Wˆ bn (γi ) > th =⇒ & is extracted Wˆ bm (γi ) − Wˆ bn (γi ) < −th =⇒ % is extracted
(5)
F. Frattolillo and S. D'Onofrio Attack
f
DMS 2005, Banff, Canada k
Linear attacks 5 3 Averaging 5 9 Averaging 10 3 Averaging 10 9 Averaging 15 3 Averaging 15 9 Averaging Nonlinear attacks 5 3 Minimum 5 9 Minimum 10 3 Minimum 10 9 Minimum 15 3 Minimum 15 9 Minimum 5 3 Median 5 9 Median 10 3 Median 10 9 Median 15 3 Median 15 9 Median 5 3 Maximum 5 9 Maximum 10 3 Maximum 10 9 Maximum 15 3 Maximum 15 9 Maximum
ber (%)
tion process enables the procedure to achieve a good performance against the most common and dangerous attacks. Moreover, the procedure robustness can be improved by increasing the number of the insertion frequencies. Finally, the use of XML documents enables the security and robustness levels usually characterizing the nonblind watermarking schemes to be achieved without requiring the unprotected original images to be exchanged in the Internet whenever the watermark extraction has to be performed. In fact, keeping XML documents protected or securely exchanging them in the Internet results nowadays in being almost easy. Furthermore, the XML technology is well supported by the Java world, and this makes the procedure well suited to be adopted in a web context.
1.95 1.89 2.62 2.54 3.97 3.02 1.09 0.94 2.1 1.83 3.07 2.84 1.27 1.12 2.35 2.06 3.42 3.08 1.14 1.06 2.28 2.05 3.26 3.03
References [1] M. Barni and F. Bartolini. Data hiding for fighting piracy. IEEE Signal Processing Magazine, 21(2):28–39, 2004. [2] D. Boneh and J. Shaw. Collusion-secure fingerprinting for digital data. IEEE Trans. on Information Theory, 44(9):1897–1905, 1998. [3] T.-Y. Chung, M.-S. Hong, et al. Digital watermarking for copyright protection of MPEG2 compressed video. IEEE Trans. on Consumer Electronics, 44(3):895–901, 1998. [4] F. Frattolillo and S. D’Onofrio. Applying web oriented technologies to implement an adaptive spread spectrum watermarking procedure and a flexible DRM platform. In P. Montague et al., editors, Procs of the 3rd Australasian Information Security Workshop, volume 44 of Conferences in Research and Practice in Information Technology, pages 159– 167, Newcastle, Australia, Feb. 2005. [5] F. Frattolillo and S. D’Onofrio. A video watermarking procedure based on XML documents. In Procs of the 13th Intl Conf. on Image Analysis and Processing, LNCS, Italy, 2005. [6] S. Katzenbeisser. On the design of copyright protection protocols for multimedia distribution using symmetric and public-keywatermarking. In Procs of the 12th Intl Workshop on Datab. and Expert Systems App., pages 815–819, 2001. [7] S. W. Kim and S. Suthaharan. An entropy masking model for multimedia content watermarking. In Procs of the 37th Hawaii Intl Conf. on System Sciences. IEEE CS, 2004. [8] C. L. Lei, P. L. Yu, et al. An efficient and anonymous buyer-seller watermarking protocol. IEEE Trans. on Image Processing, 13(12):1618–1626, 2004. [9] Y. Wang, J. Ostermann, and Y. Zhang. Video Processing and Communications. Prentice Hall, 2002. [10] A. B. Watson. DCT quantization matrices visually optimized for individual images. In J. P. Allebach and B. E. Rogowitz, editors, Human Vision, Visual Processing and Digital Display IV, volume 1913 of SPIE Procs, pages 202–216, S. Jose, CA, USA, Feb. 1993. [11] R. B. Wolfgang et al. Perceptual watermarks for digital images and video. Procs of the IEEE, 87(7):1108–1126, 1999. [12] M. Wu et al. Collusion-resistant fingerprinting for multimedia. IEEE Signal Processing Magazine, 21(2):15–27, 2004.
Table 2. The results of some collusion attacks. where th is a threshold calculated as th = ω (JNDbm (γi ) + JNDbn (γi )),
0.45 < ω < 0.55
and ω is a factor depending on the characteristics of the image. Thus, whenever the expressions in (5) do not allow for extracting an up or a down symbol from the i-th position of the code retrieved from the colluded copy, it is possible to establish that the colluding copies present both a bit 0 and a bit 1 in the i-th position of the embedded codes, and this enables the anticollusion codes to catch the colluders according to the capabilities described in [2]. The results reported in Table 2 show that the proposed procedure can achieve a good performance also against some relevant collusion attacks. In fact, the ber values are low and always allow the embedded codes to identify colluders according to the catching capability documented in [2]. Moreover, the behavior of the procedure results in being rather independent of the number of the insertion frequencies. Finally, the PSNR values have not been reported in Table 2, since they all approximate a value about 42 db.
6. Conclusions This paper presents a watermarking procedure that directly acts on compressed JPEG images and exploits XML documents to store information needed to the watermark extraction. The redundancy assured by the inser-
27