Compressed-domain Encryption of Adapted H.264 Video - CiteSeerX

4 downloads 0 Views 162KB Size Report
developed for layered video compression techniques. Partial or selective .... the number of frames equal to the frame rate at which the raw video is being ...
Compressed-domain Encryption of Adapted H.264 Video Razib Iqbal, Shervin Shirmohammadi, and Abdulmotaleb El Saddik Distributed and Collaborative Virtual Environments Research laboratory (DISCOVER Lab) School of Information Technology and Engineering, University of Ottawa, Ottawa, Canada [riqbal | shervin | elsaddik] @site.uottawa.ca

Abstract Commercial service providers and secret services yearn to employ the available environment for conveyance of their data in a secured way. In order to encrypt or to ensure personalized security of the video contents in an intermediary node, it is necessary to have the content structure conforming to an international standard. Moreover, pressure to satisfy user preferences and device requirements seamlessly are raising the need for content to be customized providing the best possible experience. In this paper, we present perceptual encryption scheme for video encryption that is incorporated with a dynamic temporal adaptation technique of the H.264 video conforming ISO/IEC MPEG-21 Digital Item Adaptation. Encryption is performed on demand directly from the adapted bitstream and its generic Bitstream Syntax Description (gBSD).

1. Introduction In digital video encryption, the tradeoffs between security and speed, encryption and compression and format compliance is of greater importance in designing and implementing an encryption framework or algorithm. For commercial video encryption, if strict confidentiality needs to be maintained, then the technique should involve operations where content will not be revealed unless proper decryption key is used. On the other hand, in transparent encryption, full quality is restricted to the legitimate user only. The simplest way to encrypt a video is perhaps to consider the whole stream as a 1-D array and then encrypt this 1-D stream with the encryption key. However with advancement of time, video encryption mechanism considering sensitivity of digital video has been developed for layered video compression techniques. Partial or selective encryption algorithms [1] [2] are deployed on selective layers. The notion of perceptual encryption refers to partially degrading the visual data of the video content by encryption. Much research have already been performed [3-6] for video encryption based on perceptual encryption and

researchers have successfully offered Scalability-based perceptual encryption, Perceptual encryption for wavelet-compressed images and videos, Perceptual encryption of motion vectors in MPEG-videos etc. Adaptation is the newly applied practice for digital content customization. A conventional cascaded adaptation approach is nothing more than a concatenation of a decoder, an adaptation module and an encoder, which eventually requires either lots of processing power or greater processing time. So if the adaptation and encryption operations can be performed in the compressed domain without decompression and re-compression, it will surely be a significant improvement. These advantages have not gone unnoticed by the multimedia community and the ISO MPEG-21 standard has recently introduced provisions and techniques for a standard-based adaptation of multimedia in the compressed domain [7]. In this paper, we first present a brief discussion on MPEG-21 and H.264 in section 2. Section 3 consists of the literature review. Framework for temporal adaptation dynamically and directly from the compressed H.264 bitstream applying the MPEG-21 gBSD and the technique to encrypt the adapted bitstream based on the adapted gBSD is portrayed in section 4 and 5 respectively. Quality evaluation and result is presented in section 6. Finally we draw the conclusion in section 7.

2. MPEG-21 and H.264

Figure 1. MPEG-21 DIA Part 7 of MPEG-21 framework [8] specifies the syntax and semantics of tools that may be used to

assist adaptation of Digital Items. A Digital Item is denoted as a bit stream and all its relevant descriptions. The bitstream can be audio, video, or any other media. Figure 1 below illustrates the concept of MPEG-21 DIA. From the figure, we see that a Digital Item is subject to a resource adaptation engine and a description adaptation engine, which together produce the adapted Digital Item. In the specification, DIA tools are clustered into several categories. A Bitstream Syntax Description (BSD) that describes the syntax, in most cases is the high-level structure of a binary media resource. To provide full interoperability, a new language based on W3C XML Schema, called Bitstream Syntax Description Language (BSDL), is standardized in the MPEG-21 framework. With this language, it is possible to design specific Bitstream Syntax Schema (BS Schema) describing the syntax of a particular coding format. Since, BSDL provides a way for describing bitstream syntax with a codec specific BS Schema, an adaptation engine consequently requires knowing the specific schema. But if the adaptation takes place on some intermediary nodes, then a codec independent schema is more appropriate to offer a format independent adaptation procedure. Therefore, a generic Bitstream Syntax Schema (gBS Schema) is specified in the MPEG-21 framework. A description conforming to this schema is called a generic Bitstream Syntax Description (gBSD). The gBSD provides an abstract view on the structure of the bitstream that can be used in particular when the availability of a specific BS Schema is not ensured. However, for transformations on gBSD, it is important to include coding format specific information in attributes of the gBSD. H.264, the latest coding and compression standard by ITU-T and ISO/IEC as International Standard MPEG-4 part 10 Advanced Video Coding (AVC) is expected to dominate the field by offering a flexible architecture and compression gain of up to 50% [9]. The advanced compression technique, improved perceptual quality, network friendliness and versatility of the codec [10][11] drives it to outperform all the previous video coding standards. One of the major reasons behind choosing H.264 for our framework is its multiple reference pictures for motion compensation. The H.264 design covers two main layers - a Video Coding Layer (VCL) and a Network Abstraction Layer (NAL). Video Coding Layer represents the video content which includes tools and methods to code a set of macroblocks of a picture into a slice partition or a data partition.

Figure 2. Flexible slice sizes in H.264 video frame From figure 2, we can see that a picture can be split into 1 or several slices where slice sizes are flexible. Slices are self contained and can be decoded without using data from other slices. Network Abstraction Layer (NAL) formats the VCL representation of the video and provides header information in a manner appropriate for conveyance by particular transport layers.

3. Literature Review Overview of DIA, its use in multimedia applications, and report on some of the ongoing activities in MPEG on extending DIA for use in rights governed environments is available in the literature [12][13]. In [12], authors have mentioned several emerging research topics and open issues related to DIA which includes permissible and secure adaptation, where a secure adaptation of possibly encrypted content need to be performed. We have tried to address this open issue from another aspect where contents have been encrypted after adaptation. Devillers et al. have proposed BSD based adaptation in streaming and constrained environments [14]. In their framework, the authors have emphasized on BSD based adaptation applying BS Schema and BSDtoBin processors. But in our case, we have used gBSD, which provides an abstract view on the structure of the bitstream that can be used in particular when the availability of a specific BS Schema is not ensured, and gBSDtoBIN processor is being used for transformation. D. Mukherjee et al. [15] have utilized MPEG-21 DIA and introduced an Adaptation Decision taking Engine but the paper lacks detail in individual adaptation policies for different adaptation entities. Secure Scalable Streaming (SSS) framework discussed in [16] enables transcoding without decryption. It encodes video into secure scalable packets using jointly designed scalable coding and progressive encryption techniques, i.e. a video sequence is encoded into segments/layers and thus security is added by encrypting small blocks sequentially, and feeding the encrypted data of earlier

blocks into the encryption of later blocks. This combination allows downstream transcoders to perform transcoding operations by truncating or discarding packets, and without decrypting the data. It is worth mentioning that the SSS framework was designed for scalable coders but in [17] authors show the applicability of SSS framework to non-scalable coders. The latter frame work enables secure transcoding of the H.264/MPEG-4 AVC by encoding media into secure scalable packets, thus allowing potentially untrusted nodes to perform transcoding. But in both the proposed methods, intermediary nodes definitely need to know the content structure to apply adaptation and/or encryption. In the literature, there are traces of promising work that incorporate MPEG-21 DIA and H.264, but we did not encounter any dynamic secure adaptation framework that exploits MPEG-21 DIA and applies H.264. Since we have used gBSD, the adaptation/intermediary nodes need not to be aware of the video codec and can perform adaptation and/or encryption operations in a format independent manner.

4. System Description of Digital Item generation and Adaptation module A Systematic Procedure for Designing Video Adaptation Framework involves identifying adequate entities for adaptation, as well as identifying a feasible adaptation technique, for example: re-quantization, or frame dropping. In our temporal adaptation framework, I-frames and P-frames in each frameset are the entities; and skipping frames from the original compressed bitstream dynamically is the chosen adaptation technique. For implementation convenience, ‘Frameset’ has been assumed to represent the number of frames equal to the frame rate at which the raw video is being encoded (usually 30fps).

4.1. Generation of Digital Item

Figure 3. Digital Item generation

Figure 3 shows the structure of the Digital Item generation module. In the design, gBSD of each compressed video is generated while it is being encoded inside the encoder. Resulting compressed video and its gBSD in the form of XML shape the Digital Item. This Digital Item performs as original content for the resource server or for the content provider on the delivery path. We have enhanced the ITU-T reference software implementation [18] with gBSD generation functionality. The gBSD consists of frame number, frame start, frame type, length of each frame, importance level in frame marker (to be used for encryption, as shown later), number of slices/macroblocks, and start and length of each slice/macroblock for the entire bitstream.

4.2. Adaptation Module Both the gBSD and the compressed video bitstream are subject to the adaptation process. For gBSD transformation, adaptation characteristics (i.e. frame rate reduction) for the adapted gBSD are formed in a generic style sheet which is fed to the XSLT processor to transform the original gBSD. An XSL style sheet defines the template rules and describes how to display a resulting document. Template rules assembled in the style sheet are predisposed to filter the original gBSD based on the actual encoding frame rate and target frame rate. For this purpose, the user preference or device requirement, in terms of bitrate, is considered as an external input to the adaptation module. A frame skip pattern is implemented in the template based on the original frame rate and required frame rate, which is matched against the nodes in the source tree of the gBSD. If the original encoding frame rate is 30fps, the frame skip mechanism enables a consistent frame dropping pattern for a target frame rate in between 1 to 29fps. The pseudo-code of this strategy is shown below: FrameSkip(newFrameRate,oldFrameRate) { a=1, b=1; temp = newFrameRate/oldFrameRate; if(newFrameRate

Suggest Documents