Secure and Robust High Quality DWT Domain Audio Watermarking Algorithm with Binary Image A.R.Elshazly, M.M.Fouad
M.E.Nasr
Electronics and Electrical Communication Dept. Zagazig University Zagazig, Egypt
[email protected]
Electronics and Electrical Communication Dept. Tanta University Tanta, Egypt
[email protected]
Abstract— To enhance security and robustness of digital audio watermarking algorithms, this paper presents an algorithm based on mean-quantization in Discrete Wavelet Transform (DWT) domain. A binary image is used as a watermark, and is encrypted with chaotic encryption with secret key. This approach is based on the embedding of an encrypted watermark in the low frequency components using a two wavelet functions with adaptation to the frame size. The reason for embedding the watermark in the low frequency components is that these components’ energy is high enough to embed the watermark in such a way that the watermark is inaudible; therefore, it should not alter the audible content and should not be easy to remove. The algorithm has a good security because only the authorized can detect the copyright information embedded to the host audio signal. The watermark can be blindly extracted without knowledge of the original signal. To evaluate the performance of the presented audio watermarking method, objective quality tests including bit error rate (BER), normalized cross correlation(NCC), peak-signal to noise ratio (PSNR) are conducted for the watermark and Signal-to-Noise Ratio(SNR) for audio signals. The tests’ results show that the approach maintains high audio quality, and yields a high recovery rate after attacks by commonly used audio data manipulations such as noise addition, amplitude modification, low-pass filtering, requantization, re-sampling, cropping, cutting, and compression. Simulation results show that our approach not only makes sure robustness against common attacks, but it also further improves systemic security and robustness against malicious attack. Keywords- Audio watermarking; Binary image; normalized cross correlation; Robustness; Security component; formatting; style; styling; insert (key words)
I.
INTRODUCTION
In recent years, due to the growth of networked multimedia systems and the widespread use of personal computers, people have easy access to vast amounts of copyrighted digital data. This data, which includes text, digital audio, image and video, offers various advantages [1]. They can be reproduced without loss of quality, shared by multiple users, distributed over networks, and managed for long periods of time without any damage. However, unauthorized copying and distribution of digital data are
serious threats to the rights of content owners [2-3]. Therefore, digital data protection and copyright issues have become more and more important in the face of today’s technology. As a solution to copyright protection issues, digital watermarking technology is gaining attention as a new method of protecting copyrights for digital data. Digital watermarking is a technique that embeds imperceptible and statistically undetectable information into digital data (e.g. video, images and audio signals). This embedded information contains certain information (signature, logo, ID number, etc.) uniquely related to the owner or distributor [4]. Digital data authors and distributors are thereby able to prove ownership of intellectual property rights without restricting other individuals from copying the contents of the digital data. There are a number of desirable characteristics that a watermarking algorithm should satisfy. First, a watermark has to be statistically undetectable by unauthorized persons in order to prevent its removal. Second, the watermark needs to be robust enough to withstand intentional signalprocessing attacks and its removal should be impossible without perceptible signal alteration. Finally, to retain the quality of watermarked data, the watermark insertion should be imperceptible, thereby preventing its visual or aural identification. When considered collectively, the above requirements become conflictual. Therefore, when designing a watermarking system, compromises and tradeoffs have to be made. Many watermarking algorithms have been reported in the literature. Most of them were developed for digital images and videos data; [5] interest and research in audio watermarking started slightly later. Compared to digital video and image watermarking, audio watermarking algorithms are not easy to develop because of the human auditory system (HAS) is extremely more sensitive than human visual system (HVS). Also the HAS is sensitive to a dynamic range of amplitudes, and of frequencies. Even a small amount of embedded noise can be detected by the naked human ear. Finally, audio clips are rather short compared with video clips in terms of time and file size. Thus, the amount of hidden information in audio clips is relatively large compared with the image or video.
Consequently, this information tends to degrade the audio quality. There is always a conflict between inaudibility and robustness in current audio watermarking methods. Finding a satisfactory balance between these two aspects becomes an important index by which to evaluate digital audio watermarking techniques. Several techniques currently exist for the embedding of audio watermarking. These can be classified into several categories, either in time domain, transform domain, or dual domain. The time domain watermark is relatively easy to implement and requires less computing resources compared with the transform domain watermark. On the other hand, the time domain watermark is weaker against signalprocessing attacks than is the transform domain watermark. Transform domain audio watermarking applies certain frequency transforms-such as FFT[6], DCT[7], timefrequency transform (DWT)[8], cepstrum and others, to the data block of the audio signal, hiding the watermark information in the transformed data block [9]. In audio watermarking, it is impossible to have the same information on locations of data blocks where the frequency transform is applied between watermark embedding and detection parts due to the time-scale modification attack. Therefore, to be robust against the time-scale modification attack, spectral audio watermarking must use a fast algorithm that quickly finds the data block where the watermark bit is actually embedded. Although most of watermarking techniques work well for a relatively wide range of signals, they do not always adequately resolve the problems of inaudibility and robustness. In [10] a visually recognizable binary image is used as watermark embedding within audio signals in cepstrum domain. In [11] also the technology of embedding image data into the audio signal and additive audio watermarking algorithm based on SNR to determine a scaling parameter was presented. The audio based on low-frequency band DWT, and the intensity of embedded watermark on the original audio signal is modified by adaptively modulation of the scaling parameter. In [12] an audio watermark embedding algorithm based on mean-quantization in wavelet domain is presented. It uses planar and binary image as watermark, and encrypts the binary image with chaos sequence, the audio signal is decomposed for 4 levels using Daubechies-3 wavelet basis. In [13] in order to improve the watermark invisibility and robustness simultaneously, audio watermarking scheme with selfsynchronization which based on discrete wavelet transforming stable feature was proposed. The scheme adopts adjusting and checking the relationship between the absolute average values of the discrete wavelet decomposing low frequency cA3 coefficients in three audio frames. Also, the relationship between the global absolute average value and the absolute average in frames to decide where watermark bits are embedded and extracted, which is adopted to solve the problem of synchronization. In [14] the watermark data was embedded by quantizing the means of
two selected bands of the wavelet transform of the original audio signal, one of the bands was in the lower frequency and the other one in the higher frequency ranges. An adaptive step sizes were used to achieve robustness and good transparency. The scheme in [14] was developed in [15] by using two stages multi layer perception (MLP) neural network in the decoding process. Inaudibility, robustness and other practical considerations such as complexity have motivated us to look into other alternatives to audio watermarking in DWT domain. In this paper, for the purpose of establishing watermarks applicable to audio signals, we present a method of digital audio watermarking using low frequency components in DWT based on adaptation to the frame size, secret key, chaos encryption, and a binary image with less complexity than those presented in the same DWT domain and high quality. In this method, an audio signal is first divided into frames with sizes adapted to give minimum BER rate and maximum PSNR. The frames of the audio signal are then decomposed into low-frequency components by 3rd level DWT, since the energy of the low-frequency components with regard to approximation coefficients is larger than that of the high-frequency components with regard to detail coefficients, the approximation coefficients are used for the watermark embedding process. The presented method uses an encrypted binary image to decide whether or not to embed the watermark signal into the original host audio signal. In order to evaluate the robustness and transparency of the proposed audio watermarking method, we conduct watermark embedding and detection experiments for test audio signals. Through these experiments, we show that the proposed method is robust to common signal-processing attacks including filtering, re-quantization, re-sampling and MPEG/audio layer III compression. The outline of the paper is as follows: Section II introduces theoretical background on chaotic encryption. Section III presents an overview of the watermark insertion procedure of the proposed audio watermarking scheme. In Section IV, the detection process is thoroughly analyzed. Section V explains the objective measures. Bit error rate (BER), Normalized Cross Correlation (NCC), Peak Signal-to-Noise ratio (PSNR), and Signal-to-Noise ratio (SNR) are used to evaluate performance of the proposed method. In Section VI, the experimental results are discussed. Finally, we make a brief conclusion in Section VII. II.
THEORETICAL BACKGROUND
An encryption based on the Logistic maps [16] to the watermark image is introduced. The encryption process uses chaotic iteration to generate the encryption keys, and then carries out the XOR operation on the plain text to change the values of image pixels. The basic logistic-map is formulated as: Xn+1 = aXn(1 − Xn), (1)
Where Xn and a are the system variable and parameter, respectively, and n is the number of iterations. Logistic map is chaotic for 3.569 ≤ a ≤ 4.0 . The Logistic map has only one parameter, and its range is relatively narrower than other chaotic maps. Consequently the chaotic encryption and decryption provides guaranteed high security, and the initial value is used as a secret key. Fig1 shows the original watermark and encrypted version for a=4.0, secret key is 0.12345, and the number of iteration n=1600.
(a)
1st- Load the 2-D binary image watermark, W(i,j), let Size=MxN be the size of the watermark, L=M.N, S is the watermark length. 2nd- Encrypt the watermark image using chaos sequence using a secrete key, K1, in the 2-D space, W e (i, j ) . 3rd- Lower the dimensions of the watermark to be a 1-D vector in order to embed watermark in onedimensional audio. W1 is the sequence after lowering, i.e.: W1 = {W1 ( k ) = We (i , j ),0 ≤ i < M ,0 ≤ j < N , k = ixM + j}
(b)
Fig 1 (a) Original watermark (b) Encrypted version
III.
EMBEDDING ALGORITHM
The embedding process is shown in fig.2 as following: Selecting Audio Segments By Rectangular Ae(k)window
A(k)
A Original audio
Segmenting
A(k)
Binary image
DWT Using dB4 or Harr
W Blocking
CA,D
W(i,j)
Select Lowfrequency Coefficients
Key
CA
Encrypting
We(i,j) Lowering Dimension
Mean Quantization
W1=[0,1,0,1,0,1]
CA Watermark Embedding
End watermark data
No
yes Segment Reconstructing
IDWT
A′
Watermarked audio Fig.2 Diagram of Audio Watermarking Embedding Procedure
4th- Discrete wavelet transform: Decompose the original audio signal (A) in wavelet domain at 3rd level and then obtain the decomposition vector V, V = {CA 3 , CD 3 , CD 2 , CD 1 } , CA3 is the low-frequency coefficients of the audio signal; CD3-CD1 is the high-frequency coefficients of the audio signal. 5th- Selecting CA3 for watermark embedding and is arranged in matrix form of F x L, where F is the frame size chosen to give minimum BER and maximum PSNR, and calculate the mean of each column, 1 F CA 3 ( n ) = ∑ CA 3 (m , n ), n = 1,..., L F m =1 6thCalculate the variable integer, P └ P (n ) = CA 3 ( n ) / q + 1 / 2 ┘,n=1,….,L ,└ ┘is down integer operation, q is quantization step size, the modified version of CA 3 ( n ) due to watermark embedding is given by: if mod(P(n),2)=W1(n) do nothing if mod(P(n),2) ≠ W1(n)&P(n) =└ CA 3 ( n ) /q┘&P(n) ≥ 0 or mod(P(n),2 ≠ W1(n)&P(n) ≠ └ CA 3 ( n ) /q┘&P(n)