Int J Speech Technol (2011) 14:285–296 DOI 10.1007/s10772-011-9103-7
Chaotic encryption of speech signals Emad Mosa · Nagy W. Messiha · Osama Zahran · Fathi E. Abd El-Samie
Received: 20 December 2010 / Accepted: 21 July 2011 / Published online: 23 September 2011 © Springer Science+Business Media, LLC 2011
Abstract This paper introduces a speech encryption approach, which is based on permutation of speech segments using chaotic Baker map and substitution using masks in both time and transform domains. Two parameters are extracted from the main key used in the generation of mask. Either the Discrete Cosine Transform (DCT) or the Discrete Sine Transform (DST) can be used in the proposed cryptosystem to remove the residual intelligibility resulting from permutation and masking in time domain. Substitution with Masks is used in this cryptosystem to fill the silent periods within speech conversation and destroy format and pitch information. Permutation with chaotic Baker map is used in to maximize the benefits of the permutation process in encryption by using large-size blocks to allow more audio segments to be permutated. The proposed cryptosystem has a low complexity, small delay, and high degree of security. Simulation results prove that the proposed cryptosystem is robust to the presence of noise. Keywords Chaotic Baker map · DCT and DST
E. Mosa () · N.W. Messiha · O. Zahran · F.E. Abd El-Samie Department of Electronics and Electrical Communications, Faculty of Electronic Engineering, Menofia University, Menouf, 32952, Egypt e-mail:
[email protected] N.W. Messiha e-mail:
[email protected] O. Zahran e-mail:
[email protected] F.E. Abd El-Samie e-mail:
[email protected]
1 Introduction Speech encryption techniques have been widely used in the corporate and military sectors. Nowadays, it is very important to protect speech calls over wire and wireless communications with fast and secure cryptosystems. Generally, techniques of speech encryption have been categorized into two types; digital and analog. The first type is the digital speech encryption. Digital speech encryption produces digital speech encrypted by modern digital cryptosystems such as the Advanced Encryption Standard (AES) (Advanced Encryption System 2001; Daemen and Rijndael 2001). Although the AES can attain a high degree of security, it is rarely used in existing voice communication systems due to the bandwidth expansion of the encrypted speech and the degradation of the Signal-to-Noise Ratio (SNR) performance (Tseng 2007). The second type is referred to as analog speech encryption. The main attraction of analog speech encryption is that it can be used with the existing analog telephone and narrow-band radio communication systems. Accordingly, the analog speech scrambling has become one of the most popular encryption techniques in speech communications. Analog speech scramblers are based on a permutation of the speech components in time domain (Beker and Piper 1985), frequency domain (Lee et al. 1984), both time and frequency domains (Milton 1989), an appropriate transform domain (Goldburg et al. 1993), wavelet transform domain (Ma et al. 1996), Hadamard transform domain (Wu and Ng 2002), or circulant transform domains (Manjunath and Anand 2002). There are also blind source separation based methods (Lin et al. 2006). In general, a speech cryptosystem cannot provide sufficient security against eavesdroppers during speech communication if it contains residual intelligibility in the scrambled
286
speech. Residual intelligibility such as talk spurts and the original intonation makes it easy for any adept interceptor to deduce the contents of the scrambled speech (Tseng 2007). Figure 1(a) illustrates the original speech pattern of a female-male conversation in time domain, while Fig. 1(b) illustrates the encrypted speech pattern produced by the chaotic Baker map in time domain and Fig. 1(c) illustrates the encrypted speech pattern produced by the chaotic Baker map in a transform domain. Figure 2 shows the corresponding spectrograms of Fig. 1. As shown in Figs. 1 and 2, it is evident that a considerable residual intelligibility remains in the encrypted signal. Formant and pitch information have not been obviously destroyed in the encrypted signal. Thus, this process can not overcome frequency-domain attacks. The encrypted speech signal still preserves the signal energy, talk spurts, and the original intonation, which can be easily detected by eavesdroppers, and can even degrade the security of the audio cryptosystem. To solve the problem of residual intelligibility, the speech scrambler is supposed to remove the traces of residual intelligibility in the scrambled speech and to resist the frequencydomain attacks. This paper presents a new speech scrambler that is based on permutations and substitutions for the speech segments to obtain a secure encrypted speech signal and a high-quality decrypted signal. The rest of this paper is organized as follows. Section 2 presents the chaotic Baker map. Section 3 presents the proposed cryptosystem. Section 4 presents the experimental results. Finally, Sect. 5 presents the concluding remarks.
2 Chaotic Baker map Chaotic systems are encryption systems, which depend on permutation only for encryption. These systems use maps for rearranging the elements within blocks. These systems are highly sensitive to the initial parameters therefore, if different parameters are used, the system will run in different orbits, which are difficult to be analyzed and calculated. The output sequences of these systems have good randomness, low correlation and non predictability. In this section, the chaotic Baker map is discussed (Fridrich 1998). The Baker map is a 2-D chaotic map, which transfers each element in a square matrix into a new position in the matrix (Meng-En et al. 2009; Cheng 2001). The discretized Baker map will be denoted as B(n1 , n2 , . . . , nk ), where the sequence of k integers , n1 , n2 , . . . , nk , is chosen hence each integer ni divides N , and Ni = n1 + · · · + ni . The element at the indices (r, s), with Ni ≤ r < Ni + ni and 0 ≤ s ≤ N is mapped to the position: N N , B(n1 ,...,nk) (r, s) = (r − Ni ) + s mod ni ni
Int J Speech Technol (2011) 14:285–296
N ni s − s mod + Ni N ni
(1)
The steps of chaotic permutation of an N × N square matrix are summarized as follows: (i) The matrix is divided into k vertical rectangles of height N and width ni . (ii) Each vertical rectangle of dimensions N × ni is divided into ni boxes, and every box contains N points. (iii) Each of these boxes is mapped to a row of elements by mapping column-by-column (the left one at the bottom and the right one at the top). An example of the permutation of an 8 × 8 matrix is shown in Fig. 3. The secret key is (2, 4, 2), hence N = 8, n1 = 2, n2 = 4, and n3 = 2. Figure 3(a) shows the generalized Baker map, while Fig. 3(b) shows the discretized Baker map. In this paper, the discretized Baker map is used to randomize speech segments in the permutation stage and to generate the mask.
3 Proposed cryptosystem The proposed cryptosystem is used for the permutation and masking of speech segments in both time and transform domains. The encryption steps of the proposed cryptosystem are performed in rounds and can be summarized as follows: 1. Framing and reshaping into 2-D format. 2. Generation of a mask. 3. First Round: • Permutation with the chaotic Baker map. • Addition of the mask. 4. Second Round: • • • •
Application of the DCT or the DST. Permutation with the chaotic Baker map. Addition of the mask. Application of the Inverse DCT (IDCT), or the Inverse DST (IDST).
5. Third round: • Permutation with the chaotic Baker map. 6. Reshaping into 1-D format. The decryption steps of the proposed cryptosystem can be summarized as follows: 1. Generation of the mask from the secret key. 2. Framing and reshaping into 2-D format. 3. First round • Inverse permutation with the chaotic Baker map. 4. Second round
Int J Speech Technol (2011) 14:285–296
287
Fig. 1 Chaotic encryption of a speech signal in time and transform domains. (a) Original speech. (b) Encrypted speech with chaotic Baker map. (c) Encrypted speech with chaotic Baker map in a transform domain
Fig. 2 Spectrograms of the speech signals. (a) Original speech. (b) Encrypted speech with chaotic Baker map. (c) Encrypted speech with chaotic Baker map in a transform domain
• Application of the DCT or the DST.
• Application of the IDCT or the IDST.
• Subtraction of the mask. • Inverse permutation with the chaotic Baker map.
5 Third round
288
Int J Speech Technol (2011) 14:285–296
Fig. 3 The Baker map
• Subtraction of the mask. • Inverse permutation with the chaotic Baker map. 3.1 Masking and the secret key The mask is generated from the secret key. A specific number of ones are introduced to an empty block, and are then permutated with the chaotic Baker map to generate a mask of zeros and ones. The resultant mask is added to each block before the permutation step. This step is necessary to hide silent periods within a speech conversation to eliminate known-plaintext attacks by changing the signal energy within silent periods. For the chaotic Baker map with secret key of {4, 2, 2, 2, 2}, hence the sum of sub-keys leads to a 12 × 12 block size. The number of sub-keys is 5 and the first sub-key is 4, so we can perform the following steps: (a) A number of ones equal to the number of sub-keys is introduced into the first n1 rows of an empty matrix, where n1 is the value of first sub-key. Therefore, 5 ones are put in each of the first 4 rows of a 12 × 12 empty matrix as shown in Fig. 4. (b) This matrix is permutated with the chaotic Baker map. (c) The resultant mask is added to each block of the speech to be encrypted, and then the output is circularly shifted between −1 and 1 as shown in Fig. 5. 3.2 Permutation Permutation is the process by which the elements of each block the speech signal are rearranged and mapped to new positions in the same block without changing their values. Permutation with the chaotic Baker map is considered as a good permutation process due to its random behavior and sensitivity to initial conditions. The proposed cryptosystem provides a variable key length, a relatively large block size, and a high encryption rate, which are the basic requirements of a good speech cryptosystem (Fridrich 1998).
3.3 Substitution Substitution is the process by which the amplitudes of elements in each block are changed to other values without changing their positions in the block. Permutation of speech segments in the time domain will result in distortion of the speech time envelope, which reduces the intelligibility of the speech. However, portions of the signal remain intact, which may allow a trained listener to directly interpret the scrambled speech. The purpose of the substitution step is to change nonpermutated portion of speech and to change the power spectrum of the speech to overcome cryptanalysis attacks. In the proposed cryptosystem, substitution is performed chaotic Baker map permutation and then masking in the Time Domain (TD), DCT, and DST to evaluate the best domain for this task (Meng-En et al. 2009; Cheng 2001; Ahmed 1974; Murthy and Swamy 1991; Yip 1987). 3.4 Last permutation step Each substituted block is then permutated for another time by the chaotic map with another secret key. The first objective of this step is to increase the key space by using two different keys. The second objective is to prevent a cryptanalyst from discovering the secret key with known-plaintext attacks. Without the last permutation step, if an attacker knows the original signal and a record encrypted signal, he can get the original signal.
4 Experimental results We are mainly concerned with the quality of the encrypted as well as the decrypted signals. Two sub-sections are devoted both qualities.
Int J Speech Technol (2011) 14:285–296
289
Fig. 4 Generation of a mask
4.1.2 Statistical analysis The different kinds of ciphers can be analyzed, statistically (Fridrich 1997; Mao et al. 2004; Shannon 1949). Statistical analysis has been performed on the proposed speech cryptosystem demonstrating its superior confusion and diffusion properties, which strongly resist the statistical attacks. This is illustrated by showing the histograms of the encrypted speech signals, the correlation coefficient between the encrypted signal and the original signal and the Spectral Distortion (SD) of the encrypted signal compared to the original one. Fig. 5 Addition of the mask
4.1 Quality of encrypted signal 4.1.1 Residual intelligibility The speech conversation signal has been encrypted with the proposed cryptosystem in the time domain using the proposed algorithm but removing discrete transform steps in round two, and the result is shown in Fig. 6(b). For simplicity, we will refer to this method as TD encryption. The signal is also encrypted with the proposed cryptosystem with either the DCT or the DST and the results are shown in Figs. 6(c) and 6(d), respectively. For simplicity, we will use DCT encryption and DST encryption to refer to these two methods. The spectrograms of the original and encrypted signals are shown in Fig. 7. It is evident that the encrypted speech with the DCT and DST encryption is obviously similar to the white noise without any talk spurts. The original intonations have been removed, which indicates that no residual intelligibility can be useful for eavesdroppers at the communication channel.
I. Histogram of encrypted speech signal A typical example of the histogram test is shown in Fig. 8. The histograms of the encrypted speech signals using the TD encryption, the DCT encryption, and the DST encryption show that the DCT and the DST encryption give the most uniform histograms, which means the best encryption results. II. Correlation A useful measure to assess the encryption quality of any cryptosystem is the correlation coefficient between similar segments in the clear signal and the cipher signal. It can be calculated as follows: cv (x, y) rxy = √ √ D(x) D(y)
(2)
where cv (x, y) is the covariance between the original signal x and the encrypted signal y. D(x) and D(y) are the variances of the signals x and y. In numerical computations, the following discrete formulas can be used (Behnia et al. 2006): E(x) =
Ns 1 x(i) Ns i=1
(3)
290 Fig. 6 Encryption of the speech signals. (a) Original signal. (b) TD encryption. (c) DCT encryption. (d) DST encryption
Fig. 7 Spectrograms of the speech signals. (a) Original signal. (b) TD encryption. (c) DCT encryption. (d) DST encryption
Int J Speech Technol (2011) 14:285–296
Int J Speech Technol (2011) 14:285–296
291
Fig. 8 Histogram of speech signal. (a) Original signal. (b) TD encryption. (c) DCT encryption. (d) DST encryption
Ns 1 D(x) = (x(i) − E(x))2 Ns
(4)
i=1
Ns 1 cv (x, y) = (x(i) − E(x))(y(i) − E(y)) Ns
The SD can be calculated as follows (Kuo 1993): SD =
(5)
i=1
where Ns is the number of speech samples involved in the calculations. The low value of the correlation coefficient rxy indicates a good encryption quality. The correlation coefficients for the encrypted speech signals with the chaotic Baker map in time and transform domains illustrated in Fig. 1 and the encrypted speech signals with all proposed methods using three different main keys are tabulated in Table 1. From these results, we can see that all proposed cryptosystems produce encrypted speech with low correlation between similar segments in the original speech and the encrypted speech, which means that all keys give good encryption results in the proposed cryptosystem.
III. Spectral distortion The SD is a form of measures that are implemented in the frequency domain on the frequency spectra of the original and processed signals. It is a measure calculated in dB to show how far is the spectrum of the processed signal from that of the original signal.
M−1 N m+N −1 1 |Vx (i) − Vy (i)| M m=0
(6)
i=N m
where Vx (i) is the spectrum of the original speech signal in dB for a certain segment in time domain, Vy (i) is the spectrum of the distorted speech signal in dB reproduced by a speech processing system over the same segment in time domain, N is the segment length and M is the number of segments in the speech signal. The SD results are tabulated in Table 2. The closer the correlation coefficient to zero and the higher the SD, the higher is the quality of the encrypted signal. 4.1.3 Key-space analysis The secure encryption algorithm should be sensitive to the cipher keys. For the proposed cryptosystem, the key-space analysis and sensitivity tests are summarized in the following subsections. I. Exhaustive-key search For a secure cryptosystem, the key space should be large enough to make the brute-force attack infeasible (Koduru and Chandrasekaran 2008; Abd ElSamie 2009). For the chaotic Baker map, the secret key is dependent on the block size to be encrypted. The number
292
Int J Speech Technol (2011) 14:285–296
Table 1 Correlation coefficients between the original and encrypted speech signals Secret key
Chaotic Baker map
Proposed Cryptosystem
Time
Transform
TD
DCT
DST
domain
domain
Key A
0.2500
0.1350
0.0100
0.0023
Key B
0.1882
0.0112
0.0137
0.0024
0.0011
Key C
0.2100
0.2231
−0.0014
0.0027
−0.0012
TD
DCT
DST
0.0013
Table 2 SD in dB of the encrypted signals with all methods Secret key
Chaotic Baker map
Proposed Cryptosystem
Time
Transform
domain
domain
Key A
10.9386
12.0250
15.1529
14.255
16.3053
Key B
10.9323
12.0112
15.5948
14.2192
15.3123
Key C
11.1342
12.8624
16.6368
14.0568
14.1667
Table 3 Exhaustive key search results Secret key
Number of
Maximum Computation
length
possible keys
time by cryptanalysis
30
1 × 1016
115 day
40
1 × 1020
3.17 × 103 year
50
1 × 1022
3.17 × 105 year
60
1 × 1030
3.17 × 1013
64
1 × 1034
3.17 × 1017
128
1 × 1062
3.17 × 1051
256
1 × 10126
3.17 × 10113
512
1 × 10252
3.17 × 10239
The correlation coefficient rxy , SD, and the Log-Likelihood Ratio (LLR) are estimated between each decrypted signal and the signal decrypted with the original key. The results are tabulated in Table 4. The low correlation values and large SD and LLR show the large key sensitivity of the proposed cryptosystem implementing the DCT or the DST. It is clear from the table that no correlation exists among the encrypted signals even though they have been produced using slightly different secret keys. The key sensitivity test has been performed under the condition of fixed key length and without changing the value of the first sub-key, which controls the masking step. 4.1.4 Known-plaintext attack
of possible keys with respect to the key size is tabulated in Table 3. These results suppose a known secret key length by the attacker, but really the key length is unknown making the search infeasible. Computation times are calculated with a 1000 MIPS computer. In Table 3 results are tabulated considering two different secret keys. The first key is used in first permutation process and in the generation of mask, and the second key is used in the second permutation process. II. Key-sensitivity test In this section, the key-sensitivity test is performed to evaluate the immunity of the proposed cryptosystem against brute-force attacks. Assume a secret key = {8, 8, 4, 2, 2, 4, 2, 8, 8, 4, 2, 8, 4} with number of subkeys 13 and sum of sub-keys 64. For testing the key sensitivity of the proposed cryptosystem, the encrypted signal is decrypted with three different keys generated by changing the position of two sub-keys in the original secret key.
The known-plaintext attack is an attack model of cryptanalysis, where the attacker has samples of both the plaintext and its ciphertext and has liberty to make use of them to reveal the secret key. In the proposed cryptosystem, if a cryptanalyst knows the original signal and its encrypted version, he must know the block size to build the supposed permutation and masking processes. If he tries with a different block size, this will give completely wrong results. In modern cryptosystems that use standard block sizes, permutation and substitution processes may be analyzed to discover the key, while in the proposed cryptosystem; there is no standard block size. Therefore, the knowledge of the plaintext without knowledge of the block size is useless as it is very difficult to guess the key. 4.2 Quality of decrypted signal There is a need for some metrics to assess the perceptual quality of the decrypted speech signals. Several ap-
Int J Speech Technol (2011) 14:285–296
293
Table 4 Correlation coefficients, SD, and LLR between the decrypted signals with the different keys and the decrypted signal with the original key Decryption key
SD
rxy TD
Key 1
0.01
Key 2
0.003
Key 3
0.0006
DCT 0.001 −0.018 0.0019
TD
DCT
DST
TD
DCT
DST
0.011
15.52
14.18
16.90
0.905
0.549
0.450
0.0052
15.06
14.30
16.86
0.895
0.447
0.433
0.0006
15.09
14.55
16.37
0.754
0.429
0576
proaches, based on subjective and objective metrics, have been adopted in the literature for this purpose (Abd ElSamie 2009). Concentration in this paper will be on the objective metrics. Objective metrics are generally divided into intrusive and non-intrusive metrics. Intrusive metrics can be classified into three main groups. The first group includes time-domain metrics such as the traditional Signalto-Noise Ratio (SNR) and segmental Signal-to-Noise Ratio (SNRseg). The second group includes Linear Predictive Coefficients (LPCs) metrics, which are based on the LPCs of the speech signal and its derivative parameters, such as the Linear Reflection Coefficients (LRCs), the LLR, and the Cepstral Distance (CD). The third group includes the spectral-domain metrics, which are based on the comparison between the power spectrum of the original signal and the processed signal. An example of such metrics is the SD (Kuo 1993). The SNR is defined as follows (Abd El-Samie 2009): Ns 2 x (i) SNR = 10 log10 N i=1 (7) s 2 i=1 (x(i) − z(i)) where z(i) is the decrypted speech signal. The most popular one of the time domain metrics is the SNRseg, which is defined as the average of the SNR values of short segments of the output signal. It is a good estimator for speech signal quality. It is defined as follows (Abd ElSamie 2009): 2 M−1 Km+K−1 10 x 2 (i) SNRseg = log10 M (x(i) − z(i)) m=0
(8)
i=Km
where M is the number of segments in the output signal, and K is the length of each segment. The LLR metric for a speech segment is based on the assumption that the segment can be represented by a p-th order all-pole linear predictive coding model of the form (Abd El-Samie 2009): x(i) =
p
am x(i − m) + Gx u(i)
LLR
DST
(9)
m=1
where x(i) is the ith speech sample, am (for m = 1, 2, . . . , p) are the coefficients of an all-pole filter, Gx is the gain of the
Table 5 Quality metrics values for the decrypted speech signals Quality metrics
TD
DCT
DST
SD
0
9.31 × 10−15
9.31 × 10−15
LLR
7.78 × 10−51
9.95 × 10−17
8.91 × 10−4
rxz
1
0.99
0.98
filter and u(i) is an appropriate excitation source for the filter. The speech signal is windowed to form frames of 15 to 30 ms length. The LLR metric is then defined as: ¯ z aTx ax R LLR = log ¯ aT aR z
(10)
z z
where ax is the LPCs coefficient vector [1, ax (1), ax (2), . . . , ax (p)] for the original speech signal x(i), az is the LPCs coefficient vector [1, az (1), az (2), . . . , az (p)] for the decrypted ¯ z is the autocorrelation matrix of speech signal z(i), and R the decrypted speech signal. The closer the LLR to zero, the higher is the quality of the decrypted signal. In this paper, three metrics are used for quality assessment of decrypted speech signals; the SD, the LLR, and the correlation coefficient with the original speech signal rxz . As the values of the SD and LLR are decreased, and the value of rxz is increased, the performance of the speech cryptosystem becomes better. Figure 9 shows the decrypted signals with all methods in the absence of noise. The numerical quality metrics values for these results are tabulated in Table 5. These results ensure the efficiency of the proposed speech cryptosystem in the absence of noise. 4.2.1 Effect of noise An important issue, which deserves consideration, is the effect of noise on the efficiency of the proposed speech cryptosystem. Simulation experiments have been carried out for the decryption in the presence of noise at different SNR values. The results of these experiments are shown in Figs. 10 to 12 for all encryption methods. From these results, it is clear that the encryption quality metrics values are better at high SNR values. Thus, the proposed cryptosystem can tolerate noise with low power.
294
Int J Speech Technol (2011) 14:285–296
Fig. 9 Waveforms of decrypted speech signals. (a) Original. (b) TD. (c) DCT. (d) DST
Fig. 10 Speech quality metrics for the TD encryption
5 Conclusion This paper presented an encryption approach to protect speech information. This approach is based on the chaotic Baker map and masking in different domains. The experi-
mental study proved that the proposed speech cryptosystem is resistant to brute-force attacks, frequency-domain attacks and statistical attacks. The objective of this approach is to destroy all aspects of the original signal, while preserving the quality of the recovered speech signal with a satisfactory
Int J Speech Technol (2011) 14:285–296
295
Fig. 11 Speech quality metrics for the DCT encryption
Fig. 12 Speech quality metrics for the DST encryption
level. The proposed cryptosystem achieves both permutation and diffusion, and have shown a good immunity to different attacks. It can be used in the narrowband radio and telephone systems.
References Abd El-Samie, F. E. (2009). An efficient singular value decomposition algorithm for digital audio watermarking. International Journal of Speech Technology, 12(1), 27–45.
296 Advanced Encryption System (2001). Federal Information Processing Standards Publication, 197. Ahmed, N., Natarajan, T., Rao, K. R. (1974). Discrete cosine transfom. IEEE Transactions on Computers, C-23, 90–93. Behnia, S., Akhshani, A., Mahmodi, H., & Akhavan, A. (2006). Novel algorithm for image encryption based on mixture of chaotic maps. Chaos, Solitons and Fractals. Beker, H. J., & Piper, F. C. (1985). Secure speech communications. London: Academic Press. Cheng, L. Z. (2001). On computing the two-dimensional (2-D) type IV discrete cosine transform (2-D DCT-IV). IEEE Signal Processing Letters, 8, 239–241. Daemen, J., & Rijndael, V. R. (2001). The advanced encryption standard. Doctor Dobb’s Journal, 26(3), 137–139. Fridrich, J. (1997). Secure image ciphering based on chaos (Final report). Fridrich, J. (1998). Symmetric ciphers based on two-dimensional chaotic maps. International Journal of Bifurcation and Chaos, 8(6), 1259–1284. Goldburg, B., Sridharan, S., & Dawson, E. (1993). Design and cryptanalysis of transform-based analog speech scramblers. IEEE Journal on Selected Areas in Communications, 11(5), 735–744. Koduru, S. C., & Chandrasekaran, V. (2008). Integrated confusiondiffusion mechanisms for chaos based image encryption. In IEEE 8th international conference on computer and information technology workshops. Kuo, C. J. (1993). Novel image encryption technique and its application in progressive transmission. Journal of Electronic Imaging, 2(4), 345–351. Lee, L. S., Chou, G. C., & Chang, C. S. (1984). A new frequency domain speech scrambling system which does not require frame synchronization. IEEE Transactions on Communications, 32, 444– 456.
Int J Speech Technol (2011) 14:285–296 Lin, Q. H., Yin, F. L., Mei, T. M., & Liang, H. (2006). A blind source separation based method for speech encryption. IEEE Transactions on Circuits and Systems. I, 53(6), 1320–1328. Ma, F., Cheng, J., & Wang, Y. (1996). Wavelet transform-based analogue speech scrambling scheme. Electronics Letters, 32(8), 719– 721. Manjunath, G., & Anand, G. V. (2002). Speech encryption using circulant transformations. In Proc. IEEE, Int. conf. multimedia and exp. (Vol. 1, pp. 553–556). Mao, Y. B., Chen, G., & Lian, S. G. (2004). A novel fast image encryption scheme based on the 3D chaotic baker map. International Journal of Bifurcation and Chaos in Applied Sciences and Engineering, 14(10), 3613–3624. Meng-En, L., Chien-Feng, C., Tsung-Nan, L., & Chun-Nan, C. (2009). The application of discrete cosine transform (DCT) combined with the nonlinear regression routine on optical auto-focusing. In Digest of technical papers international conference on consumer electronics ICCE 2009 (pp. 1–2). Milton, R. M. (1989). A time and frequency-domain speech scrambler. In COMSIG 1989 proceedings, Southern African conference (pp. 125–130). Murthy, N. R., & Swamy, M. N. (1991). Efficient algorithms for the computation of running discrete cosine and sine transforms. IEEE International Symposium on Circuits and Systems, 1, 634–637. Shannon, C. E. (1949). Communication theory of secrecy system. The Bell System Technical Journal, 28, 656–715. Tseng (2007). An OFDM speech scrambler without residual intelligibility. In TENCON 2007 (pp. 1–4). Wu, Y., & Ng, B. P. (2002). Speech scrambling with Hadamard transform in frequency domain. Proceedings of the 6th International Conference on Signal Processing, 2, 1560–1563. Yip, P., Rao, K. (1987). On the shift property of DCT’s and DST’s. Acoustics, Speech and Signal Processing, 35, 404–406.