A High-capacity Reversible Data-hiding Scheme for Medical Images

Journal of Medical and Biological Engineering, 30(5): 289-296

289

A High-capacity Reversible Data-hiding Scheme for Medical Images Shu-Chien Huang*

Ming-Shun Lin

Department of Computer Science, National Pingtung University of Education, Pingtung 900, Taiwan, ROC Received 15 Jul 2009; Accepted 1 Feb 2010; doi: 10.5405/jmbe.30.5.04

Abstract Reversible data hiding is a good solution to provide image authentication or copyright protection. Being reversible, the secret data can be extracted, and the original image can be recovered as well without distortion. Most data hiding schemes distort the cover media in order to insert the secret data without the capability to recover the host medium after the secret message is extracted. However, the reversible recovery of the cover media is essential for some multimedia applications such as medical images, law enforcement, and fine art work. In this paper, we present a simple and effective reversible data hiding scheme for medical images. Experimental results show that our proposed method provides a higher pure payload without making noticeable distortions. Keywords: Reversible data hiding, Lossless data embedding, Information hiding, Medical image authentication

1. Introduction Data hiding means the hiding of secret data in a cover image, the result of which is the so-called stego-image. Recently, the reversible data-hiding technique has attracted much attention. It is also called lossless data embedding [1,2], wherein the original image can be restored after extracting the secret message. Following current trend, more and more medical images are now stored in digital form. On the other hand, as a result of the availability of powerful image processing software packages such as Photoshop, anyone can easily modify such digital media for any reason and create unconscious forgeries. How to prevent a medical image from being maliciously altered, that is, detecting the tampered parts, has become an important issue. In order to safeguard digital images, image authentication schemes are the most widely used method. Generally, the authentication codes are usually derived from the prominent features of the medical image and are directly embedded into the image. However, the embedding procedure will distort the images. This distortion may cause the modified medical images to be unable to be used for further diagnosis. That is to say, the method must have the ability to restore the original content after the extraction of the authentication codes. Therefore, it is an important challenge to develop a reversible data-hiding scheme for medical images. * Corresponding author: Shu-Chien Huang Tel: +886-8-7226141 ext. 33557; Fax: +886-8-7215034 E-mail: [email protected]

In the past decade, many reversible data embedding methods have been proposed. Honsinger et al. [1] was the first to propose the concept. Originally, Honsinger et al‟s scheme was designed for lossless authentication. Nonetheless, their scheme suffers seriously from the disturbing salt-and-pepper noise problem. Ni et al. [3] proposed a reversible data-hiding algorithm based on the shifting of an image histogram. The maximum point of the histogram is selected to embed a message. When embedding a message into the image, the pixel value at the maximum point is altered by 1 or left unchanged if the message bit is „1‟ or „0‟, respectively. Their idea is very simple and causes only a slight distortion with low complexity. However, it is limited by the hiding capacity. Tseng et al. [4] also proposed a reversible hiding scheme based on image histograms. In their method, the peak point in the histogram remains unchanged for retrieving hidden data without additional side information. Thus, secret data is embedded in the peak‟s neighboring pixels. Lin et al. [5] proposed a multilevel reversible data hiding scheme based on the difference histogram modification that uses the peak point to hide messages. Their experimental results confirm that their proposed scheme can provide higher hiding capacity while keeping distortion low. Difference expansion technique has been employed for hiding data reversibly [6,7]. The differences between two pixels are expanded to embed messages. The redundancy in the digital content is explored to achieve reversibility. However, a significant part of the embedding space is consumed by a large location map that indicates whether a pixel pair has embedded a message. Lee et al. [8] proposed an adaptive lossless steganographic scheme based on the centralized difference expansion. In their proposed scheme, the original cover image is partitioned into a series of non-overlapping blocks, and the

290

J. Med. Biol. Eng., Vol. 30. No. 5 2010

payload of each block depends on its block size and the image complexity. Due to the fact that schemes with difference expansion tend to seriously damage the image quality in the edge areas, the peak signal-to-noise ratio (PSNR) value of stego-image is always less than 40 dB. In addition, some methods for reversibly embedding secret data into binary images and palette-based images have been proposed. For example, Tsai et al. [9] introduced a reversible data-hiding mechanism based on pairwise logical computation to embed a message in a binary image. Pan et al. [10] proposed a reversible data-hiding method for error-diffused halftone images. It employs statistical features, with pixel block patterns to embed data. Lee and Wu [11] proposed a reversible data-hiding method based on an iterative approach for palette-based images. In general, embedding capacity and stego-image quality are the major criteria used to evaluate reversible data embedding. Embedding capacity means how much secret information can be embedded in an image. The quality of a stego-image is measured by the PSNR; a higher PSNR value can guarantee less distortion caused in the cover image.

Step 1: The even row of the cover image is scanned in the raster-scan order except for the pixels of the first and last column, as shown in Fig. 1. Figure 2 shows the current pixel and its three neighboring pixels. Our method uses three neighboring pixels to predict the current pixel. The predictive value P‟(x,y) of the current pixel P(x,y) is computed using Eq. (1), and the prediction error d is calculated using Eq. (2). P‟(x,y)= floor((P(x,y-1)+ P(x+1,y))/2), if P(x,y) located at the first row floor((P(x,y-1)+ P(x-1,y))/2), if P(x,y) located at the last row floor((P(x,y-1)+ P(x-1,y) + P(x+1,y))/3), otherwise d = P(x,y) - P‟(x,y).

(1) (2)

2. Methods The proposed method consists of preprocessing, embedding, extraction, and recovery procedures. 2.1 Preprocessing The preprocessing step is performed in the original image except for the first and last columns. The goal of the preprocessing step is to eliminate the pixels with a gray-level of 0 and 255. First, the pixel number of the gray-level between 0 and T is calculated and denoted by h(0),…h(T). The minimum value is searched from {h(0),…h(T)}, that is, h(T1) = min {h(0),…h(T)}. Second, the pixel number of the gray-level between 255 and 255-T is calculated and denoted by h(255),…h(255-T). The minimum value is searched from {h(255),…h(255-T)}, that is, h(T2) = min {h(255),…h(255-T)}. If T1 is 0, all pixel values of 0 shift to 1; otherwise, all pixel values from 0 to T1-1 shift to T1 by 1. If T2 is 255, all pixel values of 255 shift to 254; otherwise, pixel values from 255 to T2+1 shift to T2 by 1. The position information of pixels with a value of T1 and pixels with a value of T2 are recorded as the overhead information O1. At the same time, the first 80 least significant bits (LSBs) of the pixels in the last column are concatenated as the overhead information O2. The value of T is set to 20 in our method. After the preprocessing step is performed, the cover image is obtained. The cover image except for the first and last columns contains no pixel with a grey-level of 0 or 255. The overhead information O1, O2, and secret message SM are concatenated as the binary string EP for embedding. 2.2 The embedding algorithm For more security, cryptographic techniques can be applied to an information hiding scheme to encrypt the secret message prior to embedding. The embedding algorithm is as follows:

Figure 1. The even row of the cover image is scanned in the raster-scan order except for the pixels of the first and last column.

Figure 2. The current pixel CP and its three neighboring pixels.

Step 2: The histogram h1 for the prediction error is generated. The two peaks M1 and M2, with M1 < M2, are searched. In this step, a bit is embedded if the value of d is equal to M1 or M2. The current pixel P(x,y) is kept unchanged or modified accordingly as follows: If d < M1 { P(x,y) = P(x,y)-1; } If d > M2 { P(x,y) = P(x,y)+1; } If d == M1 { P(x,y) = P(x,y)-b; } If d == M2 { P(x,y) = P(x,y)+b; } An illustration of the histogram modification is described as follows. Figure 3(a) shows the histogram of the prediction error d. Two peaks, M1 = -1 and M2 = 1, are searched. Scan the cover image to select the pixels with a gray-level value p. In the case of d < M1 (or d > M2), its gray value is modified to p - 1 (or p + 1), with no secret data embedded. The modified histogram is shown in Fig. 3(b). In the case of d == M1 (or d == M2), its gray-level value remains p when the corresponding embedded

A Reversible Data-hiding Scheme

bit b is 0 and is modified to p - 1 (or p + 1) when the corresponding embedded bit b is 1. An example of the modified histogram is shown in Fig. 3(c). (a)

291

If d == M4 { P(x,y) = P(x,y)+b; } Step 5: The values of T1, h(T1), T2, h(T2), M1, M2, M3, and M4 are transformed to a binary string. In our method, each value of T1, T2, M1, M2, M3, and M4 is represented by 8 bits, and each value of h(T1) and h(T2) is represented by 16 bits. Therefore, the length of the binary string is 80. The first 80 LSBs of the image pixels in the last column are replaced by this binary string. Finally, the stego-image is generated. 2.3 The extraction and recovery algorithm

(b)

Step 1: The first 80 LSBs of the image pixels in the last column are extracted to determine the values of T1, h(T1), T2, h(T2), M1, M2, M3, and M4. Step 2: The odd row of the cover image is scanned in the raster-scan order except for the pixels of the first and last column, as shown in Fig. 4. The predictive value P‟(x,y) of the current pixel P(x,y) is computed using Eq. (1), and the prediction error d is calculated using Eq. (2). The secret bit b is extracted, and the current pixel P(x,y) is modified or kept unchanged accordingly as follows:

(c)

Figure 3. (a) The histogram of the prediction error; (b) and (c) the modified histogram.

Step 3: The odd row of the cover image is scanned in the raster-scan order except for the pixels of the first and last column, as shown in Fig. 4. The predictive value P‟(x,y) of the current pixel P(x,y) is computed using Eq. (1), and the prediction error d is calculated using Eq. (2).

Figure 4. The odd row of the cover image is scanned in the raster-scan order except for the pixels of the first and last column.

Step 4: The histogram h2 for the prediction error is generated. The two peaks M3 and M4, with M3 < M4, are searched. In this step, a bit is embedded if the value of d is equal to M3 or M4. The current pixel P(x,y) is kept unchanged or modified accordingly as follows: If d < M3 { P(x,y) = P(x,y)-1; } If d > M4 { P(x,y) = P(x,y)+1; } If d == M3 { P(x,y) = P(x,y)-b; }

If (d == M3) or (d == M4) { b = 0; } If (d == M3 -1) or (d == M4+1) { b = 1; } If d < M3 { P(x,y) = P(x,y)+1; } If d > M4 { P(x,y) = P(x,y)-1; } In this step, the extracted secret bits are concatenated as the binary string EP2. Step 3: The even row of the cover image is scanned in the raster-scan order except for the pixels of the first and last column, as shown in Fig. 1. The predictive value P‟(x,y) of the current pixel P(x,y) is computed using Eq. (1), and the prediction error d is calculated using Eq. (2). The secret bit b is extracted, and the current pixel P(x,y) is modified or kept unchanged accordingly as follows: If (d == M1) or (d == M2) { b = 0; } If (d ==M1 -1) or (d ==M2+1) { b = 1; } If d < M1 { P(x,y) = P(x,y)+1; } If d > M2 { P(x,y) = P(x,y)-1; } In this step, the extracted secret bits are concatenated as the binary string EP1. Step 4: The binary strings EP1 and EP2 are concatenated as EP. Extract the overhead information O1 and O2 from EP. The secret message SM can also be extracted from EP. According to overhead information O1, the grayscale value is restored to 0 or 255. According to overhead information O2, the first 80 LSBs of the last column are also restored. Finally, the original image is obtained.

292

(a)

(d)


(b)

(e)

(c)

(f)

Figure 5. (a) medical image 1; (b) original image histogram h; (c) the histogram h1; (d) the histogram h2; (e) stego-image; (f) the difference image. (a)

(d)

(b)

(e)

(c)

(f)

Figure 6. (a) medical image 2; (b) original image histogram h; (c) the histogram h1; (d) the histogram h2; (e) stego-image; (f) the difference image.

3. Results and discussion The proposed algorithm has been implemented in a MATLAB environment on a Pentium IV PC. Four grayscale medical images, with image sizes of 399 × 405, 424 × 508, 360 × 480, and 417 × 450, respectively, were used in the experiments; they are shown in Figs. 5(a)-8(a). They were CT, Ultrasound, X-ray, and MR images. Ultrasound has been recognized as the most often used imaging tool in clinical

environments [12,13]. Conventional two-dimensional ultrasound imaging has limitations in three-dimensional structural analysis, especially for volume quantification and tissue localization, which are necessary for assessing the progression of disease and uncovering the properties of human tissues. The applications of X-ray imaging in orthopaedic surgery are pervasive, both pre-operatively and intraoperatively. Pre-operative planning based on information about the patient anatomy provided by conventional X-ray imaging has been established for several years [14]. CT and


(a)

(d)

(b)

(e)

293

(c)

(f)

Figure 7. (a) medical image 3; (b) original image histogram h; (c) the histogram h1; (d) the histogram h2; (e) stego-image; (f) the difference image. (a)

(d)

(b)

(e)

(c)

(f)

Figure 8. (a) medical image 4; (b) original image histogram h; (c) the histogram h1; (d) the histogram h2; (e) stego-image; (f) the difference image.

MR findings of brain aspergillosis have been reported in a number of studies [15]. Because growing numbers of immunocompromised patients are following extensive medical treatment such as bone marrow transplants (BMT), CT and MR are becoming more and more important in the early diagnosis of brain aspergillosis. The histograms of these medical images are shown in Figs. 5(b)-8(b). For the prediction error, the histogram h1 is shown in Figs. 5(c)-8(c) and the histogram h2 is shown in

Figs. 5(d)-8(d). It was observed that the distribution in the histograms h1 and h2 were more compact. The stego-images are shown in Figs. 5(e)-8(e). For the original image O and the stego-image S, the value of |O(x,y) - S(x,y)| is 0 or 1 or 2. In our experiments, the difference images |125 × (O - S)|, shown in Figs. 5(f)-8(f), served to show the difference between the four medical images before and after the processing.

294


Table 1. Test results for the four medical images. Images Medical image 1 Medical image 2 Medical image 3 Medical image 4

T1 2 19 14 3

T2 243 255 255 255

h(T1) h(T2) 38 22 968 0 213 0 1 0

M1 0 0 0 0

M2 1 1 1 1

For the four medical images, the values of T1, T2, h(T1), h(T2), M1, M2, M3, M4, h1(M1), h1(M2), h2(M3), and h2(M4) are listed in Table 1. The embedding capacity (EC) is calculated as follows: EC = h1(M1)+h1(M2)+h2(M3)+h2(M4).

(3)

In our method, the amount of overhead information O1, OV, is calculated as follows: OV = (h(T1)+h(T2)) × ( log 2 W  + log 2 H  ),

(4)

where W and H denote the width and height of the original image, respectively, and log 2 W  and log 2 H  are the numbers of bits required to represent the X and Y coordinates of the pixels in the image, respectively. The amount of overhead information O2 is 80 bits. Therefore, the pure payload, PP, is obtained by Eq. (5). PP = EC – (h(T1)+h(T2)) × ( log 2 W  + log 2 H  ) – 80.

(5)

Table 2 shows the experimental results compared with those from Ni et al.‟s method [3]. The secret data are a set of randomly generated bits. It is obvious that the pure payload of our proposed method is much higher than that by Ni et al.‟s method. Ni et al.‟s method intends to embed secret data into the cover image by using the modification of the original image histogram. Hence, it is limited by the hiding capacity. In medical images 1 and 2, the PSNR values of Ni et al.‟s method are slightly higher than those of our proposed method. However, in medical images 3 and 4, the PSNR values of our proposed method are slightly higher than those of Ni et al.‟s method. In our experiments, the PSNR values were higher than 48 db in all cases. That is, it was difficult to distinguish the difference between the original image and the stego-image with the naked eye. Table 2. Comparison between the proposed method and Ni et al.‟s method. Images Medical image 1 Medical image 2 Medical image 3 Medical image 4

Ni et al.‟s method Pure payload PSNR 4,792 49.71 19,287 49.06 7,871 48.50 16,591 49.23

Our proposed method Pure payload PSNR 85,375 49.49 137,955 48.71 148,423 49.93 151,282 50.39

M3 0 0 0 0

M4 1 1 1 1

h1(M1) 23,156 45,314 46,476 46,493

h1(M2) 16,793 27,142 27,198 26,661

h2(M3) 24,703 41,918 42,780 41,306

h2(M4) 21,883 41,085 35,883 36,920

Table 3 shows the experimental results compared with those from Lin et al.‟s method [5]. The “Lena” image, 512 × 512 in size, is used for testing. In Table 3, “Hiding level” represents the number of rounds the hiding algorithm undergoes. For example, “Level-3 hiding” means the hiding algorithm was performed for 3 rounds. The PSNR values of our proposed method were slightly higher than those of Lin et al.‟s method. On average, the pure payload of our proposed method was higher than that of Lin et al.‟s method by 16,440 bits. This is because we use three neighboring pixels instead of the one neighboring pixel used in Lin et al.‟s method to predict the current pixel; thus, the distribution in the histogram of the prediction error is more compact. This characteristic is applied to improve the embedding capacity. Table 3. Comparison between the proposed method and Lin et al.‟s method. Hiding level Level-1 Level-2 Level-3

Lin et al.‟s method Pure payload PSNR 65,349 48.67 115,963 43.02 158,815 39.64

Our proposed method Pure payload PSNR 81,274 48.87 133,388 43.81 174,785 40.31

Table 4 shows the experimental results based on T = 5, T = 10, T = 20, and T = 40 for four test images. It was found that the pure capacity increased gradually or remained the same as the value of T increased. For medical image 2, the value of T1 was 19 and the amounts of overhead information O1 were 53,640, 28,908, 17,424, and 17,424 bits for T = 5, 10, 20 and 40 in our experiments, respectively. Thus, the pure payload (T = 5) < the pure payload (T = 10) < the pure payload (T = 20) = the pure payload (T = 40). In addition, we also found that the PSNR decreased slightly or remained the same as the value of T increased. When T = 20 and T = 40 were applied, it was observed that the experimental results were the same in terms of the pure payload and PSNR. Therefore, the value of T is set to 20 in our method.

4. Conclusion This paper presents a reversible data-hiding scheme for medical images. Our method uses three neighboring pixels to predict the current pixel. For the prediction error, two histograms, h1 and h2, are generated. The distribution in

Table 4. Performance of our proposed method with different values of T applied. Images Medical image 1 Medical image 2 Medical image 3 Medical image 4

T=5 Pure payload 84,963 101,995 133,184 151,282

PSNR 49.49 49.11 50.65 50.39

T = 10 Pure payload 85,350 126,652 147,679 151282

PSNR 49.49 48.92 49.97 50.39

T = 20 Pure payload 85,375 137,955 148,423 151,282

PSNR 49.49 48.71 49.93 50.39

T = 40 Pure payload 85,375 137,955 148,423 151,282

PSNR 49.49 48.71 49.93 50.39


histogram h1 and h2 is more compact. Our proposed method intends to embed secret data into the cover image by using the modification of the two histograms h1 and h2 instead of the original image histogram. From the experimental results, we can see that the proposed method has the following advantages. First, the stego-images have good visual image quality. Second, our proposed method has a higher pure payload compared with those of Ni et al. [3] and Lin et al.‟s [5] methods. Third, our proposed method is simple and effective.

References

[7]

[8]

[9]

[10]

[11] [12]

[1]

[2]

[3] [4]

[5]

[6]

C. W. Honsinger, P. W. Jones, M. Rabbani and J. C. Stoffel, “Lossless recovery of an original image containing embedded data,” U.S. Patent No. 6,278,791, 2001. J. Fridrich, M. Goljan and R. Du, “Lossless data embedding: new paradigm in digital watermarking,” EURASIP J. Appl. Signal Processing, 2002: 185-196, 2002. Z. Ni, Y. Q. Shi, N. Ansari and W. Su, “Reversible data hiding,” IEEE Trans. Circ. Syst. Video Technol., 16: 354-362, 2006. H. W. Tseng and C. P. Hsieh, “Reversible data hiding based on image histogram modification,” Imaging Sci. J., 56: 271-278, 2008. C. C. Lin, W. L. Tai and C. C. Chang, “Multilevel reversible data hiding based on histogram modification of difference images,” Pattern Recognit., 41: 3582-3591, 2008. J. Tian, “Reversible data embedding using a difference expansion,” IEEE Trans. Circ. Syst. Video Technol., 13: 890-896,

[13]

[14]

[15]

295

2003. A. M. Alattar, “Reversible watermark using the difference expansion of a generalized integer transform,” IEEE Trans. Image Process., 13: 1147-1156, 2004. C. C. Lee, H. C. Wu, C. S. Tsai and Y. P. Chu, “Adaptive lossless steganographic scheme with centralized difference expansion,” Pattern Recognit., 41: 2097-2106, 2008. C. L. Tsai, H. F. Chiang, K. C. Fan and C. D. Chung, “Reversible data hiding and lossless reconstruction of binary images using pair-wise logical computation mechanism,” Pattern Recognit., 38: 1993-2006, 2005. J. S. Pan, H. Luo and Z. M. Lu, “Look-up table reversible data hiding for error diffused halftone images,” Informatica, 18: 615-628, 2007. J. H. Lee and M. Y. Wu, “Reversible data-hiding method for palette-based images,” Opt. Eng., 47: 1-9, 2008. Q. Huang, Y. Zheng, M. Lu, T. Wang and S. Chen, “A new adaptive interpolation algorithm for 3D ultrasound imaging with speckle reduction and edge preservation,” Comput. Med. Imaging Graph., 33: 100-110, 2009. M. H. Horng and S. M. Chen, “Multi-class classification of ultrasonic supraspinatus images based on radial basis function neural network,” J. Med. Biol. Eng., 29: 242-250, 2009. G. Zheng, S. Gollmer, S. Schumann, X. Dong, T. Feilkas and M. A. G. Ballester, “A 2D/3D correspondence building method for reconstruction of a patient-specific 3D bone surface model using point distribution models and calibrated X-ray images,” Med. Image Anal., 13: 883-899, 2009. T. Okafuji, H. Yabuuchi, Y. Nagatoshi, Y. Hattanda and T. Fukuya, “CT and MR findings of brain aspergillosis,” Comput. Med. Imaging Graph., 27: 489-492, 2003.

296