Wei Wang*, M.N.S. Swamy**, Fellow, IEEE, and M.O. Ahmad**, Fellow, IEEE. * Department of Electrical & Computer Engineering. The University of Western ...
RNS Application for Digital Image Processing Wei Wang*, M.N.S. Swamy**, Fellow, IEEE, and M.O. Ahmad**, Fellow, IEEE * Department of Electrical & Computer Engineering The University of Western Ontario, London, Ontario, Canada N6G 1H1 ** Department of Electrical & Computer Engineering Concordia University, Montreal, Quebec, Canada H3G 1M8
ABSTRACT In this paper, we carry out a study on the RNS application in digital image processing and propose a RNS image coding scheme that offers high-speed and low-power VLSI implementation for secure image processing. The proposed scheme is more efficient than the RNS image coding scheme of [3] in that the proposed method encrypts the entire image and does not require any additional component other than a standard RNS system. Further, the proposed scheme is based on the modified CRT and its associated residue-to-binary conversion and moduli selection methods and is more efficient than the scheme in [3] in terms of VLSI implementation. The design of an encoder and decoder pair for the greyscale image is carried out using MATLAB tool and some VLSI tools. The preliminary results of the Matlab simulation demonstrate the security ability of the proposed image coding scheme.
1. INTRODUCTION Digital image processing has been extensively used in desktop publishing, medical imaging, military target analysis, manufacture automation control, machine vision, geo-physical imaging, graphic arts and multimedia [1]. Most of these applications depend on the availability of compact and inexpensive hardware delivering the required high performance so that very large scale integrated (VLSI) technology is of vital importance for digital image processing [2]. The VLSI implementation of digital image processing systems requires high-speed and low-power techniques, as well the consideration of a certain measure of security during the transmission. One method of designing highspeed and low-power VLSI digital systems is by using the residue number system (RNS) [3]. The RNS has no carry chain and offers high-speed operations [4], [5]. The high-speed gained by the RNS parallelism can then be traded-off for low power consumption [6], [7]. Further, the RNS can offer low- and medium- level security without introducing the overheads as the general encryption systems [3]. Thus, in a VLSI implementation of digital image processing systems, the RNS-based structures provide a promising future.
In this paper, we carry out a study on the RNS application in digital image processing. Based on a modified version of the Chinese remainder theorem and efficient residue-to-binary (R/B) conversion and moduli selection methods [8], [9], we propose a RNS image coding scheme. The proposed scheme offers security for digital image processing and is expected to obtain a highspeed and low-power VLSI implementation.
2. BACKGROUND MATERIAL Digital image processing consists of many topics, to name a few, digital image transforms, digital image filtering and enhancement, digital image compression, edge detection, image segmentation, shape description et. al. These processing involves the development of many signal processing algorithms and implementations. The VLSI implementation of digital image processing systems requires high-speed and low-power techniques. Further, a secure transmission of the digital image through the computer network needs to be considered in some applications [3]. One method to achieve a secure, high-speed and low-power VLSI implementation for digital image processing is to use residue number system. A residue number system is defined in terms of a relatively prime moduli set {P1 , P2 ,..., Pm } , that is, GCD ( Pi , P j ) = 1 for i ≠ j . A binary number X can be represented
as
U = (u 1 , u 2 ,..., u m ) ,
0 ≤ u i < Pi .
where
ui = U
Pi
,
Such a representation is unique for any integer U ∈ [0, M − 1] , where M = P1 P2 ...Pm is the dynamic range of the moduli set {P1 , P2 ,..., Pm } . The RNS is a carry-free system for addition, subtraction, and multiplication operations. Hence, a large dynamic range binary system can be partitioned into several small-wordlength channels in parallel. Thus, the RNS can result in a parallel and high-speed operation. In order to convert numbers from binary to residue form and vice-versa, a binary-to-residue (B/R) converter is required at the front-end of the system and a residue-to-binary (R/B) converter at the back-end. The B/R converter consists of several modulo adders, whereas the R/B converter involves a lot of modulo operations. Thus, the R/B converter is a crucial part of the RNS system. To perform the R/B conversion, that is,
Proceedings of the 4th IEEE International Workshop on System-on-Chip for Real-Time Applications (IWSOC’04) 0-7695-2182-7/04 $ 20.00 IEEE
to convert the residue number ( x1 , x2 ,..., xm ) into the binary number X , the Chinese remainder theorem (CRT) is generally used [4],[5]. Recently, a software-based RNS image coding scheme has been introduced [3]. It has been shown in [3] that by adding some RNS numbers in the encoding of a digital image can encrypt the image. Only the decoder with the R/B converter design using the correct moduli set can recover the RNS numbers back to the corresponding binary numbers using CRT. As shown in Fig. 1, this encoder/decoder pair offers a certain measure of security during the transmission. However, the approach in [3] only encrypts part of the image. It requires several extra units other than a standard RNS system. Further, the study in [3] is not in details and is not efficient for VLSI implementation.
3. PROPOSED IMAGE CODING SCHEME In order to achieve a high-speed and low-power VLSI implementation for secure digital image processing, we propose a new RNS image coding scheme. 3.1 Proposed scheme It consists of a simplified encoder and decoder pair compared to the scheme of [3]. According to a moduli set {P1 , P2 ,..., Pm } , the encoder includes m B/R converters and m RNS image processors of small wordlength. The small-wordlength parallel outputs of these RNS image processors are arranged into an encrypted bitstream by a certain order. The decoder is an R/B converter to recover the encrypted bitstream in the corresponding order back to the processed image data. 3.2 High-speed and low-power design Compared to the binary image processors, the proposed RNS image processors can achieve high-speed and low-power VLSI implementation for many image processing such as digital image transform and filtering. This proposed scheme is based on a modified version of the Chinese remainder theorem and its associated R/B conversion and moduli selection methods [8], [9].
Chinese Remainder Theorem: Given the moduli set {P1 , P2 ,..., Pm } and the dynamic range M = P1 P2 ...Pm , the residue number ( x1 , x2 ,..., xm ) is converted into the binary number X by m
∑N
X =
i
N i−1 xi
i =1
M where n > 1 , N i = , and N i−1 Pi
inverse of N i
Pi
i
∑ w x′ i
i =1
where m > 1 , w1 =
N1 N1−1 P − 1 1
P1
x1′ = x1 , and x′i = N i−1 xi
Pi
, wi =
(2)
i
P2 ...Pm
Ni , for i = 2,3,..., m , P1
, for i = 2,3,..., m .
Since P1 is generally chosen as 2n , the last step in (2), that is, the multiplication by P1 and the binary addition of u1 , can be reduced to a simple concatenation. Based on the new formulation, we have proposed efficient residue-to-binary conversion algorithms and designed several efficient R/B converter techniques [8], [9]. Thus, the proposed image decoder (R/B converter) can be very efficiently designed using VLSI technology.
R/B or pass
Binary image processor
Encrypted bitstream
Decoder
Fig. 1. RNS image encoder and decoder system [3]
Encoder
=1. Pi
m
X = x1 + P1
Demux
RNS image processor
is the multiplicative
Pi
defined by N i−1 P N i
RNS image processor
B/R
M
The CRT requires a modulo-M (large valued) operation and it is not efficient for the implementation. Thus, we apply the following modified format of the CRT to reduce modulo operations from modulo M = P1 P2 ...Pm to modulo P2 ...Pn [8], [9]. Theorem 1: Given the moduli set {P1 , P2 ,..., Pm } , the residue number ( x1 , x2 ,..., xm ) is converted into the binary number X by
Mux or B/R
Encoder
(1)
Pi
R/B
Encrypted bitstream Decoder Fig. 2. Proposed RNS image encoder and decoder system
Proceedings of the 4th IEEE International Workshop on System-on-Chip for Real-Time Applications (IWSOC’04) 0-7695-2182-7/04 $ 20.00 IEEE
require additional components to achieve the security. Thus, the proposed structure is very attractive from the point of the view of the VLSI implementation. 3.4 Comparison Compared to the RNS image coding scheme in [3], the proposed scheme is more efficient for VLSI implementation. As shown in Table I, the proposed scheme encrypts the entire image, while the approach in [3] only encrypts part of the image. The proposed scheme does not require any additional component while that of [3] requires several extra units other than a standard RNS and a binary processor. Further, the proposed scheme is based on the modified CRT and the efficient R/B converter design and moduli selection method, which is more efficient than the study in [3] that is based on the CRT.
The new formulation of the CRT can also be used to resolve the important issue of the moduli selection in the RNS design. To represent the binary numbers in a certain range, there are many different choices of moduli sets for the RNS. The importance of the moduli selection is due to the fact that the dynamic range, the speed, as well as the VLSI implementation of RNS systems depend on the form as well as the number of the moduli chosen. We have proposed a scheme for moduli selection to determine the low-cost moduli sets for different dynamic ranges. We show that for a medium dynamic range (less than 22 bits), the three-moduli set {2n ,2n + 1,2n − 1} is the most efficient one, whereas for large dynamic ranges (equal to or larger than 22 bits), the general-moduli set in the form of {2 n ,2 n + 1,2 n − 1,2 n ± 1,...,2 n ± 1} , with a length greater than three, is the most efficient one. These sets consist of the low-cost moduli only in the form of 2 n , ( 2 n − 1 ) and ( 2 n + 1 ), which can offer simplified modulo adders and multipliers [10]. Also, these sets can be used to obtain very efficient R/B converters. Thus, the proposed encoder part, the moduli set is selected according to this proposed method. For example, for the [0,255] range greyscale image data, we can choose the moduli set {7, 8, 9} for the encoder design. Thus, the encoder consists of three 3-bit channels of RNS image processors. The B/R converters and the three RNS image processors are based on the three low-cost moduli 7, 8, 9 and can be efficiently implemented using VLSI technology. Therefore, based on the modified version of the CRT, the proposed image encoder and decoder are very efficiently designed and can offer high-speed and lowpower VLSI implementation. 3.3 Security The outputs of the RNS image encoder in the proposed scheme (Fig. 2) are of small-wordlength and arranged into an encrypted bitstream by a certain order. An unauthorized listener does not know the moduli set and the order of these parallel outputs. Only the receiver with the correct R/B converter can decode this bitstream back to the processed image data according to the order of bitstream. Thus, a certain measure of the security can be achieved by this proposed structure. This security is only useful for some systems requiring low- and medium-level security. Since the proposed structure is actually a standard RNS system, in which the B/R converters and RNS subsystems contribute the encoder and the R/B converter is the decoder, the proposed scheme does not 1
1
1
2
t
4. IMPLEMENTATION STRATEGY The design flow of the VLSI implementation of the proposed image coding scheme is as follows. First, for the given image and the image processing, we choose the efficient low-cost moduli set and use Matlab to theoretically design the coding scheme. Then, all the building blocks and the architectures of the encoder and decoder from the Matlab are coded in the VHDL language. Next, the codes are executed at the register transfer level (RTL) to verify the correctness of the designs. The logic synthesis is carried out to optimize the designs, and the gate-level simulation performed. Finally, the placement and routing are carried out automatically to generate the layout. A performance evaluation in terms of the area, delay and power consumption is carried out at the layout level. The software packages to be used include Cadence tool, Synopsys V3.4b, CMC Generic Environment for Cadence, CMC CMOS35 Design Kit for Cadence and CMC CMOS35 Design Kit for Synopsys. According to the above design flow, we will start the design of an encoder and decoder pair for the greyscale image. The VLSI implementation and the results will be posted in some future papers. Here, we show the results of the Matlab simulation. The image processing is a 2-D Gaussian filter with sigma = 3. We choose a noisy “lena” image as the input to the encoder as shown in Fig. 3. The output of the decoder is the “lena” after noise cancellation as shown in Fig. 4. The encrypted bitstream is expected to be an unreadable image.
TABLE I COMPARISON OF THE RNS IMAGE CODING SCHEMES Encryption range Scheme in [3] Proposed scheme
Part of the picture
Coding component Several in both encoder and decoder
VLSI implementation Not suitable
Whole picture
-
Efficient, highspeed, low-power
Proceedings of the 4th IEEE International Workshop on System-on-Chip for Real-Time Applications (IWSOC’04) 0-7695-2182-7/04 $ 20.00 IEEE
RNS algorithm CRT Modified CRT [8],[9]
The preliminary results of the Matlab simulation have demonstrated the security ability of the proposed image coding scheme.
REFERENCES
Fig. 3. Original noisy image “lena” (input to the encoder)
Fig. 4. Noise-cancelled image “lena” (output of the decoder)
5. CONCLUSION In this paper, we have focused on the RNS application in digital image processing and proposed a RNS image coding scheme that offers high-speed and low-power VLSI implementation for secure image processing. The proposed scheme is more efficient than the RNS image coding scheme of [3] in that the proposed method encrypts the entire image and does not require any additional component other than a standard RNS system. Further, the proposed scheme is based on the modified CRT and its associated R/B conversion and moduli selection methods and is more efficient than the scheme of [3] in terms of VLSI implementation. The design of an encoder and decoder pair for the greyscale image has been carried out using Matlab tool and some VLSI tools.
[1]. K. Konstantinides and V. Bhaskaran, “Monolithic architectures for image processing and compression,” IEEE Computer Graphics & Applications, pp. 75-86, Nov. 1992. [2]. P. Pirsch and H-J. Stolberg, “VLSI implementations of image and video multimedia processing systems,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 8, no. 7, pp. 878-891, Nov, 1998. [3]. A. Ammar, A. Al Kabbany, M. Youssef and A. Emam, “A secure image coding scheme using residue number system,” in Proceedings of the 18th National Radio Science Conference, Egypt, pp. 339405, March 27-29, 2001. [4]. M. A. Soderstrand, W. K. Jenkins, G. A. Jullien, and F. J. Taylor, Residue Number System Arithmetic: Modern Applications in Digital Signal Processing. New York: IEEE Press, 1986. [5]. B. Parhami, Computer Arithmetic: Algorithms and Hardware Designs, Oxford: Oxford University Press, 2000. [6]. W. L. Freking and K. K. Parhi, “Low-power FIR digital filters using residue arithmetic”, in Proceedings IEEE International Symposium on Circuits and Systems, pp. 739-743, 1998. [7]. M. Bhardwaj and A. Blaram, “Low-power signal processing architectures using residue arithmetic,” in Proceedings IEEE International Symposium on Circuits and Systems, pp. 3017-3120, 1998. [8]. Wei Wang, M.N.S. Swamy, M. O. Ahmad and Yuke Wang, “A study of residue-to-binary converters for three-moduli sets,” to appear in IEEE Trans. on Circuits and Systems I. [9]. Wei Wang, M.N.S. Swamy, M.O. Ahmad, and Yuke Wang, “A high-speed residue-to-binary converter for three-moduli RNS and a scheme of its VLSI implementation,” IEEE Trans. on Circuits and Systems II, vol. 47, pp. 1576-1581, Dec. 2000. [10]. A.A. Hiasat, “High-speed and reduced-area modular adder structures for RNS,” IEEE Trans. on Computers, vol. 51, pp. 84-89, Jan. 2002.
Proceedings of the 4th IEEE International Workshop on System-on-Chip for Real-Time Applications (IWSOC’04) 0-7695-2182-7/04 $ 20.00 IEEE