A Novel Watermark Technique for Relational Databases Hazem El-Bakry and Mohamed Hamada Mannsura Univ., Egypt
[email protected] Aizu University, Japan
[email protected]
Abstract. In this paper, a new approach for protecting the ownership of relational database is presented. Such approach is applied for protecting both textual and numerical data. This is done by adding only one hidden record with a secret function. For each attribute, the value of this function depends on the data stored in all other records. Therefore, this technique is more powerful against any attacks or modifications such as deleting or updating cell values. Furthermore, the problems associated with the work in literature are solved. For example, there is no need for additional storage area as required when adding additional columns especially with large databases. In addition, in case of protecting data by adding columns, we need to add a number of columns equal to the number of data types to be protected. Here, only one record is sufficient to protect all types of data. Moreover, there is a possibility to use a different function for each field results in more robustness. Finally, the proposed technique does not have any other requirements or restrictions on either database design or database administration. Keywords: Relational Database, Copyright Protection, Digital Watermarking.
1 Introduction The copyright protection inserts evidence into the digital objects without lossless of its quality. Whenever, the copyright of a digital object is in question, this information is extracted to identify the right full owner. Digital watermarking is the solution of embedding information in multimedia data. There are many techniques used to protect copyrights [18]. Digital contents in the form of text document, still images motion picture, and music etc. are widely used in normal life nowadays. With the rapid grown of internet users, it boots up transaction rates (file sharing, distribution or change). Trend goes up dramatically and continues growing everyday due to convenient and easy to access. It is, hence, copyright protection becomes more concerned to all content owners [1-2]. Watermark is an open problem that aimed to one goal. This goal is how to insert [error/ mark/ data/ formula/ evidence/ so on] associated with a secret key known only by the data owner in order to prove the ownership of the data without lossless of its quality. In order to evaluate any watermark system, the following requirements are generally considered in prior: (i) Readability: A watermark should convey as much information F.L. Wang et al. (Eds.): AICI 2010, Part II, LNAI 6320, pp. 226–232, 2010. © Springer-Verlag Berlin Heidelberg 2010
A Novel Watermark Technique for Relational Databases
227
as possible, statistically detectable, enough to identify ownership and copyright unambiguously, (ii) Security: Only authorized users access to the watermark data, (iii) Imperceptibility: The embedding process should not introduce any perceptible artifacts into original image and not degrade the perceive quality of image, and (iv) Robustness: The watermark should be able to withstand various attacks while can be detected in the extraction process. In general, watermark is small, hidden perturbations in the database used as an evidence of its origin. Inserting mark into original data used to demonstrate the ownership. Watermark should not significantly affect the quality of original data and should not be able to destroy easily. The goal is to identify pirated copies of original data. Watermarking does not prevent copying, but it deters illegal copying by providing a means of establishing the ownership of a redistributed copy. There are more approaches and algorithms available for image, audio and video but the new is how to introduce a new approach serve the relational databases? Agrawal et al. introduce a watermarking technique for numerical data [1]. This technique dependent on a secret key, uses markers to locate tuples to hide watermark bits, hides watermark bits in the least significant bits. Also Sion et al. introduce a watermark technique for numerical data [2]. This technique is dependent on a secret key, instead of primary key uses the most significant bits of the normalized data set, divides the data set into partitions using markers, and varies the partition statistics to hide watermark bits. Relational database was selected because it is common and was created before. Watermarking for values of selected attributes in tuples of relational database, it must be small to be tolerated [3,4]. This paper is organized as follows: The problem statement is described in section II, Section III presents the proposed technique and discusses the evaluation of this novel technique.
2 Watermarking for Databases Watermarking of relational databases is very important point for the researches; because the free databases available on the internet websites are published without copyrights protection and the future will exploding problems. If the database contains very important data; then the problem will be how to add watermark to the numerical or textual data in relational database. This should be performed without affecting the usefulness and the quality of the data. The goal is how to insert intended error /mark /data /formula/ evidence associated with secret key known only by the data owner in order to prove the ownership of the data without lossless of its quality [5,6]. Fig.1 shows a typical watermark model for any relational database. Watermark W is embedded into the relational database I with a secret key k, the watermarked relational database IW later pass through a distribution channel (computer network, internet, etc.), which are simulated under several kinds of common attacks. The watermarked database after attack IW, with the same secret key, will then extracted in order to recover the original watermark data W [4-10].
228
H. El-Bakry and Md. Hamada
Fig. 1. Typical Watermark System Model
3 The Proposed Technique Generally, the proposed technique relies on changing database schema; which is the model of database contents, thus the structure of the data will be changed by adding a new record (altering the table) relies on the original data in each field of the relational databse. The function used in constructing the new record as well as the secret key known only by the data owner. In general, the function used in protecting this relational database is locked via a predefined secret key. The proposed technique can be summarized in the following steps: 1. 2. 3. 4. 5. 6.
Get the relational table from the desired database; which must be numeric values. For each field, adding a new calculated record based on the data stored in other records with a secret function f(.). Generate the secret function f(.); which depends on the numeric values of the other cells in the current field in an encrypted structure. Apply this function to the remaining fields in the table; thus an extra record has been created and added to the original database table. Protect the calculated column from attack with a protection KEY known only to the data owner. The added record may be hidden to malicious.
In general, the proposed technique can be used to protect the ownership of the relational database that contains only numeric values. This novel technique adds only one hidden record with a secret function. Not only that but also locking this calculated row from any attacks or changes such as deleting or updating. The advantages of the proposed technique are: 1. 2. 3. 4.
The proposed technique is available for any relational database. No delay and no additional time required till the normal calculation end. Allowable for any update such as adding rows and changing the values of the columns. Not allowable for deleting the hidden records because it was locked with a secret key known only by the data owner. 5. The values in the hidden record are known only by the data owner [14-16].
A Novel Watermark Technique for Relational Databases
229
6. Furthermore, there is a possibility to use a different function for each field results in more robustness. 7. Moreover, there is no need for additional storage area as required when adding additional columns as described in [18]. The relational database in Table 1 is the North wind database used for many applications because it was mostly published on the internet and common in different Microsoft applications. Table 2 presents the watermarked relational database. The algorithm has been practically summarized in the following: (i) selecting any numerical table such as Table l (ii) adding a new record; its value relies on the data stored in other records by unknown functions, For example: Key = STD(Cells)+Max(Cells)- Min(Cells)±Q
(1)
where, STD is the standard deviation, and Q is a constant value. (iii) Applying the function for all columns as shown in Table 2. Table 1. The original relational database
Stock No. Jan. Feb. Mar. Apr. May June July Aug. Sep. Oct. Nov. Dec. 125970 212569 389123 400314 400339 400345 400455 400876 400999 888652
1400 2400 1800 3000 4300 5000 1200 3000 3000 1234
1100 1721 1200 2400 3500 900 2400 1500 900
981 1414 890 1800 2600 2800 800 1500 1000 821
882 1191 670 1500 1800 2300 500 1500 900 701
794 983 550 1200 1600 1700 399 1300 750 689
752 825 450 900 1550 1400 345 1100 700 621
654 731 400 700 895 1000 300 900 400 545
773 653 410 650 700 900 175 867 350 421
809 723 402 1670 750 1600 760 923 500 495
980 3045 19000 790 1400 5000 450 1200 16000 2500 6000 15000 900 8000 24000 3300 1200020000 1500 5500 17000 1100 4000 32000 1100 3000 12000 550 4200 12000
Table 2. The watermarked relational database
Stock No. Jan. Feb. Mar. Apr. May June July Aug. Sep. Oct. Nov. Dec. 125970 212569 389123 400314 400339 400345 400455 400876 400999 888652 564646
1400 2400 1800 3000 4300 5000 1200 3000 3000 1234 3433
1100 1721 1200 2400 3500 900 2400 1500 900 2062
981 1414 890 1800 2600 2800 800 1500 1000 821 1340
882 1191 670 1500 1800 2300 500 1500 900 701 994
794 983 550 1200 1600 1700 399 1300 750 689 1298
752 825 450 900 1550 1400 345 1100 700 621 1362
654 731 400 700 895 1000 300 900 400 545 553
773 653 410 650 700 900 175 867 350 421 715
809 723 402 1670 750 1600 760 923 500 495 1714
980 3045 19000 790 1400 5000 450 1200 16000 2500 6000 15000 900 8000 24000 3300 1200020000 1500 5500 17000 1100 4000 32000 1100 3000 12000 550 4200 12000 2167 5235 14200
230
H. El-Bakry and Md. Hamada
(iv) Hide the calculated record and export the table with the new added record (v) lock the entire table with a protection key known only to the data owner that deter the copying and changing the values of cells. Another example is listed in Table 3. It combines different types of data. The same principles are applied to numerical The final result is shown in Table 4. A code for each character is given as listed in Table 5. The secret formula is calculated as follows: αi
n
β=
∑α ∑ ρ i =1
i
j =1
j
(2) n where, α is the number of characters per word, ρ is the character code, n is the number of words, and β is the secret key. Table 3. The original relational database
Emp_ID 2324 4547 6549 7653 8975
Emp_Name Ahmed Nagi Sameh Kamel Alaa
Address Mansoura Tanta Cairo Sudan Cairo
Birth Date 17/11/1987 22/02/1989 12/12/1987 10/08/1986 04/10/1981
Salary 2320 1344 2456 1233 2356
Table 4. The watermarked relational database
Emp_ID 2324 4547 6549 7653 8975 5661
Emp_Name Ahmed Nagi Sameh Kamel Alaa Tamer
Address Mansoura Tanta Cairo Sudan Cairo Banha
Birth Date 17/11/1987 22/02/1989 12/12/1987 10/08/1986 04/10/1981 01/19/1994
Salary 2320 1344 2456 1233 2356 2164
Table 5. Alphabetic Character Coding
Character A B C D E F G H I J K L M
Code (ρ) 1 2 3 4 5 6 7 8 9 10 11 12 13
Character N O P Q R S T U V W X Y Z
Code (ρ) 14 15 16 17 18 19 20 21 22 23 24 25 26
A Novel Watermark Technique for Relational Databases
231
The resulted Emp_name and address can be concluded as shown in Table 6. Table 6. The computed secret key and its corresponding Emp_name and address
Secret key (β)
Emp_Name
Address
1:50 51:100 101:150 151:200 201:250
Mohamed Ali Hassan Tamer Shaker
Sinai Talkha Sandoub Banha El-Baramoon
4 Conclusions A novel digital watermarking technique for relational database has been presented. The proposed technique has provided a very high degree of reliability and protection of relation database with the aid of the user predefined function; which inserts an additional hidden record to available relational database. This technique has many advantages over existing techniques. First, it is available for any relational database. Second, it does not require any additional time because the calculations required for the new record are done off line. Third, it is not possible to delete the hidden record because it has been locked with a secret key known only by the data owner. The values in the hidden record are known only by the data owner. Furthermore, the problems associated with the work in literature are solved. For example, there is no need for additional storage area as required when adding additional columns especially with large databases. In addition, in case of protecting data by adding columns, we need for to add a number of columns equal to the number of data types to be protected. Here, one record is sufficient to protect all types of data. Moreover, there is a possibility to use a different function for each field results in more robustness. Finally, the proposed technique does not have any other requirements or restrictions on either database design or database administration.
References [1] Temi, C., Somsak, C., Lasakul, A.: A Robust Image Watermarking Using Multiresolution Analysis of Wavelet. In: Proceeding of ISCIT (2000) [2] Collberg, C.S., Thomborson, C.: Watermarking, Tamper-Proofing, and Obfuscation-Tools for Software Protection. Technical Report 200003, University of Arizona (February 2000) [3] Gross-Amblard, D.: Query-Preserve Watermarking of Relational Databases and XML Documents. In: PODS 2003: Proceedings of the 22nd ACM SIGMODSIGACT- SIGART Symposium on Principles of Database Systems, pp. 191–201. ACM Press, New York (2003) [4] Digital Signatures in Relational Database Applications GRANDKELL systems INC. (2007), http://www.gradkell.com [5] Cox, I.J., Miller, M.L.: A review of watermarking and the importance of perceptual modeling. In: Proc. of Electronic Imaging (February 1997)
232
H. El-Bakry and Md. Hamada
[6] Cox, I., Bloom, J., Miller, M.: Digital Watermarking. Morgan Kaufinann, San Francisco (2001) [7] Kiernan, J., Agrawal, R.: Watermarking Relational Databases. In: Proc. 28th International Conference on Very Large Databases VLDB (2002) [8] Boney, L., Tewfik, A.H., Hamdy, K.N.: Digital watermarks for audio signals. In: International Conference on Multimedia Computing and Systems, Hiroshima, Japan (June 1996) [9] Atallah, M., Wagstaff, S.: Watermarking with quadratic residues. In: Proc. Of IS&T/SPIE Conference on Security and Watermarking of Multimedia Contents (January 1999) [10] Atallah, M., Raskin, V., Hempelman, C., Karahan, M., Sion, R., Triezenberg, K., Topkara, U.: Natural Language Watermarking and Tamper-proofing. In: The Fifth International Information Hiding Workshop, Florida, USA (2002) [11] Hsieh, M.-S., Tseng, D.-C., Huang, Y.H.: Hiding Digital Watermarking Using Multiresolution Wavelet Transform. IEEE Trans. on Industrial Electronics 48(5) (October 2001) [12] Shehab, M., Bertino, E., Ghafoor, A.: Watermarking Relational Databases Using Optimization Based Techniques, CERIAS Tech Report- (2006) [13] Sion, R., Atallah, M., Fellow, IEEE, Prabhakar, S.: Rights Protection for Relational Data. IEEE Trans. on Knowledge and Data Engineering 16(6) (June 2004) [14] Sion, R., Atallah, M., Prabhakar, S.: Rights Protection for Relational Data. IEEE Transactions on Knowledge and Data Engineering 16(6) (June 2004) [15] Benjamin, S., Schwartz, B., Cole, R.: Accuracy of ACARS wind and temperature observations determined by collocation. Weather and Forecasting 14, 1032–1038 (1999) [16] Bender, W., Gruhl, D., Morimoto, N.: Techniques for data hiding. In: Proc. of the SPIE. Storage and Retrieval for Image and Video Databases III, vol. 2420, pp. 164–173 (1995) [17] Li, Y., Swarup, V., Jajodia, S.: Fingerprinting Relational Databases: Schemes and Specialties. IEEE Transactions on Dependable and Secure Computing 02(1), 34–45 (2005) [18] Gamal, G.H., Rashad, M.Z., Mohamed, M.A.: A Simple Watermark Technique for Relational Database. Mansoura Journal for Computer Science and Information Systems 4(4) (January 2008)