Practical Searching over Encrypted Data by Private Information Retrieval

5 downloads 8143 Views 206KB Size Report
The problem was hard to solve because User has only a ... In their scheme, PIR technique aims to retrieve the target data as usual, but before .... It also drives.
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.

Practical Searching Over Encrypted Data By Private Information Retrieval Rei Yoshida∗ , Yang Cui† , Tomohiro Sekino∗ , Rie Shigetomi† , Akira Otsuka† and Hideki Imai∗† ∗ Department

of Electrical, Electronic & Communication Engineering, Chuo University 1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan † Research Center for Information Security (RCIS), National Institute of Advanced Industrial Science & Technology (AIST) 1-18-13 Sotokanda, Chiyoda-ku, Tokyo 101-0021, Japan

Abstract—Explosive progress in networking and outsourcing storage increases the use of information retrieval technologies, in massive datasets. Nowadays, there are varieties of storageproviders through the internet, such as e-mail accounts and public database, which are convenient to store and exchange electronic files and medias. Typically, the storage-provider offers users the capability to collect, retrieve and search, however, privacy issues are rarely considered at the same time. For example, it is unknown how to prevent some curious storage-provider from learning the private information of the user, such as, searching criterion and access pattern, as well as contents. In CRYPTO’07, Boneh et al. put forward a privacy-preserving solution to this problem, with the help of public key cryptography. In their work, the authors made use of PIR (Private Information Retrieval) and several combinatoric techniques, which are theoretically interesting and likely to be the best approach in the literature. In this paper, however, we show that their proposal seems unlikely to be implementable with the latest technology, due to a large amount of computation cost involved. Then, we provide an improved method to turn the keyword search more practical, which cannot only avoid the expensive computation cost caused by operations of public key encryption, but enable the privacypreserving information retrieval, as well.

I. I NTRODUCTION With more and more storage-providers on internet supplying very convenient outsourcing services, privacy issues become a crucial problem. People who use those services such as, e-mail maintained by the storage-provider like Gmail and Hotmail, have enjoyed the merits of the easy collecting, retrieving and searching electronic data. However, one crucial issue should not be forgotten that none of those storage-providers furnishes the complete protection of the users’ privacy, including such as the contents of the email, the searching criteria of the data, or the access pattern of the users. In other words, it is possible for those storage-providers to know or collect partial information of users, which only depends on whether storage-providers like or not. For example, data ming techniques could be used by storage-providers or database managers to find some useful but (maybe) private information or life patterns of certain users. This situation should be changed as soon as possible and efficient solutions are desirable to protect user’s privacy without sacrificing services performance. In a useful and typical scenario of outsourcing services, suppose that there are Sender and User (Receiver) who want to

communicate mainly via the “honest-but-curious” database 1 . Sender is only permitted to send a couple of keywords, but not the whole data which is commonly a relatively large file such as videos or photos, to User, so that User can search in the database to find target confidential data. It is desired that User could efficiently search and retrieve the information those Sender submitted to the database, where database side does not know any partial information about the submitted data, such as, the contents of data, User’s identity, his searching criteria or his access pattern, etc. It is obvious to see that efficient symmetric encryption or public key encryption scheme could be applied to protect the contents of the submitted information. However, for his searching criteria or access pattern, it is much harder to be secured. In particular, due to the wide use and great convenience of Public Key Infrastructures (PKI), it is desirable to have a solution for search over encrypted data (by public key encryption), so that Sender does not need to share a key with the Receiver beforehand. Note that it was raised as an open problem in EUROCRYPT’04 [2], prior to the paper of [4]. The problem was hard to solve because User has only a few advantages. Compared with the honest-but-curious server, there are only two advantages that User holds: 1). some keywords that Sender delivered to him which helps him recover the address of the target information, and 2). his secret key which helps him decrypt the contents. Thus, it is unknown how to non-trivially achieve this total protection of privacy until Boneh et al’s seminal paper in CRYPTO’07 [4], where the server is required to equip with a publicly accessible buffer that anyone can write and read. Our Contribution. Since the privacy-preserving techniques are of significant importance in information retrieval, in this paper we investigate the CRYPTO’07 scheme [4] to show that it is not practical at all even with the latest technology due to the expensive computation cost involved. By our experimental result, Boneh et al.’s protocol [4] seems not to perform better 1 “Honest-but-curious” means that on the one hand every operation is exactly followed, but on the other hand, it will be curious to decrypt or find some confidential information if without appropriate protection. This definition is very similar to the practical situation, and widely used to model an appropriate adversary in design of secure protocols and privacy-preserving data mining [6], [12], etc.

978-1-4244-5637-6/10/$26.00 ©2010 IEEE

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.

than a trivial solution, which will deny the useability of system proposed by [4]. Thus, we further provide our proposal which is much more efficient than [4], by using an elegant and distributed private information retrieval scheme. Related Work. Although PIR [5] has been studied for a couple of years and a lot of PIR protocols have been put forward, none of those could be used solely to solve the underlying problem. Roughly speaking, it is because that efficient searching over encrypted data may leak sensitive information, such as keywords being searched, which is out of the scope of PIR techniques. Therefore, the protocol in [4] takes advantage of several cryptographic and combinatoric methods, together with PIR, to provide the first non-trivial solution to the open problem. It appears that their approach achieves theoretical bound, and might get the best performance in their setting, from a communication point of view. However, there are too many PIR operations involved and thus bring an unaffordable computation cost in practice. Up to now, most of the PIR protocols are evaluated by the order of the communication cost, no matter how much computation cost is taken. Unfortunately, it is actually not true according to Sion and Carbunar [11] in NDSS’07, that most of single database PIR protocols are less of practice, since it appears that computation cost rather than communication cost is the bottleneck (even implemented with the latest technology) in current internet environment. Consequently, our proposal partially makes use of non-single database PIR protocol, which fulfills what was required in [4], but more efficient from a computation viewpoint. II. P REPARATION Boneh et al. [4] proposed the scheme for keyword search from DB with using public key encryption that allows PIR [4]. In their scheme, PIR technique aims to retrieve the target data as usual, but before that, it is necessary to know the location of encrypted data at the database. Several techniques have been employed, such as Bloom filter [1], color survival game [8] and modified encrypted data [3] to circumvent this hurdle. In this scheme, storage space are divided into main database and the Buffer (BF). Main database is used to store all meta data or message, such as email and media files, in a confidential way; and Buffer is used only as the intermediate storage of the information on addresses of data in main database. Note that the buffer could be written by users and it is not controlled by the manager of DB all the time. Obviously, address information should be also encrypted by users’ public keys, otherwise, from a plaintext on the buffer, anyone can find the corresponding location at the main database. Because their scheme needs to execute PIR many times on the main database and Buffer, the size of Buffer should be set as small as possible to decrease computation cost (the size of main database is usually independently decided). When the sender intends to write the encrypted address of the confidential message to Buffer, the sender picks up the keyword of the message and inputs it to the Bloom filter,

Encryption with pk tagging keyword Return the stored address } Sender Keyword

p to Create copies specified addresses by BF with keyword

}1

Cpk

}2

Cpk

丵 丵 丵 }N

Cpk

Main Database PIR from specified addresses by BF with keyword Return all k }pk Get Cpk from } by PIR

User (pk,sk)

Fig. 1.

1

}pk

丵 丵 丵 m

}pk Buffer

Overview of keyword search scheme in CRYPTO’07

which is similar as a hash function but with a false positive. Bloom filter outputs certain bit array according to the keyword, which the sender uses as the storing address(es) on Buffer. There might exist some address corresponding to multiple keywords 2 . If the user searches by using the same keyword, the user can get the corresponding address(es) on Buffer about where the encrypted message has been stored on main database, by using the same Bloom filter. When the sender intends to write the addresses of the message on Buffer, the sender first uses modifying encrypted data [4] to prevent from leaking information where the sender has written the address in Buffer, and then uses color survive game methodology [8] to prevent overwriting the addresses on Buffer due to using Bloom filter. Figure 1. shows the flow of their scheme. The user R has a pair of key (pk, sk). Sender S has a message M , a keyword W for searching and pk of R. When DB stores the N messages in the maximum, it is decided that the Buffer can store m data accordingly. They have (k, m)-Bloom Filter which has a binary array of m bits. Let {Bj }m j=1 be address information m  and {Bj }j=1 be the encryption from {Bj }m j=1 under the User’s public key pk. It is required that in the underlying scheme [4], the Buffer is not completely controlled by the server, because any user can take a temporary token to have access to the Buffer and write m copies on it. Similarly, in our following proposal, we are taking this requirement as theirs, and further distributing the one Buffer to several. III. P RIVATE I NFORMATION R ETRIEVAL Private information retrieval [5] is a very important cryptographic primitive, applicable in varieties of fields in information security and cryptography, such as privacy-preserving information retrieval, oblivious transfer, and collision resistant hash function, etc. This section introduces background about several kinds of PIR, useful in different scenarios. PIR is a scheme that does not give the database manager user’s information, such as which data the user has down2 To deal with this problem, a couple of copies of encrypted addresses should be written on the Buffer, and thank to the color survival game principle [8], at least one copy will survive with high probability

978-1-4244-5637-6/10/$26.00 ©2010 IEEE

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.

loaded, the target that the user is searching for and the access pattern of the users. For example, it is assumed that DB has a dataset {xi }ni=1 and the user wants to get xi from them. The user sends to DB a query i to specify the data. DB replies the data xi corresponding to the query i. This is a normal information retrieval and cannot be called private information retrieval(PIR) because it is known to the manager which data among the database has been retrieved by certain users. A. IPIR The scheme that the user downloads all of the dataset is the most simple and trivial PIR. In the pioneer paper on PIR [5], it is named IPIR. The user wants to get xi from DB. Then he sends to DB a query of whole dataset, so that DB replies all of dataset. The user can receive the whole dataset to retrieve xi from them, where DB cannot know which data the user needed. This IPIR has information theoretic security. The communication cost of IPIR is equal to database size n. B. Block PIR In various PIR protocols, there are 1DBPIR and xDBPIR. 1DBPIR (or single database PIR) uses only one database and xDBPIR uses multiple databases. In 1DBPIR which has information theoretic security, it has been proved that the lower bound of communication cost is n. It means that 1DBPIR which has information theoretic security cannot decrease the communication cost less than this IPIR. On the other hand, xDBPIR or (BlockPIR) can achieve the communication cost more efficient than n while keeping information theoretic security, with additional assumption that all databases are synchronized and different DB do not collude. For example, when there are two databases, i.e. x = 2, we explain Block PIR as follows. Denote database DB 1 and DB 2 with the same size N , and each with m blocks, written as {Bi }m i=1 . A user u who intends to retrieve the information with index i, but would not like to be known that he has queried about i, can use the following BlockPIR protocol. 1) u generates a random vector σ = {0, 1}m and its bit-wise XOR on i-th bit σ  = σ ⊕ i. Send σ to DB 1 and σ  to DB 2 .  2) DB 1 makes use of σ to compute η = σ(j)=1  {Bj }, and DB 2 computes with σ  to obtain η  = σ  (j)=1 {Bj }. Return σ, σ  to u. 3) u retrieves the Bi from η ⊕ η  . BlockPIR scheme is secure, if DBs do not collude together. Another important property is that BlockPIR requires databases built in a distributed manner, and does not need complicated calculation, which is favorable in specific applications as we show in section IV. C. Computational PIR IPIR has information theoretic security but the lower bound of communication cost is n. CPIR has only computational security, but with a better communication cost less than n by sacrificing information theoretic security. In this paper, we

explain one of the CPIR scheme [9] which is based on Paillier cryptosystem [10]. Note that there exists some 1DBPIR protocol, such as [7], achieves log-squared communication cost with trading off a highly expensive computation cost. However, since we have explained that the computation cost instead of communication cost here is the bottleneck of the operation time, we will use widely used Paillier CPIR to evaluate the performances. Also refer to paper [11] for the trade-off between the two aspects. Paillier cryptosystem is a public key encryption and has a property of homomorphic encryption, as well. Let M be a plaintext and (pk, sk) be a pair of public key and secret key. The homomorphic encryption is assumed as follows: Epk (M1 )Epk (M2 ) = Epk (M1 + M2 ) More precisely, Paillier cryptosystem is the system as follows (pk is implicitly included): 1) Choose two large prime p and q. Compute n = pq and Z∗n λ = lcm(p−1, q−1). Select random g ∈ ZZ∗n2 and r ∈ Z 2) Generate a ciphertext c from a plaintext m. c = g m rn mod n2 . 3) Compute m from c. m=

u−1 L(cλ mod n2 ) mod n (L(u) = ) L(g λ mod n2 ) n

IV. O UR P ROPOSAL AND P ERFORMANCE A NALYSIS Due to the recent papers [11], it is known that computation cost rather than communication cost sometimes becomes the bottleneck of the information retrieval protocols. It also drives us to testify the practical performance of keyword search using public key encryption that allows PIR [4], in order to build a implementable scheme, despite of theoretical meaningfulness. Through the following investigation, we find that it had better employ the computationally-simple protocols, because the current networking technology has made the communication cost much cheaper than before. On the other hand, a huge computation cost caused by CPIR seems unaffordable even implemented with the latest technology. As a consequence, it is necessary to reconsider the trade-off between the underlying costs to obtain a practical scheme. A. Proposed Scheme Now, we define the following variables which are used in this scheme. Assume that DB has a size of N bits and can store n messages in maximum, and Buffer on the DB has a size of M bits (wlog, we assume M is a square). R has the key pair of public key encryption system that a length of cipher text is κ. S uses the keyword of w words when S sends a message with a keyword. (k, m)-Bloom Filter has k hash functions and the hash functions outputs value with a length of m bits. The m-bit value denotes the index of Buffer. In this paper, we denote that time required for PIR is the sum of time required for computation and communication. Let α be the time required for computation of one modulo exponentiation, and β be a communication speed (bandwidth)

978-1-4244-5637-6/10/$26.00 ©2010 IEEE

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.

Variables N n M m k P κ α β

Descriptions size of main database number of main database’s index size of buffer length of buffer’s index number of hash functions probability that game fails length of cipher text time required for modulo exponentiation communication speed TABLE I VARIABLES OF PARAMETERS

between R and DB. Note that in research of PIR, communication cost is mostly considered but computation cost has not been seriously taken care of. However, a recent work [11], shows that computation cost sometime dominates the time of information retrieval, and thus cannot be omitted in practice. We here examine the this issue as well, with an assumption that modulo exponentiation is run by the the latest technology (as fast as possible). In the following, we describe our improved proposal, by using multiple Buffers. Wlog, we simply assume there are two buffers, Buffer 1 and Buffer 2 . 1) Sender S associates keyword W to the message M , encrypts the message E(M ) and sends it to DB. 2) DB stores E(M ) in main database, and returns the corresponding address ρ in which E(M ) is stored. 3) Sender S inputs W to Bloom filter to get the k outputs as addresses of Buffer (1,2) . j } and 4) Sender S then encrypts the γ copies of ρ as {B writes them into γ addresses of Buffer 1 and Buffer 2 , respectively. γ are randomly chosen from the k outputs. 5) S modifies the encrypted data as [4] When R intends to search the keyword W associated with the message from DB. 1) Input W to Bloom filter and get the k addresses H(W ) of Buffer. 2) Execute BlockPIR to the addresses k times, and get k outputs of {B H(W ) }. j }, R generates random vector a) For retrieving i-th of {B σ = {0, 1}m and σ  = σ ⊕ i, and sends query σ, σ  to Buffer 1 , Buffer 2 respectively.   b) Buffer 1 computes η = σ(j)=1 {Bj } with σ, and   j } with σ  . Return Buffer 2 computes η = σ (j)=1 {B  σ, σ to R.  c) R obtains i-th element of {B H(W ) }, by η ⊕ η . d) Repeat k times to recover {B H(W ) }.  3) R decrypts {BH(W ) } with his secret key and gets {BH(W ) }. 4) R executes CPIR 3 to the ρ of DB and gets the M associated with W . 3 or use BlockPIR to retrieve the encrypted message, but with multiple main databases

As we have explained in section III, BlockPIR has much simpler calculation than CPIR. Therefore, we can drastically reduce the computation cost by using BlockPIR with multiple buffers, instead of CPIR. B. Performance of Previous Scheme Boneh et al. [4] made use of computationally secure PIR (CPIR) for getting the corresponding addresses of encrypted data from Buffer. However, as we have pointed out, the computation cost should be fairly included from a practical point of view. Here, we adopt the widely used Paillier-based CPIR to evaluate the performances of proposal from CRYPTO’07 [4]. Paillier scheme [10] is a public key encryption and has a property of homomorphic encryption. Note that Paillier CPIR has a relatively simpler computation cost involved, so we can expect that it consumes fewer computational time than other CPIR schemes. The flow of using CPIR to retrieve an encrypted address (which denotes the position of target data in the main √ database) M modulo from Buffer is shown as follows: R calculates √ √ exponentiations to generate M queries Q. R sends M queries which all are encrypted by Paillier √ cryptosystem. M additions to Then, DB calculates M multiplications and √ generate answers R. DB sends M answers which are sum for the line. Finally, R calculates one modulo exponentiation to decrypt target R. It is repeated in times of κ which is the length of cipher text on Buffer and k which is number of addresses specified by Bloom Filter. Note that the time for one modulo exponentiation is typically 100 times as that of multiplication [11]. Therefore, the time required for CPIR is shown as follows: √ √ (2κ M )  M  α+ kκ (1) M+ 100 β C. Implementation with IPIR - For Comparison We are going to use an IPIR protocol for getting the data from Buffer, only for a comparison. The flow of using CPIR to get data from Buffer is shown as follows: R sends a query for all data. R needs not calculate in it. Then, DB sends the all data in Buffer which has a size of M . DB needs not calculate, either. Finally, R calculates modulo exponentiation to decrypt target data in times of k which is number of addresses specified by Bloom Filter. Therefore, the time required for IPIR is shown as follows: kα +

(M + 1) β

(2)

D. Performance of Our Proposal On the performance of our proposal, it is obvious to see that by using BlockPIR the computation cost is reduced a lot. And the communication cost is also acceptable considering the current networking technology. More precisely, we are using distributed Buffer say two Buffers, which do not collude to find what information has been retrieved from the Buffer. The query of R is a random m-bit sequence, where DB 1 and DB 2 calculate the bit-wise XOR from the requested bit

978-1-4244-5637-6/10/$26.00 ©2010 IEEE

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.

sequence only when “1” appears in the position. Consequently, it consumes at most m − 1 calculations of XOR. Then DB 1 and DB 2 return the answers to R which are κ-bit sequences. R further calculates the bit-wise XOR of these two bit sequences and runs one modulo exponentiation for recover one bit information. Since bit-wise XOR operation can be computed within a few CPU clock cycles and thus is negligible compared to modulo exponentiation, we can simply take care of the following, for a k outputs from Bloom filter.  (m + κ)  k (3) α+ β V. C OMPARISON To compare these schemes based on distinct PIR protocols in a clear way, we apply typical and concrete values to each parameter. Let the security parameter be 280 , then the length of cipher text of Paillier cryptosystem κ is 2048-bit. When the size of main database is assumed to be 5G Bytes (similar to Gmail services), the number of datasets which can be stored on DB is 5 × 104 with assuming the average size of each data 100K Bytes. The size of Buffer M and the number of hash functions in Bloom Filter k are calculated from these values. Bloom Filter is efficient to build by using probabilistic method [1], and needs to have a sufficient size of Buffer to store all addresses of main database. In CRYPTO’07 scheme, it is proved that the addresses of main database are survived even after multiple overwriting on Buffer with high probability, due to Color Survival Game [8]. The probability P that the game fails, means that information of the address which in main database stores the encrypted message has been deleted during overwriting. It is also shown as P = n/2k . When the probability P is 0.1%, the number of hash functions that the Bloom Filter k is Log2 (5 × 104 × 103 )  26. Suppose that Bloom filter has a false positive probability Cb = 0.1%, then the number of index m = 1.4 × 105 . However, an appropriate m could be chosen as m = 2nk. Hence, m = 3.4 × 105 , and M = m × κ = 83.2M B. Sion et al. remarked that calculation of modulo exponentiation on 1024-bit needs to take 3700ns [11], where they use Intel(R) Pentium(R) 4 CPU running at 3.60GHz with 1GByte RAM and the GNU Multiple Precision Arithmetic Library 4.2.1 [13]. In their expectancy [11], calculation time (α) of modulo exponentiation on 2048-bit needs to take 1.8 × 104 ns. Let time of multiplication take 180ns, time of XOR operation negligible and communication speed (β) between DB and R be 1M bps4 . Table II shows that the comparison of keyword search schemes based on different PIRs. For example, in a typical setting as described above, the instances of CPIR and IPIR take 1.2 × 107 s from eq.(1) and 6.9 × 102 s from eq.(2), respectively. However, with distributed Buffers, we can achieve a more practical result 8.9s from eq.(3). Note that even the 4 It is reasonable to assume this since the value on the current internet is quite practical and popular nowadays, also refer to the discussion in [11].

Buffer Comm. Overhead Total time

CPIR (Paillier) single √ 2κ M bits 1.2 × 107 s

IPIR single M + 1 bits 6.9 × 102 s

BlockPIR (Ours) multiple (distributed) 2(m + κ) bits 8.9 s

TABLE II A T YPICAL S ETTING FOR S CHEMES BASED ON VARIOUS PIR S

trivial method (IPIR) is much more efficient than that the CRYPTO’07 paper proposed, because computation cost has consumed much more than one has thought. In the following (the x-axis represents buffer size (Byte), and y-axis represents the time consumed (s)), we compare with these schemes in different settings. Our result (see full version) shows that when β = 1M bps, α = 18μs and α = 18ns respectively, the total time consumed by three schemes. It has been proven that even computation speed is increased, the improvement is less of impact, since computation cost is dominant in time-consuming. And when α = 18μs, β = 100M bps and β = 128kbps respectively, our proposal with using BlockPIR is the most efficient in different bandwidths, as well. VI. C ONCLUSION In this paper, we have proposed a practical keyword search scheme which performs better than the previous work which is only theoretically interesting but less of practice because the computation of private information retrieval takes much more cost than it was expected. We propose a simple but effective modification to overcome this problem, which greatly enhances the performance and furthermore enables the privacypreserving outsourcing techniques, as well. R EFERENCES [1] B. Bloom, “Space/time trade-offs in hash coding with allowable errors”, Communications of the ACM, 13(7):422-426, July 1970. [2] D. Boneh, G. D. Crescenzo, R. Ostrovsky and G. Persiano, “Public Key Encryption with Keyword Search”. Proceedings of EUROCRYPT 2004: 506-522. [3] D. Boneh, E. Goh and K. Nissim, “Evaluating 2-DNF Formulas on Ciphertexts”, Proceedings of TCC’05, pp.325-341, 2005 [4] D. Boneh, E. Kushilevitz, R. Ostrovsky and W. Skeith “Public Key Encryption that Allows PIR Queries ”, Proceedings of CRYPTO 2007: 19-23, 2007. [5] B. Chor, O. Goldreich, E. Kushilevitz and M. Sudan, “Private Information Retrieval”, IEEE FOCS, pp.41-50, 1995. [6] Y. Lindell and B. Pinkas, “Privacy preserving data mining”, Proceeding of CRYPTO 2000: 36-54, 2000. [7] H. Lipmaa, “An Oblivious Transfer Protocol with Log-Squared Communication”, The 8th Information Security Conference(ISC), LNCS 3650, pp.314-328, 2005. [8] R. Ostrovsky and W. Skeith, “Private Searching on Streaming Data”, Proceeding of CRYPTO 2005: 223-240, 2005. [9] R. Ostrovsky and W. Skeith, “A survey of single database pir: Techniques and applications”, Cryptology ePrint Archive, Report 2007/059, 2007. [10] P. Paillier, “Public-key cryptosystems based on composite degree residuousity classes”, Proceedings of Eurocrypto 99, LNCS 1592, pp.223-238, 1999. [11] R. Sion and B. Carbunar, “On the Computational Practicality of Private Information Retrieval”, Proceedings of The 14th Annul Network Distributed System Security Symposium (NDSS2007). [12] V. S. Verykios, E. Bertino, I. N. Fovino, L. P. Provenza, Y. Saygin and Y. Theodoridis, “State-of-the-art in privacy preserving data mining”. SIGMOD Record 33(1): 50-57 (2004) [13] GMP: GNU Multiple Precision Arithmetic Library. Online at http://www.swox.com/gmp/, 2005.

978-1-4244-5637-6/10/$26.00 ©2010 IEEE