First Asian Himalayas International Conference on Internet AH-ICI2009 Hyatt Regency Kathmandu, KATHMUNDU, NEPAL 3- 5th of November, 2009
SMC Protocol for Privacy Preserving in Banking Computations Along with Security Analysis Rohit Pathak╫, Satyadhar Joshi┼ ╫ ┼
Acropolis Institute of Technology & Research, Indore, M.P., India Shri Vaishnav Institute of Technology & Science, Indore, M.P., India
[email protected],
[email protected]
Abstract- The expansion of internet escalated banking to a new level and has raised tremendous opportunities of joint transactions in which multiple banks cooperatively conduct some computation. Such computations use confidential data of the involved banks to compute the result. As the concerned data is private for the owning organization, its security is prime concern. Privacy preservation concern rises as no party can be trusted enough to know all the inputs of computation. In this paper we have proposed a scalable and efficient protocol to perform secure multi-party computations on encrypted data. The process involves encrypting data in a manner that it does not affect the result of the computation. Virtual parties are created by all organizations and encrypted data is distributed among them. Modifier tokens are generated along encryption which are assigned to virtual parties, and finally used in the computation. The computation function uses the acquired data and modifier tokens to compute result. As the data involved in computation was encrypted, without revealing the data right result can be computed and privacy of the parties is maintained. The protocol is highly efficient in conducting banking computations. We have analyzed the security and complexity of protocol and shown how zero hacking security can be achieved. Also we have analyzed the performance through various tests. Keywords- Secure Multi-Party computation, Communication and Information Security
W
I.
INTRODUCTION
ith the advent into the 20th century the expansion of internet has escalated banking to a new level. The World Wide Web is becoming more reliable each passing day. A large ratio of population is relying on it for their daily work. Internet has penetrated into the banking industry deeply and nearly every desired transaction can be performed online. Fund transfer, payments, shopping and many other types of transaction are possible just with a few clicks. Internet and communication between different banks has given rise to a new banking structure. In this new banking era there are numerous opportunities when there is a need to perform a joint transaction between two or more parties or banks. There may be a need to perform survey’s involving confidential data from different banks and other organizations. Banks may be outsourcing their work to BPO organizations. Security and privacy are the two major issues needed to be addressed by the banking industry to have an increased rate of customers banking online. Another important issue is about sharing information with other banks and organizations to
ensure that data and vital information of the concerned customer or the bank is secured and protected. What does security actually mean, in the context of banking, and how is it different from privacy? Is it even different? Are security concerns any different for a multinational bank or national bank, and is the customer or the online service provider ultimately held responsible in the event of failure? One thing’s for sure – as organizations and common people continue to broadly adopt online banking as a strategy to perform their transactions and other processes, concerns around security take on an entirely different dimension. Today’s online banking systems have amplified and broadened security needs to the extent that security concerns now overarch all other IT-enabled sectors. Security has been a banking concern since long before online banking started and enabled customers of the processes and capabilities online. Security is not just a consequential incident that results from proliferation of information technology. Prior to IT-enabled banking, the definition of security was more passive – the state of being safe or secure. Usually the types of security topics & questions discussed in the context of banking are that who physically altered what accounts, where the accounts or money was stored and how they were transported there were usually. There is an absence of proper data security and cyber laws which is encumbering banking and its business prospects. There is also tremendous hype and a lack of understanding of the issues surrounding security. The most significant security issues revolve around the protection of data in one manner or another. Some of the information security and data privacy challenges that banks face include lack of stringent data protection laws, use of portable devices such as laptops by employees to store confidential information, rising data security costs due to increased employee background checks, training employees in maintaining data security, ensuring compliance with security policies implemented in the company, and systemic plugging of any loopholes through employee activity monitoring procedures. To ensure that the confidentiality of a customer or bank’s information is maintained, there is need to implement data security measures, which can be classified into measures taken at the recruitment level and measures taken at the operational level. II. RECENT WORKS
978-1-4244-4570-7/09/$25.00 ©2009 IEEE
Kindly cite the paper, presentation or any other work you refer; this work is indexed in IEEE Xplore DL. Say no to Plagiarism.
First Asian Himalayas International Conference on Internet AH-ICI2009 Hyatt Regency Kathmandu, KATHMUNDU, NEPAL 3- 5th of November, 2009 Yao has described millionaires’ problem and gave the solution by using Deterministic Computations and introduced a view of Secure Computation [1]. We see about collaborative benchmark problem and a proposed solution in which the private shares are changed but in a manner that the sum remained the same [2]. Mikhail et al. has provided privacypreserving solutions to collaborative forecasting and benchmarking that can be used to increase the reliability of local forecasts and data correlations, and to conduct the evaluation of local performance compared to global trends [3]. Wenliang et al. has proposed development of practical solutions to SMC problems, a new paradigm, in which we use an acceptable security model that allows partial information disclosure [4]. Linda et al. presents a unified approach to multi level database security based on two ideas: a trusted filter and an inference engine [5]. Wenliang et al. proposes the privacy preserving cooperative linear system of equations problem and privacy-preserving cooperative linear least-square problem [6]. Ran et al. has shown how uncorrupted parties may deviate from the case where even protocol by keeping record of all past configurations [7]. Mikhal et al. have given a protocol for sequence comparisons in which neither party reveals anything about their private sequence to the other party [8]. A Secure Supply-Chain Collaboration (SSCC) protocols that enable supply-chain partners to cooperatively achieve desired systemwide goals without revealing the private information of any of the parties, even though the jointly-computed decisions require the information of all the parties is proposed by Atallah et al. [9]. The problem of defining and achieving security in a context where the database is not fully trusted, i.e., when the users must be protected against a potentially malicious database is discussed by Ueli et al. [10]. We have seen building a decision-tree classifier from training data in which the values of individual records have been perturbed, and reconstruction procedure to accurately estimate the distribution of original data values has been described [11]. We have already seen the Anonypro Protocol, which had a good concept to make the incoming data of anonymous identity [12-14]. Anonypro Protocol assumed the connection between the party and anonymizer to be secured. If we have to perform a calculation which includes data from many organizations, than the safety of the data of the organization is the prime concern. Suppose a statistical calculation is to be performed among several organizations. This calculation includes information related to various person’s related to the organization, may it be employees working for the organization or the customers of the organization such as customers of a bank. In this case, information of every person is to be kept secure so as to keep privacy of every individual. Internet and communication between different banks has given rise to a new banking structure. In this new banking era there are numerous opportunities when there is a need to perform a joint transaction between two or more parties or
banks. There may be a need to perform survey’s involving confidential data from different banks and other organizations. Banks may be outsourcing their work to BPO organizations. There has risen tremendous opportunities of joint transactions in which multiple banks cooperatively conduct some computation. Such computations use confidential data of the involved banks to compute the result. As the concerned data is private for the owning organization, its security is prime concern. Privacy preservation concern rises as no party can be trusted enough to know all the inputs of computation. In this paper we have shown how the proposed scalable and efficient protocol [15] can be used to perform secure multi-party computations on encrypted data for banking computations. In our previous work [15] the protocol has already been proven for use in Military [16], Business Process Outsourcing [17] and Statistical computations [18]. The process involves encrypting data in a manner that it does not affect the result of the computation. Virtual parties are created by all organizations and encrypted data is distributed among them. Modifier tokens are generated along encryption which are assigned to virtual parties, and finally used in the computation. The computation function uses the acquired data and modifier tokens to compute result. As the data involved in computation was encrypted, without revealing the data right result can be computed and privacy of the parties is maintained. The protocol is highly efficient in conducting banking computations. We have analyzed the security and complexity of protocol and shown how zero hacking security can be achieved. Also we have analyzed the performance through various tests. III.
METHOD PROPOSED
A. VPP (Virtual Party Protocol) Algorithm Informal Description We have to compute the function f(a1, a2, a3…, an) where the function is dependent on the number of data items sent by the organization. There are n banks B1, B2, B3…, Bn. Each bank Bi has data Xi1, Xi2, Xi3…, Xim. Each bank Bi has some trusted anonymizers Ai1, Ai2, Ai3…, Aix. There are z un-trusted anonymizers A1, A2, A3…, Az. For every bank Bi we will create some fake trivial data entries Fi1, Fi2, Fi3…, Fiq, where q is the total number of fake entries. The total number of fake entries q may be different for every bank Bi but for the sake of simplicity in explanation it is kept same for every bank. The fake data is generated in a manner that it doesn’t effects the overall result. We will group this data with original data entries Xi1, Xi2, Xi3…, Xim. Thus the new group of data having m+q total number of data items, i.e. Di1, Di2, Di3…, Di(m+q). The value of each data Di1, Di2, Di3…, Di(m+q) is encrypted to obtain the encrypted data Ei1, Ei2, Ei3…, Ei(m+q). For every bank Bi we will create k virtual identity banks Bi1, Bi2, Bi3…, Bik. Encrypted data Ei1, Ei2, Ei3…, Ei(m+q) is distributed randomly among the virtual bank parties Bi1, Bi2, Bi3…, Bik. Modifier tokens Ti1, Ti2, Ti3…, Tik are generated for
First Asian Himalayas International Conference on Internet AH-ICI2009 Hyatt Regency Kathmandu, KATHMUNDU, NEPAL 3- 5th of November, 2009
every bank Bi. These modifier tokens are randomly distributed among the virtual bank parties Bi1, Bi2, Bi3…, Bik such that every virtual bank party gets one modifier token. Encryption of data and generation of modifier tokens is explained in later sections. Now the virtual bank parties Bi1, Bi2, Bi3…, Bik distributes their data and modifier tokens randomly among the trusted anonymizers Ai1, Ai2, Ai3…, Aix. Trusted anonymizers distribute their data randomly among the un-trusted anonymizers A1, A2, A3…, Az. Anonymizers can take data from multiple parties. The data of the un-trusted anonymizers is sent to third party. It is assumed that the anonymizers only redirect the data and can’t store it. B1
Eij – Encrypted data associated with party Bi where j ranges from 1 to m+q Aij – trusted anonymizer of party Bi where j ranges from 1 to x Ay – untrusted anonymizer, where y ranges from 1 to z TP – third party Start VPP ¾ Create k virtual banking parties Bij for every party Bi ¾ Create fake data Fij for every party Bi ¾ Group fake data Fij with original data Xij to get Dij ¾ Encrypt data Dij to get Eij
B2
Bn
B11
B12
B1k
B21
B2k
B31
B3k
A11
A12
A1x
A21
A2x
A31
A3x
A1
A2
A3
Az
TTP Fig.1. Data flow in VPP with five layer structure consisting of party layer, virtual party layer, trusted anonymizer layer, untrusted anonymizer layer and computation layer from starting to end respectively.
The function h() uses the encrypted data and the modifier tokens to compute the right result. Function h(), will vary for different types of computation and will depend highly on f(). Third party will compute the value of function h(E11, E12, E13…, E1j…Ei1, Ei2, Ei3…, Eij, T11, T12, T13…, T1j…,Ti1, Ti2, Ti3…, Tij) which is the desired result, same as the result computed by the function f(X11, X12…, X1m, X21, X22…, X2m, X31, X32…, X3m…, Xn1, Xn2…, Xnm,), and this result is declared publicly. The whole scenario can be seen in Fig. 1. Formal Description VPP Algorithm Identifier List: Bi – Bank or Participant where i ranges from 1 to n Xij – Data of party Bi where j ranges from 1 to m Fij – Fake data of party Bi where j ranges from 1 to q Dij – total data including the fake and the original data Bij – Virtual Party of party Bi where j ranges from 1 to k
¾ ¾
Create modifier tokens Tij for every bank Bij Distribute the encrypted data Eij among the virtual bank parties Bij ¾ Send the data and modifier tokens from virtual bank identity Bij to trusted anonymizer Aij ¾ Send the data and modifier tokens from trusted anonymizer Aij to un-trusted anonymizer Ay ¾ Send the data from un-trusted anonymizer Ay to TP ¾ Calculate the result at TP using the encrypted data and the modifier tokens ¾ The result is announced by TP End of Algorithm IV. SECURITY ANALYSIS The anonymizers hide the identity of the bank. Suppose there is one layer of anonymizers, consisting of z anonymizers A1, A2, A3…, Az. Than the probability of revealing the source of the data at TTP is inversely proportional to the number of
First Asian Himalayas International Conference on Internet AH-ICI2009 Hyatt Regency Kathmandu, KATHMUNDU, NEPAL 3- 5th of November, 2009
parties sending data. We can see that there is more security when there are large numbers of participants. To further increase the security we use two layers of anonymizers.
When bank Bi has ki number of virtual bank parties, the probability of hacking data of any virtual bank party of bank Br is
P VBr
kr
(2)
n
¦k
i
i 1
Even if the data of virtual bank party is hacked it will not breach the security as this data is encrypted. Probability of hacking the data of any party r is calculated as
P Pr
kr
u
n
kr 1
n
¦ k ¦ k 1 i
i 1
i
i 1
uu
1
(3)
n
¦k k i
r
i 1
Fig. 2. Graph between probability of hacking and number of participants
Now consider the case of two layered architecture. Each participant bank has x number of trusted anonymizers Ai1, Ai2, Ai3…, Aix and the number of un-trusted anonymizers is z, namely A1, A2, A3…, Az. In this case as there are two layers and hence the security has increased. In this case the security has increased by a factor of number of total anonymizers in trusted layer.
Fig. 4. Graph between number of Virtual Parties (x axis) vs Probability of hacking (y axis).
Fig. 3. Graph between probability of hacking and number of participants
If the TTP is malicious then it can reveal the identity of the source of data. The two layers of anonymizers will preserve the privacy of source of data. A set of anonymizers will make the source of data anonymous and will preserve the privacy of individual. The more the number of anonymizers in the anonymizer layer the less will be the possibility of hacking the privacy of the data. Each virtual bank party reaches TTP on their own. Each virtual bank party will reach TTP as an individual bank and TTP will not know the actual bank which created the virtual bank party. The probability of hacking data of virtual bank party Bir is
P VBir
1
(1)
n
¦k i 1
i
Fig. 5. Graph between number of Parties (x axis) vs Probability of hacking(y axis).
The graph between number of virtual parties k vs. the probability of hacking P(Pr) for n=4 is shown in Fig. 4 which clearly depicts that probability of hacking is nearly zero when the number virtual parties is three or more. Also the graph between number of parties and probability of hacking for k=8 is shown in Fig. 5. As the number of virtual parties is eight the probability of hacking is in the order of 10-5 or we can say nearly zero. It is clear that if the number of virtual parties is increased in multiple then there is a significance change in security ratio. It depicts that we should increase the number of virtual parties in multiples to increase the security. Even if data of all virtual parties of a particular party is hacked it will
First Asian Himalayas International Conference on Internet AH-ICI2009 Hyatt Regency Kathmandu, KATHMUNDU, NEPAL 3- 5th of November, 2009
not breach the security. The data is encrypted and can only be used for computation and exact values can never be obtained from it. CONCLUSION Thus we have shown another application of the powerful VPP protocol in banking applications. We know that with the advent into the 20th century, the expansion of the World Wide Web has escalated banking to a new level. Tremendous opportunities for joint transactions have arisen in which multiple banks cooperatively conduct some computation. Security and privacy preserving measures are major issues regarding such cases as confidential data is concerned. We proposed a secure protocol for multi-party computations in which privacy of individual is preserved. We have corroborated that by creating fake data and distributing it among the generated virtual parties then sending this data along with modifier tokens to carry out computations on encrypted data using an improvised computation method, we can achieve zero hacking security. Hiding the identity of parties using anonymizer and reaching zero hacking security has been substantiated. The protocol and algorithm are highly scalable and optimized for computations of surveys, banking, business etc. Encryption methods have been built for certain common functions and the process of generating modifier tokens for a collective method has been shown. SMC’s are used for many big surveys and large scale statistical calculations. With the use of VPP most of the statistical calculations and other computations can be performed without revealing the data to other parties and even to the third party. It can allow us to reach zero hacking security for a wide variety of applications. Using this protocol and algorithm a wide variety of computations can be optimally performed with enhanced security and privacy. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9]
Yao Andrew C., “Protocols for secure computations,” Proc. of 23rd Annual Symposium Foundations of Computer Science, pp. 160-164. Mikhail A., Marina B., Jiangtao L., Keith F., Mercan T., “Private collaborative forecasting and benchmarking,” Proc. of the 2004 ACM workshop on Privacy in the Electronic Society, 2004. Mikhail A., Marina B., Jiangtao L., Keith F., Mercan T., “Private collaborative forecasting and benchmarking,” Proc. 2004 ACM workshop on Privacy in the electronic society, pp. 103–114, 2004. Wenliang Du, Zhijun Zhan, “A practical approach to solve secure multiparty computation problems,” Proc. of the New Security Paradigms Workshop, 2002. Linda M.N., Johnny W., “A unified approach for multilevel database security based on inference engines,” Transaction of ACM New York, Vol. 21, Issue 1, Feb 1989. Wenliang D., Atallah M.J., “Privacy-preserving cooperative scientific computations,” Proc. 14th IEEE Computer Security Foundations Workshop, Jun 11-13 2001, pp. 273 – 282. Ran C., Uri F., Oded G., Moni N., “Adaptively secure multi-party computation,” Proc. The 28th annual ACM symposium on Theory of computing. Mikhail J.A., “Secure and Private Sequence Comparisons,” Proc. The 2003 ACM workshop on Privacy in the electronic society, 2003. Atallah, M.J., Elmongui H.G., Deshpande V., Schwarz L.B., “Secure supply-chain protocols,” Proc. IEEE International Conference, ECommerce, 2003.
[10] Ueli M., “The role of cryptography in database security,” Proc. The 2004 ACM SIGMOD international conference on Management of data, 2004. [11] Rakesh A., Ramakrishnan S., "Privacy-Preserving Data Mining," Proc. The ACM SIGMOD Conference on Management of Data, 2000. [12] Mishra D.K., Chandwani M., “Extended protocol for secure multi-party computation using ambiguous identity,” WSEAS Transactions on Computer Research, Greece, Vol. 2, No. 2, pp. 227-233, Feb. 2007. [13] Mishra D.K., Chandwani M., “Arithmetic cryptography protocol for secure multi-party computation,” Proc. Of IEEE SoutheastCon 2007: The International Conference on Engineering – Linking future with past, Richmond, Virginia, USA, pp. 22-24, 22-25 Mar. 2007. [14] Mishra D.K., Chandwani M., “Anonymity enabled secure multi-party computation for Indian BPO,” Proc. of the IEEE Tencon 2007: International conference on Intelligent Information Communication Technologies for Better Human Life, Taipei, Taiwan, pp. 52-56, 29 Oct. - 02 Nov. 2007. [15] Rohit P., Satyadhar J., "Secure Multi-party Computation Using Virtual Parties for Computation on Encrypted Data," Advances in Information Security and Assurance, Springer Lecture Notes on Computer Science, Springer Berlin / Heidelberg, Vol. 5576/2009, Jun. 2009, DOI=10.1007/978-3-642-02617-1_42. [16] Rohit P., Satyadhar J., “Secure Multi-party Computation Protocol for Defense Applications in Military Operations using Virtual Cryptography,” Contemporary Computing, Communications in Computer and Information Science, Springer Berlin Heidelberg, Vol. 40, 17-19 Aug. 2009, DOI=10.1007/978-3-642-03547-0_37. [17] Rohit P., Satyadhar J., “Secured Communication for Business Process Outsourcing using Optimized Arithmetic Cryptography Protocol based on Virtual Parties,” Contemporary Computing, Communications in Computer and Information Science, Springer Berlin Heidelberg, Vol. 40, 17-19 Aug. 2009, DOI=10.1007/978-3-642-03547-0_20. [18] Rohit P., Satyadhar J., “Secure Multi-Party Computation Protocol for Statistical Computation on Encrypted,” Proc. of 2009 International Conference on Software Technology and Engineering (ICSTE 2009), 2426 Jul. 2009.