a framework for adaptive secure information system

0 downloads 0 Views 539KB Size Report
penalties. In this paper, we propose an adaptive intrusion ... In response to the potential threats due to cyber attacks .... and performance tradeoffs in the adaptive security coding .... number for each update access and integrate a check.
SDPS-2010 Printed in the United States of America, June, 2010 2010 Society for Design and Process Science

A FRAMEWORK FOR ADAPTIVE SECURE INFORMATION SYSTEM Liangliang Xiao, Yunqi Ye Department of Computer Science University of Texas at Dallas {xll052000, yxy078000}@utdallas.edu Raymond Paul Department of Defense [email protected] I-Ling Yen, Farokh Bastani Department of Computer Science University of Texas at Dallas {ilyen, bastani}@utdallas.edu financial institutions, transportation systems, etc., which can seriously disrupt critical operations and lead to catastrophic loss of lives and property. In response to the potential threats due to cyber attacks, intrusion detection systems (IDSs) have been actively investigated (Kemmerer, et al., 2002), (Sherif, et al., 2002), (Amoroso, 1999), (Lindqvist, et al., 2001). However, IDS is not a sufficient defense against intruders and hackers since attack strategies have continued to evolve to sidestep around IDS defense (Cholter, et al., 2002). As new IDS are developed, new attacks are invented to exploit the weaknesses of the newly developed defense systems. The so-called “red teams” invariably manage to penetrate systems that are protected using the most sophisticated firewalls and IDS (Kewley, et al., 2001), (Stavridou, et al., 2001). Consequently, many security experts recommend additional defenses such as intrusion prevention and tolerance (Cholter, et al., 2002), (Okena). Intrusion tolerance techniques have been actively investigated over the past two decades (Anjum, et al., 2000), (Deswarte, et al., 1991), (Goseva-Popstojanova, et al., 2001). A common intrusion tolerance approach is to use the (t, n) threshold secret sharing (TSS) schemes, which partition critical data objects into secret shares and scatter them among distributed processors (Shamir, 1979), (Rabin, 1989), (Krawczyk, 1994). Thus, intruders cannot compromise the data even if they successfully penetrate a subset of the processors (less than t). Also, the (t, n) threshold schemes can achieve high availability since the data can be reconstructed even as long as the users can accesss t out of the n shares. Moreover, with proper placement, the users can access the nearby shares in order to achieve good access performance.

ABSTRACT We are facing increasing threats and vulnerability on the Internet. Conventional intrusion detection techniques are not sufficient, especially for new attacks. Thus, it is necessary to adopt intrusion tolerance techniques. However, sophisticated intrusion tolerance techniques may provide better security, but also incur performance penalties. In this paper, we propose an adaptive intrusion tolerance framework to balance the tradeoff between security protection and performance overhead. We use intrusion detection systems to collect and evaluate the threat situations. Then, we develop multiple levels of protection mechanisms and the adaptive protocols that can switch protection mechanisms on-the-fly according to the threat level. 1 INTRODUCTION The growth of the Internet is dramatically revolutionizing many facets of modern society. A large amount of information in a broad range of knowledge domains is readily available at our fingertips. New emerging Internet-based business paradigms are transforming the way products are designed and manufactured to the way they are marketed and sold to consumers. The rapid evolution of the Internet is also changing the operations of government and educational institutions, with more and more activities being shifted to on-line access. These changes are improving the productivity and quality of our lives. However, these benefits are threatened by our increased vulnerability to cyber attacks. A determined enemy can remotely launch well-coordinated attacks on critical information infrastructures, such as electric power distribution, 1

well protected even if the system is compromised. Secure computation protocols have been investigated since early 1980’s (Beaver, 1989), (Cramer, 1999). However, these algorithms have excessive performance overhead. In this paper, we propose an adaptive framework to address the tradeoff problems in security protection and performance overhead. First we use IDS to collect and evaluate the threat situations in the system. Based on the threat information provided by the IDS, the data in the system are protected by the corresponding mechanisms. If the threat level is low, we use the protection mechanisms that can achieve better performance. When the threat level becomes high, we adapt the mechanism to offer better security protection. We develop adaptation protocols to support this approach and allow on-the-fly transformation of the protection mechanism. Access to partitioned data is more complex. To achieve efficient data accesses, we design a DHT (distributed hash table) scheme to enable nearby share retrieval. Then, we develop the corresponding access protocols. Our protocol is based storing multiple versions of data, and provides a very low access latency.

Although data partitioning technologies can provide intrusion tolerance, it also make the data access more complicated. For read accesses, the client must retrieve a sufficient number of consistent shares. Otherwise, the reconstructed data may not be valid (not any prior version of the data). For update accesses, if partial updates and concurrent updates are not properly handled, they may lead to serious share inconsistency in the system. Then not only the read performance will be affected, but also data loss may occur. There have been a lot of works considering access protocols for accessing partitioned data. Some of them (Cachin, et al., 2006), (Hendricks, et al., 2007) keep one version for each share on servers and use some atomic commitment protocol to prevent partial or malicious updates so that their update protocols are very expensive. Some of them (Zhang, et al., 2002), (Wylie, 2005) keep multiple versions for each share on servers so that the atomic updates are not necessary. But their read protocols are costly because the clients need to request shares from all the servers and may take a lot of rounds of share retrieval to find a sufficient number of consistent shares. Moreover, they bring the issues of stale version removal (SVR) to prevent unlimited storage growth. But they do not consider how to avoid the potential impacts of SVR on read requests. And most of the existing works assume that each server keeps a share for every data. So there is no directory management issue regarding where to find the data. But when there are a large number of servers, this assumption is too restrictive. Thus, directory management problem has to be considered. Intrusion tolerance techniques still have their limitations. They protect data at storage time and cannot protect them when the server needs to perform computation on them. With the increasing use of cloud storage and third party hosting, the limitation becomes a more significant problem. Thus, it is imperative to provide an ultimate defense, i.e., keeping critical information encrypted all the time. Then, even if the system is compromised, secret data are still well-protected. However, the problem is that in current practical systems, the encrypted data need to be decrypted in order to perform operations on them. Thus, the system needs to have the decryption key and critical data in their original (clear) form during the computation. Hence, attackers (external intruders or insiders) can monitor the system memory and CPU to successfully compromise critical data and keys. Secure computation provides a solution to the above problem and allows the system to operate on encrypted data without needing to decrypt them. Thus, all the critical information can stay encrypted throughout their lifecycles, including communication, storage, as well as computation time. Only when a human client with the appropriate access rights needs to see it, will the data be decrypted for display. With this technique, critical information will be

2 BASIC INFORMATION SYSTEM MODEL The architecture of the information system we consider is shown in Figure 1. The system consists of N servers that handle client transactions accessing critical data. Let S = {S1, S2, …, SN} denote a group of N servers. Each server Si has a unique server ID denoted as IDi , 1 ≤ i ≤ N. The servers host a large number of data objects. Each data d has a data ID denoted as IDd, and a security class to indicate the criticality of d. The system also has a coordinator server K which coordinates the operations of the servers in S during computation on data objects. Our secure information system design provides two layers of defense, including the IDS layer and the intrusion tolerance layer. In the IDS layer, The IDS units collaborate with each other to achieve global collaborative intrusion detection. System administrators can set the sensitivity of the IDS units. When the IDS unit detects a potential threat that exceeds a certain tolerable threshold, it reports the threat level and the server adapts its configuration and its secure computation protocols correspondingly. We consider four threat levels, i.e. “safe”, “suspicious”, “attacked”, and “compromised”. The threat level “safe” indicates that there are no attack activities in the system and the threat level “compromised” indicates that some servers have been compromised. The security of the system becomes severe when the threat level changes from “safe” to “suspicious”, from “suspicious” to “attacked”, and from “attacked” to “compromise”. In the intrusion tolerance layer we protect the data by multiple mechanisms according to the threat level as well as the security class of the data. We consider three

2

computation mechanism in data processing services layer to support the computation on data protected by the security coding scheme (also discussed in Section 3). In Section 4, we discuss the DHT (distributed hash table) scheme which is used to implement the directory service. Also, the corresponding access protocols are discussed. In Section 5, we conduct experiments to study the security and performance tradeoffs in the adaptive security coding mechanism. Section 6 presents the conclusion of the paper.

security classes for each data, i.e. “public”, “secure”, and “classified”. The security class “public” indicates that the data is accessible by any client, and hence does not need to be protected. The security class “secure” indicates the data should be protected, however considering it is not at top security level, we use the adaptive protect mechanisms to achieve the tradeoff between the security and performance. Specifically if the threat level is low, the data with the security class “secure” will be protected by the mechanism laying more stress on performance than security; if the threat level is high, it will be protected by the mechanism laying more stress on security than performance. The security class “classified” indicates that the data is top secret, and hence should be well protected in any circumstances. As a summary, more secure mechanisms will be adopted if the threat level becomes severe or the data has a high security class as illustrated in Figure 2. On the other hand, if the threat level decreases, the system will shift to less secure mechanism to provide better performance. Based on the adaptive intrusion tolerance concept, we design a framework that incorporates different protection mechanisms providing security and performance tradeoffs (as shown in Figure 3). The framework consists of four layers. The bottom layer is the data communication services and the IDS module. The communication service can provide unprotected as well as secure and/or reliable communications. The IDSs provides the first layer of defense as well as the threat level information. Above the communication layer is the secure storage layer, which provides secure storage services, including the directory services which is used to guide the placement and location of data objects; the security coding module which provides mechanisms for protecting data confidentiality; the integrity coding module which includes integrity protection schemes; and the access protocols which controls the read and update accesses to the data. Frequently, information systems provide various data processing capabilities, such as data mining, data analysis and reporting, and many other functions. The computation will be performed on the encoded data. The data may be decoded before computation or secure computation mechanisms can be used to directly operating on the encoded data to ensure the highest security. Various data processing services can be composed together to achieve more complex processing capabilities. The service composition layer provides composition services to compose existing data processing services to achieve challenging data processing tasks. In the following sections, we discussed the detailed mechanisms for implementing the adaptive information systems framework. In Section 3, we develop the security coding scheme in secure storage services layer and the corresponding adaptation protocol to adapt the coding scheme for different threat levels. Based on the security coding scheme, we further develop the secure

3 STORAGE PROTECTION AND SECURE COMPUTATION We develop the storage protection mechanisms in the secure storage services of the framework and the secure computation protocols in the data processing services of the framework in Subsections 3.1 and 3.2, respectively. Then we design the adaptation protocols for protection mechanisms and secure computation protocols in Subsections 3.3 and 3.4, respectively. 3.1 Storage Service Protection In the storage services each data is protected by using (t, n) threshold secret sharing (TSS) scheme where a data d is shared to n shares and the n shares are distributed to n servers. The detailed sharing process is shown in Figure 4, first a randomly polynomial f(x) with the degree less than t and the constant term equal to d is selected, then share f(IDi ) is computed and distributed to server IDi, 1 ≤ i ≤ n. The TSS scheme has the property that every t shares can be used to reconstruct d, and any t’ shares, t’ < t, infer no information of d. It worth to note that when t = 1, the TSS scheme becomes replication scheme, and when t > 1, the TSS scheme is general threshold secret sharing scheme. Based on the property of TSS scheme, larger t implies stronger protection of the data and smaller t implies weaker protection of the data. However larger t also implies the client needs to fetch more shares to reconstruct the data, and hence larger t means worse performance. Therefore we can adjust the protection level of the data by adjusting the t value to realize the tradeoff of the security and performance. For “public” data we set t = 1, i.e. we adopt replication scheme. For “secure” and “classified” data we set t > 1 and increase t value for increased threat level. Figure 5 shows the adaptation of t values for different threat level and security class. 3.2 Computation Based on the Storage Service We develop computation protocols on data based on the storage services. For “public” data, since they are replicated in plain form on servers, the host servers can execute computation directly on the data. When computation is required on “classified” data, since the security level of the data requires that they

3

adaptation protocol accordingly. If the threat level lowes to “safe”, the servers send their shares to the coordinator server K who reconstructs the data and process the computation. If the threat level is raises, the coordinator server K shares the (intermediate) computation result by (t, n) TSS scheme, sends them to the corresponding servers, and deletes the (intermediate) computation result. Then the servers execute the computation by MPC protocol. We summarize the adaptation protocol in Figure 8.

should be well protected in any circumstances, it is not secure to reconstruct the data and then process the computation on them. In other words, the computation should be process directly on the shares. Therefore the host servers execute multiparty computation (MPC) for high security guarantee. In (Gennaro, 1995), Gennaro introduced an MPC protocol based on (t, n) TSS scheme. Without loss of generality, we consider the addition and multiplication operations on two data, d1 and d2, that are secret shared using polynomials f1(x) and f2(x). Note that both f1(x) and f2(x) are of degree at most t−1 and their coefficients can be different. Without loss of generality, let d1,1, …, d1,n and d2,1 …, d2,n denote the n shares of d1 and d2, respectively, and d1,i and d2,i, for all i, are assigned to host Si . To obtain the sum or product of d1 and d2, each host simply adds or multiplies d1,i and d2,i and the resulting polynomials become f1(x)+f2(x) or f1(x)⋅f2(x). In the multiplication case, the resulting polynomial f1(x)⋅f2(x) is of degree at most 2(t−1). Let si = d1,i⋅d2,i denote the result share of the operation d1⋅d2. To convert f1(x)⋅f2(x) back to an at most t−1 degree polynomial, each si is re-shared into shares sij and sends sij to host Sj. A set of predetermined ratio ri, derived from IDi, for all i, is used to add together the new shares, i.e., final share sj’ = Σi ri ⋅sij. This MPC protocol has an O(n2) communication overhead and we summarize it in Figure 6.

4 ACCESS PROTOCOLS Though TSS can assure the data confidentiality and availability, it also brings some data access issues. First, the clients must be able to locate all the data shares since the update accesses should update all the n shares while the read accesses must retrieve at least t shares. We present a DHT (distributed hash table) scheme that can resolve the data location problem in Subsection 4.1. Second, since at least t shares of some version is required to reconstruct the data, update accesses must ensure that there are still at least t shares of some version after they make changes to the system. At the same time, read accesses should be able to retrieve t consistent shares efficiently no matter the concurrency rate. We discuss the issues of share based accesses in detail in Subsection 4.2 and give protocols to address them in Subsection 4.3.

3.3 Adaptive TSS

4.1 DHT

In our storage services, the data is protected by (t, n) TSS scheme. As discussed in Section 3.1 the t value can be adapted according to the threat level. Suppose that a data d is protected by (t1, n) TSS scheme. We assume that the threat level changed so that the protection of the data d should be adapted to (t2, n) TSS scheme accordingly. We design the protocol to adapt the protection of data d from (t1, n) TSS scheme to (t2, n) TSS scheme. It is not secure to reconstruct the data d in the adaption protocol. Therefore instead of first reconstructing d and then sharing d method, we use re-sharing technique (Gennaro, 1995) to adapt the TSS schemes. First each share of (t1, n) TSS scheme is re-shared by (t2, n) TSS scheme and allocated to the n servers, after that each share of (t1, n) TSS scheme is deleted and the share of (t2, n) TSS is reconstructed. The details are illustrated in Figure 7.

Directory service is an important part in distributed storage to enable data location and retrieval. There have been various approaches for provide directory service. Centralized directory services are widely adopted. But the centralized approaches may suffer from the single-pointfailure and cannot scale well when the number of nodes continue increasing in the system. Distributed directory services such as (Douceur, et al., 2006) generally have very high maintenance cost to keep the directory data consistent, correct and updated. Peer-to-Peer (P2P) system can eliminate the requirement of directory service. In unstructured P2P systems such as Gnutella (Risson, et al., 2006), the data search is done by flooding query or random walk that are costly and cannot guarantee that a data could be found even if it really exists. Structured P2P systems such as PAST (Druschel, et al., 2001) that use DHT schemes to maintain structured overlay network and implement efficient data searches while guarantee that a data can always be found as long as it exists in the system. In an (t, n) TSS scheme, each read request involves the accesses to at least t data shares. DHT schemes such as Chord (Stoica, et al., 2001) and Pastry (Rowstronl, et al., 2001) require O(logN) hops for each share lookup. Thus, accessing t shares may incur high access traffic. So we adopt the one-hop lookup scheme (Gupta, et al., 2003) for efficient share lookup. In this scheme, each server keeps

3.4 Adaptation Secure Computation As discussed in Section 3.2 the computation protocol for the “classified” and “public” data are non-adaptive. However the computation protocols for the “secure” class are adaptive to the threat level of the system. If the threat level is low, we use a coordinator server K to process the computation to raise the performance. If the threat level is high, servers will execute MPC protocol to guarantee the security of the data. Therefore it requires to design the

4

multiple versions of each share to address these problems (Different versions of the same share form a share history). In this way, partial updates and concurrent update-read accesses cannot affect the data availability on nearby servers since the client can manage to read consistent shares of previous versions. However, if the protocol is not carefully designed, the read access may take great effort on finding the latest available version. Consider an example using (3, 5) TSS scheme. Assume that when a client request come to the nearest 3 servers, the three share histories follow the share arrival order and are , and respectively. If the client tries to find the latest available version before retrieving a specific version of the shares, two rounds of message exchanges between the client and the servers are always required. If the client requests the latest version and then previous versions one by one, then the read access may complete in one round of share retrieval when the latest version of shares on different servers are consistent. But in this example, it requires three rounds of retrieval to get 3 consistent shares. We try to improve the access performance in two aspects. First, we use totally ordered version numbers to define the update order (including concurrent updates) and maintain the multiple versions of one share in version number order. In this way, the probability of retrieving consistent shares in first round of share retrieval can be increased significantly. For example, all the three servers should contain the share history (assume that v1 < v2 < v3) in the last example and only one round of share retrieval can complete the read. Second, in the first round of retrieval, each server also returns all version numbers in the share history to the client. Thus, once the client does not get t consistent shares, it can analyze the version numbers received, find the latest available version and request the specific version of shares in the second round. Thus, the read access can complete in at most two rounds of retrieval. But when storing multiple versions, it is necessary to remove stale versions to avoid indefinite storage space growth. First of all, stale version removal (SVR) protocols must guarantee that there are no data losses. But even if there are no data losses, the read access may still be affected due to the race condition between concurrent SVR and read requests. For example, the client may try to retrieve a specific version with a sufficient number of shares, but this version may be removed by SVR since a new complete version becomes available before the retrieval requests arrives to the server. Such aggressive removal problem generally happens due to the conflicting operation between SVR and the second round of share retrieval (with specific version). So the SVR protocol should prevent the SVR from removing the potential version that an incomplete read may request to guarantee wait-free (Herlihy, 1991) reads.

the full routing table of the system so that the search can complete in one hop. For a read request, any t of n shares can be used for data reconstruction. Since the communication cost between a client and different server may differ significantly, it is desirable to only retrieve shares from nearby servers. To support such nearby share retrieval, we assign location information (e.g. longitudes and latitudes) to each server and also store them in the routing table. Thus, the geographically closest share holding servers can be calculated conveniently. Although the geographic distances do not absolutely correlate to the real communication costs, they can be used to establish a lower bound for the latencies (Zhang, et al., 2005). Thus accessing the geographically closest shares can yield good performance. 4.2 Issues in Data Share Based Access Protocols When using (t, n) TSS scheme, the client must reconstruct data from consistent shares of the same version, otherwise, the reconstructed data is invalid. To help the client identify consistent, we assign a version number for each update access and integrate a check vector to the version numbers. The check vector is a vector of hash values of each share that can be used by servers to verify the share integrity and by clients to assure correct data reconstruction. Since update accesses make changes to the system, it must guarantee that after the change, the data are still available, i.e., there are no data losses. However, when using shared data and keeping one version of shares on each server, partial updates (due to client failures during update process) may lead to data losses. Consider an example using (3, 5) TSS scheme and there are two partial updates u and w on data object d. Assume that u only manages to update the first two shares while w only writes the second two shares to the system. Then after u and w complete, we cannot find any 3 shares of the same version for data reconstruction. Even if there are no data losses, the client may not find t consistent shares from the nearby servers and have to request additional shares from servers that are farther away. This can degrade the performance significantly. Even if there are no partial updates, concurrent update-read accesses may also cause the same problem. Consider an update u on data object d. Assume that the version number of u is vu and the previous version number of d is vp. If some servers have updated their shares to version vu and the others have not received u yet, then, for this period of time, a read request may get some shares of version vu and other shares of version vp from nearby servers. Thus the benefits of nearby share retrieval may be lost. Using atomic commitment protocol can address the partial update problem, but cannot handle the problem caused by concurrent update-read. We consider storing 5

duration can be adjusted to balance communication overhead for SVR and the storage space required for keeping the share history. Also, to avoid aggressive removal, we should identify the potential version for the second round of retrieval of an incomplete read access. The incomplete read access list Si .irl on each server Si records the version number of retrieved share in the first round for each incomplete read access (these read accesses are concurrent to the SVR). Thus, for the same RID, if we select the minimal version number vm in all Si.irl (1≤i≤n) and a stable version vs that is less than vm, then in the second round of retrieval, the specific version will be no less than vs. Therefore, removing versions less than vs will not affect the read access RID. So the SVR servers also collect the smallest version numbers from all Si.irl and determine a global minimal version number vgm among them. Then assume that there is a stable version vgs that is less than vgm, removing versions less than vgs will not affect any incomplete read accesses. The SVR servers’ determination results are sent to each server. And correspondingly, each server performs the stale version removal based on the majority determinations from the SVR servers. Figure 11 gives the pseudo code of the SVR protocols. Since our SVR protocol does not affect any concurrent read accesses to it, we can guarantee wait-free reads. Furthermore, the SVR protocol can also be executed on demand when the system is idle without affecting the access performance.

4.3 Protocols Based on the above discussion, we present a group of protocols for update, read and SVR to support efficient data access and safe stale version removal. Update protocol. When a client updates a data d, it first discovers the n servers that are assigned to host the shares of d from the DHT algorithm. Then it contacts all these servers to get their current timestamps. After receiving the responses from all servers (timeout on failed servers is considered as special responses), the client finds maxTS, which is the highest timestamp among the responses. Then the client can create a new version number accordingly. Thus, the version number order can exactly reflect the real update order. If two concurrent updates get the same maxTS value, then the client ID can be used to ensure the total order. Then the client constructs n shares and sends the update request to the servers. Each server maintains an ordered share history for each data to store the received shares. The pseudo code for the update protocol is given in Figure 9. Read protocol. When a client requests a data d, it generate a unique read access ID (RID) and contacts all servers in Sr = {the nearest t+2f servers hosting shares of d} to get their latest version of shares. To assure that the read can complete within two rounds of share retrieval, the client also retrieve the version number lists from servers together with the latest version of shares. Each server Si maintains an “incomplete” read access list Si.irl = where vt is the latest version number of d when Si responds read RID, which will be used for the stale version removal protocol. If the client cannot get t consistent shares, then it tries to find a version number vh, which is the highest version number that appears in at least t version number lists so that it is available in at least t servers. Then the client requests shares with version number vh. After retrieving at least t consistent shares, the client can reconstruct the data. Besides this, it needs to notify all servers in Sr that this read access is completed. And correspondingly, each server Si removes the item with this RID from Si.irl. The pseudo code of the read protocol is given in Figure 10. Here the client does not wait for confirmations of the read-complete notification. So the read access latency will not be affected. SVR protocol. To perform safe SVR, it is necessary to identify the versions that contain a sufficient number of shares to ensure the data availability. We use stable version to refer such versions. Such stable version identification requires the knowledge of the update histories of all the n servers and cannot be done by one server independently. So for each data d, we select 2f+1 SVR servers to determine stable versions. They periodically collect update histories from servers and analyze them. If a version appears in at least n−f update histories, it is identified as a stable version. The period

5 EXPERIMENTAL STUDIES In the experiment we study the tradeoffs of t settings of (t, n) TSS scheme for the metrics of security, availability, access performance, and storage cost. In the experiments we randomly generates 3500 domains on a sphere with radius 3959 miles to simulate the nodes on the globe, each domain consists of 500 nodes, and select 2000 nodes out of 3500*500 nodes as the storage servers. We generate the shares of 20000 data of size 100KB and distribute them to the servers. The topology of the servers is generated by using Inet (Winich, et al., 2002). For security experiment, we consider the weaknesses (CWE), the attack patterns (NVD), and the relation between the attack pattern and weakness (i.e. the attack pattern and the weaknesses that it can exploit) provided by (NVD). We assign the weaknesses to the servers and assume that the adversary masters some attack patterns following zipf distribution. We assume that the adversary first randomly selects some domains and then selects the most vulnerable servers in these domains to attack. Let τ, 0 < τ ≤ 1, be the attack probability, i.e., the probability that a domain is attacked by the adversary. If a server selected by an adversary is compromised. Since it will be easier for the adversary to find and attack the other servers in a domain in which one or more servers are compromised, we

6

technique”, IEEE Wireless Communications and Networking Confernce, Vol. 3, 2000, pp. 1101-1106. D. Beaver, S. Goldwasser, “Multiparty computation with faulty majority”, IEEE Sympo. Foundations of Computer Science, Oct-Nov 1989, pp. 468-473. C. Cachin and S. Tessaro, “Optimal resilience for erasure-coded Byzantine distributed storage”, DSN′06, 2006. W. La Cholter, P. Narasimhan, D. Sterne, R. Balupari, K. Djahandari, A. Mani, and S. Murphy, “IBAN: Intrusion blocker based on active networks”, Proc. DARPA Active NEtworks Conference and Exposition, 2002, pp. 182-192. R. Cramer, I. Damgård, S. Dziembowski, “On the complexity of verifiable secret sharing and multiparty computation”, ACM Symposium on Theory of computing, May 1999, pp. 325-334. Common Weakness Enumeration (CWE), http://cwe.mitre.org/. Y. Deswarte, L. Blain, and J.-C. Fabre, “Intrusion tolerance in distributed computing systems”, IEEE Symposium on Research in Security and Privacy, 1991, pp. 110-121. John R. Douceur and Jon Howell, “Distributed Directory Service in the Farsite File System”, OSDI′06, 2006. P. Druschel, A. Rowstron, “PAST: A large-scale, persistent peer-to-peer storage utility”, Proc. of the 8th Workshop on Hot Topics in Operating Systems. 2001. R. Gennaro, “Theory and practice of verifiable secret sharing”, Ph.D-thesis, MIT, 1995. K. Goseva-Popstojanova, Feiyi Wang, Rong Wang, Fengmin Gong, K. Vaidyanathan, K. Trivedi, and B. Muthusamy, “Characterizing intrusion tolerant systems using a state transition model”, DARPA Information Survivability Conference & Exposition II, 2001, pp. 211221. A. Gupta, B. Liskov and R. Rodrigues, “One hop lookups for peer-to-peer overlays”, HotOS′03, 2003. J. Hendricks, G. R. Ganger and M. K. Reiter, “Lowoverhead byzantine fault-tolerant storage”, Proceedings of 21st ACM SIGOPS symposium on Operating systems principles, 2007. M. Herlihy, “Wait-free synchronization”, TOPLAS′91, 1991. R.A. Kemmerer and G. Vigna, “Intrusion detection: a brief history and overview”, IEEE Computer, Vol. 35, No. 4, April 2002, pp. 27-30. D.L. Kewley and J.F. Bouchard, “DARPA Information Assurance Program dynamic defense experiment summary”, IEEE Trans. Systems, Man and Cybernetics, Vol. 31, No. 4, July 2001, pp. 331-336. H. Krawczyk, “Secret sharing made short”, CRYPTO '93", D. Stinson Ed., Lecture Notes in Computer Science, Vol. 773, Springer, Verlag, Berlin, 1994. U. Lindqvist and P.A. Porras, “eXpert-BSM: a hostbased intrusion detection solution for Sun Solaris”, Proc.

consider a higher attack probability for the situation. Let σ, τ < σ ≤ 1, be the attack probability of the servers in a domain x by an adversary y if at least one server in x is compromised by y. The security value is calculated as the ratio of the number of uncompromised data to the total number of data. For availability experiment, we assume that the nodes within a domain are all connected and the internal network does not fail. However any of the nodes may fail with probability pnf and the links on the generated topology may fail with probability pef. The availability value is calculated as the ratio of the average number of clients (represented by servers) who can reconstruct d, for all d, to the total number of clients. For access cost experiment, each client (represented by a server) issue z read requests where z follows uniform distribution in [0, 200] and the data accessed follow a zipf distribution. The read access cost is calculated as the average latency of all queries. Finally the storage cost is simply the total size of shares stored on all servers. We show the experiment results in Figure 12. As shown in Figure 12, (a) the security value increases when the t value increases, (b) the availability value does not change when the t value changes, (c) the access cost increases when the t value increases, and (d) the storage cost does not change when the t value changes. As a conclusion, increasing t value will benifit security but will loss access cost. Hence it is tradeoffs to set t value of (t, n) TSS scheme for security and performance. 6 CONCLUSION In this paper we develop the adaptive framework for secure information systems. The adaptation is based on the threat level of the system at current time and the security classes of the data to be protected. To avoid overengineering and provide optimal performance for specific security requirements, we use the (t, n)-TSS scheme to achieve adaptive data storage dependability and security. We design protocols to adapt the t values on the fly. When computation on the protected data is required, we also design adaptive computation schemes (i.e. computation can be either executed by a coordinator server or by n servers using MPC) to achieve best computation performance with the desired security protection. In the data storage, the directory management scheme and access protocols are also important elements that should be considered. We develop a DHT scheme to facilitate accesses to nearby shares. Also, we design the corresponding access protocols for efficient data update and read accesses without having the data loss problems. REFERENCES Edward Amoroso, “Intrusion Detection”, Intrusion.net Books, New Jersey, 1999. F. Anjum and A. Umar, “Agent based intrusion tolerance using fragmentation-redundancy-scattering 7

17th Annual Computer Security Applications Conference, 2001, pp. 240-251. National Vulnerability Database (NVD), http://nvd.nist.gov/. Okena, “The Okena StormSystem difference,” http://www.okena.com/okena_difference.html Michael O. Rabin, “Efficient dispersal of information for security, load balancing, and fault tolerance”, Journal of ACM, Vol. 36, No. 2, April 1989, pp. 335-348. John Risson and Tim Moors, “Survey of research towards robust peer-to-peer networks: Search methods”, University of New South Wales, Australia. February 2006. Antony Rowstron1, Peter Druschel, “Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems”, ACM International Conference on Distributed Systems Platforms. 2001. A. Shamir, “How to share a secret”, Communication of the ACM, Vol. 22, 1979, pp. 612-613. J.S. Sherif and T.G. Dearmond, “Intrusion detection: systems and models”, IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, 2002, pp. 115-133. V. Stavridou, B. Dutertre, R.A. Riemenschneider, and H. Saidi, “Intrusion tolerant software architectures”, Proc. DARPA Information Survivability Conference & Exposition II, 2001, pp. 230-241. Ion Stoica, Robert Morris, David Karger, et al., “Chord: A scalable peer-to-peer lookup service for internet applications”, ACM SIGCOMM. 2001. J. Winick, J. Sugih, “Inet-3.0: internet topology generator”, Technical Report CSE-TR-456-02, EECS Department, University of Michigan, 2002. J. J. Wylie, “A read/write protocol family for versatile storage infrastructures”, Doctoral Thesis in Carnegie Mellon University, 2005. Z Zhang and Q. Lian, “Reperasure: Replication Protocol Using Erasure-Code in Peer-to-Peer Storage Network”, SRDS′02, 2002. H. Zhang, A. Goel and R. Govindan, “An Empirical Evaluation of Internet Latency Expansion”, ACM SIGCOMM Computer Communication Review, 2005.

Fig. 2 Adaptive Protections

Fig. 3 Adaptive Framework for Secure Information. Input: d ∈ Zp, t, and n Share construction: Let f(x) = d + Σ1 ≤ i < t ri⋅xi where ri is randomly selected from Zp, 1 ≤ i < t. Share dj = f(IDj) where IDj is the ID of server IDj, 1≤j≤n Share distribution: dj is distributed to server IDj, 1 ≤ j ≤ n Fig. 4 Data Sharing Scheme

Fig. 5 Adaptive Storage Service Protection Mechanisms

FIGURES AND TABLES

Fig. 1 Information System Model

8

Initiation: d1 and d2 are shared to d1,1, …, d1,n and d2,1 …, d2,n, by (t, n) TSS scheme, respectively. The shares d1,i and d2,i are allocated to host server Si, 1 ≤ i ≤ n.

Update (d) : 1. Discover n servers hosting d with DHT 2. Retrieve current timestamps from servers Find maxTS Generate n shares of d Generate check vector and new version number v 3. Send each share corresponding server with v

Addition: for every i, server Si computes d1,i+d2,i Multiplication: for every i, server Si computes si = d1,i⋅d2,i, reshares si to si,j by TSS scheme, and sends si,j to Sj, 1 ≤ j ≤ n. For every i, server Si computes the final share sj’ = Σi ri⋅sij., where rj = −Πv≠j IDv/Π v≠u (IDj − IDv), 1 ≤ j ≤ n.

Fig. 9 Update Protocol Read(d): 1. Generate RID Discover Sr={t+2f nearest servers hosting d} with DHT 2. Retrieve latest version of share and version number list from servers in Sr Server adds (RID, vt) to incomplete read access list If the client get t consistent shares, then Notify servers to remove (RID, vt) Return reconstructed data Otherwise, 3. Find vh from version number lists received Retrieve share with version number vh Notify servers to remove (RID, vt) Return reconstructed data

Fig. 6 MPC Protocol Initiation: The ith shares of (t1, n) TSS scheme di is stored on server Si, 1 ≤ i ≤ n. For 1 ≤ i ≤ n, server Si shares di to di,1, …, di,n using (t2, n) TSS scheme and sends di,j to server Sj, 1 ≤ j ≤ n. After that, server Si deletes share di. For 1 ≤ i ≤ n, server Si re-assigns the value of share di = Σ1 ≤ j ≤ n rj ⋅ dj,j, where rj = −Πv≠j IDv/Π v≠u (IDj − IDv), 1 ≤ u ≤ n. Fig. 7 Adaptation Protocol for TSS Schemes Threat level lowes to “safe ”: (1) Servers send the shares to the coordinator server K (2) K reconstructs the data and processes the computation on the data

Fig. 10 Read Protocol SVR(d): At the end of each period, each SVR server: Collect update histories and smallest version numbers in incomplete read access lists from all the n servers Determine a stable version list Determine vgm and maximum vgs Send vgs to each server } Each server: Select vgs from the majority SVR servers Remove share versions less than vgs

Threat level raises: (1) K shares the (intermediate) computation result by TSS scheme, and sends them to the corresponding servers (2) K deletes all the (intermediate) computation result (3) The servers resume the computation based on MPC protocol Fig. 8 Adaptation Protocol for Secure Computation Protocols

Fig. 11 SVR protocol

9

1

Security

0.98 0.96 0.94 0.92 3

5

7

9

11 t

Fig. 12 (a) Security of the TSS Schemes Under Variant t (n=40, τ=0.01, σ=2⋅⋅τ).

. Fig. 12 (c) Access Cost of the TSS Schemes Under Variant t and n.

1

Availability

0.9998 0.9996 0.9994 0.9992 0.999 3

5

7

9

t

Fig. 12 (b) Availability of the TSS Schemes Under Variant t (n=40, pnf=0.01, pef=0.0001).

Fig. 12 (d) Storage Cost of the TSS Schemes Under Variant t and n.

10

Suggest Documents