Proxy Restrictions for Grid Usage ´ Joni Hahkala1 , John White1 , and Akos Frohner2 1
2
Helsinki Institute of Physics, FIN-00014 Helsingin Yliopisto, Finland {Joni.Hahkala,John.White}@cern.ch CERN European Organization for Nuclear Research, CH-1211 Geneve 23, Switzerland
[email protected]
Abstract. The scale and power of Grid infrastructures makes them an inviting target for attack. Even if the Grid software is secure the Grid infrastructure is vulnerable via operating system vulnerabilities and misconfiguration. One of the worst results of the exploit of these vulnerabilities is user proxy credential compromise. This paper describes a pragmatic and simple way, using proxy certificate extensions, to mitigate the damage in case of credential compromise. The potential damage is limited by restricting the range of hosts that the credentials can be used to open connections to and be accepted from. This paper also describes a way to help investigate credential delegation problems.
1 1.1
Background Certificates
A user on a Grid must be trusted on all services and resources of the infrastructure that he accesses or uses. Therefore some form of credential must be presented to the Grid services in order for it to authenticate and eventually authorize a user to perform an action. Many of today’s Grid middleware systems authenticate users with Public Key Infrastructure (PKI) certificates that follow the X.509 [7] format. This X.509 certificate is used to establish the chain of trust that is needed for a user to access a remote Grid resource or service. In this identification scheme a user obtains an X.509 certificate from a recognized certificate authority (CA). In order to do this the user simultaneously generates a public and private key pair and puts the public key into a certificate request. The signed certificate request is sent to the CA and the identity of the requester is verified using an out of band method such as a telephone call to the requester’s employer. After the verification, the CA signs the request creating a certificate that can be used by the user in conjunction with their previously generated private key to prove that he is the person who requested the certificate from the CA. At this stage a trust relationship has been established between the Grid user and the CA. The Grid user has declared his identity to the CA, the CA has verified this and it has issued a credential asserting this fact. The services and N. Abdennadher and D. Petcu (Eds.): GPC 2009, LNCS 5529, pp. 48–56, 2009. c Springer-Verlag Berlin Heidelberg 2009
Proxy Restrictions for Grid Usage
49
users that trust the CA in question can now trust the owner of the certificate if he can prove he has the private key corresponding to the public key in the certificate. 1.2
Proxy Certificates
In order for a user to allow Grid services to act on his behalf, which is usually needed, the services must present the user’s credentials to a resource they need to access in order to be authenticated and authorized as the user. To first order, this can be accomplished by sending his CA-signed certificate and private key to the service giving it the power to authenticate as the user. This has an obvious security flaw in that anyone in possession, through interception, of this pair could pretend to be the user. This security problem is avoided by allowing the user to delegate their security rights to Grid services by using a different type of certificate. This is the proxy certificate which, as the name indicates, is an authorized credential derived from the original credential or intermediate proxies that is allowed to act on behalf of the Grid user. In order to generate a proxy certificate a new public/private key pair is generated and used to create a request for a proxy certificate. This request is subsequently signed using the private key of the user’s original CAissued certificate or by an intermediate proxy. Therefore, the proxy certificate is a credential that contains the identity and provenance of the owner. These certificates usually have a short validity time, usually 24 or 12 hours as opposed to CA-issued credentials that have a lifetime of months. As the proxy certificate is generated locally by the owner of the credential and the private key it is based on and not by a CA, the trust is provided by the original user and possible intermediate proxy credentials. As the private key of previous credentials is used for the signing, the chain of trust is extended from the CA to this proxy credential. In theory this proxy chain can be extended infinitely, but in practice software implementations and size constraints limit the number of proxies in a chain. In this document extensions to the basic proxy certificate, as can be currently used in the EGEE gLite middleware, are discussed.
2
Problem Definition
The main usage of proxy credentials in a Grid is the authorization of the user’s rights through the job submission system to the worker node (WN) of the cluster where the user’s actual computation takes place, and the data access from the WN. To achieve this without transferring the full credentials with private key over the network, thus exposing it to attacks, the credentials have to be delegated through the job submission system. Each move between services entails at least one delegation and thus one more proxy certificate in the certificate chain. As the services need to use the credentials for authentication on behalf of the user, the private key in the proxy credentials has to be in plain text, unencrypted
50
´ Frohner J. Hahkala, J. White, and A.
and thus unprotected. This leads to the vulnerability that if any machine or service in the job submission chain is compromised, the attacker can harvest these proxies that pass through or are stored on that host. The vulnerability is especially apparent on the WN where code submitted by the user is run or at least a program is run with input data provided by the user. A rogue Grid user has more possibilities to exploit vulnerabilities on that WN host and access the credentials of other jobs running on that host. Furthermore, if a shared file system is used, possibly all credentials in the shared file system and thus in the site can be compromised. Another possibility is to start processes on the WN that harvest incoming credentials, unless rigorous cleaning of processes is done after the job ends. After the attacker has obtained credentials, he can impersonate the owner of the credentials and e.g. access the owner’s files and run jobs under the owner’s identity. These problems are already mitigated through some limitations on proxies. Firstly, the proxies should have a short lifetime, typically they are limited to 24 hours, but for various reasons this limitation is sometimes ignored and longer proxies are used e.g. a week or 10 days. Another limitation is to use a limited proxy. When a proxy is delegated to the WN with the user’s job it is marked in the process as limited proxy. Subsequent job submissions using limited proxies are refused at the workload management system (WMS) and computing elements (CEs). Thus proxies harvested from a WN cannot be used to submit more jobs, but they can still be used to access data and other services. Limited proxies can still be used by an attacker by submitting jobs to another site using their own credentials and then within those jobs use the compromised credentials to access data and do other damage.
3
Proxy Restrictions
To mitigate the possibilities for causing damage with compromised proxies, the amount of damage that can be caused by compromised proxies and to control the compromised proxies two proxy restrictions will be described. 3.1
Source Restriction
This restriction is used to limit the source network address space from which the proxy can be used for authentication when opening connections. If a proxy is limited using this extension to an address space, Grid services will only accept the proxy as authentication credentials when the connection is opened from that specific address space. When the address spaces where the proxy will be delegated to in the future are known this restriction can be used to limit the credentials and the further delegations derived from these credentials to only work from those address spaces. As the restriction is inherited, meaning that if there is this extension a proxy certificate in the chain the further proxies derived from this proxy are also restricted. Therefore, the restriction cannot be bypassed by generating a new sub-proxy.
Proxy Restrictions for Grid Usage
51
This extension should, at least in the beginning be non-critical to allow gradual migration even on the production services. After enough systems support it, it should be marked critical to make sure the restriction is honored. Two examples of usage of this type of restriction: 1. A job is submitted by the WMS into a CE in a site. If it is known that the job runs only in the CE site without needing to delegate rights to any service to act on behalf if it and that the job will stay in the site, the proxy can be restricted to the CE site during the delegation of the credentials from the WMS to the CE. If the CE site is compromised, the credentials harvested from these hosts can only be used on that site. Thus, if there is a compromise in a site, isolating the site from the rest of the Grid will ensure that the problem is isolated. After waiting the maximum allowed proxy lifetime and after the centre has fixed the damage and cleaned the machines, the site can be taken back to the Grid with knowledge that the compromised credentials have expired and are thus unusable. 2. A job is submitted by a CE into the WN. Again, if it is known that the job only operates from that WN and does not need to delegate rights elsewhere, this restriction can be put into the credentials during the delegation from the CE to the WN thus tying the credentials to that specific WN. This way if the WN is compromised, it can simply be taken offline stopping possibility of abuse of the credentials. After the vulnerability has been fixed and after waiting that the compromised credentials have expired, the WN can be safely returned to use. Without this restriction, compromised credentials may be used anywhere in the Grid and thus the only way to recover from the incident is to ban all the users that had credentials in the compromised machines, which can be hard to determine. Thus, this restriction greatly eases the recovery from an incident and allows the users to continue working using other sites. 3.2
Target Restriction
This restriction is used to limit the target network address space where the proxy can be used to authenticate connections to. If a proxy is limited using this extension to an address space, the services will only accept the proxy as an authorized credential when the service resides in that specific address space. As above, when the address spaces where the proxy will be delegated to and what services need to be contacted using these credentials are known, this restriction can be used to limit the further delegations derived from these credentials to only work as credentials in those address spaces. Also as above the same considerations for criticality of this extension has to be taken into account. An example of the usage of the extension: If a CE knows the storage elements that a user’s job will contact to get the data it needs and the possible storage elements (SEs) where the job will store the results, it can, during the delegation to the WN, limit the proxy to be acceptable only when contacting those services.
52
´ Frohner J. Hahkala, J. White, and A.
This way if the credentials are compromised, they can only be used to contact those specific storage elements. The other Grid storage elements and services outside of the address spaces are protected and the attacker cannot cause damage to them using these credentials. Thus the possibility for damage is greatly reduced. 3.3
Identifiers
The object identifiers (OID) for these restrictions have been requested from the International Grid Trust Federation and await approval. The requested OIDs are iGTFProxyRestrictSource (1.2.840.113612.5.5.1.1.2.1) for the source restriction and iGTFProxyRestrictDestination (1.2.840.113612.5.5.1.1.2.2) for the target restriction. 3.4
Data Structure
Both of the extensions need to define a list of 0 to n namespaces containing a single internet protocol (IP) address or IP address range. RFC 5280 [2] already defines a data structure with similar properties, the NameConstraints extension. Instead of defining a new data structure and implementing it the NameContraints data structure is used and the software implementations for it are reused. The data structure contains a list of permitted subtrees and a list of excluded subtrees. The lists consist of GeneralName structures that each define a single address space. Use of three possible fields of the GeneralName have been considered: the dNSName, iPAddress and uniformResourceIdentifier. The iPAddress is a strongly preferred solution as that avoids the interaction with the DNS system when verifying the IP address with the extension. This way speed is increased by avoiding the calls to the DNS system, the reliability is increased as a crashed or slow DNS server does not affect the verification. Also the possibility to use DNS spoofing attacks is removed. For example when the WMS submits a computing job into a CE in a site, it knows that the job will run in that site and that the proxy will not need to be delegated into another site. Therefore, as the WMS knows that the network addresses of the site are in the address space e.g. 137.138.0.0/16, it can add the source restriction using iPAddress field in the GeneralNames structure and this address space. When a service is contacted and a proxy certificate chain containing this restriction is presented as credentials, the service can find this restriction and reject the authentication attempt if the connection was opened from any other network address than the one allowed in the restriction.
4
Proxy Tracing
The proxy delegation tracing extensions [4] are used to add to the proxies the information where they were delegated from and where they were delegated to. Currently, when a proxy is examined, there is no way to tell where the proxy
Proxy Restrictions for Grid Usage
53
has been and how it got here, except looking at all the delegation service logs and comparing the timestamps of the delegation logs. These extensions make the tracking of the delegations through the delegation chain a trivial task and thus help to resolve the problems. Each time there is a delegation, a new proxy is created. During this process, the source and target service unified resource identifiers (URIs) [1] are added to the proxy using these extensions. This way, whenever a proxy is looked at, the path it has traveled via delegation can be seen. This helps in debugging possible problems with credentials. It can also be used informally when investigating compromises. But it is trivial to fake long additional delegation chains that do not reflect the true delegation path and thus implicate services that have not been compromised. This can simply be done by generating new proxies locally and putting the URIs of those services as the source or target service. Thus, the extensions are not fully useful for auditing and compromise investigations as the information in them is not trustworthy. The tracing extensions can be used to reject certificates that have passed through a compromised site. When a proxy is delegated into a site, the new delegated proxy contains a trace that was put there by the previous service that delegated it, and the services in the site that receives it does not have means to remove it. Thus, assuming a situation where the previous service is not compromised and the site is, the proxy that passed through the site contains the trace that it has been in the site and can thus be rejected as possibly compromised. There is, though, the proxy renewal service [8], that can defeat this tracing unless the renewal service carefully re-applies the trace extensions to the new proxy. The OIDs for these extensions have been approved by the IGTF. The OIDs are iGTFProxyTracingIssuerName (1.2.840.113612.5.5.1.1.1.1) that determines where the delegation came from and the iGTFProxyTracingSubjectName (1.2.840.113612.5.5.1.1.1.2) that defines where the proxy was delegated to. The data structure of the extensions is the GeneralNames ASN.1 structure as used in e.g. RFC 5280 [2]. While iPAddress and dNSName can be used, the URI field is the most useful of the choices. As there can be several services running in a host, even in the same TCP port, each with their own delegation system, the URI can identify the exact service that performed the delegation and thus it is the only choice that provides the exact and correct information. For the user client to the service delegation step a special URI format has been proposed. The URI is formed by using the client program name as the scheme and, as normally for HTTP URLs, the machine DNS name as the authority. The URI can then still be further completed by using the user account as the path. But the user account must be omitted e.g. in case a pseudonymity service [6] is used. A delegation from a WMS client to a WMS service would add both the iGTFProxyTracingIssuerName and iGTFProxyTracingSubjectName extensions to the proxy being generated to identify the client and the service. The iGTFProxyTracingIssuerName and iGTFProxyTracingSubjectName could have the URIs, for example, of “WMS-client://clienthost/johndoe” and “WMS-server://wmshost:8443/examplewms” respectively.
54
´ Frohner J. Hahkala, J. White, and A.
To avoid the problem of the possibility to fake delegations locally a solution has been proposed. This solution would add a signature to this extension made by the host or service credentials. Adding this signature would prove that the proxy actually passed by that host. This way the tracing extension could be used for auditing and investigating compromises. But, so far the increase of proxy size caused by the additional signature and the performance hit caused by the signature calculations have been considered too big compared to the advantages brought by adding it. Using the signature algorithm and signature structure in RFC 5280 [2], the signature with 2048 bit RSA key and SHA1 would add 260 bytes without counting the additional data structures needed to include it, while the actual extension without signature is 27 bytes for the data structure overhead and OID in addition to the bytes the URI string takes to encode (39 bytes in the WMS client example above).
5
Discussion
In the current Grids the proxies are limited when they come to the WN. Thus, the proxies that get compromised in the WN, which is the most vulnerable place for the proxies, cannot be used to submit jobs. That means that the biggest possibility of launching a massive number of jobs, which use compromised credentials to attack some services, is already prevented. But if a CE or WMS is compromised, the bigger number of credentials in these services can be used to launch attacks limited only by the capacity of the system. Even if the proxies are limited as in the WN, they are still usable to access the SEs, information systems and other services and thus the potential for damage is great. The most valuable of these vulnerable services are the data management (DM) systems as the data in the infrastructure is the most valuable part. Thus, the implementation of these limitations should be started from the DM services giving the best return for the effort and biggest additional protection. The RFC 3820 [10] defines an proxyInfoExtension with the possibility to define own policy languages. But this cannot be used for this work for two reasons. First, many infrastructures like EGEE still use legacy proxies and do not support the RFC 3820 proxies yet. Thus they do not understand this extension and adding this extension, which has to be critical by definition, would make the proxies fail as they fail the critical extension checks. Second, defining a policy language and using it would mean that the default inherit-all and limited policies could not be used. But these policies are the only ones implemented and supported in the current systems. Thus, using a policy language would break the interoperability with existing systems. UNICORE [5] jobs, including input data, are signed by the client’s private key providing protection from changes in the job description by intermediate gateways and avoiding the delegation of a proxy certificate at all. In respect of protection against hijacked credentials UNICORE is clearly in advantage.
Proxy Restrictions for Grid Usage
55
The SAML-based authorization token system [9] preserves the pre-defined nature of the UNICORE system (the user has to pre-scribe which resources can be accessed and which actions can be invoked). However it already empowers the submitted job to act on the user’s behalf, for example to access remote files. The other end of the palette is the Globus style delegation model [3] with full identity delegation through proxies. It allows the usage of highly dynamic jobs, where for example the input and output data does not have to be submitted with the job but can be dynamically accessed on demand from any convenient location using the proxy credential. Our approach tries to find a middle-ground among these models by balancing security with the ease of usability. It provides protection against the most obvious credential hijacking problem of full delegation. Furthermore, these restrictions would be automatically introduced by the job submission and delegation chain, with no explicit action from the user, but still with almost the same level of protection that a SAML-based system can provide.
6
Conclusion
The extensions in this paper greatly reduce the damage caused by a compromised machine, a virtual machine or a site. The extensions also help in management and recovery from a compromise providing a way to avoid user and Virtual Organization (VO) banning and thus makes the Grid more resilient to interruptions. The tracing extension is useful for debugging and a way to make it valid for auditing and compromise investigation is described.
References 1. Berners-Lee, T., et al.: Uniform Resource Identifier (URI): Generic Syntax. IETF RFC (January 2005), http://www.ietf.org/rfc/rfc3986.txt 2. Cooper, D., et al.: RFC 5280 Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile, IETF RFC (May 2008), http://www.ietf.org/rfc/rfc5280.txt 3. Demchenko, Y., Mulmo, O., Gommans, L., de Laat, C., Wan, A.: Dynamic security context management in Grid-based application. Future Generation Computer Systems 24(5) (May 2008) 4. Groep, D.: OID for Proxy Delegation Tracing, International Grid Trust Federation OID registry (February 28, 2008), http://www.eugridpma.org/documentation/OIDProxyDelegationTracing.pdf 5. Goss-Walter, T., Letz, R., Kentemich, T., Hoppe, H.-C., Wieder, P.: An Analysis of the UNICORE Security Model, Open Grid Forum, Grid Final Document (July 18, 2003), http://www.ogf.org/documents/GFD.18.pdf 6. Hahkala, J., Mikkonen, H., Silander, M., White, J.: Requirements and Initial Design of a Grid Pseudonymity System. In: Proceedings of the 2008 High Performance Computing & Simulation Conference (HPCS 2008), Nicosia, Cyprus, June 3-6 (2008)
56
´ Frohner J. Hahkala, J. White, and A.
7. ITU-T: X.509 Information Technology - Open Systems Interconnection - The Directory: Public-key and attribute certificate frameworks (August 2005), http://www.itu.int/rec/T-REC-X.509-200508-I 8. Kouril, D., Basney, J.: A Credential Renewal Service for Long-Running Jobs. In: Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing, November 13-14, 2005, pp. 63–68 (2005) 9. Snelling, D., van den Berge, S., Li, V.: Explicit Trust Delegation: Security for Dynamic Grids. Fujitsu Scientific & Technical Journal (FSTJ) - Special Issue on Grid Computing 40(2) (December 2004) 10. Tuecke, S., et al.: RFC 3820 Internet X.509 Public Key Infrastructure (PKI) Proxy Certificate Profile, IETF RFC (June 2004), http://www.ietf.org/rfc/rfc3820.txt