Jul 24, 2006 - data" or even "which data is being used", new database providers could ... XML Signature â a standard to apply a digital signature operation to ...
This is not a peer-reviewed article. Computers in Agriculture and Natural Resources, 4th World Congress Conference, Proceedings of the 24-26 July 2006 (Orlando, Florida USA) Publication Date 24 July 2006 ASABE Publication Number 701P0606. Eds. F. Zazueta, J. Kin, S. Ninomiya and G. Schiefer
A security architecture for sharing distributed biodiversity databases Marcelo Succi de Jesus Ferreira1, Pedro Luiz Pizzigatti Corrêa1, Antônio Mauro Saraiva1
1 Agricultural Automation Laboratory - Escola Politécnica – Universidade de São Paulo (USP) Av. Prof. Luciano Gualberto, trav.3, n.158 - sala C2-56 São Paulo-SP Brazil 05508-900 {marcelo.succi, pedro.correa, antonio.saraiva}@poli.usp.br
Abstract. Nowadays, the main initiative to integrate biodiversity databases worldwide is leaded by the Global Biodiversity Information Facility (GBIF). Its solution is based on the DiGIR (Distributed Generic Information Retrieval), a set of protocols and standards, such as HTTP, XML, and UDDI. As biodiversity data has economic and scientific value, and DiGIR approach doesn't answer questions like "who is using the data" or even "which data is being used", new database providers could feel inhibited in joining GBIF system. This paper proposes a web services based architecture for shared biodiversity databases that provides some security services, as authentication and confidentiality, and considers other functional requirements as access control. The architecture is discussed and an implementation is suggested. Keywords. Biodiversity Information Systems, Security Architecture, GBIF, web services
Introduction Global Biodiversity Information Facility (GBIF) is an independent organization whose mission is making “world’s biodiversity data freely and universally available via the Internet” (GBIF, 2003). This objective has been accomplished by several programs, each one with a specific goal, e.g., digitization of biodiversity collections or the establishing of data standards for the exchange of biodiversity contents. A data portal has been built, consolidating data offered by several biodiversity data providers. As of May 2006, about 100 millions records from 174 different providers were available through this portal. Any Internet web user, after accepting a Data Use Agreement (DUA), can retrieve these data by using simple queries, or, if necessary, more advanced searches. Providers offer these data via DiGIR (DiGIR 2005), a set of protocols which allows several different databases by integrated as one virtual searchable collection. DiGIR, however, doesn’t supply basic security services, as authentication, for example, by itself. Thus, in a portal-providers system connected via DiGIR, biodiversity data providers aren’t able to say who is using data collected and organized by them. They are also clueless to which data are being searched. Biodiversity data has clearly commercial importance. GBIF (2003) illustrates this by showing a use of biodiversity data by a pharmaceutical chemist who is willing to find species that produces a “promising drug compound”. Its scientific value is also easily verifiable. Moreover, often biodiversity data are sensitive, e.g., the locality data of endangered species (Brook 2000), and in this way should be shared in a controlled manner. Nevertheless, providers can only rely on the DUA accepted by the portal visitor in order to protect their intellectual property rights and also to guarantee the correct use of retrieved data. These reasons lead some countries and organizations to be reluctant to share their holdings, and , in this way, could compromises GBIF’s objectives. This paper shows some mechanisms that could be used to provide security services to a web services based system and suggests an architecture based in the web services standards and protocols to share biodiversity databases, while providing services like authentication, confidentiality and access control. 4TH WORLD CONGRESS ON COMPUTERS IN AGRICULTURE
539
The rest of this paper is organized as follows. Section 2 presents web services security model and standards. Section 3 shows related work in securing distributed biodiversity databases, mainly in the GBIF framework. Next section shows the suggested architecture and possible implementation and section 5 concludes the paper and presents possible future work.
Web Services Security International Telecommunication Union -Telecommunications Standardization Sector (ITU-T) recommendation X.800 lists basic security services, among them (ITU-T 1991): •
Confidentiality, that guarantees that an exchanged authorized parts;
•
Authentication, the confirmation that a part in a association is the one claimed;
•
Integrity, the guaranty that data in a message has not be modified in any way;
•
Non-repudiation, which prevents the sender from denying have sent it;
•
Authorization, which grants rights to access a specific resource.
message is not disclosed to non
First four services are commonly implemented using a combination of public key cryptography techniques and some others processes ; authorization, in turn, usually needs an Authorization Control List that presents a relationship among entities (a person, machine, device, etc.) , resources and permissions. Web Services are a set of protocols , including XML, SOAP, UDDI and WDSL (Curbera, 2002) that constitute a way to allow the integration of different applications in a standardized manner. During the last years, web services have attracted much interest because of their flexibility. Security, nevertheless, has been an afterthought in Web Services development. After basic standards cited above have been matured, web services security has its moment. Some web services security approaches are based in XML extensions, such as: •
XML Signature – a standard to apply a digital signature operation to XML data. It assures data integrity and gives support to non-repudiation and authentication services. Differently of other digital signatures processes, XML Signature allows the a whole document to be signed, or only parts of documents to be signed.;
•
XML Encryption – a standard that offers mechanisms to encrypt parts of XML documents exchanges in a transaction, providing confidentiality;
•
XACML - eXtensible Access Control Markup Language, a specification that describes a general policy language to define access to XML documents, based on the object to be accessed, the subject who is requesting access and the nature of the access action – read or write, for example;
Other initiatives are also present . WS-Security , a standard proposed initially by companies such as Microsoft and IBM, has been adopted by OASIS (Organization for the Advancement of Structured Information Standards) a not-for-profit consortium which drives the development of some web services standards, WS-Security 1.1 was approved as a formal OASIS standard in February 2006. It includes (OASIS 2006), Security Assertion Markup Language (SAML), a framework for exchanging authentication and authorization information (“assertions”) between systems. In the next two sections , it is explained how some of these standards could help in the implementation of the cited security services. 4TH WORLD CONGRESS ON COMPUTERS IN AGRICULTURE
540
Related Works Hatala, Eap, Shah (2006) describes a solution to allow “users from the trusted organization to access learning objects based in their attributes in their home organizations”, and considers XACML to control access policies and SAML to exchange authentication/authorization information attributes between these organizations. Tolksdorf, Suhrbier , Langer (2005) present a concept to add security services into a biodiversity network system. In their study, security components, still to be developed, will be integrated in a architecture used in the BioCASE project (BioCASE 2006) . BioCASE protocols are based on XML, so XML Security standards XACML, XML Signature and XML Encryption standards are considered. XACML Components Policy Enforcement Point (PEP) and Policy Decision Point (PDP) are used. Requests are processed firstly by PEP and the decision about granting access or not is done by PDP (Tolksdorf, Suhrbier, Langer 2005) The work described in (Corrêa et al 2005) shows a system where a web portal is used to allow access to biodiversity databases kept by providers. Portal users are authenticated to a LDAP database which include users profiles. According to the profile and the action requested , write or delete for example, access to a specific register in a biodiversity database is denied or allowed. All these works offer a base to build a security solution to share biodiversity databases using web services. The suggested architecture extends the studies in (Corrêa et al 2005) by including some elements of security in web services, as shown in Tolksdorf, Suhrbier , Langer (2005), seeking to address the already related difficulties of providers in a biodiversity network system. In the same way that the work presented in Hatala, Eap, Shah (2006) , the proposed solution allows control access based on attributes of the user requesting access in a a web services interconnected system; however, the suggested architecture covers some specific necessities of the biodiversity domain, for instance, access control with desired grainularity e access log by the biodiversity data provider.
Proposed Architecture In the suggested structure (figure 1) , the users interacts with the system by using a portal. Biodiversity data is offered by providers, that has the following components: - coordinator : sends other provider components external requests. Also send results from internal processing to the portal or other providers; - authenticator : provide authentication service for system users registered locally; - access control : verifies user credentials and grant or deny the execution of a transaction; - transaction manager : executes the transaction in the biodiversity database - logger : logs data related to requests to and results from the provider There must be a provider with all its components for each separated biodiversity database. Access to the biodiversity database always is done via the coordinator component. The model used to control access is the RBAC (Role Based Access Control), more specifically RBAC0 , described in (Sandhu et al. 1996). In this model, basic elements are “users (U) , roles (R) , permissions (P) and sessions (S)”, as follows: - users: system users; - role or profile: a classification assigned to an user which represents the authority and responsibility assigned to the user associated to this role; 4TH WORLD CONGRESS ON COMPUTERS IN AGRICULTURE
541
- permission : the way how system objects or resources can be accessed; - session : time period in which the system is being employed in a continuous way by a determined user An user can have more than a role, and a role could have more than a permission. A user is who effectively exercises a permission. The roles are just a way to ease the access control administration. A formal definition of the relationship between users, roles, permissions and sessions is shown in (Sandhu 1996): " PA : permission assignment, a many-to-many permission-to-role assignment relation; PA ⊆ P x R; UA : user assignment, a many-to-many user-to-role permission assignment relation; UA ⊆ U x R ; User:S→ U, a function mapping each session si to the single user(si) (constant for each session's lifetime)" A function that maps a session to roles is also defined, roles : S →2R . In the suggested structure, PA is done by the Authorization component and its local Access Control List (ACL). Likewise , UA is done by the Authentication component and the correspondent authentication database.
Local Authentication Database
Local ACL
Local Biodiversity Database
Authentication
Authorization
DBMS
Log Database
Portal
Logger
Coordinator PROVIDER
Fig.1 - Biodiversity databases sharing architecture A portal user can be registered by one provider, whereas willing to have access to content located in a different provider, as shown in figure 2. Information authentication – e.g. username/password is passed to the provider A, that has the user registered (1) within a SOAP (Curbera 2002) message. Once positively authenticated , security assertions are sent to the provider (B) , which has the biodiversity data requested, (2) and (3), using SAML and SOAP. Authorization credentials received by provider B are compared to the local PA matrix. If they are correct, access to the local biodiversity is granted , and requested data is transmitted . Provider B can log who has requested the data and what data was requested, besides fine controlling how data was make available, because the PA matrix is locally administered. Thus, sensitive data , at least 4TH WORLD CONGRESS ON COMPUTERS IN AGRICULTURE
542
of provider point of view, is delivered only to people with right attributes. For example, a registered researcher could have much more data available than a regular student; a database curator could update the whole data , while a researcher could modify only data obtained by him or his group.
Local Authentication Database Authentication
Local ACL
Authorization
A
Local Biodiversity Database DBMS
Local Authentication Database
Local ACL
Local Biodiversity Database
Authentication
Authorization
DBMS
Log Database
Log Database
Coordinator
(1)
Coordinator
Logger
Logger
PROVIDER
PROVIDER
(2) (3)
Local Authentication Database
Local ACL
Local Biodiversity Database
Authentication
Authorization
DBMS
Portal
B
Log Database
Coordinator
(4)
Logger
PROVIDER
Fig.2 – A three-part transaction. XML Encryption is used, if necessary, to not reveal valuable data to unauthorized parts. Message 4 (figure 2), can have chosen parts encrypted, so only the portal could use these data. In a similar way, XML Signature might be used when sending biodiversity data to the portal, so integrity and source authenticity would be assured to the portal user. XACML would be necessary only for non-local access rules to biodiversity data. If they are local, permissions associated to roles could be internally treated by the biodiversity provider.
Conclusion and Future Works This paper presents an security architecture to share biodiversity databases which considers requirements from a biodiversity database provider point of view. Provider can have registered which data is being retrieved, and who is getting it. Additionally, it is up to the provider how his data is shared, what could make easier joining networks like GBIF’s one. After reviewing some web services security initiatives, we have suggested an architecture that uses theses protocols and frameworks. SAML is used to send identity and roles information from one provider to another. XML Encryption and XML Signature can be employed , respectively, to avoid unwanted data disclosure and assure data integrity and source authenticity. While focusing on biodiversity database sharing, the presented structure is flexible enough to be adapted to other uses. A use of this architecture in a system that has weblabs (web laboratories) and a portal structure to allow remote experiences is foreseen.
4TH WORLD CONGRESS ON COMPUTERS IN AGRICULTURE
543
References BioCASE 2006 The Biological Collection Access Service for Europe. Available at: http://www.biocase.org Accessed 11 May 2006 Brook, M. de L. 2000. Why museums matter, Trends in Ecology & Evolution, Volume 15, Issue 4 :136-137 Corrêa P. L. P. et al. 2005. A Service Oriented Information System to Manage Bromelia Distributed Database. EFITA/WCCA 2005, 2005, Vila Real - Portugal. Proceedings of EFITA/WCCA2005 Curbera, F. et al . 2002. Unraveling the Web services web: an introduction to SOAP, WSDL, and UDDI IEEE Internet Computing Volume 6, Issue 2 :86-93 DiGIR. 2005. Distributed Generic Information Retrieval (DiGIR). Available at: digir.sourceforge.net . Accessed 05 April 2006 GBIF. 2003. Global Biodiversity Information Facility Strategic Plan . Available at: www.gbif.org/GBIF_org/GBIF_Documents/strategic_plan . Accessed 03 April 2006 Hatala, M., Eap, T.M., Shah, A. 2006. Unlocking Repositories: Federated Security Solution for Attribute and Policy based Access to Repositories via Web Services. Proceeding of the First International Conference on Availability, Reliability and Security (ARES06) ITU-T 1991 Security Arhitecture for Open Systems Interconnection for CCITT applications. Recommendation X.800. Geneva, 1991 OASIS 2006. OASIS Web Services Security Available at: http://www.oasisopen.org/committees/tc_home.php?wg_abbrev=wss. Accessed 11 May 2006 Sandhu, R.S. et al. 1996. Role-based access control models In Computer , v.29 , n. 2 , fev. 1996 p.38-47 Tolksdorf, R., Suhrbier L., Langer, E. 2005 Designing XML Security Services for Biodiversity Network DACH Security 2005. Available at : http://www.ag-nbi.de/research/GBIF-D/downloads/ publications/ dach2005synthesys.pdf . Accessed 11 May 2006
4TH WORLD CONGRESS ON COMPUTERS IN AGRICULTURE
544