Sharing @ The Edge: Secure Information Sharing

Sharing @ The Edge: Secure Information Sharing Tony White1, Dwight Deugo1, Steve Gutz2 School of Computer Science, Carleton University 2 Texar Corporation {[email protected], [email protected], [email protected]} 1

Abstract Peer-to-peer (P2P) computing promises to return the World Wide Web to an environment where the edge of the network is as important as the Web servers that have dominated its early years. This paper describes issues associated with P2P computing that must be solved before widespread deployment is possible. The paper presents a P2P solution where services and content are equally secured and can be dynamically added to and removed from the network. Keywords: peer-to-peer, security, information sharing, internet computing

1. Introduction This paper concerns itself with the issues surrounding the creation of P2P “Sharing @ The Edge”, with information and service sharing taking place directly from desktop machines with dynamic creation and reconfiguration of services. This paper describes concepts that facilitate the creation of communities and how the information and users within these communities are protected and secured. The motivations for this are clear: current sharing mechanisms slow down business. Users often copy content in order to avoid security processes, or deal with a local copy of information that is frequently out of date. Collaborative Commerce, or c-commerce, is difficult in the current sharing model, as the creation, maintenance and tear down of business to business (B2B) relationships is mediated by a third party. The Internet is also used for transport, thereby complicating security. This paper continues with a discussion of the issues and requirements that motivate our architecture and solution. Key concepts are then introduced, followed by a description of our architecture.

2. Issues and Requirements P2P computing, and the creation of information networks, presents a number of serious challenges. These challenges are related to interoperability, discovery or search, identity and security. In order for information networks to flourish, standard communication protocols are required. Currently, many proprietary protocols exist but the P2P working group are working on standards. Sun’s JXTA [1] project represents a community-based approach to standardization that is based upon HTTP

and XML technologies. However, it lacks key authorization concepts that are addressed in this paper. The following paragraphs highlight several key issues. Anonymity is unacceptable. After all, would I give access to my bank account to a stranger? In effect, with Gnutella file sharing this is what I do as I do not have fine-grained control for what is shared, and to whom. When participating in an information network a user should be represented by a digital identity, which can be authenticated in a variety of standard ways, both weak and strong. It should be possible to associate a level of trust with an individual, as is the case in many social interactions. This level of trust, in turn, should be used to determine what is shared and to whom. The Poblano [2] trust framework represents a promising approach here. For information to be shared, it must be possible to tell who has asked for it, with whom the information has been shared and to control distribution and access beyond the delivery of the information. Information, and the intellectual property that it represents, requires protection. Information exchange should be secure. It should be possible to detect tampering with information that has been transmitted between peers. It should not be possible to intercept information in transit or modify it. Connections between peers should be encrypted. Information networks should support policy-based authorization. Information represents the intellectual property of an organization and has considerable value. Policies associated with information allow for fine-grained control of information sharing. Policies should go beyond simple access controls and be capable of monitoring, controlling and reacting to information access. This implies that policies need to have an executable specification much like a programming language. Security should be separate

from a resource being secured – authorization services should be present in the network. Information networks should facilitate more effective search for interesting content. Information sharing is only effective if content of interest can be found. There are two challenges to resolve here. First, a protocol for routing search queries effectively throughout the network is required. Second, finding information in networks of peers containing large stores of documents requires that a semantic layer needs to be provided for the data itself. Metadata should be easy to manipulate by a search engine. Information networks should be easy to use and set up. The widespread use of web browsers compared to usage of Gopher and FTP clearly demonstrates the desire for ease of use. In other words, P2P networks should self-organize and deal with all of the issues surrounding information routing. It should not be necessary for a user to understand low-level communications concepts in order to obtain value from P2P networks. Information networks should scale. Today’s Gnutella networks represent a flat structure that is exhaustively searched for interesting content when a 98765 Chief Weenie Tony White [email protected] 123456A99CEBF19930809B101995 IPAddress 192.168.0.1

user generates a query. Information networks need to support communities in order that queries can be restricted to relevant portions of the network. Communities should be easy to manage, and remain under the control of the creator or individuals to whom management has been delegated. We believe that the following concepts and architecture described in the next 2 sections meet these requirements.

3. Key Concepts 3.1 Identity The challenges provided by the requirements section are considerable. Key to the security-related issues of the previous section is the need for identity. Identity is typically implemented by use of authentication domains [3]. However, this presents

serious difficulties when enterprise to enterprise sharing is involved, as authentication domains typically do not cross enterprise boundaries. It is possible to use Certificate Authorities (CA), using X.509 [4] certificates and relying upon the CA to authenticate a user. Unfortunately, not all enterprises provide certificates for their employees and even in certificate-enabled enterprises, obtaining a certificate can take some time. We have provided support for certificate-based authentication within our system; in fact, the authentication subsystem relies upon the Java Authentication and Authorization System (JAAS). When a user installs the software, an identity is created that contains a public key and a private key pair. A user’s identity is protected by a pass phrase chosen by the user. It is used, along with key information, to encrypt the identity stored on disk. The metadata associated with an identity allows further information about the user to be stored; e.g. preferences, location, IP address and phone information. This information may optionally be used by an authentication system and is also used for index creation by a repository service. While no central authority is implied by this definition, trust in the individual claiming the identity can be generated by an out-of-band interaction. In our framework, the exported identity can be sent via e-mail to someone with whom collaboration is to occur. Once received, the identity can be checked to verify that it contains the e-mail address of the e-mail sender. The identity may then be incorporated by double clicking on the e-mail attachment. While this provides for a simple verification of identity, it says nothing about the trust that we should associate with that identity. It is possible for the user to poll a community to determine the trust level associated with the new identity. Other community members provide responses according to linguistic values. The responses are averaged and the user requesting the character reference can accept the community’s highest, lowest or average rating. Naturally, they may also select a rating of their own choosing. A community member may notify other community members of changes in ratings. Finally, The ID element within the identity allows a policy to be associated with the user, the purpose being explained in the section on policies.

3.2 Community Communities are the fundamental building block for information organization and sharing. Communities define the boundaries for authentication and authorization and are designed to bring together individuals with similar interests or the need to

collaborate. Communities do not define resource access beyond the basic services available. Information sharing, and sharing of community member services, rely upon the privilege settings of individual users. A community also provides for a Bob’s P2P Community 234567 Bob White 12:00:00 GMT 11-Nov-2001 12:00:00 GMT 11-Nov-2001 A Sharing information on P2P GLOBAL 197EB2467 Membership 197EB2467 Authorization 197EB2467 NoteTaker IPAddress 192.168.0.1

structure that allows trust between its members to be developed, with no single member being completely responsible for the assessment of trust of other members. This is particularly important, in that allowing a single member to dominate the trust associated with a member leaves the system open to a “denial of trust” attack. When a community is created, the creator associates a description and metadata with the community. From a security perspective, the key elements of the community structure are to be found in the and elements. These two elements contain definitions for the advertisements of membership and authorization services used by the community. Prospective and existing members who wish to modify their membership status use the membership service. It should be noted that the community document does not contain explicit membership or privilege information; this is stored elsewhere and manipulated by the aforementioned services. A special comment should be made regarding the management of the community itself. Authorization of community changes is performed by the authorization service. This ensures that security is

coherently enforced for the community and the services that it supports. Another way of stating this is that by becoming a member of a community a member agrees to abide by access rules defined for that community. This does not mean that the member loses control over their shared information or services, but does mean that he or she agrees to the policies for sharing operating within the community. This is clearly an attractive feature for an enterprise where information technology groups need to define policies for sharing, but cannot reasonably expect to “own” user-defined services or information. Separation of policy and access control is clearly an important consideration for implementation of P2P communities. The ID element within the document allows a policy to be associated with the community, the purpose being explained in the section on policies. Finally, the description element of the community document is analyzed and indexed in order to facilitate the location of communities of interest. The element allows for hierarchical structuring of communities. If defined, the membership and authorization service specifications need not be provided.

3.3 Encryption and XML XML documents wrapped using the HTTPS protocol are the primary mechanism for message exchange, although inter-peer messaging is possible using serialized Java objects or raw byte streams. Messages sent from one peer to another are digitally signed using the identity of peer. Community and individual keys may also be used to encrypt the body of messages. XML-based messaging has become the norm for P2P systems, with Groove, Jabber and JXTA using it. The emergence of standards based upon XML, such as XML-RPC and SOAP as message passing frameworks has strengthened its appeal for messaging. Standards such as XKMS [5], SAML [6] and ebXML [7] are specifically designed to support global marketplace initiatives.

3.4 Services The modeling of services is crucial to the effective implementation of a P2P infrastructure. Dynamic creation and ease of discovery are fundamentally important in our framework. Services are specified in XML documents, which are stored in a repository. Peers invoke the services of other peers, with authorization for service usage being controlled by the peer offering that service. Search is an example of a service, as are Membership and Authorization introduced earlier.

The guiding principle behind a service is that it should be possible to create services dynamically and have the output of one component of a service to be fed into the next component or service in the chain. Services and components receive two inputs: a structured document from the output of the previous component or service and an environment in which execution occurs. This is similar to JXTA, where pipes provide data flow between peers. Services can be built up from other services and components. Components have a MIME type that is used to determine the environment in which the component executes. Execution environments are also modeled as services. For example, a component created in Java is a simple Java class that conforms to the Component interface. It requires a JVM, whereas an executable file is instantiated by forking a process.

the component representation allows for flexible combination and service definition. Given the references to XML and HTML processing, it will not come as a surprise to the reader that we have implemented services for screen scraping. We have also implemented services to interface to various instant messaging services, the Gnutella file sharing network and wrapped the FastTracktm protocol to access media distribution networks that support it. The XML representation of a service facilitates transfer of services from one peer to another, and service creation is a straightforward matter of composing a new document. The ID element within the document allows a policy to be associated with the service, the purpose being explained in the next section.

Filter 123456ABC Bob 12:00:00 GMT 11-Nov-2001 12:00:00 GMT 11-Nov-2002 This service is used as an example example file://ex-exe Another example Source

While communities define a boundary, this is insufficient to define the security associated with services or information available within the community. Communities represent coarse-grained access control. Users must retain control over their own services and information. Association of a policy with a service or information resource provides this fine-grained capability. However, the community may define the policies available to the user; in essence the community defines the rules for sharing. In our system, all service or information access is considered abstractly as answering the question, “Can X perform action A on Y?” Typically, X is a user, A is an operation such as “read” and Y is a file or a service. Referring to the service definition shown in the previous section, another example might ask the question, “Can Alice execute the Filter service?” Clearly, the question may be answered by appealing to a policy associated with the user or the resource.

http://www.example.com

The metadata associated with a service defines an environment in which the service always runs. This environment is available to all components or services that run as part of the service execution. In the example above, the service always has a source of information to access. The expiry date information essentially defines a lease for the service. In this way, periodic updates for the service can be obtained from a known and trusted repository or through searching the P2P network. The creator of the service is a mandatory element as it is used to determine the level of trust in the service. The definition section is made up of the set of components and services that are to be executed for the service to run. MIME types have a registry that defines an association between the type and the execution environment required to process the component. The output of the final component in a service is returned to the requestor. Services defined in this way are easy to find as they are indexed in the same way as documents and

3.5 Policies

123456 Name of policy

The framework allows for the association of policies with information or services and defines the AuthorizationEntity concept. AuthorizationEntities are objects that have an associated authorization policy. AuthorizationEntities provide a mapping between a service or information resource represented by the identity (ID), to the name of a policy. Policies are evaluated by a policy service, the specification of which follows the definition provided in the section on services. Policies may be associated with groups of services or information resources; a discussion of mechanisms for achieving this is beyond the scope of this paper. Polices are implemented in a language

called Idyllic [8], which is based upon the functional programming paradigm. Policies written in Idyllic can react to the access attempt; for example, they can compose and send an email or page a security administrator. Policies can make use of information stored in corporate databases or LDAP directories in order to enforce the business policies for a particular enterprise. Simple Role Based Access Controls (RBAC) are also possible.

4. Architecture Using the ideas of the previous section, we have implemented a P2P solution using a broker-based architecture. A servlet runner executes on the users’ platform (we use Tomcat) and a user interface is provided via a browser. The clean separation of presentation and application logic means that a user can access their peer remotely; e.g. from PDAs, remote PCs and browser-enabled phones. Communication between the servlet runner and the broker is through the use of messaging middleware; in this case the Java Message Service (JMS). Encryption of the message stream between the peer and the broker is supported – we currently use 1024 bit encryption. Integration with the JXTA protocols has also been prototyped. The broker, which sits within the enterprise DMZ, performs a number of securityrelated functions, which include the validation of the IP address of the peer communicating and the authentication of the peer user. Authentication, being based on the JAAS standard means that the self-

Broker Tomcat Servlet Runner

Browser Tomcat Servlet Runner

Browser

registration mechanism currently implemented can easily be replaced by plugging in an alternative compliant implementation. The broker and peer mutually authenticate upon session initialization.

The primary service of the broker is to route messages to the appropriate peer or peers, which then runs the service requested by the peer. Multi-cast messages are possible, where the entire community is addressed when a search is performed. Content returned from running the service on the peer is routed back to the requesting peer where the skin and rendering services generate HTML for presentation in the browser. Several system services are implemented on every peer. These services provide timing, security, event management, file access, indexing, plug-in management and rendering. Security services allow the peer user to associate access controls and sophisticated policies at a fine grained level, to communities and the services and information that they contain. Event management services interpret notifications from other peers, although strict notification is not supported. The file access service isolates the peer from the details of a file system. The plug-in service is used to provide an execution environment for services composed of scripts, using a sandbox model to enforce a high degree of security. The rendering service is used in conjunction with the skin service to create HTML (or potentially XML) representations of the data received from other peers. The indexing service provides the ability for content such as notes and files to be made searchable via the distributed search service. There are several application services that have been implemented which focus on collaboration. These are: distributed search, file sharing, notes, outliner and the whiteboard service. Communities are created by peer users, and application services are associated with them. For example, a peer user may choose to have multiple file sharing services within a community, each one sharing a different directory. Members may be given different roles within the community; potentially being able to read, write, or delete content. Management may also be delegated, an administrator role being assignable. Access rights are inherited, that is if the user is assigned a reviewer role within the community, he or she will adopt that role for each service assigned to the community. Particular users may become friends, and be given a default role that becomes active when they are added to a community created by the owner of the friend list. The assignment of default roles is also extended to services and content; inheritance being used to determine a role if one is not explicitly assigned. Being browser-based, the interface is available to a peer user on a wide range of platforms; in fact, the

Community service and membership management Communities: Local and Remote

Community Application Services

peer can reside on one machine while the user accesses it from another. We have demonstrated the utility of this separation by accessing the peer from a PDA and plan to implement an interface for next generation cell phones in the coming months. The figure above shows an example of the peer interface.

5. Conclusions This paper presents a set of important concepts that facilitate the creation of well-defined P2P communities. Ownership of the information and services remains with the authors and corporations can be assured that the members of the community can be bound by policies and procedures that define what are deemed to be proper uses. We have implemented these concepts within a brokered P2P network. The distributed use of identity, along with familiar social mechanisms for identity validation, aligned with structured communities, has enabled the creation of secure P2P collaboration environments into which services are easily introduced. We are currently evaluating these ideas and framework with users in the research, health and financial communities. Modelling users, services, information and the community itself as authorization entities has reduced the access control problem to answering the simple question, “Can one entity perform a specific action on another.” In short, security has been reduced to the execution of policies

operating on abstract relationships between authorization entities. The architecture described provides facilities that enable pervasive computing: the separation of security from services and the browser-based interface being two examples. The implementation is firmly based upon standards, with the use of HTTP, JMS and JAAS, making it attractive as an advanced P2P platform for enterprise deployment. We continue to work on the platform and intend to add remote object execution and code mobility services in the near future.

References [1] www.jxta.org [2] Chen R., and Yeager W., Poblano: A Distributed Trust Model for Peer-to-Peer Networks, available at http://www.jxta.org/project/www/docs/trust.pdf [3] http://web.mit.edu/kerberos/www/ [4] http://www.ietf.org/rfc/rfc2559.txt [5] http://www.w3.org/Submission/2001/08/ [6] See links to SAML on http://www.oasisopen.org/committees/security/ [7] http://www.ebxml.org/specs/index.htm [8] Bacic E., The Generic Policy Engine, M.C.S. diss. Carleton University, 1998.