Semantic Management of Distributed Web ... - Semantic Scholar

2 downloads 0 Views 269KB Size Report
Daniel Oberle, SAP Research. Steffen Staab, University of Koblenz-Landau. Andreas Eberhart, HP Germany ..... xml files; and k table metadata. Using semantic ...
IEEE DISTRIBUTED SYSTEMS ONLINE 1541-4922 © 2006 Published by the IEEE Computer Society Vol. 7, No. 5; May 2006

Semantic Management of Distributed Web Applications Daniel Oberle, SAP Research Steffen Staab, University of Koblenz-Landau Andreas Eberhart, HP Germany

An ontology-based approach facilitates the management and administration of distributed Web applications developed using application servers and Web services.

Application servers and Web services offer many possibilities for developing Web applications, but they also present new challenges. For instance, managing component dependencies, versions, and licenses is a typical problem in an ever-growing repository of programming libraries. Similarly, Web services let developers reap advantages roughly similar to those of application servers. However, developers and administrators must cope with local management issues as well as incorporate external, possibly varying dynamic information about third-party services. Specifying issues such as transactions, session management, and user rights in an application-independent way facilitates managing application servers and Web services. Developers achieve this by configuring generic software with the help of administration tools and corresponding XML configuration files. This lets you flexibly develop and administrate a distributed application; however, because configuration files lack a coherent formal model, they don't provide a high level of abstraction to facilitate management, even if they're more or less human-readable XML. So, it's difficult or often impossible to query a system for conclusions that come from integrating several descriptions. We remedy the problem by applying semantic technology that is, ontologies and inference engines in the middleware solutions. We contribute a state-of-the-art management ontology and propose a design to integrate the ontology infrastructure in existing application servers. We implemented a prototype of our proposed scheme in KAON SERVER, an amalgamation of JBoss and the KAON ontology tool suite.

Our motivation: Management problems Of the several use cases we've identified in previous work,1,2 we use two in this section to show the difficulties of managing application servers and Web services. We don't intend to show how to solve these specific use cases. Rather, we demonstrate how seemingly trivial problems that involve handling the source code and configuration files of different software components and Web services lead to complex situations that are hard for software developers or administrators to understand. IEEE Distributed Systems Online May 2006

1

Application servers The first use case commonly occurs when you must link legacy components. It deals with indirect permissions due to context switches in application servers (see figure 1). Suppose a customer, identified by the user account dob, logs into a Web shop via HTTP basic authentication. The script on this page say, a servlet might connect to the CustomerEntityBean, an Enterprise JavaBean,which in turn accesses the Customer table in a legacy database. The legacy database defines its own set of user accounts, which differs from the user accounts in the J2EE (Java 2 Platform, Enterprise Edition) realm. We assume that only the dbuser (typically the administrator) can access the Customer table. So, the EJB must perform an explicit context switch (frequently called the run as paradigm3). The call succeeds because dbuser's credentials are propagated.

Figure 1. An example of indirect permission.

J2EE application servers define context switches independent of applications. This means that the responsibility shifts from coding to deployment. Although reducing the amount of source code you must write is always a good idea, deployment can be tricky. The J2EE specification describes the structure of XML deployment metadata (deployment descriptors). In our example, the developer would have to analyze two different deployment descriptors as well as the source code to configure the context switch. First, the configuration file (web.xml) of the servlet container states that only authenticated users can access the WebShopServlet (see figure 2).

IEEE Distributed Systems Online May 2006

2

Figure 2. A relevant snippet of the servlet container's web.xml deployment descriptor.

Second, the WebShopServlet accesses the CustomerEntityBean (see figure 3). The servlet's doGet() method serves the incoming HTTP requests. In our case, it queries user account information from the Customer table by means of the bean to display it to the user. After retrieving a handle to the bean via the Home interface, the servlet invokes the bean'sgetCustomerName() method.

Figure 3. A relevant snippet of WebShopServlet.java showing the invocation of the CustomerEntityBean.

Third, the CustomerEntityBean's deployment descriptor, called ejb-jar.xml, states that the bean performs a context switch via the tag. It thus accesses the

IEEE Distributed Systems Online May 2006

3

Customer table with dbuser's credentials (see figure 4).

Figure 4. A relevant snippet of the ejb-jar.xml deployment descriptor showing the tags for the context switch. This case isn't a bug but a common way of integrating legacy components. The developer must align two disjoint sets of user accounts via deployment descriptors. J2EE implementations such as JBoss provide tools to help developers generate such deployment descriptors. However, the tools act merely as an input mask, which generates the specific XML syntax for the developer. This is a nice feature; however, the developer must fully understand the complicated concepts behind the options for the context switch. The deployment tools don't help avoid or even repair configurations that might cause harmful system behavior. Even worse, this problem could be duplicated because a plethora of deployment descriptors exists for different kinds of components (servlets, EJBs, and managed beans) and aspects (security, transactions, and so on).

Web services Similar to deployment descriptors in application servers, WS* descriptions manage orthogonal aspects in an application-independent way. By WS*, we mean Web service specifications, such as WSDL (Web Service Definition Language), WS-Security, or WS-Policy (see www-128.ibm.com/developerworks / IEEE Distributed Systems Online May 2006

4

views/webservices/libraryview.jsp? type_by=Standards). WS* descriptions are XML files that declaratively describe how developers should deploy and configure Web services. So, WS* descriptions are exchangeable, and developers might use different implementations for the same Web service description. WS* descriptions' disadvantages, however, are also visible. Although the different standards are complementary, you might produce models composed of different WS* descriptions that are inconsistent but don't easily reveal their inconsistencies. This happens because no coherent formal model of WS* descriptions exists, so it's hard to query the system for conclusions that come from integrating several WS* descriptions. As an example of a trivial conclusion derived from both a WS-BPEL (Web Service Business Process Execution Language) and WS-Policy description, consider the following case. Let's return to the Web shop example and assume we've realized it with internal and external Web services composed and managed by a WS-BPEL engine. After a customer submits an order, we must check the customer's credit card for validity, depending on the credit card type (VISA, MasterCard, and so on). We assume that credit card providers offer this functionality via Web services. The corresponding WS-BPEL process checkAccount invokes the provider's Web services.

Figure 5 shows a snippet of the WS-BPEL process definition.

Figure 5. Snippet of the WS-BPEL document showing the checkAccount process.

IEEE Distributed Systems Online May 2006

5

Suppose that a credit card provider's Web service accepts only authenticated invocations conforming to Kerberos or X509. It states such policies in a corresponding WS-Policy document, such as the one in figure 6. The invocation will fail unless the developer ensures that it meets the policies are met. The developer must check the policies manually at development time or implement this functionality to react to policies at runtime, assuming that no policy-matching engine is in place.

Figure 6. The MasterCard service's WS-Policy document. Several tools are available to define WS-Security and WS-Policy descriptors. However, as with deployment descriptors, they act as an input mask that generates the specific XML syntax for the developer.

Summary Both deployment descriptors in application servers and WS* descriptors for Web services provide convenience and flexibility, but management remains cumbersome. The conceptual models underlying the configurations are implicitly encoded in different XML-DTDs or schemas. So, software developers must manually discover conclusions (that is, by reading and analyzing the descriptor files) that derive from the integration of such descriptions, with little to no help from formal machinery. The use cases we presented would be easy to solve in isolation using dedicated tools. However, a generic, ontology-based semanticmanagement approach would let us solve a plethora of use cases in a common framework.

IEEE Distributed Systems Online May 2006

6

Ontology As a solution to the missing coherent formal model, we propose an ontology covering aspects from the heterogeneous deployment and WS* descriptors. An ontology is a conceptual model with formal logicbased semantics. Ontologies formalize concepts and concept relationships (associations) similar to conceptual database schemas or UML class diagrams.4 However, ontologies differ from existing methods and technologies in several ways: Ontologies aim to enable agreement on the meaning of specific vocabulary terms and so facilitate information integration across applications. Ontologies are formalized in logic-based representation languages. Their semantics are thus specified unambiguously. Ontology representation languages come with executable calculi enabling querying and reasoning. A concept hierarchy, or taxonomy, forms an ontology's backbone. Associations define relationships between concepts and can be instantiated accordingly.

Design Semantic management of software components in application servers and of Web services comprises two layers. First, you formally specify an ontology of components and services that is, you formalize what components and services are made of. You need only build the ontology once, although the ontology for the application domain (for example, shopping for books on the Web) might vary. Second, you formally specify a concrete set of components or services and their properties by providing semantic metadata (also called instances in common literature) aligned to the ontologies from the first layer. The semantic metadata formalizes a particular instantiation of a distributed application. For the first layer, we used a foundational ontology as a starting point. Foundational ontologies capture insights from philosophy, logics, and software engineering to prescribe good ontology engineering practice at an upper level of the ontology. For example, our ontologies distinguish carefully between subconcepts and functional roles; for instance, data might play the role of Web service input or output, but the concept "data" is a subconcept of neither. The foundational ontology provides the basis for modeling relevant aspects of components and services. Because modeling is usually time consuming, we generally strive to reuse existing ontologies. We could have reused several recently proposed Web service ontologies (such as OWL-S5 or WSMO6). However, these ontologies have several shortcomings7 that lead to conceptual ambiguity and inferior design. So, we decided against reuse and created our own ontologies for components and services.

IEEE Distributed Systems Online May 2006

7

Correspondingly, we've pursued a modularized, layered approach that adds ontological commitment in a piecewise manner to maximize ontology reuse at all layers. Figure 7 depicts our ontology modules (see http://cos.ontoware.org).

Figure 7. Reused and contributed ontology modules as a UML package diagram. Packages represent ontology modules; dashed lines represent dependencies between modules. An ontology module A depends on B if it specializes concepts of B, has associations with domains and ranges to B, or reuses its axioms.

Reused ontology modules. Dolce (Descriptive Ontology for Linguistic and Cognitive Engineering)8 is a typical foundational ontology with a rich axiomatization of generic (domain-independent) concepts, explicit construction principles, and careful reference to interdisciplinary literature. Several additional theories exist for Dolce that come in the form of ontology modules. For example, Descriptions & Situations (D&S) can be considered an ontology design pattern for structuring (or restructuring) application ontologies that require contextualization. The domain we want to model, namely that of software components and Web services, requires an ontological formalization of context. The most prominent examples of the need for context modeling are the different views that can exist on data. Data can play the role of both input and output, depending on the context. Aldo Gangemi and Peter Mika describe D&S in detail.9 IEEE Distributed Systems Online May 2006

8

Another requirement is the possibility to model workflow information between software components or between Web services. One Dolce module, the Ontology of Plans, generically formalizes a theory of plans10 that you can use to model workflow information.The Ontology of Plans applies the D&S ontology design pattern to characterize planning concepts. The module's intended use is to specify plans at an abstract level independent from existing calculi. Finally, the Ontology of Information Objects provides a semiotic ontology design pattern centered around information objects.10 It lets us concisely model the relationship between entities in an information system and the real world. This is required for modeling Web services later on. Contributed ontology modules. We've applied the reused ontology modules to formalize a Core Software Ontology, a Core Ontology of Software Components, and a Core Ontology of Services. Core ontologies define concepts that are generic across a set of domains; they're situated between the extremes of generic and domain ontologies. Their goal is to facilitate reuse in specific settings and platforms. The borderline between generic and core ontologies isn't clearly defined because there's no exhaustive enumeration of fields and their conceptualizations. However, the distinction offers intuitively meaningful and useful information for building libraries. The Core Software Ontology formalizes the most fundamental concepts for modeling both software components and Web services, such as software, data, users, access rights, or interfaces. It formalizes such concepts by reusing the modeling basis of Dolce, D&S, the Ontology of Plans, and the Ontology of Information Objects. We've defined the fundamental concepts in the Core Software Ontology and separately from the Core Ontology of Software Components to facilitate reuse. The Core Ontology of Software Components is based on the Core Software Ontology to formalize our understanding of the term "software component." This term requires special attention because you can interpret it differently, leading to ambiguity. We also put libraries and licenses in this core ontology. The ComponentProfile is the core concept and aggregates a component's relevant aspects. It allows for browsing and querying the various component properties. The Core Ontology of Services is also based on the Core Software Ontology. It formalizes our understanding of the term "Web service" and introduces the notion of ServiceProfiles. You can access more information on all our core ontologies elsewhere.11 Domain ontology modules. Domain ontologies enrich the core ontologies with additional domaindependent knowledge. We specialize the concepts and associations of both core ontologies for the particular needs of the infrastructure in our prototype. Several idiosyncrasies of the underlying platform come into play and are formalized accordingly.

IEEE Distributed Systems Online May 2006

9

Examples Our ontologies can formalize the use cases we discussed in the "Application servers" and "Web services" sections. Application server example. Returning to our indirect-permission example, we now show how using our ontology can support developers. The Core Ontology of Software Components introduces concepts such as Resource, User, RequestContext, or AccessRight. Associations hold between concepts: grantedForUser between AccessRight and User, definedOnResource between AccessRight and Resource, invokes between one Resource and another, and so on. The association below, namely permission, holds when a User is directly granted an AccessRight on a Resource. It's defined intensionally, that is, in the form of a rule (the notation resembles Prolog syntax):

(1) permission(u,r)