stating the requirements and capabilities of a component. In addition ... when the developer can find and understand a component quickly, and when the component solves a ...... Towards Very Large Knowledge Bases, IOS Press, pp. 25â32.
Reuse, CORBA, and Knowledge-Based Systems
John H. Gennari, Heyning Cheng, Russ B. Altman, and Mark A. Musen Stanford Medical Informatics Stanford University, Stanford, CA 94305-5479
Abstract By applying recent advances in the standards for distributed computing, we have developed an architecture for a CORBA implementation of a library of platform-independent, sharable problem-solving methods and knowledge bases. The aim of this library is to allow developers to reuse these components across different tasks and domains. Reuse should be cost effective; therefore, the library will include standard problem-solving methods whose semantics are well understood and are described with a language for stating the requirements and capabilities of a component. In addition, when a developer needs to adapt a component to a new task, the adaptation costs should be minimal. Thus, we advocate the use of separate mediating components that isolate these adaptations from the original component. We demonstrate our approach with an example: an implementation of a problem-solving method, a knowledge-base server, and mediating components that adapt the method to different knowledge bases and tasks.
1. Cost-Effective Reuse Researchers in both knowledge-based systems and in software engineering have looked to reuse as a methodology for reducing the high cost of software development and maintenance. With a reuse approach to software construction, developers adapt existing software components at a fraction of the cost of developing a system from scratch. Reuse can save development costs when (1) the overhead cost of building a component for reuse is low, (2) the frequency with which developers reuse components is high, and (3) the cost of finding, and adapting a component is low. Unfortunately, there are a number of obstacles to overcome before component reuse is cost effective. Reuse is not cost-effective when the cost of building, finding and adapting a library component is greater than the cost of building a solution from scratch. Reuse is cost effective only when the developer can find and understand a component quickly, and when the component solves a significant problem: one that would be expensive to solve with software built and debugged from scratch.
1
Development and maintenance of knowledge-base systems without reuse is known to be expensive. Originally, first-generation knowledge-based systems consisted of an inference engine and a knowledge base that included both facts about the domain and rules that controlled the processing of those facts. Unfortunately, these systems did not scale well to large knowledge bases, as the set of rules and facts quickly became unwieldy and difficult to maintain (Bachant & McDermott, 1984). In response to this problem, researchers isolated knowledge about the process used to solve some problem from knowledge of specific facts about a particular domain. Thus, second-generation knowledge-based systems are composed of two large-grained components: (1) a problem-solving method and (2) a knowledge base used by the method (David, Krivine, & Simmons, 1993). Although the design of second-generation knowledge-based systems allows developers to build systems that scale to very large problems, significant reuse of components has yet to be demonstrated. One of the obstacles to reuse for knowledge-based systems is the inability to share components across development environments. Typically, different environments have idiosyncratic ways of specifying components, and thus, developers building a reuse library for one architecture cannot use components developed in a different environment. To address this problem, we advocate the use of the Common Object Request Broker Architecture (CORBA) standard for platform-independent communication and component definition (Orfali, Harkey & Edwards, 1996). Our vision for the development of knowledge-base systems is that developers will be able to build systems at reduced costs by retrieving and adapting components from a distributed and platform-independent reuse library. We believe (1) that components such as problem-solving methods are particularly appropriate for cost-effective reuse, (2) that an effective strategy for minimizing component adaptation costs is to construct separate mediating components that filter and transform information among components, and (3) that a reuse library built with a standard for cross-platform communication and distributed computing such as CORBA, maximizes the potential reuse frequency for components and amortizes the cost of developing a large reuse library. While the first claim has been made by many in the field (e.g., Chandrasekaran, 1986; Schreiber, et al., 1994; Motta, et al., 1996; Breuker, 1997), and a few have been actively pursuing the second claim (e.g., Fensel & Groenboom, 1997), we are not aware of any other experiments with knowledge base component reuse and CORBA. In this paper, we present an initial example in support of our vision, using a set of example tasks and knowledge bases from the field of molecular biology, and a well-known problem-solving method, propose-and-revise (Marcus, Stout & McDermott, 1988). We have constructed a simple knowledge-base server using the ideas of the Open Knowledge Base Connectivity Protocol (Chaudhri, et al., 1998; see also related work of Karp, et al., 1995) and have built mediating components that connect the method to the appropriate knowledge bases in the server. Our example demonstrates the reuse of a problem-solving method across multiple tasks, where the adaptations of this method are isolated in mediating components, and where all components (method, mediator, and knowledge-base server) can be implemented as distinct services available 2
over the network. Although a single example cannot be used to prove claims about effort saved via reuse, or measure the ease with which components can be retrieved and adapted from the reuse library, our work is a first step toward such studies. Over time, reuse cases must be built and components made available so that developers of knowledge-based systems can assess the value of an entire reuse library of components. Before presenting our reuse example, we describe the construction of second-generation knowledge-based systems, including our architecture for building systems from reusable knowledge-base components.
2. A Reuse Architecture for Knowledge-Based Systems Our work to develop a reuse architecture for knowledge-based systems is part of Protégé: a longterm project to build a toolset and methodology for the construction of domain-specific knowledge-acquisition tools and knowledge-based systems from reusable components (Puerta, Egar, Tu, & Musen, 1992; Gennari, Tu, Rothenfluh, & Musen, 1994; Eriksson, et al., 1995). Protégé is one of several environments for the construction of second-generation knowledge-based systems—other examples include VITAL (Shadbolt, Motta, & Rouge, 1993; Motta, et al., 1996) and CommonKADS (Schreiber, et al., 1994). These environments are designed to help systems developers build knowledge-based systems from reusable components: problem-solving methods and knowledge bases. As we describe below, the Protégé methodology helps developers design these components for reuse: developers can apply the problem-solving method to different knowledge bases, and developers may use different methods over the same knowledge base. 2.1. Methods, Knowledge Bases, and Ontologies A problem-solving method captures knowledge about how to accomplish some class of tasks, whereas a knowledge base provides the data or information about the domain that is necessary for some problem-solving method to operate. Examples of problem-solving methods include constraint satisfaction, reactive planning, or the temporal abstraction of data (Tu, et al., 1995). These methods are algorithmic procedures that can be at least somewhat domain-independent: they can be applied to different sets of data to solve problems in different domains. A knowledge base captures information about a domain that is more static, relative to the method or methods that use the knowledge base. Although knowledge bases are typically designed as input data for some problem-solving method, a knowledge base may be designed for several methods, and it may capture information about the domain that is independent of problem at hand. Thus, knowledge-bases can be at least somewhat method-independent: they can serve as sources of information for more than one problem-solving method (Musen & Schreiber, 1995). To allow developers to understand and reuse both methods and knowledge bases, both of these components are accompanied by ontologies that more formally describe the terms and relations used by that component (Gruber, 1993; Guarino & Giaretta, 1995). In our environment, these ontologies describe at a more abstract level the set of objects used by the component. Thus, an
3
ontology for a constraint-satisfaction method defines the abstract notion of a constraint as an object with particular attributes or slots, and with particular inheritance relationships to other objects in the ontology. When the method is invoked, this abstract notion of constraint must be instantiated with a particular set of constraints that the method then attempts to satisfy. Similarly, an ontology for a knowledge base describes the abstract classes, their relationships and attributes, and these classes are instantiated by ground-level facts in the knowledge base. These ontologies are essential for component reuse, because they allow developers some insight into the semantics of a method or a knowledge base. Thus, the ontology for a knowledge base specifies the vocabulary for that domain, and allows method developers to formulate queries of the knowledge base in the terms of that ontology. If the knowledge base is designed for reuse, it should be able to respond to a range of queries. Furthermore, as we describe in Section 5, if a set of knowledge bases share a common representation language, then they can be grouped together and made available by a single knowledge-base server. The ontology for a problem-solving method specifies the classes of inputs and outputs used by the method. As with the knowledge-base ontology, this method ontology simplifies the reuse of a method. For other developers to use our problem-solving methods, we must provide applicationprogramming interfaces (APIs) for each method, and we can derive these interfaces from method ontologies that describe inputs and outputs. As we describe in Section 6, these method ontologies are part of richer method-description language that would specify method semantics such as goals, competence and decomposition into sub-methods. 2.2. Reusing Methods and Knowledge Bases with Mediators In general, it is cost-effective for developers to reuse both problem-solving methods and knowledge bases when building knowledge-based systems. For example, a developer assembling a knowledge-based system for a computer hardware domain should be able to retrieve and reuse either a building-block method, such as a constraint satisfaction algorithm, or a knowledge-base component, such as a repository of information about standard pieces of computer hardware. However, such a scenario raises the issue of component adaptation costs: the costs of modifying and adapting existing components to fit a new task. Unless the method and the knowledge base are matched perfectly, the developer must either change the knowledge base to match the input and output specifications of the method, or modify the method to use exactly the terms specified by the knowledge-base ontology. Either of these solutions modify existing components, and will therefore inhibit the ease with which those components might be reused by other developers. Therefore, we advocate the use of separate mediating components that isolate any customizations required to use a particular problem-solving method with a particular knowledge base. Figure 1 shows our view of these three types of components (methods, knowledge bases, and mediators) that developers can configure to solve knowledge-based problems. Unlike problem-solving methods and knowledge bases, the mediating components are specific to a particular task—they
4
Figure 1. CORBA support for a reuse architecture. encapsulate information about how to connect and adapt a specific method to a specific knowledge base to solve a particular problem. Figure 1 shows all components interconnected via the Common Object Request Broker Architecture (CORBA), a platform-independent standard that supports object-oriented distributed computing. This software standard allows objects and associated methods to operate across a network in a hardware and language-independent manner (Orfali, Harkey & Edwards, 1996). All CORBA components include an interface definition—a stub declaration of an object’s attributes and methods specified in the Interface Definition Language (IDL). Any developer that wishes to use a CORBA component would compile that component’s IDL specification into the machinespecific form for inclusion into a local application. For knowledge-base components, this specification is related to the ontology for that component. However, IDL specifies only a syntactic description of a component’s interface elements; it does not specify any semantic information about the component. Nonetheless, once an IDL specification is defined for a component, the CORBA standard makes it easy to reconfigure that component to allow connections to different clients or servers. Our aim is to support component interoperation—to allow developers in different knowledgebase environments, or using different development and representation languages, to share components. Without sharing across environments, each group of developers would need to build up their own library of components, rather than using components shared from other architectures. CORBA supports this goal by establishing a standard of communication. Whether this standard is CORBA or alternatives, we argue that the high cost of developing a large reuse library should be distributed across developers by the use of standards for defining and describing software component interfaces. In this paper we describe a distributed implementation of a knowledge based system, built as a set of reusable components as shown in Figure 1. These include a legacy problem-solving 5
method, a knowledge-base server and mediating components that solve two configuration problems. This example demonstrates reuse of a problem-solving method, and shows how the CORBA standard facilitates communication among components.
3. Configuration Problems in Molecular Biology In molecular biology, transfer RNA (tRNA) and the ribosomal macromolecule are part of the cellular machinery responsible for translation of RNA to protein in living organisms. Researchers are interested in determining the three-dimensional shape of such structures, since such information may be essential if we are to understand the mechanisms of protein synthesis. In other words, the task for the researcher is to determine the three-dimensional position or configuration of the ribosome or the tRNA macromolecule, given experimental evidence about the sequence and known structure of these macromolecules. To build a knowledge-based system to solve this problem, we need (1) a specification of a knowledge base of experimental molecular biology information, and (2) a method that can solve the configuration problem of determining threedimensional structure. Fortunately, neither of these components need to be built from scratch: researchers in this area have already begun to share knowledge bases about ribosome sequence, and a number of legacy computational methods are available (Chen, Felciano & Altman, 1997; Altman, Abernathy & Chen, 1997). We describe two variations of this configuration problem: (1) for the ribosome itself, and (2) for transfer RNA. In both cases, the general problem can be solved by constraint satisfaction: Given experimental information about the molecule that includes constraints among components, find a configuration of the molecule such that no constraint is violated. Our solutions to these two problems require two different knowledge bases: one for the tRNA structure, and one for the ribosome. However, as we show, the two problems are sufficiently similar that they can be solved by a single problem-solving method. 3.1. The 30S Subunit Configuration Problem The 30S ribosome subunit is made of a single chain of RNA bases and a set of 21 unconnected proteins. Current experimental techniques provide four types of information. First, they provide the location in three dimensions of the 21 proteins. Second, they provide the primary sequence of RNA bases. Third, they provide the location in the primary sequence of geometric components known as secondary structures, such as double helices and coils, that have a known, regular structure. Fourth, they provide distance constraints between the components and the fixed proteins, as well as among the components themselves. Thus, given this experimental information, the configuration task is to find sets of locations and orientations for each component such that no distance constraint is violated. The secondary structure for this ribosomal knowledge base specifies 10 helices; Figure 2 shows these helices in a solution position, where all distance constraints are satisfied.
6
Figure 2. A configuration of the 30S subunit of the ribosome. Cylinders represent helices; ellipsoids represent proteins. The knowledge base required to solve this problem can be described by its ontology; in our framework, the ontology is specified with the classes and attributes shown in Figure 3. The Object class specifies the 10 helices that make up the ribosomal secondary structure; the Representation class describes the size of each helix; the Location-file is an explicit list of possible locations in three-space for that helix; and the Constraint class specifies the distance constraints between helices. This ontology is a simpler, more task-specific version of a general-purpose ontology and knowledge base that we are building for a wider variety of tasks (Chen, Felciano, & Altman, 1997). Our goal is to use the same problem-solving method to solve different but related problems in molecular biology. Thus, it is important to note differences and similarities in the ontologies of the two problems. For example, in order to reduce the total number of possible locations (and 30s subunit Ontology
Location-file Name Ref-obj Date-created locFound List-of-locs (x,y,z,ω ϕ)
Representation Top Bottom Radius
Constraint Obj1 Obj1-pos Obj2 Obj2-pos Lower-bound Upper-bound
Object Name Obj-type Representation Best-loc-file
Figure 3. An ontology for the ribosomal 30s subunit configuration task. 7
tRNA Ontology Sampling_rate Max Min Step Constraint Helix1 Helix1_pos Helix2 Helix2_pos Lower_bound Upper_bound
Helix Name Representation X_rate Y_rate Z_rate Omega_rate Theta_rate Phi rate
Helix_rep Top Bottom Radius
Figure 4. An ontology for the tRNA configuration task. thus, the computational burden on the constraint-satisfaction algorithm), this knowledge base includes information from a preprocessing step that computes an explicit list of candidate helix locations. As we will see, this information and the corresponding ontology elements are missing from the tRNA configuration problem. 3.2. The tRNA Configuration Problem Transfer RNA (tRNA) is a critical component of the molecular machinery that translates the DNA genetic code into proteins. It interacts with the ribosome to provide individual protein components that are specifically matched to segments of the DNA sequence. The threedimensional structure of tRNA is one of only a few RNA structures that is known at high resolution from x-ray crystallographic studies. Thus, this problem has a known gold-standard solution, unlike the configuration problem for the ribosome 30S subunit. As with the ribosome configuration problem, our aim is to find a position in three-space for all elements of tRNA such that no distance constraints are violated. However, unlike the knowledge base for the ribosome problem, this knowledge base does not include an explicit list of positions for each helix; there is no pre-processing step that produces a list of possible locations. Instead, each helix has an associated sampling step for each dimension. As we shall show, this information allows the knowledge-based system to produce dynamically a list of possible helix locations for use by the problem-solving method. Figure 4 shows a schematic view of the ontology for the tRNA knowledge base, including a Sampling_rate class. In the Protégé framework, all ontologies (such as those shown in Figures 3 and 4) are represented with a simple frame-based formalism, and viewed and manipulated within the Protégé Ontology Editor Reuse of components for related tasks minimizes adaptation costs. In the example that we describe here, we reused the propose-and-revise method for two configuration tasks. Although the tasks are clearly similar, all of differences between the ontologies of Figures 3 and 4 must be ac8
commodated by the reuse developer. In addition to conceptual differences, such as how sampling rates are represented, even simple differences in terminology must be addressed. For example, in the ribosome ontology, the constraint class has attributes named Obj1, Obj2, and so on, while in the matching constraint class in the tRNA ontology, the corresponding attributes are labeled Helix1, Helix2, and so on. In Section 5, we show how our mediating components address these differences between ontologies.
4. A Legacy Method: Propose-and-Revise An important feature of a reuse architecture is the ability to accommodate legacy methods and algorithms. In the molecular biology domain, there are many legacy algorithms that researchers would like to apply to new data sets. For example, there are analysis tools that simply determine whether a set of constraints is consistent with a set of helix positions in three-space. In our example, there is an algorithm for predicting location information given a set of constraints to be satisfied. The molecular biologist wishes to apply such analysis tools to many different data sets. As more and more researchers collaborate, they will create many different data sets reflecting different experimental techniques as well as different macromolecules. Thus, for our reuse approach to be effective, we must support the reuse and distribution of legacy methods. In this section, we present the problem-solving method known as “propose-and-revise” (Marcus et al., 1988). This method is a simple backtracking search through the space of all solutions, by repeatedly modifying the design as constraint violations are flagged. The algorithm follows the following steps: 1) Propose an initial design. 2) Check for any constraint violations. 3) If there are no constraint violations, succeed; otherwise, either (a) choose the best fix for a violated constraint, or (b) if no fixes are available, backtrack to the most recent choice point. 4) Revise the design by applying the selected fix. 5) Go to step 3. Propose-and-revise is a legacy system from the knowledge-based systems literature. Our ability to adapt this algorithm to problems from the field of molecular biology is a demonstration of the flexibility of our generic problem-solving method. The algorithm was originally designed by Marcus for an engineering task: configuring elevator components to be consistent with customer specifications and with safety requirements (Marcus, Stout, & Mcdermott, 1988). Our first implementation of this method was for use with a knowledge base of elevator components, as part of the Sisyphus-2 benchmarking project in 1992 (Schreiber & Birmingham 1996). This project compared the ability of different environments by testing them against a common knowledgebased problem, as detailed in the elevator design task specification (Yost & Rothenfluh, 1996). Although our implementation of propose-and-revise was designed for the elevator configuration 9
problem, we wrote the method to be reusable, with a method ontology of generic search and constraint-satisfaction terminology, rather than with an ontology of elevator-specific concepts. Our method ontology is a simple specification of the requirements of propose-and-revise, with classes for constraints, fixes, and the state variables that participate in constraint and fix specifications. As we describe in Section 5, this generic design has allowed us to reuse the module for other problems in other domains.
5. Reuse of Propose-and-Revise Over the past 5 years, we have adapted and reused the legacy code for propose-and-revise for a series of different tasks. The way in which we implemented the reuse of this method for new tasks reflects our progression toward a more general-purpose architecture for reuse. A review of our different uses of propose-and-revise will help explain our current approach to reuse. After building the method for elevator configuration, we initially adapted the method for reuse in a similar, artificial test problem for Uhaul configuration (Gennari, et al., 1994). In this domain, the system is given information about various truck sizes and rental costs and then selects the appropriate truck and total cost, based on customer information about the volume of goods to be shipped. This domain was engineered to be similar to that of elevator configuration; for example, the customer’s selection is upgraded to a larger truck if a capacity constraint is violated, much as an elevator cable might be upgraded to a stronger cable if a safety constraint were violated. Although the UHaul task is merely a simplified version of the elevator-configuration task, there are still adaptation costs for reusing the propose-and-revise problem-solving method. For example, the knowledge bases for UHaul equipment and for elevators have different terminologies. Information in these knowledge bases must be translated into the terminology of the proposeand-revise method ontology, thereby adapting the knowledge bases to the requirements of the problem-solving method (Gennari, et al., 1994). We isolated these adaptations as a set of declarative mappings that translated the domain-specific knowledge base to a method-specific knowledge base. Using this same approach, we reported on the adaptation of propose-and-revise to the ribosomeconfiguration problem (Gennari, Altman & Musen, 1995). The ribosome knowledge base was designed independently, (without knowledge of the propose-and-revise method) and originally used with the Protean problem-solving method (Altman, Weiser & Noller, 1994). Unlike our initial demonstration of reuse across elevator and UHaul problems, the adaptation of propose-andrevise to the ribosome configuration problem demonstrated our ability to reuse method code with real-world problems that were designed to be solved by other problem-solving methods. However, in all of our work to date, all code was written within a single software environment: the CLIPS production-system language that supports both an object-oriented language for knowledge bases, and a rule-based system for problem-solving methods. The requirement that all users of our components build systems within CLIPS greatly constrained our ability to reuse compo10
Figure 5. The propose-and-revise method with two mediators and knowledge bases. nents. Furthermore, our implementations used CLIPS to load the complete knowledge base as part of the run-time system, and this approach will not scale to large knowledge bases. Finally, our declarative mappings could not support dynamic queries to the domain knowledge base. That is, our architecture required that all domain knowledge be translated as a pre-processing step before invoking the problem-solving method. This requirement is awkward and especially problematic in the tRNA configuration task, where the knowledge base does not include an explicit list of possible helix locations. These problems led us to design the reuse architecture introduced in Figure 1: a distributed architecture in which methods, knowledge bases, and mediators are independent components that communicate via the CORBA standard. In Figure 5, we show an instantiation of this architecture with specific examples of method, mediator and knowledge-base components. Our architecture is compatible with any development environment that complies to the CORBA standard. The use of a knowledge-base server allows problem-solving methods to retrieve knowledge-base information at run-time, without loading or pre-processing an entire knowledge base. In Sections 5.1 through 5.3, we describe each component in Figure 5 in greater detail. 5.1. The Propose-and-Revise Component As described in Section 4, our implementation of propose-and-revise is built up from legacy code written 5 years ago in CLIPS. To wrap this code and turn the method into a CORBA service, we built a C++ component that invokes CLIPS. This component communicates both with client components that invoke the method, and with the knowledge base of constraints and variables required as input to propose-and-revise. Thus, this component is both a service for a clients that may invoke the method, and also a client to knowledge-base servers that provide information for the method. Clients invoke the problem-solving method upon some knowledge base and then 11
supply the method with any additional run-time inputs from the user. The knowledge-base server responds to queries from propose-and-revise (via mediating components), allowing the method to retrieve information from the knowledge base. Figure 5 shows the client, method and knowledge base server in left-to-right order. As described earlier, any CORBA component includes an IDL specification that is a formal specification of its methods and objects. As a service, the propose-and-revise component includes an IDL specification for the following methods: (1) “connect”, which allows a client to inform the method as to which knowledge base the method will use; (2) “getInputs”, which retrieves from the knowledge base the list of required run-time inputs; and (3) “solve”, which invokes propose-and-revise, passing in the run-time inputs as parameters, and returning the set of output variables and their associated values. For the molecular-biology tasks described in Section 3, there are no run-time inputs since all inputs for the method are included in the molecular biology knowledge base, whereas for the UHaul and the elevator-configuration problems, some information must be elicited from the client component at run-time. The invocation of the legacy CLIPS code is within the implementation of “solve”, and input and output parameters are transformed between CLIPS and C++ representations. As shown in Figure 5, the method component connects to a particular knowledge base as a client, and retrieves required information for processing. For the propose-and-revise method, all instances of variables, constraints, and fixes must be available before inference can begin. The propose-and-revise component may also make queries of the knowledge base during inference, as it tries to apply certain types of fixes to resolve a constraint violation. However, whether before or during inference, all queries are formed in terms of the method ontology: Propose-and-revise does not include any knowledge about elevators, molecular biology or UHaul equipment. Thus, we need mediators that translate between problem-solving methods and the knowledge bases that they are querying. Before we describe mediators, we present the design of a method-independent knowledge-base server component. 5.2. The Knowledge-Base Server Component As seen in both Figures 1 and 5, an important component of our architecture is the notion of reusable knowledge bases that are available as independent resources. Thus, a knowledge base developed within one environment could be accessed and used in another environment, in spite of differences between corresponding knowledge-representation languages. In support of this type of knowledge-base interoperation, researchers have developed a generic frame-based protocol for communication among knowledge-representation systems known as the Open Knowledge Base Connectivity (OKBC) protocol (Chaudhri, et al., 1998; see also related work of Karp, et al., 1995). This protocol formally defines the terminology and axioms for a frame-based system, with concepts such as class, slot, and instance, and then specifies a set of queries and functions that comprise an interface to any frame-based knowledge representation system.
12
As mentioned in Section 2, our reuse research is within the context of the Protégé environment. Thus, our ontologies and knowledge bases are manipulated and built with Protégé-generated knowledge-editing tools, and all knowledge bases built with these tools share a simple, framebased knowledge-representation language based on CLIPS. Because this language is framebased, we can build a Protégé knowledge-base server that uses a subset of OKBC as its interface specification. Consistent with the rest of our architecture, this server is a CORBA component, and OKBC calls into a particular knowledge base are serviced via the CORBA standard for communication. The use of CORBA and OKBC allows for the broadest range of access by developers into any of our Protégé knowledge bases. Developers can make queries of frame-like knowledge-bases with these commands; for example, they can retrieve classes via get-class-all-subs, slots via getframe-slots, or instance of classes via get-class-all-instances. Currently, we have only implemented a subset of the OKBC accessor functions, in part because the semantics of the knowledge model assumed by OKBC is somewhat different from the knowledge model used by Protégé and CLIPS (Grosso, et al., 1998). Before we can build a complete OKBC server, we must resolve these differences, so that the semantics of arbitrary OKBC queries are interpreted correctly by the Protégé knowledge base server. Although OKBC was not designed specifically for CORBA, it was designed as a communications layer, and therefore has been easy to convert into IDL. OKBC provides the semantic layer for accessing our knowledge bases—without OKBC, IDL provides only syntactic communication information. With OKBC, any component that understands the protocol and the assumptions of the OKBC knowledge model can access knowledge bases stored on our server. As we describe in Section 5.3, we have implemented mediating components that use our knowledge-base server to retrieve information from knowledge bases for use by the propose-and-revise problem-solving method. 5.3. Mediating Components For the propose-and-revise method to solve the two tasks described in Section 3, it must access two separate knowledge bases. Rather than custom-tailor the method to adapt to these separate knowledge bases, we isolate all adaptation of the method to mediating components. As shown in Figure 5, there is a separate mediator for each of the two tasks. In addition, we built a third mediator that allows the method to solve the UHaul task, and that accesses a third knowledge base that contains UHaul information. All mediators stand between the method and the knowledgebase server, as shown in Figure 6. On the method side, mediators provide a service for the problem-solving component, responding to method-specific queries for information; on the knowledge-base side, mediators are clients to the general-purpose knowledge-base server, sending queries to particular knowledge bases via OKBC. There may be several mediators that retrieve information from a single knowledge base for different methods, or, as in our example, several
13
Figure 6. A method, mediator, and knowledge-base server. mediators that service a single problem-solving method and that retrieve information from different knowledge bases If mediators were large and expensive to build, our reuse architecture would not be costeffective. If developers need to connect N problem-solving methods with N knowledge bases, then they would have to build N2 mediators. Fortunately, it is rarely appropriate to connect all knowledge bases to all methods, since some methods simply cannot be applied to some knowledge bases. Furthermore, as seen in Figure 6, mediators all share the same structure, especially if they use either the same knowledge base or the same method. Building a set of mediators for the same component is therefore inexpensive. Because all three of our mediators service queries from the propose-and-revise method, they all share the same server-side IDL specification; they all receive the same requests for information from the method. In general, a mediator implements a method service via some set of OKBC queries to the appropriate knowledge base. The responses to these queries (received from the knowledge-base server) use the classes defined in the domain ontology for that knowledge base. The mediator must then translate or map this domain-specific data into the terms of the method ontology before returning them as responses to queries from the propose-and-revise server. For example, propose-and-revise must receive the set of all constraints before starting the “solve” process that searches for solutions. The method component expects constraints to be in a generic format that includes an expression attribute that can be evaluated to true or false. To match this data structure requirement to different knowledge bases, specific mediators must transform portions of the knowledge base to create constraints that match this format. Thus, for the tRNA mediator, the attributes in the tRNA knowledge base for upper-bound and lower-bound must be converted into a single expression attribute (an expression that ANDs together the upperbound and lower-bound values). From the method component’s point of view, the mediator acts as a virtual knowledge base, conforming to the method ontology, and responding to method-specific queries. However, each me14
diator may implement method services in different ways, translating and adapting information from the knowledge-base server to be compatible with the method. The transformation of constraint information described above is an example of a simple static mapping. In the next subsections, we describe in greater detail both this type of static mediation, and a more dynamic, runtime mediation as implemented by the three mediators we constructed for the propose-and-revise method. 5.3.1
Static Mediation
For any mediator for the propose-and-revise method, the initial request for information from the method is to initialize and load the knowledge base. For propose-and-revise, a large amount of data must be available before the problem solver can proceed: all information about constraints, fixes for those constraints, and state variables. Thus, as soon as any of our mediators receives a “LoadKB” request from the method, in addition to connecting to the correct knowledge base, the mediator immediately begins requesting and transforming information from the knowledge-base server for use by the method. A successful completion of the “LoadKB” request indicates both that the knowledge base has been successfully loaded from files into the Protégé knowledge-base server, and that all static mapping of the initial information has been completed by the mediator. Thus, the mediator copies and converts information from the server component into the mediator component. When the method component next makes requests, such as “get all constraints” or “get all fixes”, the mediator can supply this information directly, without additional requests from the knowledge-base server. For both molecular-biology problems, it is necessary to augment the problem-solving method with domain-specific procedural information—in particular, with knowledge about how to compute distances between helices. For the tRNA problem, this distance is computed between the tops and bottoms of the helices, based on the dimensions of each helix. The ribosome problem requires a somewhat different distance function: one that determines the distance between two points expressed relative to the local coordinate systems of the two helices. These local systems are related to the global coordinate system by translations and rotations. In both cases, these functions are not specified in the domain knowledge bases, but are instead stored in their respective mediating components, and are passed to the method as a set of CLIPS “domain functions” used to adapt the behavior of the method for specific knowledge bases. Because the entire body of these functions is passed to the method, the run-time invocation of helix distance computation can remain within the method component. The notion of domain-specific functional or procedural knowledge is not unique to propose-andrevise. In fact, we believe that our ability to specify this type of knowledge in mediating components will provide the developer with a powerful mechanism for adapting a method to a particular domain and knowledge base. Our aim is to support easy method adaptation, yet to provide method implementations that have no domain-specific knowledge.
15
5.3.2
Dynamic Mediation
Although propose-and-revise could be implemented to solve our configuration tasks with static mediation alone, this type of mediation is not sufficient for many problem-solving methods, and does not scale to applications with large knowledge bases. For some methods, the requests for information from a knowledge base may be entirely dependent on user-provided run-time inputs. Such methods need to be able to query the knowledge-base server at run-time, and therefore need mediators that can process their requests dynamically, at run-time. The choice between static and dynamic mediation depends on the problem-solving method and on the knowledge base. In general, dynamic mediation is appropriate for any application that needs only a small portion of a large knowledge base, whereas static mediation could be used for applications that need all or most of the knowledge base. For example, in the ribosome-configuration problem, the knowledge base included lists of possible helix locations based on the preprocessing sampling procedure. Some helices have several hundred possible locations. Since the propose-and-revise method halts as soon as a single solution is found, and since the search space is densely populated with solutions, it is likely that most of these possible locations will never be examined by the method. Therefore, our mediator allows the method to query for locations as needed at run time. At initialization time, our mediator sends only the initial location of each helix to the problem-solving method: Then, as the method discovers constraint violations and wishes try another configuration of a helix, it makes a run-time query to receive the next location for a given helix. Unfortunately, knowledge about how to request a new helix position is clearly domain-specific information. If this knowledge is encoded into the problem-solving method, then that component is no longer as generic, and it will have fewer opportunities for reuse. To keep the method domain-independent, we encode information about how to request a new helix location in the mediating component. Thus, the mediator adapts the generic problem-solving request of “fix a constraint-violation” to the domain-specific request for a new helix location from a particular knowledge base. In particular, we have implemented the capability for run-time knowledge-base access via a flexible messaging protocol between the method and mediating components. This protocol allows the mediator to send a set of messages to the method component at initialization time, and then at run time, the method invokes a particular message at some point during processing. In our examples, the mediator informs the method at initialization time that “requesting a new helix location” is the appropriate message to use when fixing a constraint violation. At run time, the method sends this message, parameterized with a particular constraint violation, back to the mediator, which responds by querying the knowledge-base server for the particular helix location. All three of our mediators for propose-and-revise include some form of dynamic mediation. For the tRNA mediator, the same “get-object-location” message is declared. However, since the knowledge base for the tRNA task specifies sampling rates, instead of explicitly listing helix lo-
16
cations, the mediator implementation of “get-object-location” does not make any queries to the knowledge-base server. Instead, helix-location information can be generated from a logical location number plus the sampling rate within the mediator. In this way, the mediator behaves as a virtual knowledge base, answering run-time queries from the method without accessing the knowledge-base server. For the UHaul mediator, the analogous run-time query for “get-object-location” is “get-upgrade”: a function that retrieves information from the knowledge base about how to upgrade a piece of equipment. For either the UHaul or the elevator configuration tasks, the knowledge base includes explicit information about how to upgrade parts along implicit dimensions of size, strength, or cost. For example, in the UHaul domain, there is a constraint violation if the vehicle storage capacity is insufficient for the customer’s needs. In this case, the equipment must be upgraded according to a sequence stored in the knowledge base. Thus, the “get-upgrade” message includes three parameters: the name of the class indicating the type of equipment, the slot name within that class containing the upgrade sequence information, and the name of the old model. Our ability to build three different application systems with this dynamic messaging mechanism and with the same problem-solving method is a first demonstration of the versatility of our approach. The only requirement of messaging is that the method server include a mechanism for passing back the message call to the mediator. Developers are then free to implement arbitrary functionality in the mediating component for a given message call. This approach allows developers to adapt method components to new tasks, thereby encouraging more frequent method reuse in a wide range of applications.
6. Discussion and Future Work The implementation that we presented in Section 5 is a proof-of-concept example of our approach for the construction of knowledge-based systems from reusable components. Our goal is to reduce the development time and cost of building large knowledge-based systems. With only a single example of reuse, we can neither make claims about the amount of work saved via reuse, nor measure the ease with which components can be found, retrieved and adapted from the reuse library. However, our work is the first step needed to carry out these studies: over time, reuse cases must be built and components made available so that developers of knowledge-based systems can assess the value of an entire reuse library of components. We have argued that CORBA provides a useful communications standard that helps make components in a reuse library more accessible. The inability to share components across different development environments is a significant obstacle to cost-effective reuse. Until reuse libraries become large enough to help solve a wide variety of problems, it may not be cost-effective for developers to use such libraries: the costs of searching for, understanding, and adapting components is often usually higher than the benefit gained by reuse. The CORBA standard allows developers in different environments to contribute to a large library of components. Thus, we argue that this
17
software engineering standard should be embraced by developers of reuse libraries for knowledge-based systems. To achieve cost-effective reuse, we must increase the frequency with which components are reused, reduce the cost of finding and adapting components, and reduce the cost of adding new components to a reuse library. To meet these goals, we are extending our work, as we described in the next three subsections. First, we are developing a language for describing and indexing problem-solving methods. Such a method-description language will reduce the cost of finding and understanding a legacy component from a library. Second, we are expanding our set of CORBA implementations of problem-solving methods, including methods that are decomposable into sub-methods. Third, we are investigating ways in which to reduce the cost of constructing the mediators that adapt components to new tasks. 6.1. A Method-Description Language If a developer wishes to reuse an existing component to solve a new task, it is necessary her to understand that component. If components are small and if the semantics of components are well-known, then it is easy for developers to reuse components. For example, any developer that understands trigonometry can reuse a component that computes cosine(x). However, for knowledge-based systems development, where components are typically large and complex, it is unreasonable to expect either that users will understand a component a priori, or that users can read and understand the software code that specifies a component’s behavior. Therefore, components in such a reuse library must be accompanied by a more abstract description of how and what the component does. Developers can use this description to understand a component, and this understanding should help them know whether or not a component is appropriate for their task. Furthermore, if this description includes formal specifications, it would be possible to build tools that help developers select appropriate methods from a reuse library, and validate whether or not a method (with the adaptations by some mediator) is appropriate for a given problem and knowledge base. In the implementation described in Section 5, the only component descriptions that we used were the IDL specifications for each component. Although this supports component inter-operability, it does not guarantee that developers share and understand the semantics of components. That is, IDL specifies the syntactic definition of the inputs and outputs, but it does not specify higherlevel semantics such as the goals of the method, assumptions the method makes about its inputs, or type of problem the method is designed to solve. The need for method specifications to help organize and index a library of problem-solving methods has been recognized by many researchers (Angele, et al., 1996; Fensel & Groenboom, 1997; Motta & Zdrahal, 1998). One way to organize methods in a library is by the abstract task or problem type that each method addressed (Breuker & van de Velde, 1994). Example tasks might be planning, diagnosis, or prediction. Additionally, problem-solving methods can be indexed by
18
their goals and result types (Gil & Melz, 1996). Thus, the goal of propose-and-revise might be to “find a state where all constraints are satisfied,” whereas the result type might be a configuration of all state variables. However, while these organizations are useful, goals and task descriptions are not by themselves sufficient for describing software components in a reuse library. For our needs, we envision a method-description language designed to help developers select, understand and adapt methods from the reuse library (Gennari, Grosso, & Musen, 1998). In addition to the syntactic interface information provided by IDL, this language should include the following elements: •
It would specify the goals or capabilities of the method. This might include a description of the problem-type that the method is designed to solve.
•
It would specify the constraints across inputs and outputs. This specifies what the method accomplishes, assuming it runs correctly, and is known as the competence of the problemsolving method (Akkermans, Wielinga, & Schreiber, 1994). For example, if propose-andrevise runs successfully, every output variables will have an assigned value.
•
It would specify constraints about inputs and outputs. This captures the formal assumptions that the method makes about inputs and outputs. For example, all input constraints for propose-and-revise must be computable; that is, they must be expressed in a language that allows an algorithm to evaluate them as true or false.
•
It would include a description of the control flow within the method. As we describe in Section 6.2, problem-solving methods are often decomposed into a set of sub-methods. Therefore, the method-description language must include information about these sub-methods, including a specification of the control flow among them.
•
It would include a history of usage of the problem-solving method. Although a history is more descriptive than analytic, it would provide examples of use from which developers could build.
Our initial goal for a method-description language is to support the construction of mediators. A formal description of components and their requirements could help developers in three ways: (1) The specification could help developers find and select appropriate problem-solving methods, (2) the formal descriptions of constraints about inputs and outputs could be used to semiautomatically construct mediators, and (3) the specification (plus a proof engine) could be used to verify that a mediator is correct and complete with respect to a domain knowledge base. In some ways, a method-description language is similar to the specification language for Design Patterns (Gamma, Helm, Johnson, & Vlissides, 1995). Although design patterns differ substantially from problem-solving methods, a library of patterns and a library of methods both require means for developers to search for and understand appropriate patterns or methods. Thus, patterns are described in a structured manner, including features such as a “known uses” that would be appropriate for either problem-solving methods or design patterns. Both sorts of reuse librar19
ies are trying to specify information about the semantics of their elements: information beyond what is captured by an IDL specification and at a more abstract level that what is communicated by source code. If the capabilities of problem-solving method components can be specified, then these components can become agents that communicate and interact via the Knowledge Query and Manipulation Language, or KQML (Finin, McKay, Fritzson, & McEntire, 1994). If a set of agents share a common language, then KQML is the communication protocol for making and processing queries in this language. Ideally, one agent could broadcast information about the type of problem that the developer wants solved, another agent about the knowledge available in a knowledgebase server, and other agents about the capability of problem-solving methods. Thus, the KQML protocol would allow agents to associate appropriate problem-solving methods and knowledge bases with a particular problem, thereby obviating the need for a developer to search a reuse library for appropriate components. However, KQML provides only the query and response mechanism among agents, and does not specify the language used to communicate information about component capabilities or semantics. Thus, before we could use KQML, we must first complete the specification of our method-description language. The development of a method-description language is one of the aims of the KARL and New KARL work at the University of Karlsruhe (Angele, Decker, Perkuhn, & Studer, 1996; Fensel, Angele, & Studer, 1997). These are formal and executable specifications of problem-solving methods. In fact, one of the aims of New KARL is to capture and differentiate knowledge about (1) inputs and outputs, (2) method decomposition into sub-methods, (3) control flow information, (4) inference structure, and (5) pre- and post-conditions (Angele, et al., 1996). This list of types of knowledge about a method is very similar to the sorts of information we need to capture in our method-description language. More recently, this line of research has been extended to include “adapters” that include mappings between a domain model and the requirements and goals of a problem-solving method (Fensel & Groenboom, 1997). As we develop our language, we will continue to work with these researchers, sharing work and ideas where appropriate. 6.2. Decomposable Problem-Solving Methods We believe that developers will be able to reuse problem-solving methods more easily if the methods are broken into sub-methods; that is, methods should be decomposable (Chandrasekaran, 1986; Steels, 1990; Musen, et al, 1996). For example, Figure 7 shows a decomposition of propose-and-revise into three sub-methods: Goalp, Revise, and Transition. This decomposition is not the only one possible; it is simply one that we have implemented for the elevatorconfiguration task (Rothenfluh, et al., 1996). Although our implementation included this decomposition of the method, in the reuse example of Section 5, the problem-solving method was wrapped as a single CORBA component. Instead, if each sub-method in Figure 7 were available
20
Solve: Find a state with all constraints satisfied Inputs: constraints, variables with initial values Output: a complete, consistent state (or failure)
Goalp: Check state against all constraints; terminate if successful Inputs: state, constraints Output: success, or a state with violated constraints and proposed fixes
Revise: Apply fixes and suggest new states Inputs: a state with proposed fixes Output: a set of possible states
Transition: Propose new state; backtrack if no new state Input: a set of states Output: the preferred state (if dead end reached, this state may be retrieved by backtracking)
Figure 7. The propose-and-revise method decomposed into three sub-methods. as a separate CORBA component, developers could modify a method simply by interchanging one sub-method for another. For example, one developer may need an efficient implementation of the Revise sub-method that uses a dependency network to recalculate state variables when applying a fix. With a different task, such an implementation might not be desirable: Efficient revision is unimportant for tasks with few state variables and the requirement of a dependency network could be an unnecessary burden. If these two implementations of Revise—one with a dependency network and one without—could be wrapped as CORBA components with similar (or identical) IDL specifications, then the problem-solving method could be configured to use one or the other version of Revise, depending on the task. Thus, if methods are designed to be decomposable when they are added to the reuse library, developers can reuse large portions of the components, yet be free to replace sub-methods that may not be appropriate for their task. This reuse scenario is just one that would be possible with a CORBA library of decomposable problem-solving methods. Another example that we are pursuing is the reuse of the temporalabstraction problem-solving method (Shahar, 1997). This method takes time-stamped data as input and produces abstractions of those data as output—in the medical domain, it might take red-blood-cell counts over time, and infer that a patient is anemic as a resulting abstraction of those data. This method can be used alone for some tasks (as embodied in the Résumé system, Shahar & Musen, 1993) or it can be reused as a sub-method of a larger problem-solving method, to solve problems such as planning for medical care (Tu, et al., 1995). To date, we have reused temporal-abstraction as a sub-method in a number of different configurations, but always within the CLIPS programming environment. We are working to wrap both temporal abstraction and our planning methods as CORBA components to be added to our reuse library. As we said earlier, if methods are decomposable, the method-description language must include information about how sub-methods are organized within the larger method. For example, the description of propose-and-revise would have to indicate the control flow among the three sub-
21
methods. We are currently exploring different possible languages for expressing control-flow knowledge. Gil and Melz (1996) have implemented propose-and-revise as a method within their LOOM environment; their implementation includes both a decomposition of the method into a set of components and a language for describing the control flow among these components. However, just as our previous implementations are sharable only within the CLIPS system, their implementation is sharable only within the LOOM environment. We are currently collaborating with them to combine our CORBA implementations with a method-description language that incorporates elements from their implementation. 6.3. Component Adaptation For every instance of reuse, there is some adaptation cost: the cost of adapting or reconfiguring the component to the task at hand. For example, when developers replace a sub-method (as in Section 6.2), or build a mediating component (as in Section 5), to apply a problem-solving method to a new knowledge base and task, they are adapting the component to a new task. We want to make reuse cost effective; therefore, our approach includes strategies for minimizing these adaptation costs. Some adaptation costs are minimized by our use of the CORBA standard, which allows developers to wrap legacy code with a standard API so that they do not have to work at the inter-process communications level when integrating and adapting legacy components. We have demonstrated our ability to wrap legacy code as a CORBA component with the propose-and-revise method. We are using this same approach to wrap the temporal-abstraction method and add that method to our library of reusable CORBA components. We believe that our use of mediating components also minimizes adaptation costs. As shown in Figure 6, all our mediating components share a canonical form: an implementation of a set of services required by the problem-solving methods in terms of a set of OKBC calls to a knowledge-base server. Thus, all mediating components for a given method component must offer the same set of services. From an implementation perspective, once a developer has built one mediator, successive adaptations to new domains become easier and easier, because all the mediators for these adaptations share so much structure. Thus far, our experience has been that the adaptation costs decrease as the developer gains more experience with the method and with our architecture. Our long-term goal is to reduce the adaptation cost by building tools that perform semi-automatic mediator construction. Unlike automatic programming, mediator construction should be a tractable problem, because the mediators share structure. For example, given the IDL of a problemsolving method, we can easily generate a skeletal mediator that provides stub definitions based on the method IDL for all queries that the method may make of the mediator. Better yet, given a formal method-description language specification of the requirements of a method, we can generate actual mediating functions that map information in a knowledge base to those requirements.
22
As a specific example, for queries where the method ontology matches the knowledge base fairly well, we may be able to provide an implementation for that query using knowledge of OKBC and the ontology for the knowledge base. A sophisticated tool might ask developers for corresponding terms in the knowledge-base ontology for particular method queries, and then might construct the appropriate OKBC requests of the knowledge-base server. The purpose of such a mediator-constructing tool is to decrease the adaptation costs associated with component construction. To understand more about how to decrease these costs, we must know what sorts of adaptation are typically required. Thus, we must build a library of reusable components, make them available to a variety of knowledge-base system developers, and collect a set of reuse examples from which we can learn about typical component adaptations. The example of method reuse that we present in this paper represent our initial attempt to learn about reuse in our architecture for the construction of knowledge-base systems. Our architecture emphasizes development of separate mediating components and use of the CORBA standard for component communication. Both of these ideas support the use of legacy code, and allow developers to isolate their component adaptations. We presented only a single case of method reuse here; however, we are expanding both our library of reusable components, and the set of example cases of component adaptation and reuse. By growing this set of reuse cases, we will eventually be able to measure the cost savings due to reuse, and to learn to build additional tools that assist developers reuse components in a cost-effective manner. Acknowledgements This work has been supported in part by the Defense Advanced Research Projects Agency, and by the National Science Foundation (#IRI-9257578). We would like to thank all members of the knowledge modeling group for their contributions to our ideas. We would also like to thank three anonymous reviewers for constructive suggestions and Lyn Dupré for editorial assistance.
23
References Akkermans, H., Wielinga, B., and Schreiber, G. (1994). Steps in constructing problem-solving methods. Proceedings of the Eighth Banff Knowledge Acquisition for Knowledge-Bases Systems Workshop (pp. 29.1–29.21), Banff, CA. Altman, R. B., Abernathy, N. F., and Chen, R. O. (1997) Standardizing representations of the literature: combining diverse sources of ribosomal data. Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology, (pp. 15–24), Halkidiki, Greece. Altman, R. B., Weiser, B., and Noller, H. F. (1994). Constraint satisfaction techniques for modeling large complexes: Application to the central domain of the 16s ribosomal subunit. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, (pp. 10–18), Stanford, CA. Angele, J., Decker, S., Perkuhn, R., and Studer, R. (1996). Modeling problem-solving methods in New KARL. Proceedings of the Tenth Banff Knowledge Acquisition for Knowledge-Bases Systems Workshop (pp. 1.1–1.18), Banff, CA. Bachant, J. and McDermott, J. (1984). R1 revisited: Four years in the trenches. AI Magazine 5, 21–32. Breuker, J.A., and van de Velde, W., Eds (1994). The CommonKADS Library for Expertise Modeling. Amsterdam: IOS Press. Breuker, J. (1997). Problems in indexing problem solving methods. Workshop on PSMs for Knowledgebased systems (pp. 19–36), Nagoya, Japan. Chandrasekaran, B. (1986). Generic tasks for knowledge-based reasoning: High-level building blocks for expert system design, IEEE Expert, 1(3), 23-30. Chaudhri, V., Farquhar, A., Fikes, R., Karp, P., and Rice, J. (1998). The Open Knowledge Base Connectivity Protocol 2.02 [Online] Available http://www.ai.sri.com/~okbc/spec.html, February, 1998. Chen, R.O., Felciano, R., and Altman, R.B. (1997). RIBOWEB: Linking structural computations to a knowledge base of published experimental data. Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology, (pp. 84–87), Halkidiki, Greece. David, J., Krivine, J., and Simmons, R., editors, (1993). Second Generation Expert Systems. SpringerVerlag, Berlin, Germany. Eriksson, H., Shahar, Y., Tu, S.W., Puerta, A.R., and Musen, M.A. (1995). Task modeling with reusable problem-solving methods. Artificial Intelligence, 79(2), 293–326. Fensel, D., and Groenboom, R. (1997). Specifying Knowledge-Based Systems with Reusable Components. Proceedings of the 9th International Conference on Software Engineering & Knowledge Engineering (SEKE-97) pp. 349–357, Madrid, Spain. Fensel, Angele, and Studer, R. (1997). The knowledge acquisition and representation language KARL. IEEE Transactions on Knowledge and Data Engineering. Finin, T., McKay, D., Fritzson, R., and McEntire, R., (1994). KQML—A language and protocol for knowledge and information exchange. In Kazuhiro Fuchi and Toshio Yokoi (Eds.), Knowledge Building and Knowledge Sharing, Ohmsha and IOS Press. Gamma, E., Helm, R., Johnson, R., and Vlissides, J. (1995) Design Patterns: Elements of reusable object-oriented software. New York: Addison Wesley.
24
Gennari, J. H., Altman, R. B., and Musen, M. A. (1995). Reuse with PROTÉGÉ-II: From elevators to ribosomes. Proceedings of the Symposium on Software Reuse, (pp. 72–80). Seattle, WA. Gennari, J.H., Grosso, W., and Musen, M. (1998). A method-description language: An initial ontology with examples. Proceedings of the Eleventh Banff Workshop on Knowledge Acquisition, Modeling and Management. Banff, Canada. Gennari, J. H., Tu, S. W., Rothenfluh, T. E., and Musen, M. A. (1994). Mapping domains to methods in support of reuse. International Journal of Human-Computer Studies, 41, 399–424. Gil, Y., and Melz, E. (1996). Explicit representations of problem-solving strategies to support knowledge acquisition. Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96), pp. 469 – 476, Portland, OR. Grosso, W., Gennari, J., Fergerson, R., and Musen, M. (1998). When Knowledge Models Collide (How it happens and what to do). Proceedings of the Eleventh Banff Workshop on Knowledge Acquisition, Modeling and Management. Banff, Canada. Gruber, T.R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5, 199-220. Guarino, N., and Giaretta, P. (1995). Ontologies and knowledge bases: Toward a terminological clarification. In N.J.I. Mars (ed.), Towards Very Large Knowledge Bases, IOS Press, pp. 25–32. Karp, P, Myers, K, and Gruber, T. (1995) The generic frame protocol. Proceedings of the 1995 International Joint Conference on Artificial Intelligence (pp. 768 – 774). Marcus, S., Stout, J., and McDermott, J. (1988). VT: An expert elevator designer that uses knowledgebased backtracking. AI Magazine, 9(1), 95–112. Motta, E., Stutt, A., Zdrahal, Z., O’Hara, K., and Shadbolt, N. (1996). Solving VT in Vital: A study in model construction and knowledge reuse. International Journal of Human-Computer Studies, 44, 333–401. Motta, E., and Zdrahal, Z. (1998). A library of problem-solving components based on the integration of the search paradigm with task and method ontologies. International Journal of Human-Computer Studies. (same issue—please fill in volume and page numbers). Musen, M. and Schreiber, A. T. (1995). Architectures for intelligent systems based on reusable components. Artificial Intelligence in Medicine, 6, 189–199. Musen, M.A., Tu, S.W., Das, A.K., and Shahar, Y. (1996). EON: A component-based approach to automation of protocol-directed therapy. Journal of the American Medical Informatics Association, 3, 367–388. Orfali, R., Harkey, D., and Edwards, J. (1996). The Essential Distributed Objects Survival Guide. John Wiley & Sons, New York. Puerta, A.R., Egar, J. W., Tu, S.W., and Musen, M.A. (1992). A multiple-method knowledge-acquisition shell for the automatic generation of knowledge-acquisition tools. Knowledge Acquisition, 4, 171– 196. Rothenfluh, T.E., Gennari, J.H., Eriksson, H., Puerta, A.R., Tu, S.W., and Musen, M.A. (1996). Reusable ontologies, knowledge-acquisition tools, and performance systems: PROTÉGÉ-II solutions to sisyphus-2. International Journal of Human-Computer Studies, 44, 303–332. Schreiber, A.Th., Wielinga, B., Akkermans, J.M., van de Velde, W., and de Hoog, R. (1994). CommonKADS: A comprehensive methodology for KBS development. IEEE Expert, 9, 28–37. Schreiber, A.Th. and Birmingham, W.P., Eds. (1996). Special issue on the Sisyphus-VT initiative. International Journal of Human-Computer Studies, 44, 275–568.
25
Shahar, Y. (1997). A framework for knowledge-based temporal abstraction. Artificial Intelligence, 90, 79–133. Shahar, Y., and Musen, M.A. (1993) Résumé: A temporal-abstraction system for patient monitoring. Computational Biomedical Research, 26, 255–273. Shadbolt, N., Motta, E., and Rouge, A. (1993). Constructing knowledge-based systems. IEEE Software, 10, 34–38. Steels, L. (1990). Components of expertise, AI Magazine, 11, 30–49. Tu, S. W., Eriksson, H., Gennari, J. H., Shahar, Y., and Musen, M. A. (1995). Ontology-based configuration of problem-solving methods and generation of knowledge-acquisition tools: Applications of PROTÉGÉ-II to protocol-based decision support. Artificial Intelligence in Medicine, 7, 257–289. Yost, G. R., and Rothenfluh, T. R. (1996). Configuring elevator systems. International Journal of Human-Computer Studies, 44, 521–568.
26