A Graph Oriented Approach to Enhance Reusability in *-bases Martin Hitz, Hannes Werthner Institute of Statistics and Computer Science University of Vienna Liebiggasse 4/3-4, A-1010 Vienna, Austria Tel: (+431) 40103-2795, fax: (+431) 430197 Email:
[email protected],
[email protected]
Abstract We present a graph oriented structure for a model base together with a retrieval component for queries and navigation in this structure. It is designed to enhance reusability in an already implemented environment for simulation model prototyping, execution and optimization. While the approach presented exploits speci c characteristics of its domain, generalizations to the problem of retrieval of other object types (e.g. software components) in general are envisaged hence the wild card in the title.
Keywords: Graph browsers, model bases, navigation in software bases, semantic networks, simulation environments
Workshop Goals: De nition of further research, exchange of experiences with similar approaches, learning.
Working Groups: Reuse and OO-methods, useful and collectible metrics, reuse terminology standards, design guidelines for reuse - C++.
Hitz- 1
1 Background In [GHW88, GHW89] the authors present several versions of an integrated modelling, simulation and optimization environment which has been implemented at the Politecnico di Milano and the University of Vienna. One of the nuclei of this environment is a so-called model base populated by numerous simulation models. While the reuse of given components has already been solved in this environment by means of an intelligent aggregation mechanism, ecient retrieval of certain model classes and model instances becomes now an issue. In this situation, a navigational/retrieval component has been designed and is currently being implemented. The augmented modelling environment will thus support four crucial aspects of reusability in the simulation eld:
Storage of models and model classes, Retrieval of models which satisfy the modeller's needs (as described in this paper), Modi cation of models as necessary, Composition of new aggregate models from existing parts, where semantic constraints are paid heed to as far as possible.
2 Position Regarding structural and semantic aspects, our approach exploits the speci c characteristics of the simulation domain. However, we feel that its main principle can easily be adapted to software bases in general, namely, to use structural and semantic knowledge to reduce the set of possible reuse candidates as far as possible and to support associational navigation in the remaining subset of the inventory. This will be explained in some more detail in the following subsections. While we consider the designed retrieval mechanism a rather useful tool to support reusability of elements stored in the model base (or, in general, in a "*-base"), it is unclear whether the retrieval power gained will in fact pay for the additional administrative overhead to maintain the base. Put dierently, we believe that the support for retrieving existing components is powerful enough to hinder the frequently encountered approach to "re-do everything from scratch because it is cheaper (simpler, quicker) than looking for existing parts", provided the *-base is appropriately structured (i.e. all possible relationships are in fact stored where applicable and all non-mandatory properties/attributes of the nodes are well de ned). However, we do not know whether the base will in fact be maintained accordingly. It is obvious that if the *-base converges toward the state of a at (non-structured) repository by simply inserting "orphan" elements which form isolated subgraphs, the bene t of the tool practically disappears. In order to avoid this situation, a set of measures must be de ned and obeyed, which might include:
forcing the user to specify as many aspects as reasonable when inserting elements, keeping maintenance as independent from an institutionalized administrator as possible, elaborating rules guiding the inference of virtual arcs (see subsection 2.1.2) to move the burden of inserting relationships from the user to the inference engine, rewarding the user for each insertion, for example, by displaying some metrics about the augmented repository, and de ning obligatory instructions for the development process of a new entity (e.g. include the printout of the new node's environment into the documentation requirements). Hitz- 2
In the remainder of this section, we rst describe some structural aspects of the model base as de ned in [GHW88, GHW89] and then present the basic principles of the retrieval component. 2.1
The Model Base
A model base serves as a repository for both, simulation models and model classes. The variety of possible models is restricted, in the following, to structured input/output models of discrete, continuous or hybrid dynamic systems described by dierential or dierence equations [Zei76]. Such models can be envisaged as software particles mapping input time series to output time series. The operations de ned on the model base are
de nition of a new model class (either from scratch or via "copy and edit"), deletion of a model class/model, editing of a model class/model de nition, and instantiation of a model, i.e. creating a distinct executable simulation model.
Given these operations, retrieval becomes an issue when the model base is populated by several hundred classes and instances. The model bases implemented for the integrated environments as described in [GHW88, GHW89] simply represent more or less unstructured sets of models and model classes. In order to introduce the retrieval components, we rede ne the model base as a multigraph G(V,A), where the node set V contains both, model classes and instances thereof, while the arc multiset A represents various relationships between models and/or model classes (see also [ACP89]). In the following, we describe the two sets in some more detail. 2.1.1
Models and Model Classes
The set of models stored is partitioned with respect to three types: basic (atomic) models, static (memoryless) models and compound (aggregate) models. Basic models have internal states and are responsible for the dynamic properties of the system under study, whereas static models are auxiliary components which provide a time-independent input-output mapping. Within this framework a model is de ned by
a type indicator (basic, static, compound x discrete, continuous, hybrid), a (possibly empty) set of input variables (each described by name, textual information, unit of measurement and a logical proposition de ning the set of permissible values (called "value range" in what follows)), a set of output variables (name, textual description, unit of measurement, value range and the function which computes its value), a (possibly empty) set of parameters (name, textual information and default value).
Basic and compound models contain in addition to the above:
a set of state variables (each described by name, de nition of the state transition function, textual information, value range and default initial value), Hitz- 3
a default simulation method, i.e. a numerical procedure for calculating the trajectory of the state variables, a time step and unit for running the simulation.
Moreover, compound models possess aggregation information re ecting their internal structure. Aggregation is achieved by connecting output to input variables of appropriate models of any type. However, compatibility checks are performed on the variables connected with respect to their units of measurement and value ranges. Likewise, similar checks are performed on the time dimensions of two coupled models. While basic and compound models look similar within this framework, they dier in so far as the above attributes are given explicitly for basic models, whereas for compound models they are automatically deduced from attributes of their components. In addition to the above described essential features of models, we add several additional descriptors as suggested by [GW88] in the form of a Thesaurus-guided keywording scheme, such as purpose and objective of the model, the domain of its application, etc. Together with the structural attributes de ned above, these descriptors form a faceted classi cation scheme [PDF87, PD89]. As usual, a class is de ned as a collection of instances with similar structure. That is, for a model class, all structural attributes (like number and dimensions of variables, the state transition function, etc.) are de ned, while all value-oriented properties (values of parameters, initial conditions, etc.) are added only upon instantiation. For model classes, derivation with property inheritance is de ned. A subclass inherits all structural information from its superclass, but may add new features and rede ne output and state transition functions. Whereas the previously described instances and classes deal with quantitative simulation models, the model base contains also qualitative simulation models (for a detailed description see [GNRW89], but also [KB84, For84, Kui84]). Such models can be seen as a qualitative abstraction with respect to structure, functions and value ranges of variables. One such model may be the qualitative abstraction of more than one numerical model. 2.1.2
Relationships
Any two elements of the model base may be connected by arcs of the following types, thus establishing the corresponding relationships:
instance-of: Arcs of this type connect models with their respective classes. These links may be traversed interactively and automatically. Automatic traversal takes place when the traversal along another dimension is requested starting from a model instance. In this case, the branches oered to the user contain both, explicit links and virtual links which are inferred by the browser via this relationship. For example, consider the situation of Figure 1. Classes A and B are related via the similar-to - relationship (see below), while nothing is known about their instances a1 and b1. However, looking for models similar to a1, the dashed arc is automatically inferred and presented to the user. The instance-of - relationship is also used for maintenance purposes: When a model class is changed, the changes are automatically propagated to their instances, unless the instance-of - arcs are agged to inhibit this propagation.
Hitz- 4
Figure 1: Inferring virtual relationships
part-of: This represents the classical aggregation relationship. In the modelling domain, it is used to form compound models of arbitrary deep nesting levels by connecting input/output ports of submodels. The part-of - relationship may be expressed on the class level as well as on the instance level.
connected-to: While part-of relates object of dierent aggregation levels, connected-to describes the linkage information between two component models (on the same level of aggregation). Connected-to is de ned only for model instances, but also establishes a corresponding relationship on the class level (friend-of).
is-a: This is the classical generalization/specialization relationship. It holds if the participating model classes represent two descriptions for one and the same physical situation, one model being more detailed than the other, i.e. more variables and parameters. Is-a features property inheritance, which may be exploited by both, the query mechanism and the browser, which may again infer "virtual" arcs in a similar way as with the instance-of relationship.
friend-of: This is a virtual relationship, it holds between two classes A and B, i there are at least two instances a of A and b of B for which relation connected-to holds.
similar-to: This relationship establishes a semantic "anity" metric: From the user's point of view, certain models may show a similar behavior. This relationship is weighted, i.e. a positive distance is given on each arc. Transitive closures are possible as discussed in [PT89, PDF87].
qualitative-abstraction-of: This relationship links qualitative models to corresponding quantitative models. It may again be used for the inference of virtual arcs.
2.2
Model Retrieval
We distinguish two dierent retrieval requirements: Exact queries ("Give me the model/class with property X=a") and fuzzy queries ("I'd need something like..."). To cope with both types of queries, we design two distinct interfaces, which may collaborate with each other. The rst interface supports "classical" queries regarding properties of models and model classes, while the second interface provides the user with an associative navigational tool to browse through the highly structured model base. Both interfaces operate either on the whole model base or on subsets thereof, stemming from previous retrieval steps. Hitz- 5
2.2.1
Query Interface
The query interface is form-based with a similar "look and feel" as the model (class) de nition/editing interface. It supports more or less arbitrary complex query expressions over the structural and semantic attributes as de ned in subsection 2.1.1, thus enabling the user to retrieve entities (models or model classes) satisfying the constraints imposed by a speci c problem at hand. The result of a query, a set of entities together with the relationships holding between them (which are invisible at this stage), may be re ned subsequently in the following manner: The set may be modi ed by the results of another query, combining the two results with standard set operations, or it may be explored interactively via the navigational interface as described below. 2.2.2
Navigational Interface
The navigational interface features an interactive graph browser. It inspects a "current" node at a time. This node is displayed together with its "environment", which is de ned as the subgraph induced by all paths of a (parameterized) maximum length, starting from the current node and using arcs from a user-de ned subset of arc types. The arcs used to establish this environment may either be explicitly stored in the model base or derived as sketched above ("virtual arcs"). The user may interact with the multigraph in the following way:
Selection of a new "current" node and re-display of the environment. This is the main browsing activity. Including/excluding certain arc types for the subsequent interaction(s). Changing the diameter (the maximum path length) of the environment. Inspecting/editing node information (and/or arc information, where applicable). Combination with results of the query interface by highlighting nodes which are members of the respective result set or restricting the environment to the subgraph induced by this set.
3 Comparison An approach put forth by Falkenhainer and Forbus [FF88, FF90, FF91] focuses on an aggregation process to build composite models out of model fragments. This process is guided by speci c rules, namely the sets I (identi cation of individuals), A (simplifying assumptions), O (operating assumptions) and R (relations imposed by the fragment). These sets can be confronted with a query Q to put together those components necessary to satisfy Q. However, the repository as a whole is of a at structure, consequently, no interactive navigational interface is provided. Zeigler's multifacetted modeling approach [Zei84, Zei89] incorporates the knowledge about the domain in little pieces of information (composable model fragments), classi ed in a strict hierarchic manner by two types of relationships (part-of and kind-of). Nodes may be either entities, specialisations or decompositional aspects. The pieces of knowledge which can be put together nally to create the new model are located at the leaves of the tree. Although the model fragments are better organized than in [FF91], the proposal covers only compositional aspects and lacks a retrieval component. The same seems to be true for systems proposed in the classical areas of software reuse: Prieto-Diaz and Freeman do not describe a navigational interface in their well-known paper [PDF87, PD89], Hitz- 6
instead, the metrics imposed on the conceptual graph are used to rank the items found by a query. Still, the user is presented a at list of candidates at the end of his session. Similarly, the Software Knowledge Base described by Meyer [Mey85], which also employs multiple link types in the graph organizing the search space, lacks the possibility to freely browse the graph. To summarize, the major strength of our approach lies in its staged nature: It enables the user to eliminate signi cant parts of the search space by de ning a " lter", based on the structural and semantic properties of the entity looked for - as has been proposed by many authors. However, in a second step, the remaining part of the repository (i.e. the result of the query) may be inspected interactively, whenever it is too large to be grasped at once by the user, thus enabling the user to explore all semantic relationships de ned until a suitable item is found.
References [ACP89]
S. Addanki, R. Cremonini, and J.S. Penberthy. Reasoning about Assumptions in Graphs of Models. In Proceedings of 11th IJCAI, Detroit, Mich., pages 1432{1438, 1989. [FF88] B. Falkenhainer and K.D. Forbus. Setting up Large-Scale Qualitative Models. In Proceedings of AAAI-88, Saint Paul, Minn., pages 301{306, 1988. [FF90] B. Falkenhainer and K.D. Forbus. Compositional Modeling of Physical Systems. In Proceedings of 4th Int. Workshop on Qualitative Physics, Lugano, CH, pages 1{15, 1990. [FF91] B. Falkenhainer and K.D. Forbus. Compositional Modeling: Finding the Right Model for the Job. Arti cial Intelligence, 51(1):95{143, 1991. [For84] K.D. Forbus. Qualitative process theory. Arti cial Intelligence, 24(1-3):85{168, 1984. [GHW88] G. Guariso, M. Hitz, and H. Werthner. A knowledge based simulation environment for fast prototyping. In Proc. of the European Simulation Multiconference, Nice, June 1988, pages 187{192. SCS, 1988. [GHW89] G. Guariso, M. Hitz, and H. Werthner. An intelligent simulation model generator. Simulation, 53(2):57{66, 1989. [GNRW89] G. Guariso, L. Nisoli, A. Rizzoli, and H. Werthner. QUALSIM: An Object-Oriented Software Package for Qualitative Modelling and Simulation. In Proc. of the European Simulation Multiconference, Rome, June 1989, pages 175{180. SCS, 1989. [GW88] G. Guariso and H. Werthner. A software base for environmental studies. Computer Journal, 31(6):550{553, 1988. [KB84] J. De Kleer and J.S. Brown. A qualitative physics based on con uences. Arti cial Intelligence, 24(1-3):7{83, 1984. [Kui84] B. Kuipers. Commonsense reasoning about causality: Deriving behaviour from structure. Arti cial Intelligence, 24(1-3):169{203, 1984. [Mey85] B. Meyer. The Software Knowledge Base. pages 158{165, 1985. [PD89] R. Prieto-Diaz. Classi cation of Reusable Modules. In A.J. Perlis T.J. Biggersta, editor, Software Reusability, Volume I, Concepts and Models. ACM Press, 1989. Hitz- 7
[PDF87] [PT89] [Zei76] [Zei84] [Zei89]
R. Prieto-Diaz and P. Freeman. Classifying software for reusability. IEEE Software, pages 6{16, 1987. X. Pintado and D. Tsichritzis. SaTellite: A Navigation Tool for Hypermedia. Technical report, Centre Universitaire d'Informatique, Universiti de Geneve, 1989. B.P. Zeigler. Theory of Modelling and Simulation. John Wiley, 1976. B.P. Zeigler. Multi-Facetted Modelling and Discrete Event Simulation. Academic Press, 1984. B.P. Zeigler. Object-Oriented Simulation with Hierarchical, Modular Models: Intelligent Agents and Endomorphic Systems. Academic Press, 1989.
4 Biography is Assistant Professor at the Institute of Statistics and Computer Science, University of Vienna. He received a Master's degree and Ph.D. in Computer Science from the Technical University of Vienna. His interests include information systems design, reusability, and OO programming and databases. He is an observing member of the ANSI X3J16 committee for the standardization of C++ and a member of the IEEE Software Reuse Working Group of the Software Engineering Standards Subcommittee. Martin Hitz
is Assistant Professor at the Institute of Statistics and Computer Science, University of Vienna. He received a Master's degree and Ph.D. in Computer Science both from the Technical University of Vienna. His main research interests include simulation environments, OO design, information systems and multimedia. For two years, he was also the principle software engineer in a private company managing the development of a distributed information system. Hannes Werthner
Hitz- 8