Visualization of Reusable Software Assets - Semantic Scholar

Visualization of Reusable Software Assets Omar Alonso

William B. Frakes

Oracle Corp. 500 Oracle Prky, Redwood Shores, CA. [email protected]

Computer Science Dept., Virginia Tech, 7054 Haycock Rd, Falls Church, VA. [email protected]

Abstract This paper presents methods for helping users understand reusable software assets. We present a model and architecture for visualizing reusable software assets. We described visualizations techniques based on design principles for helping the user understand and compare reusable components. Keywords: Representation methods, Software assets, Information visualization, 3Cs, XML.

1. Introduction This paper presents methods for helping users understand reusable software components. This is important because if software engineers cannot understand components, they will not be able to reuse them [Frakes and Fox]. Current methods for representing reusable components are inadequate. A study of four common representation methods for reusable software components showed that none of the methods worked very well for helping users understand the components [Frakes and Pole]. Our approach for helping potential users understand reusable software components is to use visualization techniques. We describe a model and system for storing, retrieving, and visualizing components in a software repository. We argue that having a tool to visualize components in different ways can help users understand and integrate them into applications.

2. Related Work Research on visualization is quite active. Much of the work focuses on visualization of scientific data from the physical sciences. Research on software visualization often concerns algorithms and code. Algorithm representations and animations, for example, are commonly used in teaching and research [Stasko et al.]. Aspects of code such as function call and include relationships can be visually represented using tools such as CIA and CIA++ [Chen et al.]. There has also been considerable work on the development of notations to support various software design methods. [Baecker and Marcus] applied human factors and typography techniques to source code. They proposed a new software engineering approach called program visualization. They emphasize the importance of enhancing program representation, presentation, and appearance. They provided seventeen design principles, several design variations, and developed a graphic design manual for C along with its graphic parser. Unfortunately they did not explore other program metrics that could go beyond improving program presentation. SeeSoft [Eick et al.], [Ball and Eick] allows the analysis of statistical data from large systems. SeeSoft introduced new techniques for visualization and analysis of source code that can be summarized in four ideas: reduced representation, coloring by statistic, direct manipulation, and capability to read actual code.

Eick’s group is part of Visual Insights1 that provides ADVIZOR, a set of components for interactive data visualization.

3. Find and Understand Users typically search for components by submitting queries to systems that retrieve assets. The user evaluates the output and if necessary, refines the query. The reuse search and retrieval problem is well understood and there are several summaries of approaches [Frakes and Gandel], [Mili et al.]. Our approach is to emphasize the understanding process assuming a known search method. As a basis of understanding we use the 3Cs model [Latour et al.]. The 3Cs model of reuse design provides a high level framework that has been found useful in the design of reusable assets. The model indicates three aspects of a reusable component - its concept, its content, and its context. The concept specifies the abstract semantics of the component, the content specifies its implementation, and the context specifies the environment necessary to use the component. For a software component, the concept might correspond to an abstract data type (ADT), whose implementation might be a C program. This component's context might require a workstation running UNIX, and a GNU C compiler. Figure 1 shows a scenario in which a user needs assets. Using a search system, the user will query the repository and get, if found, a list of assets that could answer their needs. At this point the user must understand the assets to perform a good evaluation of them. Using visualizations as a representation of the attributes and values of the assets, including concept, content, and context, the user can have a better understanding of the components. For example, let’s assume that the user is looking for string searching components. After performing the query the search system returns a list of all the matching string searching assets found in the repository. Each asset has 3Cs (concept, content, and context) which are ordered: concept precedes content and content precedes context. If the user can’t understand the concept of the component, there’s no interest in its contents or its context. If the user’s concept of a string searching component differs from the concepts in the string searching assets, the user should reformulate the query. If the asset concept is clear to the user, the next step is to understand its content (the type of algorithm the code implements, type of text target, etc.). Finally if the content suits the initial requirement, the user will consider the context (the version of compiler, operating system, expected running time, etc.). The identification of concept, content, and context is not always easy and sometimes there are no clear boundaries between them. For example an executable specification may be considered as both a concept and content.

1

http://www.visualinsights.com

Information Need

has

Visualizations

presented to

User

submits

entered

Understand and Evalaute

Represented by

Query e.g. "string search"

Search System Returns

Attributes and Values

have Assets

AKO

AKO

AKO KMP.c

Concept Content Context

Algorithm Spec

Figure 1. Reuse understanding scenario.

4. Visualization Reference Model [Card et al.] define visualization and information visualization as follows. Visualization is the use of computer-supported, interactive, visual representations of data to amplify cognition. Information visualization is the use of computer-supported, interactive, visual representations of abstract data to amplify cognition. Visualization is the process of creating a visual representation for data and information that is not inherently spatial. In the rest of this section we describe the reference model for visualization that was proposed by [Card et al.] This reference model is useful because it is simple and supports comparison of different information visualization systems. Figure 2 shows the reference model. We can see that arrows flow from Raw Data to the human, indicating a series of data transformations. Each arrow might indicate multiple chained transformations. Arrows flow from the human at the right into the transformations themselves, indicating the adjustment of these

transformations by the user. Data Transformations map Raw Data, that is, data in some specific format, into Data Tables, relational descriptions of data extended to include metadata. Visual Mappings transform Data Tables into Visual Structures, structures that combine spatial substrates, marks, and graphical properties. Finally, View Transformations create Views of the Visual Structures by specifying graphical parameters such as position, scaling, and clipping. The core of the reference model is the mapping of a Data Table to a Visual Structure.

Visual Form

Data

Data Table

Raw Data

Visual Structures

Data Transformations

Visual Mappings

Views

View Transformations

Figure 2. The reference model for visualization. The main goal behind transforming raw data into data table, is that it is easier to map a data table into a visual structure. A data table combines relational data with metadata that describes them. For example, a relation: { ,