The Software Concordance: A User Interface for Advanced Software Documents Satish Chandra Gupta Department of EECS University of Wisconsin-Milwaukee email:
[email protected]
Tien N. Nguyen Department of EECS University of Wisconsin-Milwaukee email:
[email protected]
ABSTRACT The Software Concordance is a hypermedia software development environment exploring how document technology and versioned hypermedia can improve software document management. The Software Concordance’s central tool is a document editor that integrates program analysis and hypermedia services for both source code and multimedia documentation in XML. The editor allows developers to embed inline multimedia documentation, including images and audio clips, into their program sources and to bind them to any program fragment. Web style hyperlinks are also supported. The developers are able to move seamlessly between source code and the documentation that describes its motivation, design, correctness and use without disrupting the integrated program analysis services, which include lexing, parsing and type checking. This paper motivates the need for environments like the Software Concordance, describes the key features and architecture of its document editor and discusses future work to improve its functionality. KEY WORDS software engineering environments, versioned hypermedia
1 Introduction The software development life cycle consists of many phases: requirement analysis, high-level design, low-level design, development, testing, deployment, and maintenance. A software system primarily consists of source code and formal specifications created during the development phase. Other phases generate a variety of documents such as requirement specifications, design documents, test cases, bug reports, user manuals, and deployment and transition procedures. These documents are used to develop the source code, test its correctness, and maintain it. Since they have implicit or explicit structure, they can be represented as structured documents. In a broader sense, source code is also a structured document that specifies a sequence of instructions to be executed by a computer to perform a task. Similarly, other formal specifications are also structured documents. Struc
This research was supported by the U. S. Department of Defense and by NSF CAREER award CCR-9734102.
Ethan V. Munson Department of EECS University of Wisconsin-Milwaukee email:
[email protected]
tured documents can be divided into two categories [14]: formal and informal. Formal documents include program source code and formal specifications. Informal documents include all other documents. A formal document is written using a formal language, and formal languages have precisely defined semantics. Therefore formal documents can be understood and analyzed by tools such as compilers. An informal document is written using a natural language, which does not have precisely defined semantics. Therefore, the process of understanding informal documents can not be automated, and human intervention is required. Modern software systems are highly complex. A software engineer spends a significant amount of time reading and understanding informal documents in order to develop and maintain formal documents. In practice these two sets of documents are maintained separately. Informal documents are usually rich in formatting, and are managed using a word processor such as Microsoft Word or FrameMaker. On the other hand, formal documents, usually stored as plain ASCII text, are maintained by a text editor. A typical scenario is that while performing a test case, a bug in a particular functionality is found, and a bug report is generated. In order to fix this bug, an engineer needs to read the bug report, test case and requirements specification in order to find the corresponding design document. Then, he analyzes the appropriate piece of source code, makes modifications, recompiles, and tests the system. Since informal and formal documents are maintained separately by different editors, the process has inefficiencies. These inefficiencies can be removed by using a structured document editor, which unifies the navigation and editing of formal and informal documents and can use hypermedia technology to maintain the relationships between these documents. Moreover, the power of multimedia can be exploited to enrich the documentation of the source code. As a program source editor, it should allow developers to embed multimedia documentation and hyperlinks in their source code, while still supporting program analysis services such as lexing, parsing, and type checking. Such an editor requires a uniform document model for informal and formal documents. The model should support editing and program analysis services, and should allow access to any fragment of the document to create hyperlinks. In programming language tools, the Abstract Syn-
tax Tree (AST) is often used as a standard representation for formal documents. With the proliferation of the Web, XML [23] is in widespread use to store data and informal documents. Since XML also has a tree structure, a tree model is suitable for both formal and informal documents. A tree model might be suitable for storing and analyzing documents, but humans are more adept at understanding, navigating, and editing a textual presentation of a document. To further assist a user, various style techniques such as use of different fonts and colors, indentation, and elision of information can be used while creating a presentation. Style information can also be used to provide several different presentations of the same document simultaneously. This can help the user by providing alternate views of the document suited to particular tasks. For example, while editing a presentation showing the full body of a Java class, it may be helpful to have a separate presentation that shows only the class interface and hides method implementations. The interface-only view can be used both to show the full class interface and as a navigational tool that causes the full-class view to jump to the selected method, similar to the navigational frames on many web sites. In order to support multiple presentations of a document, style information has to be kept separate from the document. This separate representation is called a style sheet. The document editor can apply a style sheet to a document tree to build a presentation. When the user edits a presentation, the editor in turn manipulates the underlying document. Multiple presentations can be achieved by having multiple style sheets. If there are multiple presentations of a document, they need to be synchronized. That is, when the user edits one presentation, the changes should be reflected in all other presentations as well. In the case of formal documents, even if a tree document model is used, program analysis services and editing services have conflicting requirements for the internal representation of the document tree. Program analysis needs the tree to be concise to do the analysis efficiently and therefore an AST is the best representation. On the other hand, editing services need all lexemes to be present in the document tree in order to efficiently build a high quality presentation of a program. Most program analysis systems use the AST as the internal representation of a program. In order to use such an existing infrastructure, the AST must be used as the document model for formal documents, and a facade needs to be built over the AST to support textual editing. The facade can have essentially the same tree structure as the AST, but all the lexemes need to be present. Such a facade is similar to a parse tree. The Software Concordance is a software development environment designed to explore the use of versioned hypermedia technology in improving software document management tools. It includes a structured document editor for both Java source code and XML documents. The next section describes the architecture of the of the SC Editor and its key features. Then, related work on integrated development environments (IDEs) is presented.
The final section discusses future work.
2 Software Concordance System The Software Concordance (SC) project is motivated by the observation that software engineers have difficulty maintaining semantic consistency among their documents. Our research is focusing on two particular sources of difficulty. First, formal documents interoperate poorly with the informal documents. Formal documents are written and managed as simple text documents using specialized text editors, often in IDEs that include interactive program analysis. Informal documents are produced using a variety of tools such as word processors, graphics editors, and specialized environments for project planning or graphical design languages such as UML [19]. In general, the informal documents and the tools used to produce them interoperate well. However, these tools do not interoperate with the IDEs used to edit formal documents. Second, the relationships among software documents are often implicit and thus cannot be browsed, navigated, queried, or analyzed systematically. This is particularly true for formal documents, whose lexical and syntactic rules hinder the use of hyperlinks, the most natural means for representing relationships among documents. To address these problems, we seek to demonstrate that it is possible to create a new kind of software development environment in which: All software documents, including source code, are Web-compatible and support hyperlinks and embedded multimedia elements as documentation. Some software documents (such as source code) will be suitable for specialized analyses (such as compilation), but these analyses will not hinder interoperability. A variety of document views will be available in order to support the variety of tasks that a programmer performs. For example, programmers may want to hide distracting multimedia documentation or may benefit from novel “fish-eye” presentations of source code. The environment will provide analyses and visualizations of the graph of hyperlinks among the software documents so that developers can answer important questions related to such issues as requirements and bug tracing. Programmers will continue to have high-quality, interactive program analysis built in to their editors and will not be forced to use unnatural top-down (or syntax-directed) editing techniques. This vision for software development is similar to Knuth’s idea of literate programming [11]. Where Knuth envisioned a single document encompassing both the implementation and documentation of a system, the SC pictures a large collection of documents, linked with hypermedia technology into a literate whole.
Figure 1. Screen shot of the Software Concordance System
2.1 The Software Concordance editor The project’s first application is an editor for Java programs and XML documents. The Software Concordance editor is implemented in Java with the Swing user interface library, and like most Java programs, runs in a variety of environments. Program analysis (lexing, parsing, and type checking) is provided by the Fluid program analysis system [5]. A screen dump from an editing session is shown in Figure 1. This session’s rightmost window shows part of the implementation of an AVLTree class, specifically the simpleLeft method. The source code has been prettyprinted automatically using information from syntax analysis. The details of how the code is pretty-printed are controlled by a simple style sheet. Appearing above the method’s source code is some multimedia documentation, including a diagram of the simple left rotation operation, a textual description (but not a comment), and buttons that control an audio clip of the implementor explaining details of the implementation. On the left, the developer has another window open on the same source code document. This window shows the entire class, but using a style sheet that hides the method bodies and multimedia documentation so that only the class interface is visible. This is simply another view of the same information shown in the rightmost window. The two views are different only because they are presented using different style sheets. The developer can edit the document through either window and any changes will appear in both.
The center window shows a design document for the AVLTree class. This window was brought up because the developer clicked on the word “simpleLeft” in one of the other windows. This word happens to be a hyperlink that points to the “Simple Left Rotation” heading in the design document. This heading is itself a hyperlink that can be followed back to the simpleLeft method in the source code. The close integration between source code and design documents seen in this example is possible because the Software Concordance editor uses a uniform document model for all software documents. Both Java and XML documents are represented using the AST data structures of Fluid, but the editor accesses these structures through a facade that hides most of the complexities of program analysis. Gupta [7] describes the document model and its implementation in detail. The SC editor is the first editing system with integrated analysis services to support inline multimedia documentation and hyperlinks in source code.
2.2 The Software Concordance architecture 2.2.1 General architecture The Software Concordance supports two basic file representations, XML and Java. XML documents that do not represent Java programs are stored normally. Java programs are stored in a special XML format that is compatible with our program analysis and hypermedia services.
Java source code files in traditional raw ASCII text can also be imported into the Software Concordance system and Java sources can be exported in either our XML-based format or ASCII. A style sheet file contains the style information needed to determine a document’s visual appearance (i.e. a presentation). To support fine-grained document access, the notion of structural unit is introduced. A structural unit is a fragment of a document that is used to encode a concept such as a paragraph, a section in XML documents or a statement, declaration or class in programs. It is represented by a Software Concordance document node. Figure 2 shows the architecture of the Software Concordance system. The Software Concordance consists of the following sub-systems: the Fluid system API, the document API, the style sheet system, the presentation system, the editing system and the user interface (UI). The document API implements the document model on top of the Fluid system API and provides document navigation and manipulation services to the rest of the system. The style sheet system maintains style information, which is used by the presentation system to build a presentation for a document. The editing system provides document editing services. A user interacts with the system through the UI.
Screen Layout
User Events
User Interface
Style Sheet
Style Sheet System
Editor Invocation Request
Scroll Request
Style I/O Request
Presentation
Edit Request
Presentation Request Style Info
Node Presentation System
Editing System Screen Location
Event Notification Request Document Change Request
Listener Document Tree Registeration Request
Document Event Management Interface
Structural Navigation Interface
Document
Document Manipulation Interface
Document I/O Request
I/O Interface
Document API AST Request AST Fluid System API
2.2.2 Using the Software Concordance editor A user interacts with the system using menu, tool bar and contextual pop up menu. Suppose the user opens a document. The document API parses the file and constructs a document tree, and reports parse errors, if any, to the UI. To display a document, the user must select a style sheet file. The UI sends a style I/O requests to the style sheet system to parse a style sheet. After building the style sheet database, the UI sends a presentation request to the presentation system and passes it the document and the style sheet. The presentation system uses the document tree and the style information to build the presentation. It also registers the presentation with the document as a document listener. The UI displays each of these presentations. To edit a document, the user moves the mouse and select any structural unit of the document that needs to be edited. Then, via the commands in the pop up menu, she can choose to edit the content of the document node presented in the selected portion of the presentation, or to edit the documentation associated with that node. The UI sends an edit request to the editing system. The editing system uses the screen location where the request was initiated to get the corresponding document node from the presentation system. If the user wants to edit the content of the node, then the editing system invokes the node editor. The node editor is a simple text editor. It unparses the node and displays the resulting textual representation of the node to be edited. The user edits the text and commits the changes. The system incrementally parses the modified text, creates new nodes and attach them to the document tree. To edit the textual documentation associated with a
Figure 2. Software Concordance’s architecture
node, the user invokes a comment editing dialog. To associate image or audio documentation with a node, the user invoks an image or audio selection dialog. The user can create and edit a hyperlink or an anchor using steps similar to the ones needed to edit the documentation. To support a contiguous selection for anchoring, an approach similar to the Document Object Model Range [17] will be used. The system can be extended to incorporate editing services for multimedia documentation as well. For example, an editor for a new medium that follows the SC protocol can be easily integrated into the system. These editors, when terminated, send another edit request to the editing system to replace the old nodes or documentation with new nodes or documentation. Upon receiving the request to replace nodes or documentation, the editing system sends a document change request to the document API. In turn, the document API modifies the document and sends a document event notification to all presentations registered as document event listeners for the document. The presentations update themselves to reflect the changes in the document and the UI changes the screen layout accordingly.
3 Related Work Related research work on improving software development environments can be divided into three categories: programming environments, software management environ-
ments and alternate program representations. Programming environments are systems that integrate program analysis into the editing environment. The Cornell Program Synthesizer (CPS) [21] introduced the syntaxdirected editing approach. Ensemble [13], and its predecessor Pan [22], sought to provide novel presentation and user interface features. Ensemble added support for informal multimedia documents and style sheets, but lacked a uniform document model encompassing source code. Neither Ensemble or Pan could support fine-grained hypermedia links because their storage format for programs was raw source code. Mjølner [10] is a grammar-driven programming environment that supports program editing through both textual and window hierarchy views. Software management environments maintain the relationships between documents from different phases of the software development life cycle. The Software Documentation Support (SODOS) system [8, 9] stored a pre-defined set of software document types in a relational database management system (RDBMS). The Documents Integration Facility (DIF) [6] used hypertext services to integrate and manage the documents. The HyperPro [16] system extended hypertext model to be as a common internal representation. No program analysis was provided. Desert[18] is an IDE that integrates a set of tools through broadcast messages. It includes an editor based on Frame Maker that supplements program editing with additional editing tools for design diagrams and user interfaces. However, Desert’s design deliberately avoids a uniform document model and thus its level of integration falls short of our goals. The Customizable Hyperlink Insertion and Maintenance Engine (CHIME) [4], which is similar to Javadoc [20], inserts HTML tags and links in sources by querying a database produced by existing static analysis tools. The Variorum [3] is a program documentation tool in which a programmer could record the process of “walking through” the source code using multimedia technology such as text, audio and digital pen drawings. CHIME and the Variorum are pure browsers and do not support editing or interactive program analysis. JavaML [1] and SrcML [12] are alternate program representations. Both JavaML and SrcML are XML based representations of Java programs. JavaML is intended to allow XML based tools to analyze Java programs. SrcML maintains the formatting choices made by the programmer, such as indentation and comment. SrcML is intended for use in program comprehension and maintenance tools. The systems described have many interesting features, but none of them provide the complete solution envisioned in this research. Like most programming environments, CPS, Pan and Mjølner, have no support for informal documents or for multiple presentations. Ensemble supports these features, but programs were treated in a way that limited their interoperability. Software document systems have evolved from the rigid structures of SODOS to the flexible, Web-compatible approaches in HyperPro and
CHIME, but they never support program analysis. Desert treats sources as a special case of a more general concept of document, but does not take the step of adopting a uniform document representation. The alternate program representations provide a model for representing programs that is compatible with current models for informal documents. The Software Concordance brings these approaches together to provide interoperability, and integrated program analysis in a single system.
4 Future Work Logical relationships among software documents are very critical to the software development process. In large-scale software systems, both documents and relationships evolve over the time. The lack of powerful tools for relationship management hinders developers from having a full understanding of the system and reduces their effectiveness. To address this problem in IDEs, the current phase of the Software Concordance project involves the creation of a versioned hypermedia framework for representing and analyzing the software document relationships. Versioned hypermedia technology provides a natural mechanism for managing documents and their relationships. In addition, the fact that software documents and their relationships are constantly changing can lead to conformance problems in which logical connections are broken and the documents are not in agreement with each other. A scheme for using hypermedia versioning and timestamps to automate detection of possible semantic non-conformance among documents has been developed [15] and a fine-grained version control model has been designed to support versioning of both documents and relationships. In the future, a set of tools to analyze, visualize, retrieve and maintain them will be needed. The resulting system will allow users to construct queries for types of relationships and levels of conformance. For example, a user might want to find all source code documents that may not be in conformance with the design documents. Further research in this area will examine how to automatically generate links among documentation and source code. Several tasks need to be done to improve the Software Concordance editor. Full direct manipulation editing support is desirable and specialized media editors for graphics, images, and UML are also needed. The Software Concordance architecture is extensible so that media editors can be easily integrated as long as they are conform to plugin protocol. The current incremental parsing algorithm is still not general for all structural units of a program. The contextual information of a structural unit needs to be considered in the incremental parsing process. The style sheet and presentation systems are very simple. A more powerful style sheet system such as Proteus [13] or Constraint CSS [2] would be preferable so that users can have more control over the presentation of programs and XML documents. Finally, a usability study is necessary to evaluate this new environment systematically.
5 Conclusion The Software Concordance is an IDE exploring the use of versioned hypermedia and document technology to improve software document management. The project’s first application is an integration of a syntax-recognizing Java program editor and an XML document editor. The editor allows developers to have embedded multimedia documentation into their program sources. Hyperlinks are supported and can be bound to any fragment of documentation or source code.
References [1] G. Badros. JavaML: A markup language for java source code. In WWW9: 9th International World Wide Web Conference, pages 159–177, May 2000. [2] G. J. Badros, A. Borning, K. Marriott, and P. Stuckey. Constraint cascading style sheets for the web. In Proceedings of the 12th ACM symposium on User Interface Software and Technology, 1999. [3] T. cker Chiueh, W. Wu, and L.-C. Lam. Variorum: A multimedia-based program documentation system. In IEEE Multimedia 2000, July 2000. [4] P. Devanbu, Y.-F. Chen, E. Gansner, H. M¨uller, and J. Martin. Chime: customizable hyperlink insertion and maintenance engine for software engineering environments. In Proceedings of the International Conference on Software Engineering, 1999.
[11] D. Knuth. Literate Programming, volume 27 of Center for the Study of Language and Information — Lecture Notes. CSLI Publications, January 2001. [12] J. I. Maletic, M. L. Collard, and A. Marcus. Source code files as structured documents. In Proceedings of the 10th International Workshop on Program Comprehension, Paris, June 2002. To appear. [13] E. V. Munson. Proteus: An Adaptable Presentation System for a Software Development and Multimedia Document Environment. PhD thesis, University of Califonia – Berkeley, 1994. [14] E. V. Munson. The Software Concordance: Bringing hypermedia to the software development process. In SBMIDIA Anais, V Simp´osio Brasileiro de Sistemas Multim´ıdia e Hiperm´ıdia, Goiˆania, Brazil, June 1999. [15] T. N. Nguyen, S. Gupta, and E. V. Munson. Versioned hypermedia can improve software development management. In Proceedings of the thirteenth ACM international conference on Hypertext, 2002. [16] K. Nørmark and K. Østerbye. Representing programs as hypertext. In Proceedings of the Nordic Workshop on Programming Environment Research, pages 11– 24, June 1994. [17] Document Object Model Range. http://www.w3.org/TR/DOM-Level-2-TraversalRange/ranges.html.
[5] The Fluid project. http://aromatic.fluid.cs.cmu.edu/public/.
[18] S. P. Reiss. Simplifying data integration: The design of the desert software development environment. In Proceedings of the 18th International Conference on Software Engineering, pages 398–407, 1996.
[6] P. K. Garg and W. Scacchi. A hypertext system to manage software life-cycle documents. IEEE Software, 7(3):90–98, May 1990.
[19] J. Rumbaugh, I. Jacobson, and G. Booch. The Unified Modeling Language Reference Manual. Object Technology Series. Addison-Wesley, 1998.
[7] S. C. Gupta. Bringing multimedia to source code: A document interface to program analysis services. Master’s thesis, University of Wisconsin – Milwaukee, December 2001.
[20] Javadoc tool home page. http://java.sun.com/j2se/javadoc/.
[8] E. Horowitz and R. Williamson. SODOS: a software documentation support environment—its definition. IEEE Transactions on Software Engineering, SE-12(8):849–859, August 1986. [9] E. Horowitz and R. Williamson. SODOS: a software documentation support environment—its use. IEEE Transactions on Software Engineering, SE12(11):1076–1087, November 1986. [10] J. L. Knudsen, M. L¨ofgren, O. Lehrmann-Madsen, and B. Magnusson. Object Oriented Environments: The Mjølner Approach. The Object-Oriented Series. Prentice Hall, 1993.
[21] T. Teitelbaum and T. W. Reps. The Cornell Program Synthesizer: A syntax-directed programming environment. Communications of the ACM, 24(9):563– 573, September 1981. [22] M. L. Van de Vanter. Practical language-based editing for software engineers, volume 896 of Lecture Notes in Computer Science, pages 251–267. Springer-Verlag, 1995. [23] Extensible Markup Language (XML). http://www.w3.org/XML/.