Computation Scrapbooks for Software Evolution - CiteSeerX

3 downloads 265 Views 264KB Size Report
restore intermediate computation. ... save and restore computation snapshots to transfer partially .... VMWare and Virtual PC plus 100 gigabyte hard drives.
Computation Scrapbooks for Software Evolution Richard Potter

Masami Hagiya

PRESTO, Japan Science and Technology Corporation

Department of Information Science University of Tokyo

[email protected]

ABSTRACT Previous uses of persistent computation state have focused on fault tolerant computing, continuations of interactive sessions, process migration, and debugging. Dramatic increases in computer power and storage have made additional uses practical. SBDebug is a Computation Scrapbook system that explores how multiple 100% complete snapshots of runtime state can help support other aspects of software development. Snapshots stored in a Computation Scrapbook can document source code, initialize and evaluate the testing of code, and provide criteria for automatic code generation. In this way, Computation Scrapbooks show potential for use in many phases in the evolution of software, including the reading, writing, testing, transforming, debugging, and documenting of programs.

Categories and Subject Descriptors D.2.6 [Software Engineering]: Programming Environments – interactive environments; D.2.5 [Software Engineering]: Testing and Debugging – debugging aids, testing tools.

General Terms Documentation, Experimentation, Human Factors.

Keywords Computation Scrapbooks, Programming by Demonstration, Software Evolution, End-user Programming.

1. INTRODUCTION Computer scientists have found practical reasons to save and restore intermediate computation. For example, checkpointing a computation or a database can help applications be more fault tolerant by enabling graceful recovery from hardware and software failures [4,10]. Researchers in distributed computing save and restore computation snapshots to transfer partially completed computations to computers with free resources without wasting already completed parts of the computations [1]. Some interactive systems can save entire states, so that a user can ACM COPYRIGHT NOTICE. Copyright © 2002 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or [email protected]. in Proceedings of the 5th International Workshop on Principles of Software Evolution IWPSE 2002 (Orlando FL, May 2002), ACM Press, 143-147.

[email protected]

continue work at a different time or computer without losing window positions and another other aspects of the work environment. Smalltalk’s images [7] have long provided this feature, and now virtual computing platforms such as VMWare and Virtual PC make this possible for entire operating systems. Finally, Unix core dumps and Window’s Dr. Watson allow details of crashed programs to be debugged at a later time when debugging tools and programmer attention are available. These uses hint at some desirable aspects of the saving and restoring of computation state. One is that it is conceptually simple. It is easy to understand the essence of saving and restoring computation state, even though the contents of computation state can be extremely complex. Computation state can be a meaningful whole that can facilitate communication. It contains both important details and the context necessary for a computer to interpret the details. Since snapshots span the range from simplicity desired by users to the complexity required by machine, might snapshots show potential for being a useful abstraction for human-computer communication in programming tools? This seems a timely question because increased computer power and the popularity of virtual machines such as Java make such state increasingly practical to capture, store, and process. The goal of this research is to find more places that these aspects can help the software development process. The original motivation was to find techniques that could benefit end-user programmers who need a better understanding of the invisible computation state their programs create and manipulate. The saving and restoring of computation state showed potential because it could allow end-users a way to interact with computation state. Interaction is important because active experimentation and experience is important for in-depth understanding that is otherwise difficult to obtain. The idea of a Computation Scrapbook that allowed the saving, restoring, and organizing of complete copies of computation state seemed a good place to start, as it would give even novice programmers a way to interact with poorly understood computation state in a way that is as conceptually easy as photocopying a sheet of paper. It might also scale to complex programs and thus potentially be a useful technique for Gentle-Slope Systems[3], systems that allow users to acquire a full range of programming skills in an incremental yet continually useful way. This potential has been explored with SBDebug, a prototype Computation Scrapbook system that provides the infrastructure for exploring specific solutions that allow productive interaction with runtime state. The solutions so far demonstrate how Computation Scrapbooks can be used for documentation, test cases, programming by demonstration, and debugging. In this

way, Computation Scrapbooks show potential for many phases of the evolution of software, including the reading, writing, testing, transforming, debugging, and documenting of programs.

2. SYSTEM ARCHITECTURE Figure 1 shows the core system architecture of a Computation Scrapbook. Source code is compiled and executed on a machine. The machine has a debugger that can control the machine’s execution like any typical debugger. In addition, the debugger can extract the entire computation state of the machine. Multiple copies of these states can be saved in a Computation Scrapbook. The debugger can later restore any of the snapshots, from which the machine can continue execution. The goal is to use this infrastructure to support new solutions that use multiple snapshots to support software development. Figure 1 uses an iceberg to represent the machine to suggest something that can only partially be visualized, but is still understood to have a reasonably welldefined boundary. Each snapshot contains all the state of the machine. This helps simplify the architecture because the internal complexity of the machine can be ignored, at least for a high-level description. However, since the machine can be defined in various ways, it is desirable to define it so that the simplicity of the core architecture is preserved. To support useful solutions the machine must be large enough that snapshots save enough information and context to support useful solutions. In addition, issues such as snapshot size, operating system specifics, or hardware configuration must not prevent capture of the machine state from being practical. Third, the bounds of the machine should be defined simply enough that the user can easily judge when the machine definition

Solution 1

is sufficient for their current programming project. It is this third item that can complicate the core architecture. For example, defining a machine to be all non-native parts of some Java virtual machine would be difficult to understand without detailed knowledge of how the virtual machine is implemented. Fortunately, many machine definitions are simple and (potentially) practical. A small machine definition that would work well for many simple algorithms would be all the information referenced from the execution stack frames of a single thread. A single Java virtual machine would make an attractive mid-sized machine definition, at least for situations when external state such as the file system can be ignored. A large machine definition that would be intuitively simple and useful for most types of programs would be an entire personal computer. VMWare and Virtual PC plus 100 gigabyte hard drives prove that this even this ambitious definition is quickly becoming practical. For SBDebug, the default machine is defined as all the Emacs Lisp global variables and functions defined in a single Emacs buffer. This definition was chosen to allow quick development of meaningful demonstrations on a widely available computing environment. Of course, beneficial solutions that use this architecture must exist. The following sections will briefly describe solutions that show potential for being useful without adding too much complexity to the core system.

3. DOCUMENTATION One solution supported by Computation Scrapbooks is to facilitate

Solution 2

Solution 3

...

Figure 1. At a high-level, the system architecture is simple since the internal complexity of the machine can be ignored. The iceberg icon signifies that while much of computation state is hidden, it still has reasonably well defined bounds.

very detailed documentation of code. Instead of struggling to write detailed comments, a programmer can concentrate on highlevel descriptions and provide links to snapshots in a Computation Scrapbook that illustrate how the code works in detail Figure 2 shows a comment that includes links to four snapshots. In SBDebug, links to snapshots appear as text inside of angle brackets. Clicking on the link will take the user directly to a debugger view of the snapshot. Figure 3 shows the debugger view of the fourth snapshot, with the top window showing variable bindings and the bottom window showing source code. The user can now use the debugger to explore the computation state in as much detail as required.

For programmers documenting code, Computation Scrapbooks let them concentrate on high-level issues and let the snapshots fill in the details. For programmers evaluating code for possible reuse, finding comments with prepared snapshots would allow easy experimentation and testing of special cases for the code’s potential new use.

4. TEST CASES Another way to use a Computation Scrapbook is to support the testing of arbitrary segments of code. Tests cases can be created for various levels of code and be directed at important special cases. These could have applications in regression testing of evolving software [6] and test suites for inferring program invariants [5]. For a simple example of snapshot based test cases, consider the code in Figure 4. It retrieves the temperature from a web site, converts it to Fahrenheit and outputs the value. Using snapshots for initial and goal conditions makes it possible to test only the centigrade-to-Fahrenheit conversion segment because a snapshot can initialize all state including local variables.

Figure 2. Clicking any of the links (e.g. ) in the comment will take the user to the debugger executing an appropriate example for illustrating the code in detail.

In the current version of SBDebug, test cases appear as special comments in the code with textual links to snapshots that are stored in the scrapbook as shown in Figure 4. The first link (e.g. ) gives the initial condition and the second link (e.g. ) gives the goal condition. A program counter in each snapshot defines the bounds of the code segment. When each test is run, the first snapshot is restored, the program is run to where the second snapshot was taken, and the computation state is compared to the second snapshot. The result (passed or failed) is appended to the end of the special test case comment line.

Figure 3. The top half of the debugger view shows variable bindings (e.g. i=24). The bottom half shows source code. For a user who is learning advanced programming skills, such easy access to appropriate computation state can help confirm conjectures about how the code works. Experimentation is possible by editing snapshots and continuing execution to see the effect. The programmer who created the comments and snapshots can insure that the snapshots are simple enough to be understandable but complicated enough to illustrate the code. The programmer can choose examples are easy to understand using the limited capabilities provided by whatever debugger or software visualization is available. A computation scrapbook facility can increase opportunities for this interaction because users do not have to re-create illustrative runtime states by starting the program and setting debugger breakpoints. Clicking a link to a snapshot is much easier and can even take users to difficult-to-recreate states involving partially completed loops or nondeterministic events.

Figure 4. An arbitrary section of code such as the highlighted expression here can be tested using Computation Scrapbooks. The snapshots used for test cases can be created by a combination of capture from running programs and by editing the machine state via debugger commands. Creating test cases this way can therefore be easily interwoven with other debugging tasks when insights into important special cases might occur. For the centigrade to Fahrenheit code segment, the user can start out tracing through the code with a debugger and then decide to make a test case. Pausing the program just before the code segment executes, the user could look at the value of temp-inc and decide that its value (reflecting the actual value returned from the web site, perhaps 20 degrees) is not useful for determining if the segment works because the user does not know the corresponding Fahrenheit value. The snapshot can be edited to make temp-in-c be 0-C, because the user knows the correct

result will be 32-F. At this point SBDebug would look as in Figure 5. Then the user can save a snapshot using a simple command in the menu. The user can trace to the end of the segment, confirm that the result is as expected (Figure 6) and then save the second snapshot. Then the first test case can be created based on the two recently created snapshots using a simple SBDebug command.

Figure 7. Machine state for third snapshot: temp-in-c is 100, and program counter is at beginning of code segment.

Figure 8. Machine state for the fourth snapshot: result is 212, and program counter is at end of the code segment.

Figure 5. Machine state for the first snapshot: temp-in-c is 0, and program counter is at beginning of the code segment.

Being able to test arbitrary segments of code could be especially beneficial to intermediate-level programmers. They could test partially completed code. No separate driver code would be necessary. Testing early and often can make the code better and build the user’s confidence. The increased interaction with runtime state can increase the user’s understanding of the programming environment. For advanced programmers, this reduces the amount of planning necessary to create formal tests for code. Tests can be created on a whim while stepping through a debugger. Tests can be realistic in ways that might be difficult with test drivers. As a byproduct, snapshot test cases can serve as annotations that record special cases that were being considered when the code was written and debugged. In this way, they can serve as documentation for programmers who later read the code. Snapshot test cases could also be useful for verifying later code refactorings and other software evolution events.

Figure 6. Machine state for the second snapshot: result is 32, and program counter is at end of the code segment. The second test case checks conversion from 100-C to 212-F. The user can quickly create the initial test case by restoring the 0-C snapshot, editing temp-in-c to be 100 (Figure 7), and then creating a new snapshot. However, after the user traces through the code segment, the result shown is not 212, but rather 87.5, because the code segment has a bug. Nevertheless, the user can manually edit the value to be 212 (Figure 8), save a snapshot, and then use it as the goal condition. When the user later debugs and corrects the program code, both these tests can be automatically applied with one command to verify the corrections.

5. PROGRAMMING BY DEMONSTRATION Computation Scrapbooks can potentially support Programming by Demonstration. PBD is when user demonstrates what the program should do using specific examples, and the programming system automatically generates the program. Use of this technique has mostly directed to automating tasks in user interfaces [2,8]. Computation Scrapbooks can provide the infrastructure for applying PBD to general purpose programming. A simple example of this can be demonstrated in SBDebug by using a block of test cases as the PBD specific examples. SBDebug can then attempt to automatically generate code that passes all the test cases. Of course, SBDebug must have some constrains to limit the search space. Therefore, SBDebug requires the user to explicitly list tokens that might appear in the expression. For the centigrade to Fahrenheit example above, assume the user only remembers the components of the expression for converting centigrade to Fahrenheit but has forgotten how to compose them. SBDebug allows the user to list these on a specially formatted comment. Then after specifying which test cases to use and issuing the PBD command, SBDebug attempts to find a suitable expression. For the centigrade to Fahrenheit code segment using the test cases from Section 4, the correct expression is found after quickly

testing 120 possibilities. The special comment and the final result can be seen in Figure 9.

Multi-threaded debugging creates other opportunities for snapshot-based debugger enhancements. Given a snapshot for a starting point, a debugger could automatically trace multiple permutations of thread schedulings to help debug deadlock and other synchronization problems. We have done some preliminary implementation work for using this technique with Java.

7. CONCLUSIONS

Figure 9. Using the test cases as specific examples and a list of 7 tokens to constrain the search, SBDebug can find the correct expression after testing 120 possibilities. It is hard to imagine this working for more than a few lines of code, but for a novice programmer even generating few tokens of code can be useful when the syntax or spelling of the language has not been mastered. Computation Scrapbooks make it possible to apply PBD to general purpose programming by providing enough context to test very small segments of code where PBD can be tractable. For some domain specific applications, longer segments can be made possible by using domain knowledge in the PBD inferencing system. For advanced programmers, one could imagine related forms of automatic code generation helping with integrating unfamiliar code modules. The programmer could provide example test cases and the module could include domain information to help a PBD system automatically generate glue code. Another type of automatic code generation might be possible for debugging. If an evolving software system is annotated with adequate test cases, one could imagine some situations where bug fixes that matched a certain pattern could be applied and tested automatically. For example, the repetitive changes necessitated by upgrading a module or changing some system-wide aspect might become easier to automate.

6. DEBUGGING Computation Scrapbooks can support some typical and advanced debugging techniques. By saving a snapshot before stepping through a section of code, a user can easily restore the snapshot to step through it multiple times, perhaps noting different aspects or stepping a different way by choosing different functions to “step into”. Complex data visualization techniques could benefit from the opportunity to restart a debugging session with the visualization parameters chosen differently. Computation Scrapbooks also provide an alternative way of implementing backward stepping [9]. SBDebug always saves the initial snapshot of a program so that it can step a program backward by restarting the program and stepping forward (n-1) steps. This technique is effective for several thousand steps, but one could imagine intermediate snapshots making this practical for arbitrary number of steps.

Some applications of Computation Scrapbooks and their implementation in SBDebug have been presented. Of interest to software evolution research is the way that these applications support various parts of the software life cycle. A programmer could use snapshot-based documentation to study the reuse potential of some code. Snapshot-based PBD could help integrate the code. Test cases could help confirm proper integration and, as a by-product, help document the code for programmers who wish to revise it later. More research is necessary to address the many technical challenges. However, even straightforward implementations of Computation Scrapbooks show potential for some practical benefits.

8. REFERENCES [1] Bouchenak, S. Making Java applications mobile or persistent. in Proceedings of 6th USENIX Conference on Object-Oriented Technologies and Systems (January 2001).

[2] Cypher, A. (ed.) Watch What I Do: Programming by Demonstration. MIT Press, 1993.

[3] Dertouzos, M. Creating the people's computer, Technology Review (April 1997), MIT, Cambridge, MA, pp. 20-28.

[4] Elnozahy, E.N, Johnson, D.B., and Zwaenepoel, W. The performance of consistent checkpointing. in Proceedings of the Eleventh Symposium on Reliable Distributed Systems (October 1992), 39-47.

[5] Ernst, M., Cockrell, J., Griswold. W., and Notkin, D. Dynamically discovering likely program invariants to support program evolution. IEEE Transactions on Software Engineering (February 2001), vol.27, number 2, pp. 173-181.

[6] Harrold, M. Testing evolving software. Journal of Systems

and Software (July 1999), vol. 47, number 2-3, pp. 173-181.

[7] Goldberg, A. and Robson, D., Smalltalk-80: The Language and Its Implementation. Addison-Wesley, Reading, MA,1983.

[8] Lieberman, H. (ed.) Your Wish Is My Command: Programming by Example. Morgan Kaufmann, 2001.

[9] Lieberman, H. and Fry, C. Bridging the gulf between code and behavior in programming. in Proceedings of CHI '95 (Denver CO, May 1995), ACM Press, 480-486.

[10] Wang, Y.M., Chung, E., Huang, Y., Elnozahy, E.N. Integrating checkpointing with transaction processing. in Proceedings of the Twenty Seventh International Symposium on Fault-Tolerant Computing (June 1997), 304-308

Suggest Documents