February, 1999
1
Understanding Embedded Software Through Instrumentation: Preliminary Results from a Survey of Techniques Norman Wilde Department of Computer Science University of West Florida Pensacola, FL 32514
[email protected] Dean Knudson Northrop Grumman ESD 600 Hicks Road Rolling Meadows, IL 60008
[email protected]
EXECUTIVE SUMMARY Embedded systems may offer the most challenging problems in software testing and understanding that software engineers face. They usually present a combination of real time constraints, multi-tasking design, and an embedded environment with limited access to code and data. Accordingly, experienced engineers have developed techniques for instrumenting such systems to facilitate testing and comprehension. Unfortunately each engineer seems to have his own bag of tricks, so developed systems show little consistency in the level or type of instrumentation that they contain. This paper reports on an interview survey of embedded systems software engineers working in several different companies. The study has attempted to identify instrumentation techniques that the engineers have used and found to be effective, and that seem to be applicable in other similar systems. A case study is presented to illustrate each technique. The study's goal is to accumulate experience in instrumenting embedded systems for program comprehension, as a step towards more systematic use of such techniques in new systems. The techniques presented are, of course, a preliminary list, and we would welcome a chance to extend our interviews to other systems known to the reader. This report may be cited as SERC-TR-85-F, Software Engineering Research Center, Purdue University, 1398 Dept. of Computer Science, West Lafayette, IN 47906, December 1998. This research was supported, in part, by grant EEC-9418762 from the National Science Foundation to the Software Engineering Research Center, an NSF Industry/University Cooperative Research Center with sites at Purdue University, the University of Florida, the University of Oregon, and the University of West Virginia.
February, 1999
2
Understanding Embedded Software Through Instrumentation: Preliminary Results from a Survey of Techniques I. Introduction It is commonly stated that Software Engineering is currently more a craft than a true engineering discipline. Instead of an organized body of theory codified in engineering handbooks, successful projects depend on experienced practitioners, each with his own bag of knowledge and tricks learned on previous jobs. This picture is probably nowhere more true than in the field of embedded and real-time systems: "... implementation of real-time systems is largely performed through a process of 'black magic.' Through human wizardry and experimentation, developers find the right combination of parameters that minimizes cost while maximizing performance and compliance with real-time constraints." [NILS.98]
Part of the 'black magic' used by experienced embedded systems software engineers is instrumentation inserted into the embedded system to help understand what it is doing. By instrumentation, we mean output statements added to the code whose main purpose is to help the engineer understand or debug the program. The classic example of instrumentation is the journeyman programmer sticking in a few print statements to help find a bug, but the embedded systems engineer usually needs much more ingenuity. For embedded systems it is important to understand if instrumentation is intrusive or nonintrusive. Print statements are highly intrusive in that they greatly affect both system timing and memory. Intrusive instrumentation may be tolerable in early stages of debugging but not for final testing of the whole system, since it may radically change system behavior. Ideal instrumentation for embedded systems is non-intrusive, requiring only at most a small fraction of memory and processor time. Then it can be left in the fielded system and lab behavior and field behavior will be the same. This project was motivated by the observation that different embedded systems engineers use a wide range of instrumentation techniques. Some systems are developed with extensive instrumentation while others are delivered with little or none and are a source of much frustration to their maintainers. Instrumentation is often only added ad-hoc after engineers encounter severe problems in integration or testing. As a step on the road from the craft stage to the engineering handbook stage, we have been interviewing embedded systems software engineers in a range of companies to identify the instrumentation techniques that they use, and to collect their impressions about what works and what does not. We asked each engineer to describe one or more systems he had worked on and the instrumentation methods he used. Most of the interviews were conducted at industrial affiliates of the Software Engineering Research Center (SERC) which also provided financial support for the study.
February, 1999
3
Table 1 lists the main systems that were described in these interviews. However programmers often commented on other similar software they have worked on and these comments, where relevant, are also incorporated into this report. System Type Infrared countermeasures Guided bomb Electronic countermeasures Frame relay driver Flight control analog/digital Flight control digital Data link board
Description Airborne system for detection and countermeasures against infrared guided missiles Global Positioning System (GPS)-based bomb tailkit. Airborne system for radar jamming.
Size 60 - 100 KLOC
Language Ada
Year mid 90's
~10 KLOC 20 KLOC
C& assembler Ada
Unix frame-relay protocol driver for mobile phone system F-14 fighter operational flight programs
~ 10 KLOC ~ 10K bytes 100 - 200 K words 2-3 KLOC
C
~ 20 KLOC ~3 MLOC 500 KLOC
C& Fortran C, C++ & Assembler C
late 90's early 90's mid 90's mid 70's early 80's late 80's mid 90's mid 90's late 90's
F-14 fighter operational flight programs
Data switch
High-speed data link telecommunications board. Real-time hardware-in-the-loop testing of missile components Large telecommunications data switch
Cell phone base station
Radio unit incorporated in a cell phone base station
Missile test bed
Assembler High level languages Assembler
Table 1 Embedded Systems Used in his Study
II. The Problem of Understanding Embedded Systems A typical embedded system might look something like Figure 1.
E mb e d de d S yste m R e al T ime Inp u ts
R e al T ime C o ntro l O utp uts T as ks
Figure 1 A Typical Embedded System
February, 1999
4
The embedded system's software runs on one or more processors mounted on special purpose boards inside some electronic device. It processes inputs from the outside world in real time and provides control or other output signals which are likewise required in real time. The software is typically designed as a series of tasks which together provide the required functionality. The embedded system must meet defined performance constraints for response time or throughput. To meet these constraints, the designers often estimate how much time each task will need to do its job, and schedule the tasks in a repetitive frame of execution (Figure 2). Scheduling may be quite complex since it needs to take into account task priorities, interrupt handling, average and worst cases, etc.
F rame
Ta s k 1
Ta s k 2
Ta s k 3
T ime
Figure 2 Scheduling Tasks in a Repetitive Frame Understanding embedded systems requires understanding not just what the code does, but also when it is doing it. An error in system output may occur not just because some task executed incorrectly, but also if it is taking too long, and is thus making some other task miss its deadlines. However the main problem in understanding embedded systems is that they are, well, embedded. The software is inside something, and that something is usually not a developer-friendly environment with monitors, printers, compilers and debuggers. One of our interviewees stated that systematic instrumentation was added to his system precisely because debuggers were found to be ineffective. Typical use of a debugger is to insert a breakpoint somewhere near the problem code, run to that point, and then single step forward looking for the problem. However in laboratory debugging the inputs to the system come from real-time devices operating at high speed. As soon as the program stops at the breakpoint it begins to lose inputs and its behavior changes irrevocably.
As will be seen from the examples in the following sections, the engineer's biggest problem is figuring out how to get instrumentation data out of the embedded system, especially without introducing delays that will change performance. The software engineer does have some tools to aid in testing and instrumentation. Some real-time operating systems provide some debugger capability, but with the problems mentioned above. Logic analyzers are passive hardware devices that are used to monitor systems. Data is collected and analyzed so the engineer can determine control flow, display traces, etc. Some amount of realtime displaying of data can be done. They are usually quite expensive; $50 K and up is not
February, 1999
5
uncommon. An in-circuit emulator allows the engineer to debug on the target platform as he might do at his desk. That is, he can do things like stopping the processor and single step through the code. However this is very intrusive. Many systems use several processors which adds a new dimension to the tool problem. For example it is not realistic to single step through one processor while the others keep running. On the other hand you cannot lock-step all the processors since the test would not be repeatable due to race conditions on data passing over busses during these steps. We found that software engineers had a wide range of opinions about the use of analyzers and emulators. Some projects used them heavily. But others looked at them as a last resort. For the data switch project, setting up a logic analyzer is too time consuming. The engineer has to physically go to the lab instead of working from his desk. There is usually no space inside the box for the analyzer, so the cards need to be reconfigured, which may cause timing distortions. Also the logic analyzer may not give a very good image of what a modern processor is doing. For example, if it is accessing data from the onboard cache, the logic analyzer would not see the operation by monitoring the bus. These problems can be overcome by using a special version of the processor, but that may again introduce additional distortions.
III. Deciding What To Instrument We asked interviewees to identify the main purpose of the instrumentation in their software. By an overwhelming margin the main answer was "debugging", broadly defined as the identification and diagnosis of faults in the embedded system. The faults could be either in product functionality ("We are not tracking the target accurately enough.") or in performance ("The card's data throughput is not sufficient.") However since debugging problems can be so diverse, engineers instrumented many different things to understand different problems. One useful concept may be Brooks's distinction between understanding in the problem domain as opposed to the program domain [BROO.83]. On the problem domain level, instrumentation can help determine if the system is meeting its requirements. This instrumentation tends to focus on capturing the main inputs, processing states and outputs of the software. For example: Instrumentation in the infrared countermeasures system captured the infrared images from its camera so engineers could see how well it was tracking approaching missiles. Instrumentation in the guided bomb wrote key control variables each frame to a telemetry system, allowing the engineers to tune the control laws for the bomb's flight. Instrumentation in the frame relay driver captured the stream of inputs and the most recent N transitions in the finite state machine that was the heart of the program.
However, as an engineer from the guided bomb project pointed out, problem domain instrumentation is not sufficient to resolve many debugging problems. To track down many defects engineers need information about task starts and stops, interrupts, subroutine execution, and important internal data values at different times. They needed to be able to relate this information
February, 1999
6
to inputs and outputs, that is, to cross reference code level events with problem domain events [VONM.96]. In general, interviewees were not interested in just knowing if a particular event occurred, but rather in the sequence of events and in event timing. Knowledge of how long a particular routine executed or of the time interval between interrupts is essential for diagnosing performance problems.
IV. "2-bit" Instrumentation Our first case shows what can be done to instrument an embedded system whose internals are almost completely hidden. One of our interviewees described methods he had used that required from 2 to 8 bits of output. With just 2 bits, quite a lot can be accomplished! The data-link board mentioned in Table 1 was contained in a Data General MV7800 computer, but had its own 80186 processor. The board's purpose was to read and reformat data packets from a 1 Mbit/sec data line before passing them on to the MV7800. There was no emulator or analyzer available and thus no way to see what was going on inside the processor. However the board did have 8 light emitting diodes (led's) which could be set by putting 8 bits into a specific register. In initial debugging the board was simply locking up on each test. The software engineer put instrumentation into the code which said, essentially, "display this number". When the board locked up, the led's would give a rough pointer to the code being executed just before the problem occurred. In later testing the main issues were performance and timing and a different "2-bit" instrumentation strategy was used. Instrumentation was put into the code to set or flip one of the led's. An oscilloscope probe was put on to the led to display the status of the bit in real time as the system executed. One typical procedure was to set one led on the start of each frame. This signal would be used to trigger the oscilloscope's horizontal scan so each complete scan corresponds to one frame. Then another led would be turned on at the start of a task and off at its end. This signal would be applied to the vertical axis of the oscilloscope. The resulting oscilloscope trace showed directly what fraction of the frame was being consumed by this particular task. Another alternative was sometimes used if several routines always executed in sequence, as in "round-robin" scheduling. Instrumentation in each routine would "toggle" the output bit as each was entered, thus giving a display that showed how much of the frame was being taken up by each routine (Figure 3).
February, 1999
7
routine 1 running
routine 2 running
routine 3 running
frame
Figure 3 Tracking Three Routines with "2-Bit" Instrumentation A key benefit of this method is that it makes the dynamics (that is, the timing behavior) of the embedded system visible, without the overhead associated with reading the system clock and outputting time stamps. The technique is only very mildly intrusive; toggling a bit in a register typically takes only 2-3 assembler instructions. The instrumentation can be left in the delivered code if necessary. The interviewee has since used this same kind of instrumentation on half a dozen later systems and has found it has many uses. On one recent system, the instrumentation was used to diagnose problems with a real time kernel. The kernel occasionally "popped up" to take some time from regular processing. Preliminary analysis had indicated that this time loss should not be significant. However using "2-bit" instrumentation software engineers could observe that the kernel's event caused other events to shift substantially to the right on the oscilloscope trace every few seconds. The kernel's event was causing cascading delays, resulting in badly missed time deadlines.
This kind of instrumentation can be applied in just about any circumstance where two unused output lines are available. It can be used most easily if the system's processing is repetitive. (The oscilloscope trace would probably not be as easy to interpret if processing varies substantially from frame to frame.) The only major difficulty with the technique was encountered on multi-processor systems if only one output port was available. The different processors interfered to create a picture which was difficult to interpret.
V. Exploiting a Logic Analyzer Our next case describes a project that made excellent use of a logic analyzer. The infrared countermeasures system was intended to detect infrared guided missiles attacking an aircraft, track them using an infrared camera, and take countermeasures where appropriate. Frustration with the debugging process led to ad-hoc instrumentation
February, 1999
8
being added, which was found to be so useful that a project standard was established to require all software engineers to instrument their code in the same way. Software engineers were required to instrument interrupt handlers at their start and stop, and tasks at start and exit. This instrumentation provided enough data to let testers understand the basic flow of control. Additionally, instrumentation was introduced at programmer option to cover subroutine entry and exit, key intermediate data values, and certain inputs and outputs. A combined hardware / software instrumentation strategy was used. The system was tested in the lab, with the processor connected to a logic analyzer. Programmers inserted source code at the points listed above to write to fictitious memory addresses. Since the memory was not present, no actual write took place, but the analyzer was programmed to trigger on access to this range of memory. The analyzer then stored the time of the event, the address, and the data value. Each of the fictitious addresses was given a meaningful symbolic address according to a standard naming scheme. Using these names, the analyzer produced a trace containing time, symbolic name, and data value. The instrumentation was left in the deployed system, but in a block guarded by a conditional. By changing the value of a constant, the conditional can be made to evaluate to false. Thus the Ada compiler could optimize away the whole block.
This simple instrumentation scheme has several benefits: 1.) Impact on performance is low. The extra instructions added are small and the actual capture and storage of the trace data is handled by analyzer hardware with no performance impact on the main processor. In practice, the software engineers did not report any significant problem with performance. 2.) The analyzer provides a time stamp for each of the trigger events. As previously mentioned, software engineers emphasize the importance of time information to track down performance problems. 3.) While it is necessary to keep a list of the trigger memory locations used, this list does not change frequently, as might the actual memory locations of data and code. So the mapping is easy to maintain. The engineers reported that the effort required to introduce the instrumentation was very small, perhaps one to two weeks. All the software engineers interviewed agreed that the benefits were huge and vastly outweighed the cost. Quotes include: "It is now used to resolve essentially ALL problems encountered during integration testing. We cannot do much without it." "Without it this project would have been torched long ago!"
February, 1999
9
VI. Using Pre-Positioned Hooks for Instrumentation One unusual strategy for instrumentation involves placing "hooks" at key places in the code to which instrumentation can be attached. The data switch is a very large system and uses many instrumentation techniques, some project-specific and others that rely on the real-time operating system that it uses. The operating system allows code to be dynamically linked, loaded, and unloaded onthe-fly, even in fielded production systems. This allows some kinds of instrumentation to be inserted and removed as needed to attack specific problems. Special "hooks" have been placed in the code at key places which allow debug code to be attached or detached dynamically on-the-fly. For example, to solve one customer's problem, the hooks in the message dispatch algorithms were used. These hooks are essentially function pointers which can point to debug code to be executed before or after message dispatch. If the pointer is null, it has no effect. To solve the specific customer problem, a routine was inserted that captured event and time data, filtered it looking for important events, and wrote it into a circular buffer. A separate task wrote the buffer to disk, thus minimizing the performance impact on the running switch. If the buffer were overfilled, trace data could be lost but the switch would not be affected. About 500 MB of data was generated in 10 minutes and then analyzed off line. A separate tool was written to parse the output further and throw it into a spreadsheet for analysis.
The combination of the dynamic load capability plus the presence of the hooks in the code allowed a very complex debug session to take place without rebooting the running switch. As may be imagined, this was very important for the customer!
VII. Instrumentation Packages Some really experienced embedded system engineers have recognized that they can never anticipate all the questions that they will need to ask about an evolving system, and have thus opted to develop flexible instrumentation packages that are part of the delivered system. The flight control analog/digital system mentioned in Table 1 was the F-14 fighter Operational Flight Programs (OFP). It contained an instrumentation package called Flycatcher which evolved through several versions, as did the OFP itself. Flycatcher provided a facility embedded in the OFP that can be used to extract data from the running system. It was used not only in development, but also was part of the fielded software. It could, however, be turned on or off as required. Flycatcher captured data values from memory in real time. It was an additional task in the OFP, which remained dormant until turned on. When enabled, it was instructed to monitor the value of a specific memory location at a certain periodicity. The results could be shown on a display in the cockpit or, during lab testing, be written to an output bus for recording and subsequent off-line analysis. Flycatcher shows the advantages of designing in instrumentation early in system design, so that software engineers can take best advantage of it. For example engineers placed statements in the code to count key events (e. g. the number of time a function was invoked within a frame) precisely because Flycatcher could provide access to the
February, 1999
10
resulting values during debugging. Many applications of Flycatcher were found, especially for fault isolation and performance analysis. Interestingly, pilots flying the F-14 also found applications for Flycatcher. The F-14 software monitored many sensors, not all of which were normally displayed. As experience with the OFP evolved, it was sometimes found that some sensor value, such as engine temperature at a particular location, was important to flight operations. Instead of waiting for the value to be added to the display at the next OFP release (perhaps 12 to 18 months away), pilots would have on their knee pads hexadecimal addresses that could be keyed in to Flycatcher to get the data they needed. Thus Flycatcher evolved from a lab and testing tool to provide useful operational functionality.
The Flycatcher package was found to be so useful that an enhanced version was included in later versions of the F-14 in the early 1980's. The hardware now consisted of components connected by buses, which provided a point for monitoring interactions. Flycatcher could now capture "packets" of data, and logical conditions for data capture could be set, for example to start capturing values when time had arrived at a certain point. A telemetry system was also now available so that during flight tests Flycatcher could be turned on from the ground and the outputs transmitted for recording on the ground. An instrumentation package can become somewhat easier to develop if the embedded system is on a board inside a more conventional computer, because its memory or file system may be available to store instrumentation output. However the embedded system will still have its own processor, and instrumentation will still be required to get the information out. Our next case describes such a system: The missile test bed was a real-time hardware-in-the-loop simulation system for missile components developed around 1994. It contained several processor boards and could generate radio frequency signals (RF) to the missile to simulate radar, while simultaneously maneuvering the missile platform to simulate flight.
RF Generation
Computer
Kinematics Control
Figure 4 Missile Test Bed After initial difficulties in debugging and using the test bed, a generic data capture package of C functions was developed that would allow instrumentation to be introduced in a very flexible way. The package allowed values of specified memory locations to be captured at predetermined times and written to memory. At the end of a test run the values were written to disk for off-line analysis. A major consideration in the development of the package was the need for flexibility. The test bed was used by many different analysts with different needs. It was not
February, 1999
11
practical to reinstrument and recompile each time an analyst wanted different data. Accordingly the package was file driven. For each run, the user gave a setup file naming the variables that should be captured in each of the different processors. Capture normally took place at the end of each frame or possibly every N frames where N could be set for each processor. At the beginning of each run, one package function was called to build a table of all interesting variable names and the corresponding memory addresses Then at the end of each frame another function was called to capture only the variables actually specified in the setup file for this run. The flexibility to choose any data value allowed the data capture package to serve its different users. For the engineers, often the control outputs were most important. For example, if the RF generation hardware did not seem to be working correctly, the output values from the package could be checked to see if the problem was with the control signals being sent to the hardware or with the hardware itself. For programmers, it could be more significant to count interrupts, calls to particular routines, etc.
A key concept in both of these instrumentation packages was flexibility, in one case to handle unforeseen user needs, in the other to be able to serve a diverse user population. However flexibility has a cost. Flycatcher was estimated to have been about 5% of the overall OFP, while the data capture package took most of the time of one analyst for about 6 1/2 months. The benefits of flexibility are hard to quantify ahead of time, but they seem to be great. For example, the data capture package was generic enough so that it is now in use on five different simulation systems at two different companies.
VIII. Moving Off-Platform There is an alternative to investing in instrumentation to test embedded systems in the laboratory. That is, to invest in simulators that allow much of the testing to be done in a developer-friendly environment "off-platform". Our next case made use of this approach: The electronic countermeasures system provided pilots with a jamming and "situational awareness" capability. As much testing as possible was done off target in a VAX environment, emulating the environment of the final system. The development organization had a small group of programmers devoted to building VAX based tools that simulated each part of the final target environment. A script language was developed for each of the external systems so that a driving script could be written for each test. The script language allowed conditionals so that inputs could be based on responses from the system under test. Since the VAX and the target machines had different word layouts ("big-endian" versus "little-endian"), for some parts of the system two packages had to be developed and code was linked with one or the other depending on the platform. This strategy was effective in reducing the amount of on-target debugging substantially. It was estimated that 95 - 98% of the bugs were eliminated on the VAX before going to the target platform. The software engineer we interviewed said that this was the best planned testing he had seen in his career.
February, 1999
12
The off-platform strategy has considerable benefits, including better reproducibility of tests and far better tools for debugging. However we found that this strategy is controversial. Some software engineers felt that off-platform testing is too costly since simulators and extra code are needed, and that the resulting tests may not accurately represent what will happen when the code is moved to the operational platform. They also stated that simulators are generally easier to use if the devices being simulated are digital rather than analog, and so might not be appropriate for all systems. The simulators also create a maintenance burden. After the system is fielded hardware and software may change. Even though testing is now mainly in the lab, the simulators should be kept up to date, and this task is often omitted. When testing off-platform again becomes necessary, it may be too late to update the simulators.
IX. Some General Themes Several themes emerged from the interviews that would seem to be applicable to a wide range of projects. Perhaps the most important was the need to plan for instrumentation from the beginning of the project. A frequent theme was that "We only introduced the instrumentation after we got in trouble. It would have been much more effective up front". Here are some examples of suggestions: • • • • • • •
Instrumentation should be part of the requirements or a "design item". Memory, processor time, and development time should be allocated to it from the beginning. If instrumentation is designed-in, then it can be left in the system as fielded. Thus you don't have the problem of different behavior between the fielded and test bed systems. If purchasing software, make sure the contract gives you access to whatever instrumentation it contains. In the hardware design, always provide a connector for an analyzer. Standardize instrumentation so it is used consistently in code written by all programmers. Then the integration testers can rely upon its presence to do their job. In a large system, instrument the interfaces between components so that you can test each one independently. Design the instrumentation so that you can do analysis post-run and outside the lab. Lab time is expensive and scarce in most projects.
X. Related Work While we know of no other work that surveys actual instrumentation practices of embedded systems engineers, there is a considerable literature on the problems of embedded, real-time and distributed systems. An extensive survey of the debugging problems is given by McDowell and Helmbold [MCDO.89]. Schutz also gives an extensive review of the problems, emphasizes the problem of observability, and comments on the difficulties of using conventional debuggers [SCHU.94]. He also describes several monitoring systems. Many other authors also discuss the theory of monitoring methods and/or describe specific monitoring or debugging systems, for example [MARI.90, TSAI.90, SCHM.94, SIDE.94]. Yan, Sarukkai and Mehra describe some experience with a very extensive set of monitoring and
February, 1999
13
visualization tools from NASA's Ames Research Center [YAN.95]. Hollingsworth, Miller, Goncalves, Naim, Xu and Zheng describe a technique they call "dynamic program instrumentation" in which the running program is modified periodically to collect information about its execution [HOLL.97]. Waheed, Rover and Hollingsworth give a detailed analysis of some of the design alternatives in monitoring distributed systems, such as the trade-off between sending event data immediately or batching it for efficiency [WAHE.98]. Heath and Etheridge describe methods for graphically displaying the results of monitoring parallel systems, a topic that is very important for effective program comprehension [HEAT.91]. If our sample of embedded systems is at all representative, it would appear that the tools actually available to practicing software engineers are much less sophisticated that what appears in the literature.
XI. Conclusions We believe that the benefits of instrumenting embedded systems are fairly obvious from the cases we have studied. Instrumentation is a fairly cheap strategy for improving software testability and maintainability and can provide a significant immediate economic return. For example, one of our interviewees described a system that was instrumented so well that data from flight tests could be used to generate test scripts for use in the lab. The maintenance organization had agreed that, if any bug could not be found within two days, an engineer would fly out to the field site to fix it. Using the generated test scripts, only one trip was needed in two years of operation, a considerable saving over usual experience!
But instrumentation techniques are far from universal. Several times in the course of the interviews for this study, software engineers described their techniques and then said something like: "Doesn't everybody do it this way?". Well they don't! Almost every interview contributed a new idea or technique. We hope that this paper is a step towards getting simple and effective instrumentation methods into more widespread use. We recognize that this report is only a first stage, and we would encourage our readers to get in touch with us if they know of examples of effective instrumentation that we could include in future versions.
Acknowledgments We would like to thank deeply the software engineers who contributed their time and ideas to this study: Joe Androjna Duane Braford Kevin Burr Cuong Dang C. R. (Dick) Hotz Bary Juras
Northrop Grumman CSA, Inc. Nortel Nortel Northrop Grumman Northrop Grumman
February, 1999
14
Dave Knight Tom Marler Walter Miner Gary Mussar Brian Stinton
Northrop Grumman CSA, Inc. Northrop Grumman Nortel Motorola
References [BROO.83] R. Brooks, "Towards a Theory of the Comprehension of Computer Programs", Int. J. Man-Machine Studies, Vol. 18, (1983), pp. 543 - 554. [HEAT.91] M. Heath and J. Etheridge, "Visualizing the Performance of Parallel Programs", IEEE Software, Vol. 8, No. 5, (September 1991), pp. 29 - 39. [HOLL.97] J. K. Hollingsworth, B. P. Miller, M. J. R. Goncalves, O. Naim, Z. Xu, L. Zheng, "MDL: A Language and Compiler for Dynamic Program Instrumentation", PACT'97 - 1997 International Conference on Parallel Architectures and Compilation Techniques, November 11-15, 1997, San Francisco [MARI.90] D. Marinescu, J. Lumpp, T. Casavant, and H. J. Siegel, "Models for Monitoring and Debugging Tools for Parallel and Distributed Software", Journal of Parallel and Distributed Computing, Vol. 9, No. 2, (June 1990), pp. 171 - 183. [MCDO.89] C. McDowell and D. Helmbold, "Debugging Concurrent Programs", ACM Computing Surveys, Vol. 21, No. 4, (December 1989), pp. 593 - 622. [NILS.98]
K. Nilsen, "Adding Real-Time Capabilities to Java", Communications of the ACM, Vol. 41, No. 6, (June 1998), pp. 49 - 56.
[SCHM.94] U. Schmid, "Monitoring Distributed Real-Time Systems", Real-Time Systems, Vol. 7, No. 1, (July 1994), pp. 33 - 56. [SCHU.94] W. Schutz, "Fundamental Issues in Testing Distributed Real-Time Systems", RealTime Systems, Vol. 7, No. 2, (September 1994), pp. 129 - 157. [SIDE.94]
R. S. Side and G. C. Shoja, "A Debugger for Distributed Programs", Software Practice and Experience, Vol. 24, No. 5, (May 1994), pp. 507 - 525.
[TSAI.90]
J. Tsai, K. Fang, H. Chen, and Y. Bi, "A Noninterference Monitoring and Replay Mechanism for Real-Time Software Testing and Debugging", IEEE Trans. on Software Engineering, Vol. 16, No. 8, (August 1990), pp. 897 - 916.
[VONM.96] A. von Mayrhauser, A. M. Vans, "Identification of Dynamic Comprehension Processes During Large Scale Maintenance", IEEE Trans. on Software Engineering, Vol. 22, No. 6, (June 1996), pp. 424 - 437.
February, 1999
15
[WAHE.98] A. Waheed, D. Rover, J. Hollingsworth, "Modeling and Evaluating Design Alternatives for an On-Line Instrumentation System: A Case Study", IEEE Trans. on Software Engineering, Vol. 24, No. 6, (June 1998), pp. 451 - 470. [YAN.95]
J. Yan, S. Sarukkai, P. Mehra, "Performance Measurement, Visualization and Modeling of Parallel and Distributed Programs using the AIMS Toolkit", Software Practice and Experience, Vol. 25, No. 4, (April 1995), pp. 429 - 461.