A Lean Process Model using In-House Source Code ... - CiteSeerX

4 downloads 0 Views 83KB Size Report
A Lean Process Model using In-House Source Code Libraries for. Efficient Development of Visualization Applications. Sebastian Grottel∗. Christoph Müller∗.
A Lean Process Model using In-House Source Code Libraries for Efficient Development of Visualization Applications Sebastian Grottel∗

Christoph Muller ¨ ∗

Guido Reina∗

Thomas Ertl∗

¨ Stuttgart (VISUS) Visualization Research Center, Universitat

1

I NTRODUCTION

For many people in industry and academia the idea of Software Engineering is often rather vague and frequently associated with the allegation of making easy tasks complicated. However, the IEEE Computer Society gives a quite clear definition, which is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software; that is, the application of engineering to software [4]. Processes which have proved to be efficient and effective in engineering should be used in software development. These are based on some basic principles: the work is meant to solve a problem; evaluation is based on practical success and costs; being aware of the quality of the final product; and thinking in standards and modules. When developing software in academia, not least in fields of computer science like visualization, almost all of these principles are constantly ignored: problems solved here have oftentimes no direct connection to real-world applications; practical success is replaced by acceptance for publication; quality awareness and coding standards do factually not exist; and thinking in modules boils down to thinking in C++ classes at best. To some degree, this is understandable: programs are seen as prototypical proof-of-concept implementations of a proposed new method—the publication is the product. Following a software development process model [7] does, indeed, introduce overhead. Beyond that, many models are too inflexible for the fuzzy requirements definitions in academia, and newer, more agile processes usually rely on an external customer defining and constantly checking the requirements. So why abandon the usual way of tinkering with software [2]? Prototypes are only meant to get a first impression of a future product. In the research context, programs are mostly used for performance measures and screen shots. Often, they even cannot serve any other purpose. Sometimes, even loading data sets other than the ones used for the publication might fail. However, research work is incremental most of the time [3], and even totally new approaches must be evaluated against existing methods; preferably against optimized implementations—e. g. benchmarking a hand-tailored GPU implementation against a quickly written, single-threaded CPU version is neither fair nor useful, but common practice. For visualization research, the change in focus from just making an (GPU-accelerated) rendering possible to solving problems in an application domain via visualization increases the necessity for high-quality software, whereas current programs are often so poorly written that no domain scientists can use them on their own. Thus, the difficulty lies in finding a process flexible enough for the research context and minimizing the software engineering overhead on the one hand, while ensuring sufficient software quality on the other hand, hence allowing the programs to be used in real applications and allow reuse of as much source code as possible. ∗ e-mail:

{grottel, mueller, reina, ertl}@visus.uni-stuttgart.de

2

P ROCESS M ODEL

What we need is just the right dose of software engineering. From our practical experience, we suggest a simple process which is based on a central code repository for an academic group and typical software engineering tasks like detailed class design and documentation only where necessary. The reuse of source code is encouraged, which in turn reduces development time and enhances the quality of programs [9]. Class libraries are the most common way of source code reuse and are widely accepted as being beneficial for reducing development time and increasing software quality. However, public class libraries like boost must aim at maximizing generality, which often comes at the cost of a slightly worse performance. For interactive visualization applications, runtime performance is of utmost importance, which results in speed-optimized implementations of classes that could also be found in common libraries. While these implementations are not generic enough to be widely used, they are often valuable assets for the academic group they originate from as there is usually a strong coherence in the working areas of the members of the group. Here the proposed repository comes into play: such algorithms must not be lost in a single prototype program, but should be collected in coherent libraries. Several conditions must be met for this process to work. First, when working on a new visualization application, even if it is just a small prototype, the developer must be aware of the possibility of reusing code and creating reusable code from the start. Second, parts that have been identified for possible reuse are to be collected in the repository and therefore require some additional work: it is crucial that they are encapsulated in a cleanly defined interface following the concept of information hiding [10] (which is inherently supported by most modern programming languages). That definition requires some mental work and the resulting implementation is usually slightly larger than just for the present application. Third, all classes collected for reuse must follow some coding standards and, most importantly, must be reasonably documented. As it is the very nature of research groups to have high fluctuation of personnel, it is essential that the source code is readable, self-contained and works out of the box, to guarantee efficient reuse and to make the work reproducible [11]. For the same reason, a complete API documentation is indispensable that highlights assumptions made for performance reasons, in particular. Using tools like doxygen is reasonable in this context. Fourth, it is desirable having platform independency whenever possible. Hiding different implementations behind interfaces greatly alleviates porting applications, which we have seen to be a major problem once a platformdependent, quickly developed application has reached some size. In particular, implementing the same functionality for all platforms at once helps to ensure the same behavior on all platforms. Figure 1 illustrates the whole approach. Increased development efforts (in the gray box) are only required for identifying reusable parts and, especially, when writing the reusable code. In practice, it proved sensible to organize the repository in more than one library grouping classes for different usage scenarios. The basic library at the center mostly deals with data structures that define the means of data exchange. That allows the source code col-

ApplicationSpecific Code

Scientific Work

Define Reusable Interfaces

Reusable Classes

(Prototype) Application Class Repository

Figure 1: Schematic illustration of the proposed development process: Instead of monolithic prototype implementation reusable classes are identified and integrated into a local repository. Only for these classes additional effort is required guaranteeing high quality.

lection to be organized in tiers around this core, which are all interoperable, but become more and more specialized for the specific research tasks. Thus, the whole library is organized into modules of high cohesion which can be used independently. This process proved to be practical in everyday work in a research group as the reduced development time increases the acceptance for some limited software engineering overhead. This effort decreases in time as the number of classes to be added to the repository, which require more software engineering, normally decreases. Having a central storage point and documented classes, a self-organizing knowledge management has been implemented, which persists source code assets over time. 3 E XAMPLE AND R ESULTS We have implemented such a source code library in our visualization group starting by collecting valuable code pieces from graduate students of the group. After the interfaces have been made coherent and comprehensively documented, these classes form the inner core of the library. Among them, there are very basic ones, e. g. a camera class, which seems nothing more than a gluLookAt replacement at first glance. But this class can illustrate the development process described above: Programs based on gluLookAt tend to end up with extremely monolithic rendering code making extensions like multi-head rendering for tiled displays very difficult, if not almost impossible. Furthermore, user interaction and rendering are usually far more interdependent than desirable. Our basic camera class first and foremost defines the interface for accessing the properties of a camera (position, direction, fov, etc.). Behind this interface, there are two implementations, one for OpenGL and one for Direct3D, which share common functions like adjusting the near and far clipping plane based on a scene bounding box. The camera can also serialize and deserialize its parameters, which allows for an easy implementation of parallel execution, multi-head rendering on multiple machines. User interaction is moved from the camera into manipulator classes as the implementation of rotation, panning, and zooming based on mouse or keyboard input only needs the camera interface. For ease of use, there are also adapters that directly connect the well-known GLUT callback functions to the camera manipulators. The implementation of these classes was, of course, more costly than just using gluLookAt. However, it is now easy to write a small visualization program that gets all the advanced features like different types of stereo projection or rendering for tiled displays nearly for free. In the meantime, we have collected many more classes, some of them only indirectly connected to visualization applications like classes that allow for accessing data set files larger than 4 GB platform-independently via memory mapping. We used this library as well as the whole interface-driven development process also for MegaMol [1], a framework for visualizing large data sets from molecular dynamics (MD) simulations. Originally, our group already had a program for point-based rendering of galaxy data sets [6], which was extended to handle MD

data [5]. As this program had an extremely monolithic structure and was yet extended several times, it eventually reached a state when it was not maintainable any more. MegaMol, as a replacement of this tool, is based on clean interface definition and source code reuse. Data storage and rendering are separated and communicate based on interface contracts, which allows for replacing and adding implementations on either side without touching the other. By defining an interface for each highly specialized type of data, we do not lose the potential for optimization, which is often crucial for interactive visualization applications. As MegaMol evolved, it is now used by seven different developers for visualizing data from the fields of MD from physics, thermodynamics, and biology [8], volume data, and grid-based simulation data. Thanks to extensive use of classes from our source code library, adjustments and extensions to the application are quickly and easily done. For example, the original desktop-focused visualization could be ported to a tiled stereo rear projection system in less than one day of work. 4 C ONCLUSION Based on the experience in our research group, we propose to bring a higher degree of software engineering to scientific software development. If a research group is able to forge a coherent local source code library that collects polished versions of classes from daily use, the effort for software engineering tasks remains limited. The source code collection itself implements some knowledge management, which is especially important in the academic environment. Most importantly, highly valuable implementations of new methods are preserved. They can be reused for variations and follow-up work or as ground truth. The key issue is that all collected classes must have a clearly defined and documented interface, be compatible with each other and common external libraries, and that the implementations must be robust and ideally platform-independent. While this process is not specific to visualization research, this area can benefit from it in particular as visualization methods must prove their effectiveness in the application domain. For that, programs that can be used by domain scientists are needed. The proposed development approach can increase the quality by software reuse while adding just as much software engineering as to benefit from it. For large development projects further steps might be required, but for every day work in academia this has proven to be sufficient and an effective and efficient compromise. R EFERENCES [1] SFB 716, Subproject D3: Visualization of systems with large numbers of particles. http://www.sfb716.uni-stuttgart.de/index.php?id=184. [2] F. L. Bauer. Software engineering - wie es begann. Informatik Spektrum, pages 259–260, 1993. [3] P. Dubois. Maintaining correctness in scientific programs. Computing in Science & Engineering, 7(3):80–85, May-June 2005. [4] I. O. Electrical and E. E. (ieee). IEEE 90: IEEE Standard Glossary of Software Engineering Terminology. 1990. [5] S. Grottel, G. Reina, J. Vrabec, and T. Ertl. Visual Verification and Analysis of Cluster Detection for Molecular Dynamics. In Proceedings of IEEE VIS ’07, pages 1624–1631, 2007. [6] M. Hopf and T. Ertl. Hierarchical Splatting of Scattered Data. In Proceedings of IEEE VIS ’03. IEEE, 2003. [7] P. Jalote. An Integrated Approach to Software Engineering. Springer New York, Inc., 2 edition, 1997. [8] M. Krone, K. Bidmon, and T. Ertl. Interactive visualization of molecular surface dynamics. In Proceedings of IEEE VIS ’09, to appear. [9] B. Meyer. Reusable software: the Base Object-Oriented Component Libraries. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1994. [10] D. L. Parnas. On the criteria to be used in decomposing systems into modules. Commun. ACM, 15(12):1053–1058, 1972. [11] G. Wilson. Software carpentry: Getting scientists to write better code by making them more productive. Computing in Science & Engineering, 8(6):66–69, Nov.-Dec. 2006.

Suggest Documents