Program Slicing in Understanding of Large Programs - Semantic Scholar

3 downloads 184 Views 1MB Size Report
Abstract. Program slicing transforms a large program into a smaller one that contains only statements relevant to the computation of a given function. It has been ...
Program Slicing in Understanding of Large Programs Bogdan Korel, Jurgen Rilling Department of Computer Science Illinois Institute of Technology Chicago, IL 60616, USA [email protected]

Abstract Program slicing transforms a large program into a smaller one that contains only statements relevant to the computation of a given function. It has been shown that program slicing can be useful in program understanding. Traditionally, program slices are represented in the textual form. Although slicing does narrow the size of the program, the textual representation of a slice does not provide much guidance in the understanding of large programs. In this paper we present program slicing concepts on the module level that allow for better understanding of program slices of large programs and their executions. These concepts have been developed for static and dynamic program slicing and are combined with different methods of visualization to guide programmers in the process of program understanding. The presented concepts have been implemented in the slicing tool that is used to investigate the usefulness of these concepts in the process of understanding of large programs.

1. Introduction A program slice consists of all statements in a program that may affect the value of variable v at some point. There are two types of program slices: static slices and dynamic slices. A static slice [16] preserves the program’s behavior with respect to variable v for all program inputs. On the other hand, a dynamic slice [10] preserves the program’s behavior with respect to variable v for a particular program input. Several methods of computation of static slices have been proposed in the literature [4, 8, 16]. Similarly, several algorithms to compute dynamic slices have been proposed [1, 6, 9, 10, 12, 13]. ____________________________________________ * This research has been partially supported by the NSF Research Initiation Award grant CCR-9308895.

Originally, program slicing has been proposed to guide programmers during program debugging [2, 11, 15], but program slicing can also be used in the process of understanding of programs during software maintenance and testing [5, 7, 14, 17]. One aid to program understanding is to reduce the amount of detail a programmer sees. Typically, a program performs several functions, but programmers frequently are interested only in the understanding of the computation of a selected function (or a selected output), rather than the computation of all functions. We assume that each program function (or program output) can be represented by a variable or a set of variables at a certain program point. Program slicing transforms a large program into a smaller one that contains only statements relevant to the computation of a function of interest. Static slicing [16] may be used to identify these parts of the program that potentially contribute to the computation of the selected function for all possible programs inputs. Static slicing is helpful to gain a general understanding of these parts of the program that contribute to the computation to the selected function. Although static slicing has many advantages in the process of program understanding, static slices are frequently still large subprograms because of the imprecise computation of these slices. In addition, static slices cannot be used in the process of understanding of program execution. Dynamic slicing [10] is used to identify these parts of the program that contribute to the computation of the selected function for a given program execution (program input). Dynamic slicing may help to narrow down this part of the program that contributes to the computation of the function of interest for particular program input. Dynamic slices are frequently much smaller than static slices. Moreover, dynamic slicing may be used to understand program execution [14].

Although slicing does reduce the size of a slice, the size of a slice may still be very large and hard to comprehend. Programmers may still have difficulties to understand the program and its behavior. The current slicing tools, e.g., [2, 3, 14], provide limited support during the process of understanding of large programs and their executions. Therefore, it is important to devise methods that will support the process of understanding of large software systems. One aid to understanding of large software systems is to use a higher level of abstraction to represent a slice. In this paper we present program slicing concepts on the module level that better supports the understanding of large programs and their executions. These concepts have been developed for static and dynamic program slicing and are combined with different methods of visualization of program slices to guide programmers in the process of understanding of large programs and their executions. The presented concepts were implemented in the slicing tool that is used to evaluate the presented concepts.

2. Visualization of slices for large programs Program slicing transforms a large program into a smaller one that contains only statements relevant to the computation of a given function. However, the current slicing tools, e.g., [2, 3, 14], do offer only limited help during the process of understanding of large programs. A program slice is represented in a textual form, i.e., a slice is displayed to programmers in the form of highlighted statements in the original program or as a subprogram by removing all statements from the original program that do not belong to the slice. A sample slice representation is shown in Figure 1. To display slices of programs whose source code is "spread over" many files, a reduced textual representation [3] of a slice has been recently proposed. Although slicing does reduce the size of a slice, it is still up to the programmer to analyze the slice and its behavior. For large programs, a slice may still be very large and hard to comprehend. For example, a slice reduction of 60% (that may be considered a significant reduction) for a 30,000 line program leaves a programmer with the 12,000 line slice to be analyzed. As a result, it is imperative to develop new program slicing related techniques that can be used to improve the comprehension of large programs. In what follows we present techniques that have been implemented in our slicing tool to improve the comprehension of programs for both static and dynamic slices. Call-Graph Slicing One aid to improve the understanding of large programs is to reduce the amount of detail a programmer sees by using a higher level of abstraction to represent a program. A commonly used high level abstraction is a call-graph of a program. On the call-graph level a

program is represented by a set of modules (procedures) and a set of call relationships between modules, where each module is graphically represented by a rectangle and each call relationship by a line connecting two modules. Therefore, a program slice may be represented not only on the source code level but also on a call-graph level, referred to as a call-graph slice. A call-graph slice is this part of the call-graph that influences a function of interest. After a program slice is computed, the call-graph slice can be easily constructed. A module is included in the call-graph slice if at least one statement of this module is in the program slice. Similarly, a call relationship connecting line between two modules M1 and M2 (where M1 calls M2) is included in a call-graph slice if at least one "call M2" statement inside of module M1 belongs to the program slice. One method to display a call-graph slice is by highlighting modules and call relationships that belong to the slice in the original callgraph of the program. Another method is to display a call subgraph that is constructed from the original call-graph by removing from it all modules and call relationships that do not belong to the call-graph slice. An example of a call-graph slice is shown in Figure 2.

a. Highlighting a slice. b. A slice as a subprogram. Figure 1. Displaying program slices on the source code level. Our slicing tool supports program slicing on the source code level and on the call-graph level. The representation on the call-graph level gives a programmer a general understanding of these parts of a software system that affect the function of interest. The representation on the source code level, on the other hand, is especially beneficial for analyzing selected modules (i.e., modules that belong to the slice). The slicing tool supports also zoom in/out features on the call-graph level. This provides the programmer with the ability to

customize the amount of detail he/she wants to observe. The benefits of the zoom functions are in the ability either to gain a general overview for large programs and programs with a large number of modules or to focus on specific parts of the call-graph without having to scroll the window.

bar inside of a module's rectangle in the call-graph slice, where the height of the bar corresponds with the number of statements in the module which belong to the slice. If the bar fills out the whole module, all statements in the module influence the function of interest. This feature provides the programmer with a simple approach to indicate the degree of influence for each module, without having to analyze the source code of the module. The degree of influence may be further enhanced by incorporating information related to the size of a module. Different sizes of modules may be represented graphically by different sizes of the rectangles representing the modules in the call-graph. The provision of this information enables a programmer not only to gain a general understanding of the distribution of the statements involved in the computation of a slice but also to identify possible sizes of modules. A sample call-graph with the degree of influence is shown in Figure 3.

a. Highlighting a call-graph slice.

Figure 3. A call-graph slice with degrees of influence.

b. A call-graph slice as a subgraph. Figure 2. Displaying call-graph slices.

The degree of influence As described earlier a module belongs to a call-graph slice if at least one statement inside of the module belongs to the program slice. In some modules all their statements belong to the program slice (i.e., they affect the function of interest), but in many other modules only a small number of statements affects the function of interest. As a result, programmers may not only be interested in identifying which modules are part of the slice but also to what extend modules affect the function of interest. This can be measured by the degree of influence of modules that belong to the slice. Clearly, the degree of influence of a module is measured as the percentage of the statements inside of the module that are part of the computed program slice. The degree of influence is indicated by a

Distribution of influencing statements After the computation of a slice and the display of the call-graph slice, frequently a programmer wants to analyze in more detail modules that belong to the slice. In this case, the programmer selects a particular module in order to see which statements of a module belong to the slice. By selecting a specific module in the call-graph the programmer can have the module's source code displayed in a separate window with all statements that belong to a slice being highlighted. For relative small modules this approach works fine. However, for modules containing several hundred lines of code the whole source code of the module may not fit into one window. In addition, statements that belong to a slice may be scattered throughout the module. In that case, a programmer has to scroll back and forth trying to find highlighted statements that belong to a slice, loosing some of the context between the statements that belong to the slice and the remaining part of the module's source code. To improve the textual representation for large modules we have implemented a

feature in our slicing tool that automatically opens an overview pane on the top part of the window showing an overview of the whole module in a line representation. Each line of the source code is reduced to a single row of pixels, preserving the coloring (highlighting) of the statements that belong to the slice (this technique is similar to the reduced textual representation of a slice described in [3]). A sample of the overview pane is shown in Figure 4. The benefit of this line representation in the overview pane is that a programmer can immediately understand the distribution (positions) of statements that belong to the slice. Another benefit is the provision for easy navigation throughout large modules. When a programmer clicks on a highlighted line in the overview pane, the corresponding part of the source code is automatically displayed in the source code window. As a result, the programmer can easily move to these parts of the source code that belong to the slice.

3. Understanding of Program Execution Traditionally, in order to understand a program execution, a programmer uses conventional debuggers that support breakpoint facilities and step-wise program execution. Breakpoints allow a programmer to specify places in a program where the execution should be suspended. When a breakpoint is reached, the programmer can then examine various components of a program state and check the values of program variables. Programmers may also execute a program in a step-wise manner in order to observe the program execution. Conventional debuggers, however, do not provide any means for identification of the parts of a program and its execution that contribute to the computation of the function of interest. Using debuggers is an inefficient and time consuming approach for understanding of program execution, especially when a programmer is interested in understanding only these parts of the program execution that relate to a particular program function. A programmer may observe a large amount of unrelated computation and frequently it is almost impossible for him/her to distinguish related computations from non-related computations. In order to make the process of program understanding more efficient it is important to reduce the amount of information programmers see and to focus their attention on the related computations. Dynamic slicing techniques provide means to prune away unrelated computation, and it may help to narrow down this part of a program that contributes to the computation of a function of interest for a particular program input.

Figure 4. Distribution of influencing statements in a module. During dynamic slice computation different types of information are computed, for example, contributing executed statements. This information is usually discarded after the slice computation. We have developed a tool [14] that takes advantage of already computed information and uses it in the process of understanding of program execution on the source code level. In this slicing tool we have developed several program slicing related concepts (e.g., executable slices, partial slices, influencing variables, etc.) that support the process of program understanding of programs and their executions. Our experience with these concepts was very positive for relatively small programs. However, for large programs a programmer may be overwhelmed by a large amount of information. The major reason for this is that all these dynamic related features are source code oriented. Since program slices are only represented in the textual form, programmers may still have difficulty to understand the program and its behavior. Therefore, it is important to devise methods that support understanding of program execution of large programs.

3.1 Program execution on the module level One approach to improve the understanding of program execution is to reduce the amount of detail a programmer has to focus on. In this section we describe several features of our tool that provide support on the module level, i.e., the smallest execution unit that a programmer sees is a particular execution of a module. Our program slicing tool provides a programmer with the option to switch at any time from the module level to the source code level and explore the specific execution of a module, i.e., a sequence of statements executed during module's execution, using traditional methods or source code oriented dynamic-slicing features described in [14]. A program execution on the module level can be presented on a call-graph level, as an execution tree or as a module trace. Program execution on a call-graph level and an execution tree level are techniques already supported by many tools. On the other hand, the representation of a program execution on a module trace level is a new concept. Execution on the call-graph level In order to observe the program execution, programmers may execute a program in a step-wise manner on the module level, where each currently executed module is highlighted. This approach is very valuable for short executions or for exploration of short parts of a program execution. Other features include the ability to set or remove breakpoints on the module level, continuous execution from one breakpoint to another one, and "slow-motion" execution. Our tool also provides a "summary" of a particular program execution on the callgraph level. Executed modules are highlighted in the callgraph or a call-subgraph is displayed that contains only executed modules. The representation of the program execution on the call-graph level provides the programmer with the ability to easily distinguish between executed and non-executed modules. The visual representation of a program execution on the call-graph level is limited to distinguish only between executed and non-executed modules but does not provide any information about the sequence of executed modules during a particular program execution. Execution on the execution tree level On the execution tree level a particular program execution is displayed in the form of a tree, capturing the execution hierarchy. Each executed module is represented by a rectangle with the module name, and a module call is represented by a line connecting the calling and the called module. The execution tree not only captures visually a sequence of module executions but also the execution call hierarchy. This approach works fine for relatively short executions. Capturing the sequence of the program execution requires that each of multiple executions of a module and its associated calls have to be included in the

execution tree. These multiple occurrences of the same module(s) and associated calls can lead very fast to a very large and complex execution tree making it almost impossible to visually represent the program execution; especially, for large programs and long program executions containing frequent module calls inside loops. Some techniques can be applied to reduce the size and the complexity of the display and to improve the visualization of the program execution in the execution tree. One technique, which reduces the number of modules to be displayed, is the introduction of a number representing a repetition of the same sequence of module calls. This is especially beneficial for loops or recursive iterations. Other techniques to improve the visualization of the execution tree are the provision of zoom in/out either on the whole execution tree or parts of the execution tree, or the provision to collapse parts of the execution tree which are not of interest to the programmer. Execution on the module trace level The module trace can be seen as a variation of the execution tree concept, capturing the sequence of module executions without the calling hierarchy. The executed modules are shown as corresponding bars in a viewing window. Each module execution corresponds to a bar in the module trace window. The width of each bar is the same for every module. The width of the bars is determined by the total number of executed modules in order to fit all the trace into one viewing window. As a result, for relatively short executions the bars are wide, on the other hand for long executions the bars are narrow. The height of each bar represents the length of the execution trace during each module execution (measured by the number of statements executed). The module trace reduces the amount of detail made available to the programmer by not including the calls between modules and therefore not visualizing the hierarchy of the modules and their calls. However, the major advantage of this concept is that very long executions can be represented in one viewing window. A sample module trace is shown in Figure 5.

Figure 5. A module trace overview.

This linear representation of program executions provides the programmer with the ability to navigate very easily throughout the execution. The current execution position (module) is highlighted in the module trace and simultaneously it is also highlighted in the call-graph. A sample module trace together with the call-graph is shown in Figure 6. Additional information is provided to the programmer in the status bar of the module trace window; including the current module name, its execution length and the percentage of its execution with respect to the total program execution length. In our implementation we also provide the programmer with the option to zoom in/out on the module trace, allowing the programmer to control the level of detail he/she wants to analyze.

Figure 7. All occurrences of the same module. Common functionality among the presented concepts for the visualization of program execution include stepwise execution, setting or removing breakpoints, and continuos execution from one breakpoint to another one. Note that our slicing tool supports a synchronized and simultaneous display of all of the presented concepts at the same time, providing the programmer with the flexibility and ability to visualize a program execution on different abstraction levels.

3.2. Dynamic slicing in program execution

understanding

of

Programmers are frequently interested in the understanding of the computation of a selected function (or a selected output) for a particular program input, rather than for all inputs. The understanding of program execution for large programs and/or long program executions can become a very challenging task for a programmer. In what follows we present dynamic program slicing related concepts to improve the process of comprehension of program execution. These concepts were originally introduced for dynamic slicing on the source code level in [14]. In this paper we extend these concepts on the module level. Our tool supports the presented concepts on the source code, call-graph, execution tree, and module trace levels. Figure 6. A module trace with a call-graph. The concept of the module trace can not only be applied on the module level but also on the source code level. Other representation options provided by our slicing tool are the options to select a specific module and investigate the occurrences of that specific module in the module trace. This selection can be done by just clicking one of the module bars and all occurrences of the specific module in the module trace window are highlighted (see Figure 7). Frequently a programmer may be interested only in a specific module. Our slicing tool provides the ability to display all the module executions of the selected module by hiding the executions of the other modules. This feature allows the programmer to focus only on the executions of that specific module.

Contributing module executions Certain parts of an execution do not contribute to the computation of a selected function for which a dynamic slice was derived. In many situations a programmer may be interested in stepping only through the program execution that contribute to the computation of the function of interest. Note that it is possible for a module to be executed multiple times during a program execution, and some of these module executions contribute to the computation of the function of interest whereas the remaining module executions do not contribute. For a programmer it is almost impossible to distinguish between contributing and non-contributing module executions at any level of abstraction. Therefore, we have developed techniques based on dynamic slicing that provide programmers with a better understanding of a program execution by identifying contributing module executions.

On the call-graph level, contributing module executions are highlighted during step-wise execution. A module is highlighted if at least one executed statement inside of the module is contributing during that specific module execution. In this way, contributing and non-contributing module executions can be easily shown to the programmer. However, the major limitation of this approach is that the programmer has to go step-wise through almost the whole execution of the program to observe the contributing and non-contributing module executions (a very time consuming process). This is probably applicable only for relatively short executions. Another approach is to show contributing module executions on the module trace level (or the execution tree level) by highlighting bars that correspond to the contributing module executions in the module trace window. The major advantage of this approach is that the programmer can immediately see all the contributing module executions in the module trace window for all modules or a selected module. Degree of contribution Frequently a programmer may be interested not only to observe whether a module execution is contributing to the computation of a selected function but also to see its degree of contribution that is this portion of the module execution that contributes to the computation of the function of interest (the number of executed contributing statements in a module divided by the number of all executed statements in a module). On the call-graph level, during step-wise execution contributing module executions are highlighted and the degree of contribution is indicated by a bar inside of a module's rectangle, where the height of the bar indicates the degree of contribution. If the bar fills out, all executed statements in the module contribute to the function of interest. This feature provides the programmer with a simple approach to indicate the degree of contribution for each module, without having to analyze the source code of the module. On the module trace level (or an execution tree level), the degree of contribution of each module execution is indicated by shading the bar, where the height of the shading indicates the degree of contribution. Figure 8 shows a module trace with the degree of contribution for each executed module. This representation provides the programmer with almost an immediate overview of the distribution of the various degrees of contribution that exist in the module trace. In many situations programmer may be also interested in degrees of contributions for all executions of a selected module. This information can be used to identify abnormalities and unexpected variations of program behavior during the executions of a selected module. This may be done by grouping all executions of a selected module together with the degrees of contribution. Figure 9 shows all executions of the same module with the degrees of contribution.

Figure 8. Module trace with degree of contribution.

Figure 9. All executions of a selected module with degrees of contribution.

Figure 10. Summary of the degrees of contributions on the module level Frequently a programmer may be interested in the total degrees of contribution for all executions of modules that belong to the slice. Figure 10 presents such a summary, where each bar represents a module and the height of the shading in each bar indicates the total degree of contribution for all module executions. This feature provides a programmer with the option to identify and to focus his/her attention on these modules that contain the largest number of contributing statements.

Partial Program Slicing The concept of a partial slice was originally introduced for dynamic slicing in [14] on the source code level. A partial slice refers to this part of a dynamic slice that affects the computation of the function of interest for a selected subtrace. Partial slicing allows a programmer to observe the "dynamics" of a program execution on selected subtraces, i.e., execution of a module, a loop iteration, etc. The concept of partial slicing has been extended in our tool on the module level, where a partial slice consists of all modules that contribute to the function of interest in the selected subtrace. Partial slicing is very useful when analyzing the affects of a long module execution, which includes many calls to other modules, on the function of interest. For the computation of a partial slice a programmer has to indicate the beginning and end of a subtrace of interest. The partial slice on a module level can be represented as a call-subgraph or by highlighting the modules that belong to the partial slice. Influencing Variables During the process of program comprehension one important question arises as to what variables should be observed in order to improve the process of understanding of the program’s behavior. The concept of influencing variables allows identifying those variables at a specific program position that have influence on the function of interest. Influencing variables were introduced in [14] on the source code level. In our tool the concept of influencing variables has been extended on the call-graph level (notice that on this level only global variables and modules' parameters are "visible"). Showing influencing global variables and modules' parameters may be useful for large programs because by focusing a programmer's attention only on these variables that gained/lost influence inside a module, a programmer can easily identify the “existence” or “lack” of influence inside a module.

4. Conclusions In this paper we have presented dynamic slicing features that support the process of program understanding and the understanding of program executions on a module level. Several static and dynamic program slicing features have been proposed on the callgraph level, execution tree level, and the module trace level. These program slicing related features have evolved during experimentation with our dynamic slicing tool that can be used in the process of program understanding during the software maintenance. We have performed experiments with programs up to several thousand lines in size and their executions up to 100,000 executed statements. Our preliminary experience has shown that these features can be of a great help for programmers during the process of program understanding. However, more research and experimentation is needed in order to

better understand the advantages and limitations of these features.

References [1]

H. Agrawal, J. Horgan, "Dynamic program slicing" SIGPLAN Notices, No. 6, 1990, pp. 246-256.

[2]

H. Agrawal, R. DeMillo, E. Spafford, "Debugging with dynamic slicing and backtracking," Software Practice & Exp., vol. 23, No. 6, 1993, pp. 589-616.

[3]

T. Ball, S. Eick, "Software Visualization in the Large," Computer, vol. 29, No. 4, April 1996.

[4]

D.Binkley and K.Gallagher, "Program Slicing", Advances in Computers, vol. 43, Academic Press, 1996, pp. 1-52.

[5]

K. Gallagher, J. Lyle, "Using program slicing in software maintenance," IEEE Tran. on Software Engineering, vol. 17, No. 8, 1991, pp. 751-761.

[6]

R. Gopal, "Dynamic program slicing based on dependence relations," Proc. of the Conf. on Software Maintenance, 1991, pp. 191-200.

[7]

R. Gupta, M. Harrold, M. Soffa, "An approach to regression testing using slicing," Conference on Software Maintenance, 1992, pp. 299-308.

[8]

S. Horwitz, T. Reps, D. Binkley, "Interprocedural slicing using dependence graphs," Trans. on Progr. Lang. and Systems, vol. 12, No. 1, pp. 26-60, 1990.

[9]

M. Kamkar, Interprocedural Dynamic Slicing with Applications to Debugging and Testing, Ph. D. Thesis, Linkoping University, 1993.

[10] B. Korel, J. Laski, "Dynamic program slicing," Information Processing Letters, vol. 29, No.3, 1988, pp. 155-163. [11] B. Korel, "PELAS - Program Error Locating Assistant System," IEEE Trans. on Software Eng., vol. SE-14, No. 9, 1988, pp. 1253-1260. [12] B. Korel, S. Yalamanchili, "Forward Derivation of Dynamic Slices," Proc. of the Intern. Symposium on Software Testing and Analysis, 1994, pp. 66-79. [13] B. Korel, "Computation of dynamic program slices for unstructured programs," IEEE Trans. on Software Eng., vol. 23, No. 1, 1997, pp. 17-34. [14] B. Korel, J. Rilling, "Dynamic Program Slicing in Understanding of Program Execution", The 5th Intern. Workshop on Progr. Comprehension., 1997, pp. 80-90. [15] J. Lyle, M. Weiser, "Experiments on slicing-based debugging tools," Proc. of the 1st Conf. on Empirical Studies of Programming, 1986, pp. 187-197. [16] M. Weiser, "Program slicing," IEEE Trans. on Software Eng., SE-10, No. 4, 1982, pp. 352-357. [17] L. White, H. Leung, "Regression testability," IEEE Micro, April 1992, pp. 81-85.