Consideration of Optimizing Compilers in the ... - Semantic Scholar

8 downloads 0 Views 111KB Size Report
Consideration of Optimizing Compilers in the Context of. WCET Analysis *. Raimund Kirner [email protected]. Institut f ur Technische Informatik.
Consideration of Optimizing Compilers in the Context of WCET Analysis 

Raimund Kirner [email protected]

Institut fur Technische Informatik Technische Universitat Wien A-1040 Wien, Austria Betreuer der Arbeit: Prof. Dr. Peter Puschner Art der Arbeit: Dissertation Fachbereich GI: 2

Abstract

This paper presents a WCET analysis concept which is based on the analysis of source code, written in a high-level programming language. Additional information about the timing behaviour of the program is given as annotations. These annotations are transformed inside the compiler to assembly code level. This transformation also works for code optimizations performed by the compiler. Experiments show examples of eects that can arise and be handles by this WCET analysis concept.

Introduction Worst Case Execution Time Analysis (WCET) has to deal with the static calculation of the maximum runtime of a program. The design of systems with safety-critical requirements highly depends on the knowledge of the WCET of its software components. Therefore in the past years WCET analysis has become an acknowledged part in the design of real-time systems. There are several approaches to calculate the WCET of a program. The input for WCET analysis can be assembly/object code of a program as well as the high-level program source code 3, 4, 5]. For the latter approach, the WCET analysis is in general done at object code level to derive accurate results. Comfortable concepts allow the input of additional timing information required for WCET analysis to be presented at source code level. Methods are necessary to transform the information about the program structure and the timing behavior from the source code into the object code level. These can be done by integrating the transformation process into the compiler or by tools that try to transform this information outside the compiler in parallel to the compilation process.

High-level WCET Analysis The WCET calculation method for the evaluation in this paper is based on the static analysis of a source code written in a high-level programming language. The syntax of the programming language is derived from ANSI C with extensions to express additional timing information about the program inside the language 1]. The additional timing information is given as loop bounds, markers and scopes which are described in 2, 3]. The WCET analysis itself is done at assembly level to enable accurate results. The annotated timing information is extracted from the source code and transformed parallel to the compilation process by the compiler, which is based on the GNU C compiler GCC. The output of the compiler is an annotated assembly code which will be processed by a WCET tool 1, 4]. The WCET tool is based on integer linear programming and calculates the worst case execution time for every assembly statement/source line. The transformation of the program structure and timing behaviour is straight-forward when using no code optimizations by the compiler. When using code optimizations it could be very dicult to keep track of the binding between the annotated information and the generated assembly code. Therefore this transformation was integrated into the compiler. This also enables automatic WCET calculation with no additional user interaction.  This work has been supported by the IST research project \Systems Engineering for Time-Triggered Architectures (SETTA)" under contract IST-10043.

Experiments A prototype of a WCET tool chain was implemented at our department to visualize the eects of using optimizing compilers to transfrom timing information in parallel to the code generation process. The chosen target hardware for the WCET calculation in this work is the 16/32 bit CPU M68000 from Motorola. From the point of view of WCET analysis it is categorized as 'simple' hardware, which simplies the calculation of tight upper bounds of the maximum execution time. It does not have performance enhancing functions like caches, register windows, instruction prefetching, asynchronous instruction execution order or pipelining. The absence of a pipeline also prohibits the existence of branch prediction. The WCET of each instruction depends on the type of the input parameters and can be directly picked up from a data sheet. For instructions, where the execution time depends on the value of the input parameters (i.e. multiplication or division) the maximum execution time is taken for WCET calculation. This hardware is a good base for WCET calculation and to show the correctness of the method. /* processor: m68000 */ /* memory wait states (r/w):

0/ 0 */

----- CYCLES(bubble) = 47034 ----1|---------------------------#define N_EL 10 2|--------------------------3|--------------------------4|---------------------------/* Sort an array of 10 elements with bubble-sort */ 5|---------------------------void bubble (int arr[]) 6| 1| 16, 0( 1, 0) -{ 7|--------------------------- /* Definition of local variables */ 8|--------------------------- int i, j, temp; 9|--------------------------10|--------------------------- /* Main body */ 11| 3| 24, 0( 1, 0) - for (i=N_EL; 12| 4| 228, 100( 10, 9) i > 1; 13| 2| 216, 90( 9, 9) i--) 14|--------------------------maximum (N_EL - 1) iterations 15|--------------------------- { 16| 2| 180, 0( 9, 0) for (j = 2; 17| 4| 3132, 900( 90, 81) j arr[j]) 22|--------------------------{ 23| 9| 7614, 0( 81, 0) temp = arr[j-1]; 24|14|11988, 0( 81, 0) arr[j-1] = arr[j]; 25| 6| 6642, 0( 81, 0) arr[j] = temp; 26|--------------------------} 27|--------------------------} 28|--------------------------- } 29| 2| 28, 0( 1, 0) -}

Figure 1: Back-Annotation of Bubble-Sort Algorithm (O0) The results of the WCET calculation are visualized by back-annotation of the results into the source code. The back-annotation is given by the following format: AAA|BB|CCCCC,DDDDD(EEEE,FFFF) -

The meaning of the elements is as follows: AAA is the line number of the corresponding source program. BB: gives the number of assembly instructions that are generated for the corresponding source line. CCCCC is the number of CPU cycles for the assembly instructions with a non-sequential successor instruction (a sequential successor instruction is the next instruction following in the assembly source). DDDDD is the number of CPU cycles for the assembly instructions with a non-sequential successor instruction. EEEE,FFFF are counter for the maximum sequential and non-sequential iteration count of the assembly instructions that are generated for the source line. A simple algorithm, called bubble-sort, was used to analyze the eects of dierent compiler optimization levels. Bubble-sort is used to sort an array of data values. It is implemented in this work

/* processor: m68000 */ /* memory wait states (r/w):

0/ 0 */

----- CYCLES(bubble) = 8417 ----1|---------------------------#define N_EL 10 2|--------------------------3|--------------------------4|---------------------------/* Sort an array of 10 elements with bubble-sort */ 5|---------------------------void bubble (int arr[]) 6| 4| 56, 0( 1, 0) -{ 7|--------------------------- /* Definition of local variables */ 8|--------------------------- int i, j, temp; 9|--------------------------10|--------------------------- /* Main body */ 11| 1| 4, 0( 1, 0) - for (i=N_EL; 12| 3| 98, 80( 9, 8) i > 1; 13| 1| 36, 0( 9, 0) i--) 14|--------------------------maximum (N_EL - 1) iterations 15|--------------------------- { 16|--------------------------for (j = 2; 17|10| 486, 720( 9, 72) j arr[j]) 22|--------------------------{ 23|--------------------------temp = arr[j-1]; 24| 1| 1296, 0( 81, 0) arr[j-1] = arr[j]; 25| 1| 972, 0( 81, 0) arr[j] = temp; 26|--------------------------} 27|--------------------------} 28|--------------------------- } 29| 4| 52, 0( 1, 0) -}

Figure 2: Back-Annotation of Bubble-Sort Algorithm (O3) Optimization level WCET (CPU cycles) none (O0) 47034 full (O3) 8417 full (O3), with marker 4997 Table 1: Calculated WCET for Bubble-Sort Algorithm to sort the elements in ascending order. The main structure of this algorithm consists of two nested loops. The outer loop is executed (N EL;1) times (N EL is the number of data entries). The inner loop is executed at maximum (N EL;1) times. The resulting complexity class of this algorithm is in the rst approach O((N EL;1)2). The overall WCET of the algorithm for both optimization levels is given in Table 1. This result makes it clear that WCET analysis has to be done at assembly language level for optimizing compilers to provide tight upper bounds of the WCET. The back-annotations for the non-optimizing compiler are given in Figure 1. Line 12 shows that the loop header is executed N EL times. This is because the loop conditions have to be tested for every loop iteration and also for the normal exit of the loop. The test of the inner loop (line 18, Figure 1) is done 90 times ((N EL;1)  N EL), but the resulting branching is done only 81 times ((N EL;1)  (N EL;1)). Line 19 shows that there are no instructions generated for loop-annotations. The back-annotations for the optimizing compiler are given in Figure 2. The iteration bound of the inner loop depends on the value of a variable which is changed inside the outer loop. This causes a pessimistic estimation of the WCET for both examples. The quality of the calculated WCET can be improved by extending the timing model for the bubble-sort algorithm. When taken the change of the  inner loop variable into account, the resulting complexity class of this algorithm is O (N EL;1)2 N EL . This behaviour can be modelled with the static information provided by marker and scopes. This modication leads to a signicantly better result. The overall WCET of the algorithm for full optimization level using markers and scopes is also given in Table 1. The back-annotations for the optimizing compiler when using markers and scopes are given in Figure 3. The result is quite the same as without the usage of marker and scopes except for the statements inside the inner loop of the algorithm. This example shows the importance of building a precise timing model of the program by annotations.

/* processor: m68000 */ /* memory wait states (r/w):

0/ 0 */

----- CYCLES(bubble) = 4997 ----1|---------------------------#define N_EL 10 2|--------------------------3|--------------------------4|---------------------------/* Sort an array of 10 elements with bubble-sort */ 5|---------------------------void bubble (int arr[]) 6| 4| 56, 0( 1, 0) -{ 7|--------------------------- /* Definition of local variables */ 8|--------------------------- int i, j, temp; 9|--------------------------10|--------------------------- /* Main body */ 11|--------------------------- scope BS 12|--------------------------- { 13| 1| 4, 0( 1, 0) for (i=N_EL; 14| 3| 98, 80( 9, 8) i > 1; 15| 1| 36, 0( 9, 0) i--) 16|--------------------------maximum (N_EL - 1) iterations 17|--------------------------{ 18|--------------------------for (j = 2; 19|10| 486, 360( 9, 36) j arr[j]) 25|--------------------------{ 26|--------------------------temp = arr[j-1]; 27| 1| 720, 0( 45, 0) arr[j-1] = arr[j]; 28| 1| 540, 0( 45, 0) arr[j] = temp; 29|--------------------------} 30|--------------------------} 31|--------------------------} 32|--------------------------restriction M

Suggest Documents