A Fully Automatic Equivalence Checker for Validating

3 downloads 0 Views 1MB Size Report
Jun 26, 2017 - Soumyadip Bandyopadhyay and Kunal Banerjee. 2017. PRESGen: A ...... The nesC language: A holistic approach to networked embedded ...
Session 2

SEM4HPC ’17, June 26, 2017, Washington, DC, USA.

PRESGen: A Fully Automatic Equivalence Checker for Validating Optimizing and Parallelizing Transformations Soumyadip Bandyopadhyay

Kunal Banerjee

Birla Institute of Technology and Science, Pilani, K K Birla Goa Campus, India Dept. of Computer Science and Information Systems [email protected]

Indian Institute of Technology, India Dept. of Computer Science and Engineering [email protected] to carry information [28–30]. This modeling formalism has a well defined semantics for precise representation of systems. Code optimizations are applied at the preprocessing stage of embedded system synthesis [31–38]; if carried out by untrusted compilers, these code transformations can result in software bugs. Hence, it is important to verify whether the implemented code faithfully represents the intended functionality. Translation validation, whereby each individual translation is followed by a validation phase to establish the behavioural equivalence of the source code and the target code, was introduced by Pnueli et al. in [39] and was demonstrated to handle various code optimizations by Necula in [40] and Rinard et al. in [41]. This method is further enhanced by Kundu et al. [42] to verify a high-level synthesis tool SPARK which can capture parallel execution of statements. Literature [43] reports a method for validating some parallelizing transformations. A major limitation of these methods [39–43] is that they can verify only structure preserving transformations and invariably fails for schedulers that alter the control structure of a program [44]. To alleviate this shortcoming, a path based equivalence checker for the FSMD models (which are essentially sequential control and data flow graphs (CDFGs)) was proposed in [45, 46] which was later modified to handle more sophisticated uniform and non-uniform code motions in [47] and code motions across loops in [48]. They, however, cannot handle thread-level parallelizing transformations mainly because FSMD, being a sequential MoC, cannot capture parallel behaviours straightway. The work described in [49] proposed a translation algorithm from a PRES+ model to an FSMD model and then used the existing FSMD equivalence checker of [45] to establish equivalence between the initial and the optimized versions of a program which were originally represented using PRES+ models. However, in this reported method, the construction of the PRES+ models from the original programs were carried out manually. The authors of [50] reported a method for automated construction of Petri net models from a highlevel language program where the source program is converted into an intermediate representation (IR) form such as, abstract syntax tree, for various modules. Subsequently, for each of the module, the method constructs the corresponding sub-net; after processing all the modules, the method merges the sub-nets to produce the final Petri net. In their method, only control structure is captured, however, the data analysis is not carried out. In our work, we present a technique for automated construction of PRES+ models from highlevel language programs, specifically from C. Having obtained the PRES+ models from the original and the transformed programs using the method presented here, we employ the PRES+ to FSMD translator of [49, 51] to obtain the corresponding FSMDs and finally

ACM Reference format: Soumyadip Bandyopadhyay and Kunal Banerjee. 2017. PRESGen: A Fully Automatic Equivalence Checker for Validating Optimizing and Parallelizing Transformations. In Proceedings of SEM4HPC’17, Washington , DC, USA, June 26, 2017, 8 pages. https://doi.org/http://dx.doi.org/10.1145/3085158.3086158

1

INTRODUCTION

With the advent of autonomous driving, virtual reality, specialized medicine manufacturing and so on, it has become a necessity to empower embedded systems with high performance computing (HPC). These chips can be found embedded in a diverse range of devices, from personal assistant systems to automobiles. Typically, they are reactive, application-specific, and have real-time constraints. Hence, efficiency and dependability are key concerns for these systems. Considerable effort has been made to incorporate concurrent applications in embedded systems [1–9] and for their programming [10–22]. However, to exhibit high performance, such programs have to be much sophisticated while being largely aware of the underlying architecture. These requirements often lead to intricacies which are difficult to analyze. Hence, there is a growing concern to enhance the current methods for designing and validating embedded systems targeted for HPC. A lot of investigation has been carried out in modeling and formally verifying embedded systems in the last two decades. A comprehensive list of models proposed to represent embedded systems and their validation can be found in [23–27]. These models encompass a broad range of styles, characteristics, and application domains and include extensions of finite state machines, data flow graphs, communication processes and Petri nets. Petri nets are especially suited for modeling concurrent behaviours. The Petri net based Representation for Embedded Systems (PRES+) model enhances the classical Petri net model to capture computation over integers, reals and general data structures; it captures concurrency and timing behaviour of embedded systems by allowing the tokens K. Banerjee is presently a Research Scientist at Intel Parallel Computing Lab, Bangalore, India. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. SEM4HPC’17, June 26, 2017, Washington , DC, USA © 2017 Association for Computing Machinery. ACM ISBN 978-1-4503-5000-6/17/06. . . $15.00 https://doi.org/http://dx.doi.org/10.1145/3085158.3086158

13

Session 2

SEM4HPC ’17, June 26, 2017, Washington, DC, USA.

SEM4HPC’17, June 26, 2017, Washington , DC, USA

Bandyopadhyay et al.

employ the FSMD equivalence checker of [48] to determine the equivalence of the original and the transformed programs. Note that the authors of [49] had used the FSMD equivalence checker reported in [45], whereas, we have used the one described in [48] since the latter can handle a larger class of transformations such as, code motions across loops. The whole equivalence checking technique has been implemented in our tool called PRESGen. The experimental results demonstrate the efficiency of the tool. Rest of the paper is organized as follows. Section 2 presents the overview of the tool. The utility of the PRES+ model is stated in section 3. The automated construction of PRES+ model from high-level language programs is mentioned in section 4. The results obtained when the procedure was tested on some benchmarks can be found in section 5. The paper is finally concluded in section 6.

2

as, loop reordering (as shown in Figure 2), is carried out by some compiler, then FSMD equivalence checkers cannot handle these transformations in a straightforward manner. Since the data-flow is captured more vividly in a PRES+ model, the original behaviour and the transformed behaviour are identical. However, the path construction method for PRES+ model [54] is very costly, so the path based equivalence checking between two PRES+ models is also very time consuming. It is to be noted that after execution of the translation algorithm in [49], the original behaviour and the transformed behaviour modeled using FSMDs are also same. Accordingly, the FSMD equivalence checker is able to handle this transformation. For the FSMD model, the path construction is very simple; therefore, path based equivalence checking is very fast compared to the path based equivalence checking between two PRES+ models. Figure 2(a) and Figure 2(b) describe the original and the transformed behaviours after loop reordering transformation. In this transformation, the loop variables are not dependent on each other. Figure 2(c) represents the PRES+ model for both original and transformed behaviour. It is to be noted that transitions having ft = + with more than two pre-places indicate application of the function an appropriate number of times.

TOOL OVERVIEW

Figure 1 demonstrates the features of the present tool – PRESGen. This tool is used for translation validation of optimizing and/or parallelizing compilers. For this purpose, we take a C program as an input of the first module (model constructor) and construct a PRES+ model. After code optimization using the compiler, we get the transformed C code. Another PRES+ model is obtained from this transformed code by again using the first module. The PRES+ to FSMD translator [49] (second module of the tool) translates both the original and the optimized/parallelized PRES+ models into corresponding FSMD models. These two FSMD models can now be readily fed as inputs to the equivalence checker [48], i.e., the third module of the tool. In this paper, we only demonstrate the first module – model constructor – of the present tool. From C code, first it generates three address code using existing technology – flex and bison. Thereafter, the basic blocks of the three address code is formed. Subsequently, for each basic block, the data dependency analysis is carried out. From the data dependency graphs, the subnets of the PRES+ model are constructed; each sub-net corresponds to each of the basic blocks. During construction of a sub-net of the PRES+ model, the maximum level of parallelization is exploited. Finally, the sub-nets of the PRES+ model are merged according to the control flow of the basic blocks.

3

4

AUTOMATED PRES+ CONSTRUCTION METHOD

As mentioned in section 2, from C code, the model constructor first generates the three address code using existing technology – flex and bison. Thereafter, the basic blocks of the three address codes are formed. Subsequently, for each basic block, the data dependency analysis is carried out using the model construction algorithm. The model construction algorithm consists of four modules. The central module of the construction method takes the set of basic blocks as the inputs and the PRES+ model is an output. For each basic block, the function checks whether it is a normal basic block or conditional basic block or the basic block containing loop. Depending upon these conditions, the function constructs the sub-net accordingly. The following function modules are involved for automated PRES+ model construction. processNormalBasicBlock– The function takes a basic block and a PRES+ model as the inputs and it returns a sub-net corresponding to the particular normal basic block. The function first constructs the data dependency graph (DDG) from the input basic block. Then it performs reachability analysis on the DDG for identifying the parallel statements within that basic block. Then it constructs PRES+ sub-net corresponding to each of the parallel statement. The outplaces for the sub-net of PRES+ model are designated as the synchronization places such that after processing of the next basic block the control flow is preserved using those places. processConditionalBasicBlock–The function takes a basic block and a PRES+ model as inputs and updates the sub-net corresponding to this basic block in PRES+ model. First it obtains condition of execution from the 3 address code. Then the function identifies the operator used in the condition. After that it creates a place. Then it creates two transitions, one with condition of execution and another one with negation of condition of execution. Finally, it creates output places for two created transitions and returns the sub-net of the PRES+ model.

UTILITY OF PRES+ MODEL FOR TRANSLATION VALIDATION

While FSMDs are suitable for capturing sequential behaviours, PRES+ models are suitable for representations of parallel behaviours. Many equivalence checkers for FSMD models already exist [47, 48, 52, 53]. Although PRES+ models have provision for expressing timing information which is not supported by FSMDs, it is to be noted that timing constraints are inconsequential for demonstrating data transformation equivalence between the behaviours; this fact allows us to perform equivalence checking using FSMDs. Literature [49] formulates an algorithm to translate a PRES+ model into an FSMD model. This translation mechanism is very beneficial because direct equivalence checking between two FSMDs needs path extension which is very costly. However, during translation method, as the algorithm endorses only symbolic execution, the equivalence checking phase between two constructed FSMDs does not need the path extension method. Moreover, if transformations such

14

Session 2

SEM4HPC ’17, June 26, 2017, Washington, DC, USA.

PRESGen: A Fully Automatic Equivalence Checker for Validating Optimizing and Parallelizing Transformations

C prog

Trans C prog

Compiler

(1)

PRES+

SEM4HPC’17, June 26, 2017, Washington , DC, USA

Model constructor Translator (2)

(1)

(2)

(3)

PRES+

FSMD eqv checker Figure 1: Tool overview. p1

i int i=0, j=0, k; while (i

Suggest Documents