A Model-Driven Automatically-Retargetable Debug Tool for ...

6 downloads 552 Views 296KB Size Report
You're seeing our new conference paper page and we'd like your opinion, send feedback ... Such tools will have to rely on some low-level model-driven debugging ... for automatically retargeting debugging tools for embedded code inspection. ... In: Proceedings of the 34th Annual Conference on Design Automation, pp.
A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems Max R. de O. Schultz, Alexandre K.I. Mendonça, Felipe G. Carvalho, Olinto J.V. Furtado, and Luiz C.V. Santos Federal University of Santa Catarina, Computer Science Department, Florianópolis, SC, Brazil {max, mendonca, fgcarval, olinto, santos}@inf.ufsc.br

Abstract. Contemporary SoC designs ask for system-level debugging tools suitable to heterogeneous platforms. Such tools will have to rely on some low-level model-driven debugging engine that must be retargetable, since embedded code may run on distinct processors within the same platform. This paper describes a technique for automatically retargeting debugging tools for embedded code inspection. The technique relies on two key ideas: automatic extraction of machinedependent information from a formal model of the processor and reuse of a conventional binary utility package as implementation infrastructure. The retargetability of the technique was experimentally validated for targets MIPS, SPARC, PowerPC and i8051.

1 Introduction Modern embedded systems are often implemented as systems-on-chip (SoCs) whose optimization requires design space exploration. Alternative CPUs may be explored so as to minimize code size and power consumption, while ensuring enough performance to fulfill real-time constraints. Therefore, design space exploration requires the generation, inspection and evaluation of embedded code for distinct target processors. Besides, contemporary SoC designs ask for system-level debugging tools suitable to heterogeneous platforms. Such tools will have to rely on some low-level model-driven debugging engine that must be retargetable, since embedded code may run on distinct processors within the same platform. As manually retargeting is unacceptable under the time-to-market pressure, automatically retargetable tools are mandatory. Retargetable tools [1] automatically extract machine-dependent information from a processor model, usually written in some architecture description language (ADL). To prevent the tools from being tied to a given ADL, an abstract processor model could be envisaged. To be practical, such a model should be synthesizable from a description written in some ADL. Figure 1 describes a typical model-driven tool chain. It summarizes distinct classes of information flow (tool generation, code generation, code inspection and code evaluation). Exploration consists of four major steps, as follows. First, given a target processor model, code generation tools (compiler backend, assembler and link editor), code inspection tools (dissassembler and debugger) and an instruction-set simulator are automatically generated. S. Vassiliadis et al. (Eds.): SAMOS 2007, LNCS 4599, pp. 13–23, 2007. c Springer-Verlag Berlin Heidelberg 2007 

14

M.R. de O. Schultz et al.

Fig. 1. Model-driven tool flows

Then, the application source code can be compiled, assembled and linked, resulting in executable code. In a third step, the executable code can be run on the instruction-set simulator and its functionality can be observed with the help of disassembling and debugging tools. These tools allow the code to be executed incrementally (step) or to be stopped at certain code locations (breakpoints) so as to monitor program values (watchpoints). Finally, as soon as proper functionality is guaranteed by removing existent bugs, continuous execution on the simulator allows the evaluation of code quality with respect to design requirements. If some requirement isn’t met, an alternative instruction setarchitecture (ISA) may be envisaged to induce a new solution. If the current processor is an application-specific instruction-set processor (ASIP), its ISA may deserve further customization. Otherwise, a new candidate processor may be selected. This paper focuses on a technique for generating debugging tools from an arbitrary processor model. The technique relies on two key ideas. First, ISA-dependent information is automatically extracted from the model of the target processor. Second, the well-known GNU Binutils [2] and GNU debugger [3] packages are employed as implementation infrastructure: ISA-independent libraries are reused, while target-specific libraries are automatically generated.

A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems

15

The remainder of this paper is organized as follows. Section 2 briefly reviews related work. Section 3 formalizes the processor model that drives tool retargeting. Section 4 discusses implementation aspects. Experimental results are provided in Section 5. In Section 6, we draw our conclusions and comment on future work.

2 Related Work 2.1 Manually Retargetable Tools Manually retargetable binary utilities are available within the popular GNU Binutils package [2]: assembler (gas), linker (ld), debugger (gdb) [3] and disassembler (objdump). Essentially, the Binutils package consists of an invariant ISA-independent core library and a few ISA-dependent libraries that must be rewritten for each new target CPU. Among the ISA-dependent libraries, there are two main libraries, namely Opcodes and BFD, which require retargeting. The Opcodes library describes the ISA of a CPU (instruction encoding, register encoding, assembly syntax). Unfortunately, there is no standard for ISA description within this library. The BFD library provides a format-independent (ELF, COFF, A.OUT, etc.) object file manipulation interface. It is split into two blocks: a front-end, which is the library’s abstract interface with the application and a back-end, which implements that abstract interface for distinct object file formats. 2.2 Automatically Retargetable Tools A great deal of contemporary retargetable tools rely on automatic generation from a CPU model, written in some ADL, such as nML [4], ISDL [5], and LISA [6]. Although disassembler and debugger are available for most ADLs, it is unclear to which extent they are automatically generated or simply hand-retargeted. For instance, once a simulator is generated in the LISA tool chain, it can be linked to a debugging graphical user interface, but there is no clue on how the underlying mechanism actually works. It has been acknowledged that novel assembly-level optimization approaches, like SALTO [7] and PROPAN [8], deserve further investigation [1]. Such techniques allow conventional compiler infrastructure to be reused by enabling post-compiling machinedependent optimizations to further improve code quality. Although such post-compiling optimizations are promising, they may inadvertently introduce flaws. Code inspection tools could loose track of breakpoints and watchpoints due to optimizations not connected to the source code (in face of new locations and distinct register usage). Therefore, conventional debuggers are likely to overlook flaws introduced by post-compiling optimizations. A technique for retargeting assemblers and linkers to the GNU package was presented in [9]. It relies on a formal notation to describe both the target ISA and its relocation information. Although the formalism is solid, experimental results are scarce. Besides, it is not possible to foresee if the proposed framework is able to address retargetable debugging tools.

16

M.R. de O. Schultz et al.

Two facts motivated the work described in this paper: first, the lack of information reporting how code inspection tools are made retargetable and at which extent this is performed automatically; second, the scanty experimental results providing evidence of proper retargetability. Although we pragmatically reuse a conventional binary-utility package as implementation infrastructure (like in [9]), we rely on an ADL-independent processor model.

3 Processor Model This section formalizes the ISA aspects of the processor model in the well-known BNF notation. To ease its interpretation, an example is also provided. Figure 2 specifies the formal structure for the information typically available in processor manuals, which relies on the notions of instruction, operand and modifiers. A modifier is a function that transforms the value of a given operand. It is written in C language and it has four pre-defined variables to specify the transformation: input is the original operand value, address represents the instruction location, parm is a parameter that may contain an auxiliary value (such as required for evaluating the target address for PC-relative branches), output returns the transformed operand value. An operand type oper-type specifies the nature of an instruction field and it is tied to a binary value encoded within a given field. Examples of operand types are imm for immediate values, addr for symbolic addresses and exp for expressions involving immediate values and symbols. Figure 3 shows an illustrative example of the processor model, according to the specified syntax. Lines 1 to 5 describe the mapping for the operand reg, where the symbols $0, $1, ..., $90 are mapped to the values 0, 1, ..., 90. Note that many-to-one mappings are allowed. For instance, the symbols $sp $fp, $pc and $ra are mapped to values already mapped in line 1. Lines 7 to 8 define the modifier R, which defines a function to be applied for PC-relative transformations. The modifier’s results (output) is evaluated by adding the current location (address) to the operand value (input) and to an offset (parm). Lines 10 to 15 define the instruction beq. Line 11 defines its instruction format as a list of fields and its associated bit sizes. Line 12 defines its assembly syntax: reg, reg and exp are tied to instruction fields rs, rt and imm (beq is the instruction mnemonic). The modifier R (whose offset is 2) is applied to operand type imm, thereby specifying that the resulting value is PC-relative and shifted 2 bits to the left. Finally, in line 14, the constant value 0x04 is assigned to the instruction’s op field. From the processor model, a table of instructions is generated as a starting point for the retargeting algorithms. Each table entry is a tuple defined as follows: table - entry = ( mnemonic , opinfo , image , mask , pseudo , format - id )

Let’s illustrate the meaning of its elements by means of an example. From the model in Figure 3, the following table entry would be generated for the instruction beq: {" beq ", "% reg :1: ,%reg :2: ,%exp :3:" , 0 x10000000 , 0 xFC000000 , 0, Type_I }

The first element is the instruction’s mnemonic (beq). The second stores information like type (reg, reg, exp) and instruction field location (1, 2, 3). The third element stores the partial binary image of the instruction (0x10000000). The fourth element stores a

A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems

17

::= ::= | < operand -def > ::= operand oper -id { " mapping definition" } ::= | empty < modifier - def > ::= modifier modifier - id { " modifier code " } ::= < instruction -def > | < instruction -def > < instruction -def > ::= instruction insn -id { ; (< syntax -desc >) : ( < operand - decoding > ) ; < opcode - decoding > } < format -desc > ::= field -id : constant , < format - desc > | field -id : constant < syntax -desc > ::= mnemonic - id ::= , | ::= oper - id | imm | addr | exp < modifier > < modifier > ::= ::= field - id , | field -id < opcode - decoding > ::= field - id = constant , | field - id = constant < qualifier > ::= # | $ | empty

Fig. 2. Processor model specification

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

operand reg { $ [0..90] = [0..90]; $sp = 29; $fp = 30; $ra = 31; $pc = 37; } modifier R { output = input + address + parm ; } instruction beq { op :6 , rs :5 , rt :5 , imm :16 , ( beq reg , reg , exp