A Programming Language for Processor Based Embedded Systems Akihiko Inoue Hiroyuki Tomiyama Eko Fajar Nurprasetyo Hiroto Yasuura Department of Computer Science and Communication Engineering, Kyushu University 6–1 Kasuga-koen, Kasuga, Fukuoka 816-8580 Japan finoino, tomiyama, eko,
[email protected] Hiroyuki Kanbara Advanced Software Technology & Mechatronics Research Institute of KYOTO (ASTEM RI) 17 Minamichou, Chudouji, Shimogyou-ku, Kyoto 600-8813 Japan
[email protected]
Abstract Since embedded software is becoming more and more complex, software reuse is an important issue in processorbased embedded system design. However, many embedded programs can not be reused on various kinds of architectures because the correctness of programs strongly depends on both compilers and processor architectures. In order to overcome the limited reusability, we have proposed an embedded programming language, called Valen-C. In ValenC, programmers can explicitly specify the required bit length of each variable in programs. Valen-C enables programs to be independent of processor architectures and can expand the opportunity of reusing the programs. In this paper, the effectiveness of Valen-C is discussed by clarifying the difference of it from existing languages, and the syntax and semantics of Valen-C are described. The structure of the Valen-C retargetable compiler which we have developed and how to preserve the correctness of programs are also described.
1. Introduction In embedded system design, processor-based systems have become popular because they give high flexibility to modification of design. System designers can modify the functionality of the systems by rewriting software only, which leads to rapid implementation. Due to the great pressure of time-to-market, the processor-based approach will be more important to reduce the design period. Software reuse is a key technology as well as hardware reuse to design embedded systems in a short period be-
cause software design is often a great burden to system designers. In order to efficiently reuse software, programming languages need to be independent of target processor architectures and the correctness of programs should be preserved on various kinds of architectures. However, many existing programming languages do not support such software reusability. For example, C programs written for a processor with 32-bit datapath width may not run correctly on 16-bit processors. Similar problem arises when generating ASICs from software. Several parts of software are processed in ASICs rather than a processor to enhance the performance of intended systems. The ASICs which are directly generated from software often have the redundant area and power consumption in the combinational logics and the storage units. The reason is as follows: Many software programming languages have the limited word-length support. For example, only a few data sizes, which are 8 bits, 16 bits, and 32 bits, are supported in C [1]. If the size of the type short is 16 bits, variables whose value is in [0,2000] 1 may be declared as short type while 11 bits are enough to hold it. In this case, upper 5 bits make no sense. Such a redundancy in ASICs may not be acceptable in cost-efficient embedded systems. In order to overcome the limitation on software reuse and to reduce the redundancy in ASICs, we have developed the Valen-C (Variable Length C) language and the retargetable compiler [2]. Valen-C is a language which incorporates a concept of computational accuracy into its semantics. The accuracy is introduced in the way programmers specify the required bit length of each data type explicitly. The retargetable compiler translates a Valen-C program into 1 If
x
is said to be in [n; m] if the range of x is n
x
m
.
assembly code for the target processor with preserving the computational correctness. Using Valen-C, programmers can write programs without assuming a particular processor, and, as the result, the opportunity of software reuse is improved. Furthermore, ASICs can be generated efficiently since the minimum required size of each variable is specified explicitly. In [2], a design methodology for embedded systems using Valen-C is presented. In this paper, the syntax, semantics, and the effectiveness of Valen-C are discussed in detail. The structure of the Valen-C retargetable compiler and how the correctness of Valen-C programs is preserved are also described. This paper is organized as follows: In Section 2, some commonly used programming languages are discussed. In Section 3, the syntax and semantics of the Valen-C language are described. The section also presents how the retargetable compiler preserves the correctness of Valen-C programs. In Section 4, a system design example with Valen-C is provided. An experimental result is shown in Section 5. We conclude the paper in Section 6.
2. Embedded Programming Language In embedded software design, programming languages should be selected carefully so that embedded programs have to satisfy some requirements such as high portability, high performance, and high memory efficiency. In this section, advantages and drawbacks of some existing languages are discussed. 2.1
Existing Programming Languages
A. C and C++ C is a high-level language widely used in embedded system design. C programs can satisfy the requirements of both high memory efficiency and high performance owing to good compilers. However, Paulin et al. mentioned that C has some limitations as an embedded programming language [1]. One of them is the limited word-length support. Only a few data sizes, which are 8 bits, 16 bits, and 32 bits, are supported in C. It is insufficient for many applications such as audio processing in which 24-bit data is typical. In this case, 32-bit data types may be used instead, therefore, memory cost may increase. A more serious problem is that the portability of C programs is not high because semantics of C programs depends on both compilers and processor architectures. C++ is a language which supports object oriented software design. While the portability of programs are improved, extra information such as virtual tables, which must be carried at run-time, causes the overhead of performance and the memory area. These drawbacks are not often acceptable in real-time system design.
B. Java Java has been receiving the attention as an embedded programming language because of the high portability. Java programs are once mapped into machine independent instruction set, called bytecode, and then interpreted into each machine code. Therefore, Java programs can be run on any machine with a Java interpreter [3]. Rosenstiel et al. clarified several drawbacks of Java as an embedded programming language [4]; Java can not access hardware resources directly; The interpretation of a Java program is slower than the execution of compiler generated native code for a processor; The large amount of memory is required for complete Java execution environment. A solution of these problems is also discussed in [4]. Unfortunately, to the best of our knowledge, we have seen no work which quantitatively evaluates the required memory size for the execution of Java programs and the computation time. C. Assembly Language Assembly language is the most widely used in embedded system design. It satisfies the requirements of both good memory efficiency and high performance. However, it has quite low productivity. Furthermore, assembly programs can be reused only on same architectures since it is completely machine dependent. This becomes a serious problem in designing hardware and software concurrently. 2.2
Common Drawbacks in Existing Languages
Semantics of C programs depend on both processor architectures and compilers. The size of each data type is fixed by compilers. Therefore, the portability of C programs is limited. For example, C programs written for a 32-bit processor may not run correctly on 16-bit processors. The size of data type in Java is fixed by the language specification. For example, integral types are byte, short, int, and long, whose size are 8 bits, 16 bits, 32 bits, and 64 bits, respectively. It is difficult to run Java programs on 24-bit processors. Because of above reasons, it is difficult to reuse programs written in an existing language with preserving the correctness of computation. 2.3
Valen-C
The bit length of each variable in an application program is originally independent of the length of the datapath width. However, in existing languages, the relation between them are determined implicitly. Let us clarify who decides the relationship. In C and C++, compiler designers define the relationship. A variable in C programs is declared using one of data types whose sizes are fixed for a datapath width.
Valen-C
:
Java C, C++ Assembly
: : :
System Designer (Programmer and/or HW Designer) Language Designer Compiler Designer HW Designer
assumed to have a length of larger than 8 bits. The struct type and the array type are also available as well. A floating point variable which has the precision of a 5bit exponent and a 10-bit mantissa is declared as “float5.10 x”. 3.2
Figure 1. Who decides the relationship between the datapath width and the bit length of each variable in programs?
Therefore, the variable size changes according to the datapath width change. The size of each data type is determined by compiler designers. In Java, language designers give the definition as a part of the language specification. Since Java assumes that the size of byte is 8 bits, the relationship is implicitly defined by the language specification. In assembly languages, processor architectures induce the length of variables. Valen-C supports the definition of the relationship supplied by system designers who write application programs and specify the processor architecture (See Figure 1). Valen-C enables system designers to explicitly specify the required bit length of each variable in programs. Even if system designers customize the datapath width for their application, the Valen-C compiler preserves the semantics of the program. Therefore, Valen-C programs can be reused on processors with various datapath widths. Valen-C is one solution for the problem of word-length support in C. In the following section, the syntax and semantics of Valen-C along with the structure of the Valen-C retargetable compiler are described.
3. Valen-C and The Retargetable Compiler 3.1
The Valen-C Programming Language
Valen-C is an extension of the C language. As mentioned before, in Valen-C, programmers can specify the required bit length of each variable in a program. The control structures in Valen-C, such as “if” and “while” statements, are same as C. C provides for three integer sizes, declared using the keywords short, int and long. The sizes of these integer types are determined by the compiler designer. In many processors, the size of short is 16 bits, int is 16 or 32 bits, and long is 32 bits. On the other hand, in Valen-C, programmers can use more kinds of data types. For example, if a variable x needs a precision of 11 bits, x will be declared as “int11 x”. Similar to C, the sign and unsign qualifiers can be specified in Valen-C. The char type also exists in Valen-C, and it is
Retargetable Valen-C Compiler
In this section, the retargetable compiler2 which translates a Valen-C program into assembly code of a target machine is described. The Valen-C compiler uses SUIF (Stanford University Intermediate Format) library [5]. SUIF is an intermediate format of programs, and the SUIF library is a set of functions and classes which provide the interface to SUIF. The Valen-C compiler preserves the correctness of programs in the following manner: If a variable has a -bit precision, the Valen-C compiler allocates the storage of not less than bits for the variable. If an operation in a Valen-C program requires the -bit precision, the operation is performed with the precision of not less than bits. For example, an addition of two 13-bit variables may be calculated with a precision of 20 bits on 20-bit processors. In case that the precision of an operation is larger than the datapath width, it is performed with a certain number of machine instructions. For example, an addition with a 20-bit precision is performed with two addition instructions of lower 10 bits and upper 10 bits on a 10-bit processor. Floating point data types have not been supported yet. The Valen-C compiler is retargetable by modifying the machine description. The machine description includes the datapath width, the number of registers, the instruction set, the sizes and alignments of the program and data memories, the minimum addressable size of the data memory, and so on. The current implementation assumes RISC architectures as the target. Figure 2 shows the compilation flow of the Valen-C compiler. The compilation flow consists of 5 stages. Details of each stage are described below.
n
n
n
n
A. Valen-C to C Translation At the first phase, a Valen-C program is translated into a C program by appropriately assigning each data type in the Valen-C program to one of the four data types, short, int, long and long long3 . The sizes of short, int, long and long long must be defined in the machine description file, and the int type must have the same size as the datapath width. For most processors, short has the half size or the same size of int, long has the double size of int, and long long has the 2 The Valen-C compiler is now available via http://kasuga.csce.kyushu-u.ac.jp/˜codesign. 3 The long long data type has the double size of long. The long long data type is not defined in the C language, however, many C compilers assume it.
4. System Design with Valen-C Valen-C Program Valen-C to C Translation
4.1
Design Flow
C Program C to SUIF Translation SUIF Machine Independent Optimization SUIF Precision Translation SUIF Register Allocation and Code Generation Assembly Code
Figure 2. Compilation Flow of the Valen-C Compiler triple or quadruple size of int. For example, a data type of Valen-C which is larger than int but not larger than long is assigned to long. If long long is four times larger than int whose width is bits, data types of at most 4 2 bits can be used in Valen-C programs. The developed Valen-C compiler accepts both Valen-C and C programs. If C programs are given to the compiler, this phase is skipped.
n
n
B. C to SUIF Translation Having translated the Valen-C program into the C program, syntax analysis is performed. At this stage, a parser in the SUIF library package is used. C. Machine Independent Optimization Machine independent optimizations such as dead-code elimination and copy propagation are performed using a tool in the SUIF package. D. Precision Translation Multi-precision operations are divided into a certain number of machine instructions. If the processor has no multiprecision multipliers and dividers, multi-precision multiplication and division operations are replaced by function calls in order to prevent excessively increasing the code size. The function libraries are assumed to be designed by the machine description designers. Automatic generation of the function libraries remains as one of our future works. E. Register Allocation and Code Generation At the final phase, each variable is allocated to registers. If a variable is larger than the datapath width, more than one registers are allocated to the variable. After register allocation, operations are mapped to machine instructions by tree pattern matching.
Using Valen-C, system designers can easily design a dedicated single-chip system which consists of a core processor, instruction and data memories, and some ASICs. System design is performed as an iterative manner. First of all, designer write a Valen-C program which exhibits an intended algorithm. The program is compiled for a processor, i.e. full software implementation. Then the area, performance, and power consumption of the system, which has no ASICs, are evaluated. If the design does not meet design constraints, redesign is invoked. There are two ways to redesign. One is modifying the core processor. In our design environment, designers can use parameterized core processors for easy modification [2]. A dedicated processor can be obtained by appropriately determining the parameters. In the parameters, the datapath width especially has a strong effect on area and performance of systems [6]. Since Valen-C is independent of the datapath width, designers can freely change it without modifying the program. The other way is the use of ASICs each of which performs a part of the program. On generating ASICs from a Valen-C program, many redundancy can be eliminated because the required bit length is specified in the program. Valen-C plays a significant role in our design flow. 4.2
Design Example
This section provides a design example to demonstrate the effectiveness of Valen-C and the retargetable compiler. Figure 3 shows two ways to implement a Valen-C program. In software implementation, we assume a 10-bit processor. The Valen-C compiler maps each data type in the Valen-C program into one of the four data types in the machine description. The int1 type is replaced the short data type whose size is 5 bits. Other two data types, int14 and int20, correspond to the long data type. Code generation is performed to obtain the assembly code for the 10-bit processor. In the assembly code, long type variables are divided into two words of lower 10 bits and upper 10 bits. If a designer changes the datapath width from 10 bits to 32 bits, the areas both of the processor and the data memory increase because some redundancies are introduced in size of variables. Conversely, since variables which are treated as two words on the 10-bit processor become single word variables on the 32-bit processor, the area of the program memory decreases and the performance is improved. Shackleford et al. have discussed the trade-off in detail [6]. The Valen-C program can be implemented in hardware. Figure 3 also shows an example of the hardware implementation. The Valen-C program is translated into a VHDL
VHDL Program
if (flag == 1) { z = x + y; }else{ z = w; } ...
14 bit registers
w
a
b 14 bit ALU
MUX
flag
z 20 bit register
Assembly Code
C Program main(){ unsigned short flag; long x, y; long z, w; if (flag == 1) { z = x + y; }else{ z = w; } ...
Code Generation
Valen-C to C Translation
Software implementation
Datapath width = 10 bits short : 5 bits int : 10 bits long : 20 bits long long : 30 bits
20 bit register
ARCHITECTURE rtl OF adder IS BEGIN PROCESS VARIABLE a: SIGNED(x’range); VARIABLE b: SIGNED(y’range); VARIABLE c: SIGNED(w’range); BEGIN WAIT UNTIL clk’STABLE and clk = ’1’; IF flag = ’1’ THEN z