MetaCore: An Application Speci c DSP Development ... - CiteSeerX

4 downloads 259 Views 144KB Size Report
Jin-Hyuk Yang, Byoung-Woon Kim, Sang-Jun Nam, Jang-Ho Cho, Sung-Won Seo, Chang-Ho Ryu, ... ASIP Application-Speci c Instruction set Processor devel-.
MetaCore: An Application Speci c DSP Development System Jin-Hyuk Yang, Byoung-Woon Kim, Sang-Jun Nam, Jang-Ho Cho, Sung-Won Seo, Chang-Ho Ryu, Young-Su Kwon, Dae-Hyun Lee, Jong-Yeol Lee, Jong-Sun Kim, Hyun-Dhong Yoon, Jae-Yeol Kim, Kun-Moo Lee, Chan-Soo Hwang, In-Hyung Kim, Jun-Sung Kim, Kwang-Il Park, Kyu-Ho Park, Yong-Hoon Lee, Seung-Ho Hwang, In-Cheol Park, and Chong-Min Kyung Department of Electrical Engineering KAIST, Taejon, 305-701, Korea

Abstract

This paper describes the MetaCore system which is an ASIP(Application-Speci c Instruction set Processor) development system targeted for DSP applications. The goal of MetaCore system is to o er an ecient design methodology meeting speci cations given as a combination of performance, cost and design turnaround time. MetaCore system consists of two major design stages: design exploration and design generation. In the design exploration stage, MetaCore system accepts a set of benchmark programs and a formal speci cation of ISA(Instruction Set Architecture), and estimates the hardware cost and performance for each hardware con guration being explored. Once a hardware con guration is chosen, the system helps generate a VLSI processor design in the form of HDL along with the application program development tools such as C compiler, assembler and instruction set simulator.

1 Introduction

Each DSP application requires di erent arithmetic functions and bit-rates according to the algorithm to be executed.[1] The characteristics of DSP algorithms must be fully re ected in designing an application-speci c processor architecture.[2] Instruction set of ASIP(Application-Speci c Instruction set Processor) is designed to maximize the performance on speci c applications, while that of general-purpose microprocessor is designed to provide an acceptable performance on a wide variety of applications. The most important issue in the design of ASIP is the determination of micro-architecture and instruction set, which are closely a ected by each other. In other words, the design of instruction set cannot be viewed as a design process independent of the design of the micro-architecture. Generally, instruction sets designed without the knowledge of sucient details of the underlying micro-architecture are not ecient in terms of execution time. The matching between micro-architecture and the instruction set can only be achieved through a diverse exploration of the design space and the understanding of the interaction between micro-architecture(hardware) and algorithm of the application(software). The design turnaround time of an ASIP depends on how eciently the system speci cation is transferred to lower-

level design to be implemented as a chip layout and system software implementation. The traditional design approach of describing design entity using HDL language or schematic and verifying this using simulation needs a significant number of iterations between higher-level speci cation and lower-level implementation and usually results in an increase of design turnaround time. In this paper, we present a systematic approach called MetaCore to develop an ASIP for DSP applications. MetaCore system consists of two major design stages: design exploration and design generation. MetaCore system accepts a set of benchmark programs and a formal speci cation of ISA(Instruction-Set-Architecture) as its inputs, and produces information on the design such as gate count, use record of each instruction, and so on. Then MetaCore system is used to generate an ASIP design in the form of HDL along with a set of application program development tools such as C compiler, assembler and instruction set simulator.

2 Design Philosophy and Framework

Major issues in the design of MetaCore system are: 1. how to design programmable yet ecient microarchitecture for DSP applications. 2. how to extract features(such as instruction set and special functional blocks) from statistical data(such as instruction usage and block usage). 3. how to generate automatically the chip equipped with the extracted features. In order to solve these problems, we have developed a recon gurable micro-architecture and a formal representation language. The micro-architecture is designed to meet the general characteristics of various DSP applications. The recon gurability of micro-architecture o ers high degree of performance optimization as required for each speci c application. Recently, various recon gurable micro-architectures were introduced such as EPICS[3] and PEAS-I[4]. Those systems retain their recon gurability by changing some system parameters such as bit-width of hardware functional blocks, and selecting instructions from prede ned superset. However, such instruction customization as selecting a subset of instructions from a superset can not fully satisfy the demands of various algorithms in DSP applications. The formal approach, adopted in MetaCore system, enables the designer to construct original instruction set for a given application. It is started with a speci cation of ISA and generates performance/cost parameters to support the designer in the choice among alternatives. The performance/cost parameter is generated by analyzing the relationship between the behavior of benchmark programs and hardware cost. The new design knowledge, the user-de ned instructions and macro-blocks supporting the instructions, if

th

35 Design Automation Conference ® Copyright ©1998 ACM 1-58113-049-x-98/0006/$3.50

DAC98 - 06/98 San Francisco, CA USA

necessary, is simplely incorporated in automatic design environment only by specifying the ISA and updating the system library.

2.1 Micro-architecture

DSP applications are generally characterized as computationally intensive with large data set, loop-dominant control ow behavior, and accumulation-based operation. To support these characteristics, it is necessary that the microarchitecture supports e ective data communication between memory system and execution units, low overhead loop control, and accumulator based instruction set architecture. PADDR D0ADDR D1ADDR Reset Int. Clk.ctrl Flags

PCU PC IR

AGU Address ALU

Addr. Reg.

Address ALU

HW-DO

PMEM DMEM0 DMEM1

PDATA D0DATA D1DATA S1 S2

ALU(AU) AA AH

MPY(MU)

Flag Unit (FU)

EDATA EADDR

AL Flags

ALU

Figure 1: The micro-architecture of MetaCore. The heart of the proposed micro-architecture shown in Figure 1 consists of three execution units operating in parallel, i.e., ALU(Arithmetic and Logical Unit), AGU(Address Generation Unit), and PCU (Program Control Unit). The ALU, which consists of MAC(Multiply-ACcumulate) unit and various basic hardware units, has been designed to optimize the time-critical inner-loop functions of DSP algorithms. The AGU is responsible for e ective indirect addressing of data operands in memory. The AGU contains dedicated interconnections among address registers(ARs) and increment/decrement capabilities for ecient traversal of the special memory structures. The PCU is a pipelined instruction decoder with branching(direct/indirect jump, call/return), interrupt handling(goto/return-from interrupt) and zero-overhead hardware do-loop capability. The design style supported by the MetaCore system is pipelined and parameterized micro-architecture. The pipeline is controlled in a data stationary fashion. It consists of stages for instruction fetch, instruction decode, operand read, execute, and result write. The rst two stages are identical for all instructions. The operand read, execute and result write stages are dependent on the semantics of the instructions. The target micro-architecture can be fully described by specifying the ISA and a set of parameters such as register le size, width of bus, size and address space of memory, and bit-width of functional unit. MetaCore system has a prede ned instruction set, which consists of instructions generally used for the implementation of DSP algorithms. Table 1 shows a summary of the prede ned instruction set, where instructions are classi ed into two classes, i.e., primitive class, and optional class. The instructions of the primitive class are essential, and are not omissible in any ASIP designed under the MetaCore environment, while the instructions of the optional class can be eliminated to satisfy the given design constraint.

class category primitive arithmetic

instructions add adc and andc ash ldi lhi lsh nop not or orc sub subi subr xor data move ld st pop push control d.cnd bd.cnd call callr for rep reti rets trap optional arithmetic bf0 bf1 bts bcr bst clip ext idiv mac mpy mrg Table 1: The prede ned instruction set of the MetaCore system.

2.2 Formal Speci cations

In the MetaCore system, the target ASIP is speci ed using formal representation language, from which the instruction set of the target ASIP can be synthesized for the purpose of simulation and lower-level implementation. The descriptive power of the formal representation language corresponds to that of ISPS[5]. Figure 2 shows an example of machine speci cation. The hardware clause describes the hardware con guration of EM1. The EM1 includes a 36 bit-wide accumulator, four 16 bit-wide address pointer register, 2 Kword sized pmem(program memory), and two 1 Kword sized dmem(data memory). During each instruction cycle, instruction fetch and instruction decode stages are identical for all instructions. Thus, the behaviors of these two stages are omitted from the speci cation of ISA. The expected machine behaviors in the operand read stage, execute stage and result write stage of each instruction are speci ed in the def inst clause and operand clause as follows: The operand clause de nes the kind of operand. The def inst clause represents the instruction behavior in detail: arithmetic function, ag elds which are a ected by execution of the instruction, and the number of execution stages in the pipeline to execute the instruction. // Specification of EM1 (hardware ACC 1 data_format 4.16.16 AR 4 addr_bit 16 pmem 2K, [2047:0] dmem0 1K, [9215:8192] dmem1 1K, [10240:9216] .. ) . (def_inst (operand ( ACC (extension (flag (exestage ) (def_inst (operand (type1 (type2 (type3 )

ADD

type2)

Suggest Documents