SoC Design Environment with Automated Bus ...

3 downloads 1145 Views 80KB Size Report
And the hardware IPs are mapped into. FPGA. ... On the other hand the number of IPs in a SOC design .... monitor starts working when a predefined triggering.
SoC Design Environment with Automated Bus Architecture Generation for Rapid Prototyping with ISS Sang-Heon Lee1, Jae-Gon Lee1, Ando Ki2, Chong-Min Kyung1 1

2

Electrical Engineering Department KAIST Daejeon, Korea

RND Center Dynalith Systems Co., Ltd. Daejeon, Korea

[email protected] [email protected] [email protected]

[email protected]

Abstract – It is important in SoC design that the design and verification can be done easily and quickly. And RTlevel simulation in verification methods is still necessary. But the usage is limited by its low performance. Therefore we propose a SoC verification environment in which hardware parts are accelerated in FPGA and cores are modeled with ISS. To connect ISS in high abstraction level with emulator in pin-level accuracy, bus functional model(BFM) is used. For hardware debugging, bus monitor is designed. By post-processing data from bus monitoring, debugging and performance estimation are possible. To design and verify a design easily and quickly in the proposed environment, we develop a tool which creates bus architectures automatically. With this tool, the design time from specification to FPGA based prototyping can be reduced remarkably. Thus fast verification and design space exploration are possible. AMBA is chosen as the SoC bus protocol. Keywords: generation

1

SoC,

BFM,

AMBA,

prototype,

bus

Introduction

For the system-on-chip(SOC) design, many high abstraction level design methodologies, for example, using SystemC[1] are introduced and being developed. But RTlevel test and verification is still important step in SOC design flow. For this, co-simulation has been used. It generally is based on ISS and HDL simulator. The communication between the two processes is done via IPC channel. Co-simulation methodologies provide accurate verification, but its performance is limited. As the designs are getting bigger, especially in the case of SoC, the HDL simulation can not afford the sufficient test. So the prototyping which is based on FPGA is used as an alternative methodology. In the prototyping system, synthesized hardware designs are mapped into FPGA to accelerate the operating clock speed. Of course, co-verification must be able to support concurrent development of software application which will be executed on an embedded core[3]. In some prototyping

systems, real chip is used to model core. But the cores are not always available in test chip form. So ISS are used to model various cores. And Generally communication between ISS and hardware can be done by implementing bus functional model(BFM) which takes higher abstract command and generates pin and cycle accurate signals. In our SoC design environment, the core is modeled in host computer with ISS. And the hardware IPs are mapped into FPGA. For the high performance the communication between ISS and hardware takes place in transaction level to reduce the communication overhead. For this, BFM is implemented in the FPGA. On the other hand the number of IPs in a SOC design is increasing. And thus the bus architecture is getting complicated having hierarchy. The IP reuse became a common sense and the reusable IP libraries are getting wealthy. Accordingly though the required IPs are various, maybe most of them are already available. Only a small part of the whole SOC design needs to be newly created. For these reasons, designing the on-chip bus architecture becomes a relatively big job. Because in many case the bus protocol may be fixed in the very early design time, automatic bus generation can make the SOC design and verification easy and speedy. Thus we developed a tool which generates bus architecture in fast and easy way. This work can reduce the design time considerably, and make design space exploration for the bus architecture possible. In this paper, the idea is realized on FPGA based prototype system[4]. The hardware prototype board is mounted on a PCI slot of the host computer. And thus communication between hardware and software takes place through PCI interface. For the on-chip bus protocol, AMBA has chosen. It is supported by ARM core and one of the most popular SOC protocols. This paper organized as follows. The related works are explained in section 2. In section 3, we explain prototyping and debugging methodology in the proposed environment. In section 4, a tool which generates on-chip bus architecture automatically is depicted. In section 5, we

show a case study on JPEG. And in section 6, we give conclusions.

2

Related works

There are various prototyping systems. In [6], the coverification system consists of VLIW processor and FPGA board. The software component is an ISS of the core running on the VLIW processor. And BFM is used to make pin signal. In [7], for design space exploration simulation based method is used. And after that whole system is mapped in prototyping system where special DSP is implemented in FPGA. ARM offers Integrator family[2] to provide the developer with a rapid prototyping environment that enables the integration of hardware and software IP. It consists of ARM test chipbased Core Modules, and FPGA-based Logic Modules. But it mainly aims to assist developing software so that it does not concern about hardware debugging. For the automatic bus generation, Synopsys DesignWare AMBA On-chip bus[4] provides synthesizable and configurable bus components. The components can be configured according to the bus specification with a GUI based tool. But this package does not provide a way how to interact with cores. And the generated components are configured in hardwired manner, so even for small changes the hardware has to be regenerated, re-synthesized and re-compiled to the FPGA. In our environment, almost all components are configurable parameters. So change of the parameters such as address map, priority, clock frequencies can be done immediately.

3 3.1

Proposed SoC design environment

The first is the process of C algorithm where APIs of the emulator are interacting with each other to generate stimulus to HDL simulator. The second is the process of HDL simulator where the HDL simulator executes model of bus components and IPs. The IPC channel is pipe where messages of each process are transferred to the other process. 3.2

Prototype

Because bus generator offers synthesized bus components too, you can go to the prototyping step easily and quickly. In the FPGA based prototyping system used in this paper, the bus system clock is based on the PCI clock which is 33MHz or 66MHz. The cycles per second of emulation system is hundreds times bigger than that of simulation. So the exhaustive verification or design space exploration is possible.

Figure 2. Co-emulation system The clock frequency of the SOC would be about one or two hundred MHz, so the running clock frequency of the prototype system may be almost same with that of the real system.

Co-simulation

Co-simualtion is possible using the IPC library of the emulator[4].

Figure 1. Co-simulation system The software application in C/C++ can run in native code or in cross complied code on ISS. The hardware blocks including generated bus components in the dotted box and application specific IPs are run in HDL simulator. There are two processes when co-simulation is running.

3.2.1

Bus functional model BFM(Bus Functional Model) enables C algorithm to communicate with hardware IPs in transaction level. BFM gets high abstraction level commands from C algorithm and interprets them to make pin and cycle accurate transaction to AHB bus. The one side of BFM is PCI controller interface of emulator and the other side is AHB interface. The major role of BFM is AHB master. But it has AHB slave interface also to utilize BFM as AHB slave or both. BFM is AMBA AHB rev 2.0 compliant and supports almost all features of the AHB specification. In C algorithm, AMBA API is used to communicate with BFM. BFM would take control information of a bus transaction via these API, and generate almost all kind of transaction necessary.

3.2.2

Debugging In the Prototype system, the debuggability is generally lacking because of the poor probing

environment. The logic analyzer can help this problem but it is troublesome work and the channel bandwidth is still insufficient. So we designed AMBA monitor which is a pin-accurate and cycle-accurate debugger for AMBA development environment. AMBA monitor is composed of hardware and software parts. The hardware part of AMBA monitor samples the AHB and APB signal values at every clock cycle respectively and sends them to the software part of it. There are several triggering conditions which define bus activities. With triggering conditions, the hardware part of monitor starts working when a predefined triggering condition, named starting triggering condition, is met, and stops when another triggering condition, named stopping triggering condition, is met. The software part of monitor stores bus sampling information in a file. When the output file size reaches 1GBytes, it closes the file and opens a new file to store new sampling data. With waveform viewer, debugging is possible. And obtaining statistics of bus activity, coverage testing and protocol violence checking is also possible through post processing with the dump files. This information can help performance measuring and bus architecture determination

4

Automated bus generation Bus Specification

HW design

SW application

Bus component Library

Bus Generator

library which is described in section 4.1. One of the bus models is for simulation in HDL. The other bus model is for emulation in EDIF format. Bus generator takes bus specification via graphical user interface and connects automatically all components required. From the specification to the bus models which are ready to be used for simulation or emulation, it takes several minutes only. So the time and effort which are required for complicated SOC design process can go down very much. And also when the bus architecture is not fixed, the design space exploration can be done in accurate and fast way. The bus architecture may include bus hierarchy, arbitration scheme, memory map, clock speed and so on. Because the design iteration time is quite small, such design factors can be decided based on the results of through simulation or emulation. 4.1

Reconfigurable bus components

There are several components in AMBA bus system. Figure 4 illustrates AMBA bus example. The gray blocks are application specific user IPs. There are two kinds of components, the basic AMBA bus blocks and special function blocks. All components are configurable with parameters. Bus Configuration module

Clock Generator

Arbiter

Arbiter

Decoder

Decoder

AHB0

AHB1

Bus model for Simulation

Co-simulation HDL Simulator

C/C++ Compiler or ISS

Bus model for Emulation

Synthesized Design

Figure 4. Bus architecture example Co-emulation Emulator

C/C++ Compiler or ISS

Figure 3. Design flow with Bus generator For the requirements described above, we made a tool which generates bus architecture automatically from the bus specification. Figure 3 shows design flow using the tool. Bus generator produces two kinds of bus models from user bus specification, using AMBA bus component

Arbiter, decoder, Muxes, APB bridge, AHB-to-AHB bridges are the basic blocks of AMBA bus. Arbiter does priority based arbitration and support upto 16 masters. The priorities of masters are configurable via configuration module. The default master is also configurable. When there is no master requesting the bus, the default master will take the grant. Decoder has two sets of address map, normal and boot. Each address can be configured via configuration module. Active address set can be switched during runtime by setting address mode control bit. Decoder includes default slave. There are two muxes,

master-to-slave mux and slave-to-master mux. These muxes can support 16 masters and 16 slaves respectively. APB bridge has two sets of address map, normal and boot. Each address can be configured via configuration module. And APB bridge includes asynchronous FIFO. So APB bus can have clock which is independent of that of AHB. APB bridge supports 16 APB slaves. AHB-to-AHB bridge connects between two AHB buses and enables constructing hierarchical bus architecture. It has asynchronous FIFO, so clocks of the two AHB can be independent. It supports all burst mode of master and responses of slave And there are three special function blocks, BFM, bus configuration module and clock generator. Bus functional model(BFM) is used to communicate with software side in transaction level. As described above, almost all bus components are configurable and bus configuration module does the work. This module takes configuration information from software and distributes it to configurable components. Clock generator is connected to the configuration module to get configuration information. Based on the information, it generates divided clocks and resets to bus system. Clock generator can generate up to 16 different clocks and 4 resets. Clock is generated by dividing PCI clock. The fastest clock frequency is the same with PCI clock. The slowest clock is slow 128 times than PCI clock.

5

Case study

the bus with IPs such VLD, IDCT and memory. It takes also a few minutes. To map the design into FPGA, FPGA compilation is needed. The compile time depends on the FPGA device and the operating system, generally about 1 or 2 hours. So when the required IPs are available, a few hours are enough to co-emulate from specification. The JPEG hardware ran at 33MHz clock frequency.

6

Conclusions

We introduced a SoC design environment with FPGA based emulation system. For debuggability, we have designed AMBA monitor which samples all bus activities in pin and cycle accurate way. The prototype system can run with clock speed which is close to that of real system. And so doing exhaustive verification and design space exploration is also possible. To improve the SOC design flow, we have developed automatic bus generation environment. In that environment, bus architecture can be generated from the bus specification in an easy and quick way with configurable bus components library. Using this, complicate SOC design processes from the specification to the prototyping system can be done in very small amount of time and effort. This co-verification environment is pseudo-cycle accurate. Works to synchronize ISS with designs in hardware are needed.

References [1] Benini, L., “Virtual In-Circuit Emulation for Timing Accurate System Prototyping”, ASIC/SOC Conference, pp. 49-53, Sept 2002. [2] Schaumont, P., “Interactive Cosimulation with Partial Evaluation”, Design Automation and Test in Europe Conference and Exhibition, pp. 642-647, Feb 2004. [3] http://www.arm.com/products/DevTools/IntegratorA P.html [4] http://www.dynalith.com/2003/iprove.php

figure 5 JPEG Decoding system We applied proposed environment to JPEG decoding system. Figure 5 shows the system. Header management part is implemented in C algorithm in the host side. And other parts such as VLD(Variable Length Decode), IDCT(Inverse Discrete Cosine Transform) and memory are implemented in hardware. BFM which operates as master takes charge of communication between algorithm and hardware. VLD and IDCT have master and slave interfaces respectively. So there are 3 masters and 3 slaves in the system. Configuration module and clock generator are omitted in the figure. Using bus generation tool, the bus system could be created in a few minutes. And the next step is connecting

[5] http://www.synopsys.com/products/designware/dwlib rary.html [6] Schnerr, J., “Instruction Set Emulation for Rapid Prototyping of SoCs”, Design Automation and Test in Europe Conference and Exhibition, pp. 56-567, Mar 2003. [7] Bieger, J. “Rapid Prototyping for Configurable System-on-a-Chip Platforms: A Simulation Based Approach”, International Conference on VLSI Design, pp 577-582, Jan 2004. [8] ARM. AMBA specification Rev 2.0

Suggest Documents