SHORT PAPER International Journal of Recent Trends in Engineering, Vol 2, No. 7, November 2009
Module Based Implementation of Partial Reconfiguration Using VHDL on Xilinx FPGA Solomon Raju Kota1, Ashutosh Gupta2, Shashikant Nayak3, and Sreekanth Varma4 1
Scientist, Central Electronics Engineering Research Institute (CEERI)/Council of Scientific & Industrial Research (CSIR), Pilani, Rajasthan-333031, INDIA Email:
[email protected] 2, 3, 4 Project Assistant, Digital System Group, Central Electronics Engineering Research Institute (CEERI), Pilani, India Email: {
[email protected]}, {nayak.tezu, srikanthvarmaiiit}@gmail.com
Abstract—Reconfigurable computing is an emerging field in this modern world of computer and electronics engineering, which allows the system hardware to be changed periodically in order to execute different applications on the same hardware. Partial reconfiguration is the prerequisite of reconfigurable computing, as it allows time-sharing of physical resources for the execution of multiple design modules. Moreover, partial reconfigurable modules can be swapped in or out on the fly from the operating environment control while other modules in the base design continue functioning without incurring any system downtime. This results in dramatically increase in speed and functionality of FPGA based system. This paper presents the algorithm for partial reconfiguration flow & implementation of reconfigurable modules (RM) on Xilinx Virtex-4(XC4VFX12). All the reconfigurable modules are designed and simulated to verify the functionality with supporting simulation tool on ModelSim-6.0d and synthesized with Xilinx 9.1i (ISE).
[2]. Reconfigurable computing is defined as the study of computation using reconfigurable devices [3]. The different models, architectures, compilation and scheduling of tasks, reconfiguration methods, optimal mapping of the design library on the RLU and the stateof-the art of reconfigurable computing systems (RCSs) are described in [4]. This paper is organized as follows: section I give the brief introduction about reconfigurable computing followed by section II that describes the different approaches for reconfiguration, section III describes the implementation flow for partial reconfiguration (PR) and implementation results are discussed in section IV, section V includes conclusion and future work.
Index Terms— FPGA, RM (Reconfigurable module), Partial reconfiguration, Reconfigurable computing
I. INTRODUCTION Computation can be performed by two ways conventionally: hardware based and other is software based. First method use the hardware such as application specific integrated circuits (ASICs) and application specific instruction set processors (ASIPs) in order to execute critical tasks quickly. This approach gives us much performance as hardware optimized for a particular application, but not flexibility because application is not always adapt to the hardware. On the other hand, second method uses Von Neumann computer (general purpose processor) or microcontroller based computing. GPPs (general purpose processors) are more flexible as they have capability to execute/compute any kind of task. But from performance (in terms of silicon area, power usage and speed) point of view, they are far away from ASICs/ASIPs, because the general purpose processor interprets a linear sequence of instructions. Reconfigurable computing blends the benefits of both hardware and software. It tries to fill the gap between hardware (ASIC/ASIP) and software (GPP/microcontroller) approaches as shown in Fig.1 [1]-
Fig.1: Flexibility vs. performance of processor classes [3]
II. RECONFIGURATION APPROACHES A. Static Reconfiguration This is the most common and simplest reconfiguration approach also referred as compile-time reconfiguration [5]. In this approach, each application consists of a single configuration bitstream. All reconfigurable modules are loaded with their respective configurations before commencing the operation. Furthermore, after starting the operation hardware resources remain static during the whole life span of the application as depicted in Fig.2.
Fig.2: Static reconfiguration
That’s why it is called as static reconfiguration.
Project number GAP 6212, sponsored by DIT, MCIT New Delhi.
19 © 2009 ACADEMY PUBLISHER
SHORT PAPER International Journal of Recent Trends in Engineering, Vol 2, No. 7, November 2009
B. Dynamic Reconfiguration This is an advance technique that uses a dynamic scheme to re-allocates the hardware resources at run-time [5]. It utilizes the physical hardware resources in a much better way. As per application demand, it allows hardware resources to be logically added or removed from the operating control environment on the fly while other base modules continue to operate. Dynamic reconfiguration (also called as RTR) allows reconfiguration and execution to proceed at the same time as depicted in Fig.3. Statically reconfigurable devices require execution interrupt. The idea behind Partial dynamic reconfiguration or PR technique is to reconfigure only the needed part of the device. Partial reconfiguration is not supported on all FPGAs. For example, the Xilinx Virtex series of FPGAs (Virtex, Virtex-E, Virtex-II, Virtex-II Pro, Virtex-4) allows partial reconfiguration of the FPGA.
Fig.5: PR Implementation Flow
Step 3 is used to determine whether a large enough area is reserved for PR module with the given AG range constraints. Step 4 analyzes both the timing and placement of the design. Timing and placement analysis is critical in optimizing the shape, size and location of PR region. Timing analyzer is used to generate a timing report for any paths which is not meeting timing constraints. During timing/placement analysis designer can check whether the bus macros are placed effectively, and whether the PR region shape and location meets timing constraints. Base design is implemented in step 5. PAR (place and route) generates a static.used output file during this step. The static.used file contains a list of routes required by the base design within the PR region. These routes cannot be used by PR modules. This static.used file is used as an input to the PR modules implementation in step 6. This output file changes with every time the base design re-implemented. Therefore, PR modules must be re-implemented if any modification is made to the base design. During step 7, the top, base, and PR modules are merged. This creates many full and partial bitstreams, one partial bitstream for each PR module and one full bitstream for each PR module merged with the base design. It also generates some blank.bit file which can be used to replace the PR region with blank logic i.e., PR region with no functionality. Partial bitstreams are used to reconfigure only the PR region instead of the entire FPGA and can be loaded or unloaded on the fly, thereby improving the functionality and performance of the FPGA.
Fig.3: Dynamic Reconfiguration
III. IMPLEMENTATION FLOW FPGA is the main fundamental computing block of the reconfigurable computing system. Step by step procedure to implement partial reconfigurable modules onto the target FPGA device is shown in Fig.4 and Fig.5 below. Step 1 of PR implementation flow is to define HDL description of the design and then synthesizing that description. Synthesis tool (XST in our case) generates .ngc/ngo output files. Hierarchy must be strictly followed during the HDL coding, because PR requires a hierarchical design approach. After synthesizing HDL description, the next step (Step 2) is to place constraints such as AREA GROUP, AREA GROUP RANGE, MODE and LOC constraints along with the timing constraints (PERIOD etc.,) on the design for place and route.
IV. IMPLEMENTATION RESULTS Table 1 shows the device utilization after synthesis and implementation of top module and Table 2 is showing the resources used by different modules.
Fig.4: Algorithm for PR Implementation
20 © 2009 ACADEMY PUBLISHER
SHORT PAPER International Journal of Recent Trends in Engineering, Vol 2, No. 7, November 2009 V. CONCLUDING REMARKS TABLE I. DEVICE UTILIZATION FOR TOP MODULE XC4VFX12 Speed Grade 10 Logic Utilization Number of Slice Flip Flops Number of 4-LUTs
Used
Availabl e
Utilizatio n
281 477
10,944 10,944
2% 4%
344
5,472
6%
344
344
100%
0
344
0%
In this paper we have discussed about the implementation flow for Partial Reconfiguration. Reconfigurable modules (RM) has been modeled in VHDL and implemented on Xilinx Virtex-4 (XC4VFX12) FPGA board with partial reconfiguration. Partial reconfiguration saves the silicon area by allowing multiple configurations to be swapped in or out of the device and provide flexibility to selectively replace the one configuration by the other. This in turn reduces the reconfiguration time by reloading data to only the needed part of the chip. In future we are going to implement these concepts for making a smart reconfigurable computing system.
Logic Distribution Total number of occupied slices Number of slices containing only related logic Number of slices containing unrelated related logic
Number of 4-LUTs
530 477 34
Number used as logic Number used as a routethru Number used as shift registers
10,944
4%
19
Number of bonded IOBs
10
Number of BUFGs
1
Number of hard macros
10
Number of DSP48s
17
Total equivalent gate count for design (not including hard macros)
ACKNOWLEDGEMENT 3%
320 32
3%
32
53%
This work is under the project “Design and development of system level reconfiguration techniques for reconfigurable computing systems” sponsored by DIT, MCIT, New Delhi. The Director CEERI, Dr. Chandrashekhar and the Group Leader, Dr. P.Bhanu Prasad are thanked for their constant support & encouragement.
6,787
Maximum combinational path delay: 4.317ns Maximum Frequency: 134.340MHz
REFERENCES [1] K. Bondalapati and V. Prasanna. “Reconfigurable Computing systems,” in Proc. IEEE, vol. 90, no.7, pp.1201-1217, July 2002. [2] Katherine Compton and Scott Hauck, “Reconfigurable Computing: A Survey of Systems and Software,” ACM Computing Surveys, vol. 34, no. 2, pp.171-210, June 2002. [3] Christophe Bobda, “Introduction to Reconfigurable Computing” Springer 2007. [4] K. Solomon Raju, M. V. Kartikeyan, R C Joshi and Chandra Shekhar, “Reconfigurable Computing Systems Design: Issues at System-Level Architectures”. The 5th Annual Inter Research Institute Student Seminar in Computer Science (IRISS - 2006), IITM, Chennai, India, January 2006. [5] Nikolaos S. Voros and Konstantinos Masselos, “System Level Design of Reconfigurable Systems-on-Chip” Springer 2005. [6] Two Flows for Partial Reconfiguration: Module Based or Difference Based, Xilinx website [online] http://www.xilinx.com/support/documentation/application_ notes/xapp290.pdf, http://www.xilinx.com/itp/xilinx7/books/data/docs/dev/dev 0038_8.html [7] P. Sedcole, B. Blodget, J. Anderson, P. Lysaghi, T. Becker, “Modular partial reconfigurable in Virtex FPGA’s,” International Conference on Field Programmable Logic and Applications, pp. 211-216, Aug.2005.
TABLE II. NUMBER OF RESOUCESR USED BY STATIC, DYNAMIC AND TOP MODULE
IOBs
16-bit memory elements
Function generators (FGs)
D flipflops (DFFs)
BUFG
10
179
484
362
1
RM1(Leds)
-
-
4
2
-
RM2(ADD/SUB)
-
-
33
32
-
Static(control RM1)
-
-
368
158
-
Static(control RM2)
-
-
9
29
-
Module
Top
A result of reconfigurable modules (Add/Sub) as displayed on Hyper-terminal is shown in Fig.6.
Fig.6: Implementation results of PR modules
21 © 2009 ACADEMY PUBLISHER