Synergy and Systems in Silicon: Survey on FPGA ...

2 downloads 0 Views 344KB Size Report
... Research [ICETT 2012], 2012 February 20-21. ISBN : 978-93-80624-62-4 http://www.icett.com/. Baselios Mathews II College of Engineering, Kollam, Kerala, ...
International Conference on Emerging Technological Trends in Advanced Engineering Research [ICETT 2012], 2012 February 20-21.

Synergy and Systems in Silicon: Survey on FPGA Based Hardware Software Co-design R.Radhika*, R.Manimegalai+ * Research Scholar,Anna Unitversity of Technology,Chennai +

[email protected]

Professor, Vellammal Engineering College, Chennai [email protected]

Abstract--Most electronic systems have a pre dominant digital component consisting of a hardware platform which executes software applications. Hardware /software co-design means meeting system level objectives by exploiting this synergism of hardware/software through their concurrent designs. A new codesign compiler provides a co-synthesis and co-simulation environment for mixed FPGA process architectures. The proposed software allows the user to add facilities for targeting the extra instructions to the compiler. Digital hardware design and software design are more or less similar to each other. Current integrated circuits can incorporate one (or more) processor(s) and memory array(s) on a single substrate. These “systems on silicon” provides flexibility for product evolution and differentiation purposes. This paper introduces the readers to various aspects of co-design, co-synthesis, co-simulation and emulation. Keywords—Hardware/software Codesign, co-synthesis, simulation,co-emulation,Field Programmable Gate Array.

co-

I.INTRODUCTION Most engineering systems have several components whose combined operation provides useful services. Today systems are electronic in nature either in monitoring or control. Moreover , systems show a predominant digital component. Digital component of electronic systems are programmable. They have both hardware and software components. The measured objectives such as manufacturing cost, performance, design reuse, memory and CPU optimization depends on both hardware and software components. Components may be homogeneous or heterogeneous in nature. The cooperative design of hardware and software is hardware software co design. The research on co design describes the problem in the designing of heterogeneous systems. The aim of the co design is to shorten the time to market while reducing the design effort and costs of the designed products. The merits of using processors are manifold because software is flexible and cheaper than hardware. This flexibility of software allows recent design changes and simplified the debugging possibilities . The use of processor is very cheap compared to development costs of ASICs because processors are produced often in high volumes, leading to a significant price reduction. Hardware is

ISBN : 978-93-80624-62-4

http://www.icett.com/

always used by the designer, but processors are not able to meet the required performance. The tradeoff between hardware and software illustrates the optimization aspect of the co design performed by humans is often a time consuming and error prone task. The recent interest rise in hardware/software co design is due to the introduction of computer aided design (CAD) tools for co design and the expectation that solution to other co design problems will be supported by tools, thus raising the potential quality, shortening the development time of products, reducing the design effort and costs of the designed products. Due to the extreme competitiveness in the market place, co design tools are likely to play a important role. The forecast of the worldwide revenues of integrated circuits sales and in particular for those used in dedicated applications, explaining the high demand of electronic system level design tools. Whose volume of sales is expected to grow at a compound annual rate of 53% over 45 years. No other technology in history has sustained such a high growth rate for so long. A report commissioned by the EU,Fig.1,stipulates that embedded system market in highly competitive and

rapidly evolving as nes technologies are being introduced.

Fig. 1 Embedded system market report

The evolution of integrated circuit technology in motivating new approaches to digital circuit

Baselios Mathews II College of Engineering, Kollam, Kerala, India.

International Conference on Emerging Technological Trends in Advanced Engineering Research [ICETT 2012], 2012 February 20-21.

design. Due to the complexity of hardware and software their reuse is often a strategic key to commercial profitability. Software layers ranging from operating system to embedded software for user oriented applications result in the standardization of cores or of ASIC. As a result, an increasingly large amount of software in found on semiconductor chips, which are often referred to as system on silicon (SoS). Thus hardware and software can be viewed as commodities with large intellectual property (IP) value. Today , large amount of systems are found on a semiconductor chips, which are called as SoC (System on chip). Europe has a strong leadership portion in the embedded system market over the US and China, Fig.2,which is highly segmented and delivers technology to end products in telecommunications, health,automobile,etc.. One important limitations, which is imposed on any technology, the embedded system industry adopt to the existing code base. Hardware tends to evolve much faster than software but embedded system compares do not want to lose that intellectual property nor go through redesign processor with uncertain outcomes.

Fig. 2 Embedded systems market for various sectors

For the entry level multicore platforms, mapping applications on them is considered problematic. Hence, the designer focus their research to bridges the gap between hardware and software design. Today’s embedded system development assumes a close interaction between hardware and software decisions, generally termed as hardware/software co design. The objectives of Hardware software co design exploiting the synergism of hardware and software through their concurrent design. Co design is perceived as important problem, but the field is fragmented because most efforts are applied to specific design problems. Thus, co design has a different flavor according to the application in which it is used. They are three cases FPGA based hardware software co design. ,ASIC based hardware software co design. and SoC based hardware software co design.

ISBN : 978-93-80624-62-4

http://www.icett.com/

II.AUTOMATIC HARDWARE / SOFTWARE PARTITIONING FPGAs can be designed to implement almost all the function, as long as there are ample FPGA resources available. The configuration memory monitors the functions on the FPGA. An FPGA configuration of a design explains what content the configuration memory should posses to implement the design. The memory can be replaced at runtime because it uses SRAM, which means that the working of functions on an FPGA can be changed at run-time. This is why FPGAs can be used for run-time reconfigurable implementations of applications. Parameterisable configurations are adopted to perform runtime reconfiguration of FPGAs. With the help of parameterisable configuration, a normal FPGA configuration can be obtained very quickly. This generation involves evaluating Boolean expressions alone and can be done quickly to be used at run-time, where some of the bits are exposed as Boolean functions instead of Boolean values. Present run-time reconfiguration techniques, such as the Modular design flow from Xilinx[1], need a lot of below RT-level design work and also demand the user to store all the possible FPGA configurations externally. Parameterisable configurations are being used in the proposed tool flow, as a way to generate all possible configurations at run-time.only the parameterisable configuration needs to be stored, not all separate configurations to use run-time reconfiguration. a detailed overview of this concept and the corresponding tool flow, are given in[3] and [2] . High degree of automation, and the possibility to rapidly generate new configutations at runtime are the main advantages. The inputs of the design are split up in two groups, a group that is slowly changing (parameters) and a group that is quickly changing (regular signals) are the main principle of the proposed tool flow .Certain bits is this parameterisable configuration will be Boolean expressions are evaluated. A faster and similar FPGA configuration than regular FPGA configuration are obtained as a result. a new configuration needs to be generated when a parameter changes, however, the FPGA will need to be reconfigured. Both the generation and the reconfiguration demand time, so using run-time reconfiguration has been introduced overhead. The overhead, and its impact, is very application oriented, as a general rule one run-time reconfiguration will take in the order of milliseconds. The reconfiguration platform, independent of reconfiguration technique, consists of two elements, the FPGA itself and a configuration manager. The configuration manager is a CPU. (we will use a PowerPC on the FPGA but that is not the only option).on the system level [4] [5] and the algorithmic aspects of hardware/software partitioning.[6] [7],A lot of the hardware/software partitioning research are focused .By using our approach at the backend to the conventional hardware/software portioning tool flows or methods, such as,[8][9] even more functionally can be shifted from costlier FPGA resources to cost efficient CPU cycles.

Baselios Mathews II College of Engineering, Kollam, Kerala, India.

International Conference on Emerging Technological Trends in Advanced Engineering Research [ICETT 2012], 2012 February 20-21.

The proposed tool flow has the ability of doing this because it makes use of dynamic reconfigurability of the FPGA. The proposed tool flow is presently used for run-time reconfiguration on FPGAs. tool flow is used to optimize a hardware design, to split the design into a hardware and a software part in a highly automated fashion. The proposed tool generates a parameterisable configuration that has Boolean values (0 or 1). This will result in an incomplete FPGA configuration. The software part then possesses Boolean expressions in the parameterisable configuration that rely highly on the parameters. Solving these Boolean expressions will produce the values to finish the FPGA configuration, and is developed in software by the configuration manager. In a hardware/software perspective, the hardware section contains all the parts of the design that are independent on the quickly varying inputs. The sections of the design that only rely upon the slowly changing inputs are moved to the software. The hardware is re-configured every time when the slowly changing inputs are fed to the software. The hardware works on the FPGA, the software on the PowerPC. In a conventional approach,Fig.3, hardware/software sectioning, a method in the absence of run-time reconfiguration, were chosen the slowly changing inputs of the design are selected, and those are similar to the parameters chosen in the proposed tool fowl. Likewise functionalities that are only dependent on these parameters are identified. The hardware/software limits are identified. the next step is to replace the boundaries by registers and the functionalities that are dependent on the parameters which are software equivalent. The real hardware then contains of the registers and left out section of the design. The signal values on the boundary will be estimated by the software, and then fed to those registers.

Fig.3 Conventional Hardware / Software partitioning Several things vary while using an approach that includes run-time re-configuration. Using the parameterisable configuration concept, and the proposed tool flow, the hardware/software limits by moving it to the configuration memory instead of adding registers to the design can be extended. In addition, since parameterisable configurations are used, the hardware will be a unique circuit that is optimized for particular parameter values, in contradiction to the usual hardware design of the counter parts. As for the software, in this case it consists of Boolean functions that are

ISBN : 978-93-80624-62-4

http://www.icett.com/

produced automatically, based on the hardware functionality that is replaced.

Fig.4 Hardware / Software partitioning using proposed tool

Exploiting the fact that, the proposed tool flow is automatic once the parameters in the design are chosen, only an extra design decision is made, in comparison with the normal design. This is shown in Figure 4. The time needed for such a decision is influenced by a log of different factors and circumstances. Feasibility of any parameter selection can be checked in few minutes and for some applications an extra area gain by extending the hardware/software boundary using run-time configuration can be achieved, which are the advantages of the proposed tool. III.CO-SYNTHESIS The reconfigurable computing(RC) group accesses scheduling for Processor’s element(PE) which are not constant and can be catered towards the present processing needs. The resource constraints on the reconfigurable hardware such as the number of FFs, LUTs, MULTs, CLBs, the reconfiguration overhead, routing overhead, size constraints, throughput, and power constraints before selecting a task to map on to the FPGA have to be considered by RC schedulers. The scheduler also has to consider various implementations depending on the targeted function. A large number of resources are required for throughput implementation making it difficult to map any other activity on the same FPGA. The same task can be implemented to reuse the cores and achieve a smaller footprint to allow for more functionality but suffer from a slower throughput. This increases the complexity of the scheduler as it can no longer assume a single implementation of a task when mapping to a processor element. The decision based on algorithms for partitioning an application initially on the speed of the processor and communication overhead. The computational abilities of an RC are configurable and are modified based on the objective functions for a specific application. Scheduling on RC systems need careful considerations of resource constraints, reconfiguration overhead, routing overhead, size constraints,

Baselios Mathews II College of Engineering, Kollam, Kerala, India.

International Conference on Emerging Technological Trends in Advanced Engineering Research [ICETT 2012], 2012 February 20-21.

communication overhead throughput, and power before choosing a task to map on to the FPGA.

constraints

A. Modifying the MET Algorithm The MET algorithm uses [29] [27] [28] an iterative search method to identify and assign a task to the PE that leads to the minimal execution time (MET). The algorithm essentially always runs a given task on its biasing PE. The MET algorithm leaves the PE load and other constraints such as communication overhead, and assigns the task to the biasing PE. This algorithm works well on the systems with small workloads or with large intervals between task submissions. The algorithm has been changed to consider the various difficulties and limitations of an RC system and allots task related to the feasibility of the system. The algorithm is also developed to identify area used on the FPGA and exploit parallelism. Initially it discovers the biases, and then repeatedly searches for reusability in both PEs. It then allots the best PE for the task. The schedule is then run through an area based optimizer, which iterates through all tasks scheduled for the FPGA and searches for parallelism. B. Modifying the MCT algorithm The MCT algorithm [29] [27] [28] uses an iterative search methodology to group the data unreliable tasks according to their minimal completion time (MCT). The algorithm groups the tasks to allot the shortest tasks initially. The MCT algorithm discards the PE load and other constraints such as communication overhead, and simply allots the shortest activity to the next available PE. This algorithm has been changed to take into account the various restrictions and constraints of an RC system. The algorithm is also improved to invent area utilized on the FPGA and exploit parallelism. It first discovers the task sizes and then repeatedly sorts the tasks such that the shorter tasks are assigned first. It the assigns the task to the next available PE. The schedule is then run through an area based optimizer, which iterates through all tasks scheduled for the FPGA and searches for parallelism.

is then run through all tasks scheduled for the FPGA and searches for parallelism. D. Modifying the Min-Min and Min-Max Algorithm The Min-Min and Min-Max algorithms[27][28][29] utilizes an iterative method to sort the data independent tasks according to their minimal completion time(MCT) in the case of Min-Min and the maximum completion time in the case of Min-Max. The algorithm sorts the tasks to assign the shortest tasks for the case of Min-Min, and sorts the tasks to assign the longest tasks first for the case of Min-Max. The PE load and other constraints like communication over head are discarded. The Min-Min algorithm works well on systems with a few small tasks and a several larger tasks, while the Min-Max performs well when there are just a few long tasks. The algorithm has been changed to take into account the various restrictions and constraints of an RC systems and allotted task based on feasibility of the system. The algorithm is also improved to discover area utilized on the FPGA and exploit parallelism. It first discovers the task sizes, and then iteratively sorts the tasks. It then assigns the task to the next available PE. The schedule is then run through an area based optimizer, which iterates through all tasks scheduled for the FPGA and searches for parallelism. The parameters used to measure the performance of each algorithm are the scheduling time and the execution time required to execute the entire application similar to the experiments . The proposed enhancements produced schedules that exploited reconfigurability to also provide an order of magnitude improvement in speed. Figure 5 shows the statistics for the above algorithms.

C. Modifying the OLB algorithm The OLB algorithm [27] [28] [29] uses a quick heuristic method that allots a task to the adjacent idle PE. It uses an opportunistic load balancing (OLB) algorithm to find for an idle. The algorithm simply assigns a task to the next PE expected to become idle. This algorithm works well on systems with a regular work load that can benefit from random assignments. The algorithm has been changed to take into account the various limitations and constraints of an RC system and assigns task based on feasibility on the system. The algorithm is also improved to discover area used on the FPGA and exploit parallelism. It first invents the idle PEs, and then assigns the task to the next PE likely to be idle. The schedule

ISBN : 978-93-80624-62-4

http://www.icett.com/

Fig.5 Near optimal solution search time

IV.CO-SIMULATION AND CO-EMULATION Presently system-level design and functional verification methodology based on the high-level abstraction is more important to increase the productivity of SoC design. In system-level design and verification, the structure and performance of the final system is being affected by design space exploration. SystemC[14][15][16] is a design language at many abstraction levels, and allows the design on a higher

Baselios Mathews II College of Engineering, Kollam, Kerala, India.

International Conference on Emerging Technological Trends in Advanced Engineering Research [ICETT 2012], 2012 February 20-21.

abstraction level to proceed to a synthesizable RT-level design through a step-by-step refinement. System Verilog[17]-[21] is a group of expansions to the Verilog HDL that allows higher level modeling and optimal verification of huge digital systems. Co-simulation [10]-[13][22][23] surroundings, which check the functional communication between hardware part and software part of the design, can be classified as homogeneous, heterogeneous and semi homogeneous based on hardware modeling language and software modeling language. Since the working of hardware part is more difficult and the scale of design becomes greater, the simulation performance reduces markedly in co-simulation. The other way is hardware/software co-emulation by real hardware such as FPGA for hardware part model. Because of the fact that the hardware and software parts are designed with System Verilog and SystemC, respectively, co-simulation environment is heterogeneous. The link between System Verilog modules and SystemC design units is given using System Verilog Direct Programming Interface(DPI).[18]-[21][24].Of recent ModelSim supports SystemC simulation with built-in compiler for SystemC, hence co-simulation of SystemC design units and System Verilog modules is performed as a single simulation process on ModelSim. Co-simulation environment based on SystemC and System Verilog is shown in Fig 1.It is a native code cosimulation in which application program explained with systemC drives signals of System Verilog module through calling tasks that are defined and exported using DPI in System Verilog module .

emulation through USB, where user design is put up in the FPGA and testing code runs in the host computer,Fig.6, iNSPIRE is an integrated design environment for FPGA based emulation platform, iNCITE. EIF file and proxy module for hardware component, FIR, filter , are generated using iNSPIRE. The EIF file is used for FPGA mapping and the proxy module is used for SystemC simulation instead of the original SystemC filter design unit.

Fig.6 The design flow using InciteTM and InspireTM

V CONCLUSION Hardware/software co-design presents an enormous challenge as well as an opportunity for system designers. This paper gives a detailed view about co-design, co-synthesis, cosimulation and emulation which are the wide field of research, because of their various applications, design styles and implementation technologies. The FPGA based “systems on silicon” allows a very large design space to be explored and the opportunity to provide a very high communication bandwidth between the processor and the hardware will mean that co-design solution have a good chance of producing more efficient design. REFERNCES

Fig.5 Native-code co-simulation environment Co-emulation is done using iNCITE,Fig.5, an FPGA based hardware emulator, and iNSPIRE, which is an integrated design environment released by Dynalith.[11][25][26] iNCITE is a FPGA board for hardware/software co-emulation provided by Dynalith. iNCITE gives a unique feature of hardware/software co-

ISBN : 978-93-80624-62-4

http://www.icett.com/

[1] X.Inc.,Two Flows for partial reconfiguration:Module based or small Bit manipulation, Xilinx Inc.,2002. [2] K.Bruneel and D.Stroobandt, “Automatic generation of run time parameterizable configurations” in Proceedings of the International Conference on field Programmable logic and Applications, 2008, pp.361-366. [3] K.Bruneel and D.Stroobandt, “Reconfigurability Aware structural mapping for LUT based FPGAs”, Reconfigurable Computing and FPGAs International Conference on, pp 223-228, 2008. [4] K.A. and S.P.A., “Hardware/Software partitioning for multifunction systems,” Computer Aided Design of Integrated Circuits and Systems, IEEE transactions, on vol.17, no. 9, pp. 819-837, 1998. [5] L.Sila, A. Sampaio, and E.Barros, “A consecutive approach to hardware/software partitioning” formal methods in System design, vol.24, no. 1, pp.45-90, 2004. [6]J.Wu. T. Srikanthan, and C.yan, “ Algorithmic aspects for power efficient hardware/software partitioning”, Mathematical computation in simulation, vol. 79no. 4, pp.1204-1215,2008, 5th Vienna International conference on Mathematical Modelling/ Workshop on Scientific Computing in Electronic

Baselios Mathews II College of Engineering, Kollam, Kerala, India.

International Conference on Emerging Technological Trends in Advanced Engineering Research [ICETT 2012], 2012 February 20-21. Engineering of the 2006 International Conference on Computational Science/Structural Dynamical Systems: Computational Aspects. [7]W.Jigang, T.Srikanthan, and G.Chen, “Algorithmic aspects of Hardware/Software partitioning, 1d search algorithms,” IEEE transactions on Computers, vol,59, no. 4, pp. 532-544, 2010. [8] A.C.S. Beck and L.Carro, “Dynamic reconfiguration with binary translation: breaking the ilp barrier with software compatibility,” in DAc’05:Proceedings of the 42nd annual Design Automation Conference.New York, NY,USA: ACM,2005,pp,732-737. [9]R.Lysecky, G.Stitt, and F. Vahid, “Warp processors,”ACM Trans. Des. Autom. Electron. Syst.,Vol. 11.no.3, pp.659-681, 2006. [10]Jason R.Andrews, Co-Verification of Hardware and Software for ARM SoC Design, Elsevier Inc., 2005. [11] Ando Ki, SoC Design and erification:Methodologies and Environment, Hongreung Science, 2008. [12]Yongjoo kim, Kyuseok Kim, Youngsoo Shin, Taekyoon Ahn, Wongyong Sung, kiyoung Choi, Soonhoi ha, “An integrated hardware-software cosimulation environment for heterogeneous systems prototyping, “ASPDAC, pp. 101-106, 1995. [13] S. Chikada, S. Honda, H. Tomiyama, H. Takada, “Cosimulation of ITRON-based embedded software with SystemC, “HLDVT, pp.71-76, 2005. [14] David C. Black, Jack Donovan, SysemC: From the Ground Up, Eklectic Ally, Inc., 2004. [15] Thorsten Grotker, Stan Liao, Grant Martin, Stuart Swan, System Design with SystemC, Kluwer Academic Publishers, 2002. [16] SystemC Language Reference Manual, http://www.systemc.org. [17] Stuart Sutherland, Simon Davidmann and Peter Flake, SystemVerilog for Design (2nd Edition) : A Guide to Using SystemVerilog for Hardware Design and Modelling, Springer, 2006. [18] Chris Spear, SystemVerilog for Verification (2nd Edition): A Guide to Learning the Testbench Language Features, Springer, 2008. [19] SystemVerilog 3.1a Language Reference Manual: Accellera’s Extensions to Verilog, Accellera, Napa, California, 2004. [20] Stuart Sutherland, “SystemVerilog, Modelsim, and You,”Mentor User2User, 2004. [21] Stuart Sutherland, “Integrating SystemC Models with Verilog and SystemCVerilog Models Using the SystemVerilog Direct Programming Interface, “SNUG Boston, 2004. [22]S. Yoo, A.A. jerraya, “hardware/Software cosimulation from interface perspective, Computers and Digital Techniques,” IEEE proceedings,vol.152, issue 3, 2005. [23]T.Jozawa, l.Huang, t. Sakai, S.Takeuchi, m.kasslin, “Heterogeneous cosimulation with SDL and SystemC for protocol modeling,”RWS, pp.603606,2006. [24]ModelSim SE User’s Manual,http:// www.mentor.com [25]iNCITE User Manual, Dynalith Systems,http:// www.dynalith.com [26]iNSPIRE User Manual,Dynalith Systems,http:// www.dynalith.com [27]Howard Jay Siegel, Shoukat Ali, “Techniques for mapping tasks to machines in heterogeneous computing systems”; Journal of Systems Architecture Volume 46 Page(s): 627 – 639. [28] Braun, T.D.; Siegel, H.J.; Beck, N.;Boloni, L.L.; Maheswaran, M.;Reuther, A.I.;Robertson, J.P.;Theys, M.D.;Bin Yao; Hensgen, D.; Freund, R.F.; “A comparison study of static mappic heuristics for a class of meta-tasks on heterogeneous computing systems”; Heterogeneous Computing Workshop, 1999. (HCW ’99) Proceedings, Eight; 12 April 1999 Page(s): 15-29. [29] Maheswaran, M.; Siegel, H.J.; “A dynamic matching and scheduling algorithm for heterogeneous computing systems”; Heterogeneous Computing Workshop, 1998. (HCW 98) Proceedings. 1998 Seventh; 30 March 1998 Page(s): 57-69.

ISBN : 978-93-80624-62-4

http://www.icett.com/

Baselios Mathews II College of Engineering, Kollam, Kerala, India.

Suggest Documents