Preferred topic: TPC-5 Signal Processing and System Control: data ...

0 downloads 0 Views 378KB Size Report
System Studio. I. INTRODUCTION. This paper presents two HW/SW co-design methodologies for applications dealing with video compression based on the.
Preferred topic: TPC-5 Signal Processing and System Control: data reduction and signal processing, Title: Study of High Level design methodologies for a MPEG frames I Compressor directed to FPGA Implementation

Title: PhD student, Name: Antoni Family Name: Portero, Street: Edifici Q, Universidad Autònoma de Barcelona, Zip: 08193, City: Bellaterra, Country: Spain, Phone: +34 93 5813559 FAX : +34 93 5813033, Mail: [email protected] Title:Mr, Name:Oscar Family name: Navas, Street: Edifici Q, Universidad Autònoma de Barcelona, Zip: 08193, City: Bellaterra, Country: Spain, Phone: +34 93 5823559, FAX: +34 93 5813033, Mail:[email protected] Title: Profesor, Name:Jordi Family name: Carrabina, Street: Edifici Q, Universidad Autònoma de Barcelona Zip: 08193, City: Bellaterra, Country: Spain, Phone: +34 93 5811078, FAX:+34 93 5813033, Mail: [email protected]

Study of High Level design methodologies for a MPEG frames I Compressor directed to FPGA Implementation Antoni Portero, Oscar Navas, Jordi Carrabina Dep. Informàtica, Universitat Autònoma de Barcelona 08193 Edifici Q, ETSE, Bellaterra, Spain

Abstract— Time needed to set-up and send new products to the market is a crucial factor for success, so one important rule is to develop your product as quickly as you can. In the electronics world the systems normally contain some parts in HW and others in SW being the HW/SW partition a crucial issue in the cost-performance trade-off. A bad choice usually requires the redesign or remaps of some parts the system with the consequent waste of time and money. In the last years many new concepts appeared for the design of systems, with the called HW/SW co-design based on System (SoC) and virtual components. Those concepts rely to methodologies that try to integrate HW and SW design techniques in just one consistent system-level methodology allowing to work in a way that is more secure (specifications and developments are verifiable), more efficient (cost analysable) and that can be automated (through systems synthesis). Demonstrator based MPEG video compression has been designed and validated. It implements video coding using the standard ISO/IEC 13818-2 | ITU-T H.262H (also know as “MPEG2 Video”), for the Main Profile at Main Level (720x480, 30fps). The encoder implements frames I or Ipictures. Index Terms— HW/SW Co-design, Rapid Prototyping, MPEG coding, MATLAB HW design, Embedded processors, IP cores, FPGAs, SystemC, embedded software, IP cores, SystemC, Cocentric System Studio.

I. INTRODUCTION This paper presents two HW/SW co-design methodologies for applications dealing with video compression based on the use of reconfigurable platforms and embedded processor cores (IPs) for SoC design. One methodology is based in Matlab environment with FPGA toolboxes, in order to model, design and verify systems’ performances; the other methodology is based in SystemC description using the environment Cocetric System Studio from SystemC and systemC libraries to model design and verify, Cocentric System Compiler for synthesis. The basic idea in HW/SW co-design [3][5], for an embedded system is to start the design from a system-level description of the behaviour, taking into account the architectural and implementation restrictions and directives of the system. This description is passed through the co-design process that evaluates the cost estimation for different HW and SW alternatives. The result is a HW description (VHDL,

Verilog...) that will be synthesized for a specific hardware (ASIC, FPGA...) and a software code, which will be compiled for a microprocessor, DSP, and so on. Besides, it is not easy to dispose of a good CAD environment, able to integrate all codesign aspects, including those concerning different computational models used for systems specification. It leads that two teams using as the starting point system specification in C of the standard model ISO/IEC 13818-2 | ITU-T H.262H [2] or MPEG-2 video to implement a HW/SW MPEG-2 frame I encoder. One methodology is based in Matlab/Simulink and the DSP Builder toolbox from Altera to implement HW blocks, together with Quartus II that allow the implementation of soft-core microprocessors (NIOS using SOPC Builder Tool). The second methodology is based in the language System C and the environment of Cocentric System Studio to verify and synthesize. This paper will be structured in following sections: Section II gives a shortly introduction to MPEG-2 frame I video encoder; in section III, we show environment from Matlab/Simulink and CoCentric System Studio; Section IV shows the platform used in the implementation of the system. Section V details the two co-design methodologies. Section VI there is a DCT synthesis example with both methodologies. Results concerning the MPEG-2 encoder frames-I are presented in Section VII, that leads to conclusions of the research work. II. MPEG-2 FRAME I DESCRIPTION Any coding system [1] is composed of two components: new and redundant parts. The new part is truth information whereas the rest is redundant because is not essential. Redundancy can be either spatial, like the area of an image where next pixels have almost the same value, or temporal, when the similarity occurs among successive images. Every compression system works separating the entropy of the redundancy at the encoder. In the encoder we find some usual parts:

Fig 1. Encoding Process 1) Redundancy extraction. In video encoder this redundancy can be temporal or spatial and use techniques of transformation and prediction.

2) Entropy Reduction using quantization factors. 3) Codification without loss of the entropy. In MPEG, these parts are mapped in the specific mathematic algorithms that we summarize next [4],[6]-[8]. The extraction of the spatial redundancy is mapped using the discrete cosine transform (DCT). Entropy reduction is achieved through a quantization matrix, taking into account the characteristics of the human eye for different frequencies, and lossless encoding is mapped to Huffman variable length code.

Fig 2. Encoder MPEG-2 Video Frames I 1) Image Sensor /pre-processing: The main function of this module is to capture image by the sensor and make tasks of pre-processing. 2) DCT: Unit to make the algorithm of discreet cosinus Transform. 3) Q: Cuantiser unit. 4) Zig-Zag: Unit to order in zig-zag. 5) VLC-HUFFMAN: Unit to reduce the entropy using the algorithm of variable length coding. 6) Header Builder: The functionality is to build the headers of MPEG2-video. 7) Transmitter III. SYSTEM C VS. MATLAB-DSP BUILDER ENVIRONMENT FOR SOC DESIGN IN FPGA PLATFORMS

toolset SOPC Builder, doing SW design a standard process. SOPC Builder interface allows to the user build system that include design in Simulink and Altera (embedded processorsr and IP cores). B. SYSTEM C – CoCentric SystemC arises as an emerging architecture-level modeling language. It is a language to short the huge modelling gap within Low level languages HDLs (VHDL, verilog), and high level (C, C++). Giving as a language support to concurrency, notion of time (clock, delays), communication model , reactivity to events , and increasing data types for hardware design. CoCentric System Studio permits extensive support for SystemC. It Enables complete end-to-end system simulation of a SoC in realistic virtual environments. IV. PLATFORM TO IMPLEMENT THE SYSTEM In order to select the prototyping platform, we have taking into account the component requirements together with the availability of new technologies based in FPGAs like SoC support or embedded DSP modules. The platform selected is the DSP-BOARD/S25 (shown in Fig.3). It contains a FPGA device (STRATIX EP1S25) running at 80 MHz clock (that can be increased up to 200MHz though the use of embedded PLLs). It allows gives very good support (SOPC builder) for NIOS microprocessor implementation and in also contains DSP modules. The toolset proposed to work with this platform using the methodology Matlab is composed of: Matlab/Simulink from Mathworks; Quartus II, DSP Builder and SOPC Builder from Altera and Gnu Pro from Red Hat.

One of the main current problems in System design is the system specification, architecture partition and final system verification. System description from a golden model and how to target the different parts in hardware and what parts in software is crucial. Taking into account that hardware engineers usually work isolated from software engineer and if there is bug in the design usually software engineers have to retarget the design to fit with the specifications. For these reasons, new methodologies and tools are emerging A. MATLAB-DSP BUILDER HW partition of the system implements complex signal processing algorithms that follow a classical data-flow model of computation. Matlab/Simulink has a long tradition for the implementation of such systems and component-specific toolboxes, like DSP Builder clearly improves design productivity for the implementation and verification of such systems. DSP Builder tool flow allow algorithm design using Matlab, system integration with Simulink and the HDL design export to synthesis tools, in this case Quartus II. DSP Builder automatically generates, from a system representation built using a bottom-up design methodology, a RTL description and a test-bench for Simulink. Altera offers a design flow based in C using embedded processors and software tools like SOPC Builder and DSP Builder. The toolset DSP Builder is integrated with the

Fig. 3. DSP-BOARD/S25 Development Board V. CODESIGN METHODOLOGIES This section explains the methodologies, developed according to the available tools that we used for the implementation of MPEG-2 video systems. We propose to use, as starting point descriptions of video standard models (i.e. ITU-T H.262H), using the standard source software (written in C). The main advantage of a system level description using C/C++ programming language is that allows reusing “good” software code as well as their verification environments. The encoder input is a set of files that contain image data in YUV format with a 720x480 resolution and 4:2:0 chroma format. We use C/C++ compiler for compiling, linking and debugging the system level model in C and execute in a PC. The program execution generates a file containing compressed data (*.m2v) that is later analysed with the

commercial tool MPROBE to check the correct behaviour of the system level model. A. Matlab Methodology The next step in Matlab methodology is the computational complexity analysis of the system in terms of: estimated number of operations per second, number of clock cycles, code and data program memory and required memory spaces to keep images. After choosing the architecture and taking into account the computational cost associated to each block implementation, we will produce the HW/SW partition. It gives as a result the requirements for the implementation software modules, hardware modules and the interface between hardware and software. HW modules are modelled at system level with Matlab/Simulink and the DSP Builder toolbox from Altera, that still allows functional and temporal verification. Simulink simulation results are compared with the results obtained for the C code, if results differ we refine HW module until getting the correct behaviour. The HW/SW interface is modelled in Matlab/Simulink and simulated with the Simulink models of HW and SW components. The software is adapted bearing in mind the algorithms that will implement. Software modules execution generates a file (*.m2v) that can be later verified with MPROBE. Behavioural verification will also produce the test-bench for the Matlab hardware subsystem. The software and interface HW/SW implementation analysis drives the design of the NIOS microprocessor architecture with the SOPC Builder tool for Quartus II. Once the right options are selected, the whole SoC subsystem is generated, producing: VHDL or Verilog files with the description of the microprocessor together with a test bench to simulate the SW system at HW level with ModelSim ; the software environment contains header files, software libraries, and drivers for peripherals, either general purpose or those included by designers as IPs. We rebuild the application software code for resulting NIOS architecture including the corresponding libraries and header files. At this moment, we can simulate the program code running on the embedded microprocessor or we can prototype it on the board. In order to verifiy the embedded software we can again generate an output file (*.m2v), by using the standard “printf” function and sending it out through the UART (generated in the synthesis of the SoC) in order to validate it with MPROBE. After that functional verification, we check that temporal restrictions are satisfied according to real time image conditions (30 fps of 720x480 images). HW is the synthesized at RTL level using DSP BUILDER, generating at the same time the test-bench for MODELSIM simulation, whose results can be analysed at Simulink level. It is also possible to perform a HW co-simulation of synthesized HW blocks in Modelsim with and the rest of HW blocks modelled in Simulink. Of course we can also prototype HW on the board. The synthesis of the HW/SW interface is developed from a

Matlab/Simulink system model by generating the RTL description that can be simulated with Modelsim. Once previous steps have been done, we dispose of a RTL description of the whole system allowing the possibility to perform verification through either co-simulation of SW, HW and HW/SW interface, or on board prototyping. B. System C Methodology CoCentric SystemC Compiler synthesizes hardware from SystemC source code. It is a tool for design teams needing a fast and high quality path from a system level hardware description coded in C/C++ to gates or a synthesizable Verilog or VHDL RTL description. This methodology is based in the information extracted from [10][11]. HW/SW Partition has been done in the same way as explained earlier. Hence, DCT, Q and zig-zag are though to be implemented in HW, while VLC and video layer would be developed in refined in SW. In case of HW sub system description is written and refined at behavioural level. Some of the advantages is that we have all system described in the same environment and language this helps a lot to find bugs at system level. Simulations at behavioural level are quite shorter than equivalent Matlab simulation, unfortunately behavioural synthesis is too error prone and time consuming, since tool is complex. For starting with SystemC Compiler, it is needed to define the target FPGA technology for the design by setting target_library and synthtetic_library variable.

Fig 4: SystemC Compiler Flow dc_shell> >target_library = {"stratix-5.db"} >synthetic_library = {"stratix-5.sldb"} >link_library = {"*"} + target_library + synthetic_library Analyzing and Elaborating the Source Code Before you use SystemC Compiler, we ensured that our behavioral description correctly reflects the functionality of our design. We did this by thoroughly simulating our behavioral description. We simulated our design with a C++ development environment. This ensures that the design is functionally

correct and meets the functional specification. Furthermore, this is also valuable for detecting and correcting any C++ syntax and semantic errors.

behavioral description. SystemC Compiler uses the target technology to build the components an to estimate the delay through them and their area.

Setting the Environment for BCView Analysis

Timing the Design

SystemC compiler provides a graphical analysis environment called BCView for evaluating the design that SystemC Compiler synthesizes. We set the bc_enable_analysis_info variable to true prior to starting synthesis. Enter

The bc_time_design command is used to perform timing and area estimation. Enter > bc_time_design -fastest This command annotates the current design with the timing and area data for later use by the schedule command.

> bc_enable_analysis_info = true Scheduling the Design Allocating Resources Elaborating the Design The compile_systemc command is used to read our SystemC source and check it for compliance with synthesis policy, C++ syntax, and C++ semantics. If there are no errors SystemC compiler produces an internal (.dB) ready for timing analysis. This process is called elaboration. The compile_systemc command, using the default setting for options, does the following: y Checks C++ syntax and semantics. y Replaces source code arithmetic operations with DesignWare components y Performs optimizations such as constant propagation, constant folding, dead code elimination, function timing, and algebraic simplification. y For a behavioral module, performs the necessary elaboration steps to prepare the SystemC description for timing analysis, scheduling, and logic synthesis. > compile_systemc top_unit + ".cpp" Setting the clock period The create_clock command is used to set the clock period. In our design has a port named clk, we enter > create_clock clk –period 100 The clock period is specified in the time unit that is defined in the target technology library. Checking the Design The bc_check_design command is used to check for errors that will prevent our design from being scheduled or synthsized with SystemC Compiler. We entered:

The schedule command completes behavioral synthesis by performing the following tasks: y Scheduling is the part which selects the clock cycle for execution of each operation. y Allocation is the part which selects the numbers and types of synthetic components and registers required in the synthesized design and assigns operations and variables to the allocated hardware. y Data path generation, which builds a netlist containing the allocated hardware appropriately interconnected with wires and multiplexers. y Controller generation is the part which builds an FSM controller to control the data path to ensure that it executes the functionality specified in the behavioral description. The output of scheduling is a structural RTL netlist containing a design for the behavioral description. To target the design to an Altera FPGA device, we entered: > set_fpga -target "STRATIX" -device "EP1S25F780" module -speed "C5" To execute the schedule command, we entered: > schedule –io_mode [cycle_fixed | superstate_fixed ] The –io_mode option specifies the way you want SystemC Compiler to handle I/O operations while performing synthesis. By default, the schedule command optimizes the design for the lowest-possible latency in terms of clock cycles and then for the smallest design that achieves that latency. We have influenced the optimization goal of scheduling and the amount of time spends looking for the best design by using schedule command options. Setting the constraints explain prior to running the schedule command that influence behavioral synthesis.

>bc_check_design –io_mode [cicle_fixed] [superstate_fixed] The io_mode option specifies how you want SystemC Compiler to handle I/O operations while performing synthesis. For behavioral synthesis is really recommended super_state_fixed, due that tool can insert more clock stages if it is necessary to realize events in more than one clock cycle. If the bc_check design command report errors in the description, using the details in the error report and the man pages for the errors to make the necessary changes in our source code. Afterward, we repeat the steps, starting with the compile_systemc command. Estimating Time and Area In the timing step, SystemC Compiler explores all the components in the specified synthetic libraries that can possible be used to implement the operations in our

VI. ONE DIMENSION DISCRETE COSINUS TRANSFORM EXAMPLE The basic computation in a DCT-based system is the transformation of a NxN image block from the spatial domain to the DCT domain. For the image compression standards, N = 8. The DCT is an orthogonal transform. Thus, if in matrix form the DCT output is Y=TXTt . The transformation, which is commonly referred to as the forward DCT or simply the DCT is expressed as

(1) F(u,v) =

(2x +1)uπ (2y +1)vπ C(u)C(v) 7 7 cos f (x, y)cos ∑∑ 4 x=0 y=0 16 16

u, v = 0, 1, ..., 7 and

C (u ), C (v ) =

1 2 1

u ,v = 0 otherwise

An important property of the 2d-dct transforms is separability. The 1D-DCT is computed as (2) Z (u) =

C(u) 7 (2 x + 1)uπ ∑ f ( x) cos 16 , 2 x=0 C (u ) =

1 1

2

u = 0,1,...7

u =0 otherwise

This equation can also be expressed in vector – matrix form as Z = TXt , where T is an 8x8 matrix whose elements are the cosine function values defined in (2), x = {x0,x1, ..., x7} is a row vector, and Z is a column vector. From (1), the output of the 2-d dct can be expressed as (3): C(u) 7 C(v) 7 (2y +1)vπ  (2x +1)uπ ∑ ∑f (x, y)cos 16 cos 16 2 x=0  2 y=0  C(v) 7 (2y +1)vπ (4) Z(x,v) = ∑f (x, y)cos 16 , x =0,1,...,7 2 y=0

(3) F(u,v) =

The equation (4) denotes the output of the 1D-DCT of the rows of f(x,y). The above equations imply that the 2D-DCT can be obtained by first performing 1D-DCT of the rows of f(x,y), followed by 1D-DCT of the columns of Z(x,v). In matrix notation, Y=TXTt and this can also be expressed as Z = TXt, Y=TZt =TXTt.

B. SystemC CoCentric System Studio The code is a description in SystemC of 1 Dimension Discrete Cosinus Transform[12] algorithm of BLOCK by BLOCK pixels. In this case block equals 8. The most inner loop there is a synopsys directive that means that this loop is unrolled. Hence this directive when we simulate produces 8 multipliers working in parallel. The dcti label loop has the same directive and also unrolls this loop. So, there is 8x8 multipliers and 8 adders. Another issue is that if you want to pipeline a loop, only is possible with loops described with while. So, outermost loop is described with a while. And of course just unrolled loops can be pipelined. But, if we do not restring scheduling, the tool will allocate minimum resources and generate just one multiplier and 1 adder. For not permitting minimum resources utilization these two sentences have to be written: >dct1i = find(cell, -hier,"dct1i") > set_max_cycles 76 -from_beginning dct1i -to_end dct1i That find out label loop that we want to pipe. Indicating in what cycle pipe starts and loop complete latency >pipeline_loop dct1i -initiation_interval 1 -latency 76

A. MATLAB-DSP BUILDER DSP Builder tool flow allow algorithm design using Matlab, system integration with Simulink and the HDL design export to synthesis tools, in this case Quartus II. Pixel value

MAC

Ti1

MAC Ti2

Ti3

MAC

MAC Ti4

MAC Ti5

.

MAC

Fig.6 SystemC Code DCT

Ti6

Ti7

MAC

MAC Ti8

1D-DCT values

Fig.5 1D-DCT implementation As an example, we show the implementation of the 1D-DCT for rows of a 8x8 image block from the spatial domain to the DCT domain. This implementation has been done using a parallel scheme that is compatible with a throughput of one pixel by clock as you can see at figure 5.

We build the algorithm 1D-DCT at RTL Matlab/Simulink description and we can simulate it with Simulink. Later it generate a HDL description that can be simulated with HW simulation tools. In the same environment you can synthesis and implement the system. The verification is done comparing implementation results with simulation results. You can take the implementation up to mix with basic elements of the DSPBuilder toolbox. For example for implement the 2D-DCT.

Finally, we should limit maximum number cycles that have both loops. We see a lot of information with bc_view command among them that there are a 64 multipliers matrix. After, the gate netlist is obtained, we load it in synplify_pro tool from synplify and finally in QuartusII and 64 multipliers are implemented in the DSP blocks of Altera FPGA. Changing the loops unrolling directive is easy to build a vector of 8 multipliers instead of a matrix of 64.

Fig 7. Shows Synthesis Summary: 64 multiplier in the DSP Blocks

VII. MPEG-2 VIDEO FRAMES-I IMPLEMENTATION AND RESULTS A. MATLAB-DSP BUILDER &SystemC CoCentric System Studio &QUARTUSII DSP Builder automatically generates, from a system representation built using a bottom-up design methodology, a RTL description and a test-bench for Simulink. Synthesis results are shown in table I. TABLE I SYNTHESYS HW SUBSYSTEMS Matlab SystemC Device EP1S25F780C6 EP1S25F780C6 Total logic 2,029 / 25,660 ( 7 % 32,438/25,660 elements ) (126%)

Total pins

24 / 597 ( 4 % )

Total memory bits DSP block 9-bi elements Total PLLs Total DLLs

338,212 / 1,944,576 ( 0 / 1,944,576 ( 0 % ) 17 % ) 64 / 80 ( 80 % ) 64 / 80 ( 80 % )

23/ 597 ( 4 % )

CONCLUSIONS We have build demonstrators based on real-time MPEG video compression, ISO/IEC 13818-2 | ITU-T H.262H (also know as “MPEG2 Video”), for the Main Profile at Main Level. These demonstrators validates a new HW/SW codesign methodology based on the use of reconfigurable platforms and embedded processor cores (IPs) for SoC design using Matlab/SystemC environment with toolboxes oriented to systems prototyping, in order to model, design and verify complete systems implementation with real-time restrictions. Currently we are working in the memory modelling in high level SystemC. In the future, we will add new features such as motion estimation and motion compensation algorithm for inter-frame compression. REFERENCES

1 /6 (17%) 0 / 2 (0 %)

1 /6 (17%) 0 / 2 (0 %)

From TABLE I, we observe, Currently our SystemC design does not use Stratix FPGA memories and all is implemented in registers due this the total logic elements is too big. In the Future, we will add Stratix memories models in synthetic library to use them in high level. These kinds of memories are generated with the tool genmem of Altera. Achieved throughput for the system is one pixel per clock as shown in Table II, together with latency results for the different HW modules. TABLE II LATENCY HW SUBSYSTEMS IN CLOCKS Matlab SystemC DCT 84 76 Q 5 5 ZigZag 1 1 TOTAL 90 81

The software subsystem has been implemented in the Nios microprocessor created with the tool SOPC Builder. Results shown in Table III. TABLE III SYNTHESYS HW NIOS-MICROPROCESSOR Device EP1S25F780C6 Total logic elements 3,863 / 25,660 (15 %) Total pins 215 / 597 ( 36 % ) Total memory bits 421,888 / 1,944,576 (21 %) DSP block 9-bi 2 / 80 (2 %) elements Total PLLs 0 / 6 (0 %) Total DLLs 0 / 2 (0 %)

Table III shows synthesis results coming from Quartus II The system verification has been done using the SW MPROBE as show in fig.8.

Fig. 8 –Design Verification with MPROBE

[1] A guide to MPEG Fundamentals and Protocol Analysis ( Including DVB and ATSC). Tektronik. [2] ISO/IEC 13818: 'Generic coding of moving pictures and associated audio (MPEG-2) ITU-T Rec. H.262, ISO/IEC 13818-2, " Generic Coding of Moving Pictures and Associated Audio ", Draft Int. Standard, Oct. 1994. [3] M. Serra, “ Adaptació d’una metodología HW/SW pel prototipat de sistemes encastats reactius”, Trabajo Experimental UAB, Bellaterra, España, Abril 2001. [4] O. Ferraz, “ Real Time Implementation of an Image Compression Algorithm”, Proyecto Final de Carrera UAB, Septiembre 2001. [5] F.Balarin, M.Chiodo, P. Giusto, H. Hsieh, A. Jurecska, L. Lavagno, C. Passerone, A. Sangionvanni-Vicentelli, E. Sentovich, K. Susuki, and B. Tabbara, “Hardware-Software Co-Design of Embended Systems: The POLIS Approach.” Kluwer Academic Publishers, 1997. [6] V. Bhaskaran, K.Konstantinides, “Image and Video Compression Standards, Algorithms and Architectures.” Kluwer Academic Publishers, 1996. [7] Al Bobik, “ Handbook of Image and Video ProcessingCompression Standards, Algorithms and Architectures.” Academic Press, 2000. [8] K. Jack, “Video Demystified.” LLH Technology Publishing, 2001. [9] ISO/IEC 11172-2: Information Technology-Coding of moving Pictures and Associated Audio for digital storage media at up to 1.5Mbit/s. Part 2: Video. [10] CoCentric SystemC Compiler, Behavioral User and Modeling Guide. Version U-2003.06, June 2003 [11] Thorsten Schubert, et al, “Evaluation of a RefinementDriven SystemC-Based Design Flow”, Proceedings on the Design and Test in Europe Conference and Exhibition Designers’ Forum (DATE-04) [12] Roger Endrigo Carbalho Porto, Luciano Volcan Agostini, “Project Space Exploration on the 2-D DCT Architecture of a JPEG Compressor Directed to FPGA Implementation”, Proceedings on the Design and Test in Europe Conference and Exhibition Designers’ Forum (DATE-04)

Suggest Documents